You've already forked comprehensive-rust
mirror of
https://github.com/google/comprehensive-rust.git
synced 2025-11-25 23:53:12 +02:00
"borrow checker invariants" section of the "leveraging the type system" chapter (#2867)
Adds materials on the "leveraging the type system/borrow checker invariants" subject. I'm still calibrating what's expected subject-and-style wise, so do spell out things where I've drifted off mark. --------- Co-authored-by: tall-vase <fiona@mainmatter.com> Co-authored-by: Dmitri Gribenko <gribozavr@gmail.com>
This commit is contained in:
@@ -455,6 +455,14 @@
|
|||||||
- [Serializer: implement Struct](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics/struct.md)
|
- [Serializer: implement Struct](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics/struct.md)
|
||||||
- [Serializer: implement Property](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics/property.md)
|
- [Serializer: implement Property](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics/property.md)
|
||||||
- [Serializer: Complete implementation](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics/complete.md)
|
- [Serializer: Complete implementation](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics/complete.md)
|
||||||
|
- [Borrow checking invariants](idiomatic/leveraging-the-type-system/borrow-checker-invariants.md)
|
||||||
|
- [Lifetimes and Borrows: the Abstract Rules](idiomatic/leveraging-the-type-system/borrow-checker-invariants/generalizing-ownership.md)
|
||||||
|
- [Single-use values](idiomatic/leveraging-the-type-system/borrow-checker-invariants/single-use-values.md)
|
||||||
|
- [Mutually Exclusive References / "Aliasing XOR Mutability"](idiomatic/leveraging-the-type-system/borrow-checker-invariants/aliasing-xor-mutability.md)
|
||||||
|
- [PhantomData and Types](idiomatic/leveraging-the-type-system/borrow-checker-invariants/phantomdata-01-types.md)
|
||||||
|
- [PhantomData and Types (implementation)](idiomatic/leveraging-the-type-system/borrow-checker-invariants/phantomdata-02-types-implemented.md)
|
||||||
|
- [PhantomData: Lifetimes for External Resources](idiomatic/leveraging-the-type-system/borrow-checker-invariants/phantomdata-03-lifetimes.md)
|
||||||
|
- [PhantomData: OwnedFd & BorrowedFd](idiomatic/leveraging-the-type-system/borrow-checker-invariants/phantomdata-04-borrowedfd.md)
|
||||||
- [Token Types](idiomatic/leveraging-the-type-system/token-types.md)
|
- [Token Types](idiomatic/leveraging-the-type-system/token-types.md)
|
||||||
- [Permission Tokens](idiomatic/leveraging-the-type-system/token-types/permission-tokens.md)
|
- [Permission Tokens](idiomatic/leveraging-the-type-system/token-types/permission-tokens.md)
|
||||||
- [Token Types with Data: Mutex Guards](idiomatic/leveraging-the-type-system/token-types/mutex-guard.md)
|
- [Token Types with Data: Mutex Guards](idiomatic/leveraging-the-type-system/token-types/mutex-guard.md)
|
||||||
|
|||||||
@@ -0,0 +1,116 @@
|
|||||||
|
---
|
||||||
|
minutes: 15
|
||||||
|
---
|
||||||
|
|
||||||
|
# Using the Borrow checker to enforce Invariants
|
||||||
|
|
||||||
|
The borrow checker, while added to enforce memory ownership, can model other
|
||||||
|
problems and prevent API misuse.
|
||||||
|
|
||||||
|
```rust,editable
|
||||||
|
/// Doors can be open or closed, and you need the right key to lock or unlock
|
||||||
|
/// one. Modelled with a Shared key and Owned door.
|
||||||
|
pub struct DoorKey {
|
||||||
|
pub key_shape: u32,
|
||||||
|
}
|
||||||
|
pub struct LockedDoor {
|
||||||
|
lock_shape: u32,
|
||||||
|
}
|
||||||
|
pub struct OpenDoor {
|
||||||
|
lock_shape: u32,
|
||||||
|
}
|
||||||
|
|
||||||
|
fn open_door(key: &DoorKey, door: LockedDoor) -> Result<OpenDoor, LockedDoor> {
|
||||||
|
if door.lock_shape == key.key_shape {
|
||||||
|
Ok(OpenDoor { lock_shape: door.lock_shape })
|
||||||
|
} else {
|
||||||
|
Err(door)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn close_door(key: &DoorKey, door: OpenDoor) -> Result<LockedDoor, OpenDoor> {
|
||||||
|
if door.lock_shape == key.key_shape {
|
||||||
|
Ok(LockedDoor { lock_shape: door.lock_shape })
|
||||||
|
} else {
|
||||||
|
Err(door)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
let key = DoorKey { key_shape: 7 };
|
||||||
|
let closed_door = LockedDoor { lock_shape: 7 };
|
||||||
|
let opened_door = open_door(&key, closed_door);
|
||||||
|
if let Ok(opened_door) = opened_door {
|
||||||
|
println!("Opened the door with key shape '{}'", key.key_shape);
|
||||||
|
} else {
|
||||||
|
eprintln!(
|
||||||
|
"Door wasn't opened! Your key only opens locks with shape '{}'",
|
||||||
|
key.key_shape
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
- We've seen the borrow checker prevent memory safety bugs (use-after-free, data
|
||||||
|
races).
|
||||||
|
|
||||||
|
- We've also used types to shape and restrict APIs already using
|
||||||
|
[the Typestate pattern](../leveraging-the-type-system/typestate-pattern.md).
|
||||||
|
|
||||||
|
- Language features are often introduced for a specific purpose.
|
||||||
|
|
||||||
|
Over time, users may develop ways of using a feature in ways that were not
|
||||||
|
predicted when they were introduced.
|
||||||
|
|
||||||
|
Java 5 introduced Generics in 2004 with the
|
||||||
|
[main stated purpose of enabling type-safe collections](https://jcp.org/en/jsr/detail?id=14).
|
||||||
|
|
||||||
|
Adoption was slow at first, but some new projects began designing their APIs
|
||||||
|
around generics from the beginning.
|
||||||
|
|
||||||
|
Since then, users and developers of the language expanded the use of generics
|
||||||
|
to other areas of type-safe API design:
|
||||||
|
- Class information can be held onto via Java's `Class<T>` or Guava's
|
||||||
|
`TypeToken<T>`.
|
||||||
|
- The Builder pattern can be implemented using Recursive Generics.
|
||||||
|
|
||||||
|
We aim to do something similar here: Even though the borrow checker was
|
||||||
|
introduced to prevent use-after-free and data races, we treat it as just
|
||||||
|
another API design tool.
|
||||||
|
|
||||||
|
It can be used to model program properties that have nothing to do with
|
||||||
|
preventing memory safety bugs.
|
||||||
|
|
||||||
|
- To use the borrow checker as a problem solving tool, we will need to "forget"
|
||||||
|
that the original purpose of it is to prevent mutable aliasing in the context
|
||||||
|
of preventing use-after-frees and data races.
|
||||||
|
|
||||||
|
We should imagine working within situations where the rules are the same but
|
||||||
|
the meaning is slightly different.
|
||||||
|
|
||||||
|
- This example uses ownership and borrowing are used to model the state of a
|
||||||
|
physical door.
|
||||||
|
|
||||||
|
`open_door` **consumes** a `LockedDoor` and returns a new `OpenDoor`. The old
|
||||||
|
`LockedDoor` value is no longer available.
|
||||||
|
|
||||||
|
If the wrong key is used, the door is left locked. It is returned as an `Err`
|
||||||
|
case of the `Result`.
|
||||||
|
|
||||||
|
It is a compile-time error to try and use a door that has already been opened.
|
||||||
|
|
||||||
|
- Similarly, `lock_door` consumes an `OpenDoor`, preventing closing the door
|
||||||
|
twice at compile time.
|
||||||
|
|
||||||
|
- The rules of the borrow checker exist to prevent memory safety bugs, but the
|
||||||
|
underlying logical system does not "know" what memory is.
|
||||||
|
|
||||||
|
All the borrow checker does is enforce a specific set of rules of how users
|
||||||
|
can order operations.
|
||||||
|
|
||||||
|
This is just one case of piggy-backing onto the rules of the borrow checker to
|
||||||
|
design APIs to be harder or impossible to misuse.
|
||||||
|
|
||||||
|
</details>
|
||||||
@@ -0,0 +1,108 @@
|
|||||||
|
---
|
||||||
|
minutes: 15
|
||||||
|
---
|
||||||
|
|
||||||
|
# Mutually Exclusive References / "Aliasing XOR Mutability"
|
||||||
|
|
||||||
|
We can use the mutual exclusion of `&T` and `&mut T` references to prevent data
|
||||||
|
from being used before it is ready.
|
||||||
|
|
||||||
|
```rust,editable
|
||||||
|
pub struct QueryResult;
|
||||||
|
pub struct DatabaseConnection {/* fields omitted */}
|
||||||
|
|
||||||
|
impl DatabaseConnection {
|
||||||
|
pub fn new() -> Self {
|
||||||
|
Self {}
|
||||||
|
}
|
||||||
|
pub fn results(&self) -> &[QueryResult] {
|
||||||
|
&[] // fake results
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct Transaction<'a> {
|
||||||
|
connection: &'a mut DatabaseConnection,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl<'a> Transaction<'a> {
|
||||||
|
pub fn new(connection: &'a mut DatabaseConnection) -> Self {
|
||||||
|
Self { connection }
|
||||||
|
}
|
||||||
|
pub fn query(&mut self, _query: &str) {
|
||||||
|
// Send the query over, but don't wait for results.
|
||||||
|
}
|
||||||
|
pub fn commit(self) {
|
||||||
|
// Finish executing the transaction and retrieve the results.
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
let mut db = DatabaseConnection::new();
|
||||||
|
|
||||||
|
// The transaction `tx` mutably borrows `db`.
|
||||||
|
let mut tx = Transaction::new(&mut db);
|
||||||
|
tx.query("SELECT * FROM users");
|
||||||
|
|
||||||
|
// This won't compile because `db` is already mutably borrowed by `tx`.
|
||||||
|
// let results = db.results(); // ❌🔨
|
||||||
|
|
||||||
|
// The borrow of `db` ends when `tx` is consumed by `commit()`.
|
||||||
|
tx.commit();
|
||||||
|
|
||||||
|
// Now it is possible to borrow `db` again.
|
||||||
|
let results = db.results();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
- Motivation: In this database API queries are kicked off for asynchronous
|
||||||
|
execution and the results are only available once the whole transaction is
|
||||||
|
finished.
|
||||||
|
|
||||||
|
A user might think that queries are executed immediately, and try to read
|
||||||
|
results before they are made available. This API misuse could make the app
|
||||||
|
read incomplete or incorrect data.
|
||||||
|
|
||||||
|
While an obvious misunderstanding, situations such as this can happen in
|
||||||
|
practice.
|
||||||
|
|
||||||
|
Ask: Has anyone misunderstood an API by not reading the docs for proper use?
|
||||||
|
|
||||||
|
Expect: Examples of early-career or in-university mistakes and
|
||||||
|
misunderstandings.
|
||||||
|
|
||||||
|
As an API grows in size and user base, a smaller percentage of users has deep
|
||||||
|
knowledge of the system the API represents.
|
||||||
|
|
||||||
|
- This example shows how we can use Aliasing XOR Mutability to prevent this kind
|
||||||
|
of misuse.
|
||||||
|
|
||||||
|
- The code might read results before they are ready if the programmer assumes
|
||||||
|
that the queries execute immediately rather than kicked off for asynchronous
|
||||||
|
execution.
|
||||||
|
|
||||||
|
- The constructor for the `Transaction` type takes a mutable reference to the
|
||||||
|
database connection, and stores it in the returned `Transaction` value.
|
||||||
|
|
||||||
|
The explicit lifetime here doesn't have to be intimidating, it just means
|
||||||
|
"`Transaction` is outlived by the `DatabaseConnection` that was passed to it"
|
||||||
|
in this case.
|
||||||
|
|
||||||
|
The reference is mutable to completely lock out the `DatabaseConnection` from
|
||||||
|
other usage, such as starting further transactions or reading the results.
|
||||||
|
|
||||||
|
- While a `Transaction` exists, we can't touch the `DatabaseConnection` variable
|
||||||
|
that was created from it.
|
||||||
|
|
||||||
|
Demonstrate: uncomment the `db.results()` line. Doing so will result in a
|
||||||
|
compile error, as `db` is already mutably borrowed.
|
||||||
|
|
||||||
|
- Note: The query results not being public and placed behind a getter function
|
||||||
|
lets us enforce the invariant "users can only look at query results if there
|
||||||
|
is no active transactions."
|
||||||
|
|
||||||
|
If the query results were placed in a public struct field, this invariant
|
||||||
|
could be violated.
|
||||||
|
|
||||||
|
</details>
|
||||||
@@ -0,0 +1,72 @@
|
|||||||
|
---
|
||||||
|
minutes: 10
|
||||||
|
---
|
||||||
|
|
||||||
|
# Lifetimes and Borrows: the Abstract Rules
|
||||||
|
|
||||||
|
```rust,editable
|
||||||
|
// An internal data type to have something to hold onto.
|
||||||
|
pub struct Internal;
|
||||||
|
// The "outer" data.
|
||||||
|
pub struct Data(Internal);
|
||||||
|
|
||||||
|
fn shared_use(value: &Data) -> &Internal {
|
||||||
|
&value.0
|
||||||
|
}
|
||||||
|
fn exclusive_use(value: &mut Data) -> &mut Internal {
|
||||||
|
&mut value.0
|
||||||
|
}
|
||||||
|
fn deny_future_use(value: Data) {}
|
||||||
|
|
||||||
|
fn demo_exclusive() {
|
||||||
|
let mut value = Data(Internal);
|
||||||
|
let shared = shared_use(&value);
|
||||||
|
// let exclusive = exclusive_use(&mut value); // ❌🔨
|
||||||
|
let shared_again = &shared;
|
||||||
|
}
|
||||||
|
|
||||||
|
fn demo_denied() {
|
||||||
|
let value = Data(Internal);
|
||||||
|
deny_future_use(value);
|
||||||
|
// let shared = shared_use(&value); // ❌🔨
|
||||||
|
}
|
||||||
|
|
||||||
|
# fn main() {}
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
- This example re-frames the borrow checker rules away from references and
|
||||||
|
towards semantic meaning in non-memory-safety settings.
|
||||||
|
|
||||||
|
Nothing is being mutated, nothing is being sent across threads.
|
||||||
|
|
||||||
|
- In rust's borrow checker we have access to three different ways of "taking" a
|
||||||
|
value:
|
||||||
|
|
||||||
|
- Owned value `T`. Value is dropped when the scope ends, unless it is not
|
||||||
|
returned to another scope.
|
||||||
|
|
||||||
|
- Shared Reference `&T`. Allows aliasing but prevents mutable access while
|
||||||
|
shared references are in use.
|
||||||
|
|
||||||
|
- Mutable Reference `&mut T`. Only one of these is allowed to exist for a
|
||||||
|
value at any one point, but can be used to create shared references.
|
||||||
|
|
||||||
|
- Ask: The two commented-out lines in the `demo` functions would cause
|
||||||
|
compilation errors, Why?
|
||||||
|
|
||||||
|
`demo_exclusive`: Because the `shared` value is still aliased after the
|
||||||
|
`exclusive` reference is taken.
|
||||||
|
|
||||||
|
`demo_denied`: Because `value` is consumed the line before the
|
||||||
|
`shared_again_again` reference is taken from `&value`.
|
||||||
|
|
||||||
|
- Remember that every `&T` and `&mut T` has a lifetime, just one the user
|
||||||
|
doesn't have to annotate or think about most of the time.
|
||||||
|
|
||||||
|
We rarely specify lifetimes because the Rust compiler allows us to _elide_
|
||||||
|
them in most cases. See:
|
||||||
|
[Lifetime Elision](../../../lifetimes/lifetime-elision.md)
|
||||||
|
|
||||||
|
</details>
|
||||||
@@ -0,0 +1,48 @@
|
|||||||
|
---
|
||||||
|
minutes: 5
|
||||||
|
---
|
||||||
|
|
||||||
|
# PhantomData 1/4: De-duplicating Same Data & Semantics
|
||||||
|
|
||||||
|
The newtype pattern can sometimes come up against the DRY principle, how do we
|
||||||
|
solve this?
|
||||||
|
|
||||||
|
<!-- dprint-ignore-start -->
|
||||||
|
```rust,editable,compile_fail
|
||||||
|
pub struct UserId(u64);
|
||||||
|
impl ChatUser for UserId { /* ... */ }
|
||||||
|
|
||||||
|
pub struct PatronId(u64);
|
||||||
|
impl ChatUser for PatronId { /* ... */ }
|
||||||
|
|
||||||
|
pub struct ModeratorId(u64);
|
||||||
|
impl ChatUser for ModeratorId { /* ... */ }
|
||||||
|
impl ChatModerator for ModeratorId { /* ... */ }
|
||||||
|
|
||||||
|
pub struct AdminId(u64);
|
||||||
|
impl ChatUser for AdminId { /* ... */ }
|
||||||
|
impl ChatModerator for AdminId { /* ... */ }
|
||||||
|
impl ChatAdmin for AdminId { /* ... */ }
|
||||||
|
|
||||||
|
// And so on ...
|
||||||
|
fn main() {}
|
||||||
|
```
|
||||||
|
<!-- dprint-ignore-end -->
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
- Problem: We want to use the newtype pattern to differentiate permissions, but
|
||||||
|
we're having to implement the same traits over and over again for the same
|
||||||
|
data.
|
||||||
|
|
||||||
|
- Ask: Assume the details of each implementation here are the same between
|
||||||
|
types, what are ways we can avoid repeating ourselves?
|
||||||
|
|
||||||
|
Expect:
|
||||||
|
- Make this an enum, not distinct data types.
|
||||||
|
- Bundle the user ID with permission tokens like
|
||||||
|
`struct Admin(u64, UserPermission, ModeratorPermission, AdminPermission);`
|
||||||
|
- Adding a type parameter which encodes permissions.
|
||||||
|
- Mentioning `PhantomData` ahead of schedule (it's in the title).
|
||||||
|
|
||||||
|
</details>
|
||||||
@@ -0,0 +1,91 @@
|
|||||||
|
---
|
||||||
|
minutes: 10
|
||||||
|
---
|
||||||
|
|
||||||
|
# PhantomData 2/4: Type-level tagging
|
||||||
|
|
||||||
|
Let's solve the problem from the previous slide by adding a type parameter.
|
||||||
|
|
||||||
|
<!-- dprint-ignore-start -->
|
||||||
|
```rust,editable
|
||||||
|
// use std::marker::PhantomData;
|
||||||
|
|
||||||
|
pub struct ChatId<T> { id: u64, tag: T }
|
||||||
|
|
||||||
|
pub struct UserTag;
|
||||||
|
pub struct AdminTag;
|
||||||
|
|
||||||
|
pub trait ChatUser {/* ... */}
|
||||||
|
pub trait ChatAdmin {/* ... */}
|
||||||
|
|
||||||
|
impl ChatUser for UserTag {/* ... */}
|
||||||
|
impl ChatUser for AdminTag {/* ... */} // Admins are users
|
||||||
|
impl ChatAdmin for AdminTag {/* ... */}
|
||||||
|
|
||||||
|
// impl <T> Debug for UserTag<T> {/* ... */}
|
||||||
|
// impl <T> PartialEq for UserTag<T> {/* ... */}
|
||||||
|
// impl <T> Eq for UserTag<T> {/* ... */}
|
||||||
|
// And so on ...
|
||||||
|
|
||||||
|
impl <T: ChatUser> ChatId<T> {/* All functionality for users and above */}
|
||||||
|
impl <T: ChatAdmin> ChatId<T> {/* All functionality for only admins */}
|
||||||
|
|
||||||
|
fn main() {}
|
||||||
|
```
|
||||||
|
<!-- dprint-ignore-end -->
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
- Here we're using a type parameter and gating permissions behind "tag" types
|
||||||
|
that implement different permission traits.
|
||||||
|
|
||||||
|
Tag types, or marker types, are zero-sized types that have some semantic
|
||||||
|
meaning to users and API designers.
|
||||||
|
|
||||||
|
- Ask: What issues does having it be an actual instance of that type pose?
|
||||||
|
|
||||||
|
Answer: If it's not a zero-sized type (like `()` or `struct MyTag;`), then
|
||||||
|
we're allocating more memory than we need to when all we care for is type
|
||||||
|
information that is only relevant at compile-time.
|
||||||
|
|
||||||
|
- Demonstrate: remove the `tag` value entirely, then compile!
|
||||||
|
|
||||||
|
This won't compile, as there's an unused (phantom) type parameter.
|
||||||
|
|
||||||
|
This is where `PhantomData` comes in!
|
||||||
|
|
||||||
|
- Demonstrate: Uncomment the `PhantomData` import, and make `ChatId<T>` the
|
||||||
|
following:
|
||||||
|
|
||||||
|
```rust,compile_fail
|
||||||
|
pub struct ChatId<T> {
|
||||||
|
id: u64,
|
||||||
|
tag: PhantomData<T>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `PhantomData<T>` is a zero-sized type with a type parameter. We can construct
|
||||||
|
values of it like other ZSTs with
|
||||||
|
`let phantom: PhantomData<UserTag> = PhantomData;` or with the
|
||||||
|
`PhantomData::default()` implementation.
|
||||||
|
|
||||||
|
Demonstrate: implement `From<u64>` for `ChatId<T>`, emphasizing the
|
||||||
|
construction of `PhantomData`
|
||||||
|
|
||||||
|
```rust,compile_fail
|
||||||
|
impl<T> From<u64> for ChatId<T> {
|
||||||
|
fn from(value: u64) -> Self {
|
||||||
|
ChatId {
|
||||||
|
id: value,
|
||||||
|
// Or `PhantomData::default()`
|
||||||
|
tag: PhantomData,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `PhantomData` can be used as part of the Typestate pattern to have data with
|
||||||
|
the same structure but different methods, e.g., have `TaggedData<Start>`
|
||||||
|
implement methods or trait implementations that `TaggedData<End>` doesn't.
|
||||||
|
|
||||||
|
</details>
|
||||||
@@ -0,0 +1,114 @@
|
|||||||
|
---
|
||||||
|
minutes: 15
|
||||||
|
---
|
||||||
|
|
||||||
|
# PhantomData 3/4: Lifetimes for External Resources
|
||||||
|
|
||||||
|
The invariants of external resources often match what we can do with lifetime
|
||||||
|
rules.
|
||||||
|
|
||||||
|
```rust,editable
|
||||||
|
// use std::marker::PhantomData;
|
||||||
|
|
||||||
|
/// Direct FFI to a database library in C.
|
||||||
|
/// We got this API as is, we have no influence over it.
|
||||||
|
mod ffi {
|
||||||
|
pub type DatabaseHandle = u8; // maximum 255 databases open at the same time
|
||||||
|
|
||||||
|
fn database_open(name: *const std::os::raw::c_char) -> DatabaseHandle {
|
||||||
|
unimplemented!()
|
||||||
|
}
|
||||||
|
// ... etc.
|
||||||
|
}
|
||||||
|
|
||||||
|
struct DatabaseConnection(ffi::DatabaseHandle);
|
||||||
|
struct Transaction<'a>(&'a mut DatabaseConnection);
|
||||||
|
|
||||||
|
impl DatabaseConnection {
|
||||||
|
fn new_transaction(&mut self) -> Transaction<'_> {
|
||||||
|
Transaction(self)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn main() {}
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
- Remember the transaction API from the
|
||||||
|
[Aliasing XOR Mutability](./aliasing-xor-mutability.md) example.
|
||||||
|
|
||||||
|
We held onto a mutable reference to the database connection within the
|
||||||
|
transaction type to lock out the database while a transaction is active.
|
||||||
|
|
||||||
|
In this example, we want to implement a `Transaction` API on top of an
|
||||||
|
external, non-Rust API.
|
||||||
|
|
||||||
|
We start by defining a `Transaction` type that holds onto
|
||||||
|
`&mut DatabaseConnection`.
|
||||||
|
|
||||||
|
- Ask: What are the limits of this implementation? Assume the `u8` is accurate
|
||||||
|
implementation-wise and enough information for us to use the external API.
|
||||||
|
|
||||||
|
Expect:
|
||||||
|
- Indirection takes up 7 bytes more than we need to on a 64-bit platform, as
|
||||||
|
well as costing a pointer dereference at runtime.
|
||||||
|
|
||||||
|
- Problem: We want the transaction to borrow the database connection that
|
||||||
|
created it, but we don't want the `Transaction` object to store a real
|
||||||
|
reference.
|
||||||
|
|
||||||
|
- Ask: What happens when we remove the mutable reference in `Transaction` while
|
||||||
|
keeping the lifetime parameter?
|
||||||
|
|
||||||
|
Expect: Unused lifetime parameter!
|
||||||
|
|
||||||
|
- Like with the type tagging from the previous slides, we can bring in
|
||||||
|
`PhantomData` to capture this unused lifetime parameter for us.
|
||||||
|
|
||||||
|
The difference is that we will need to use the lifetime alongside another
|
||||||
|
type, but that other type does not matter too much.
|
||||||
|
|
||||||
|
- Demonstrate: change `Transaction` to the following:
|
||||||
|
|
||||||
|
```rust,compile_fail
|
||||||
|
pub struct Transaction<'a> {
|
||||||
|
connection: DatabaseConnection,
|
||||||
|
_phantom: PhantomData<&mut 'a ()>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Update the `DatabaseConnection::new_transaction()` method:
|
||||||
|
|
||||||
|
```rust,compile_fail
|
||||||
|
fn new_transaction<'a>(&'a mut self) -> Transaction<'a> {
|
||||||
|
Transaction { connection: DatabaseConnection(self.0), _phantom: PhantomData }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This gives an owned database connection that is tied to the
|
||||||
|
`DatabaseConnection` that created it, but with less runtime memory footprint
|
||||||
|
that the store-a-reference version did.
|
||||||
|
|
||||||
|
Because `PhantomData` is a zero-sized type (like `()` or
|
||||||
|
`struct MyZeroSizedType;`), the size of `Transaction` is now the same as `u8`.
|
||||||
|
|
||||||
|
The implementation that held onto a reference instead was as large as a
|
||||||
|
`usize`.
|
||||||
|
|
||||||
|
## More to Explore
|
||||||
|
|
||||||
|
- This way of encoding relationships between types and values is very powerful
|
||||||
|
when combined with unsafe, as the ways one can manipulate lifetimes becomes
|
||||||
|
almost arbitrary. This is also dangerous, but when combined with tools like
|
||||||
|
external, mechanically-verified proofs we can safely encode
|
||||||
|
cyclic/self-referential types while encoding lifetime & safety expectations in
|
||||||
|
the relevant data types.
|
||||||
|
|
||||||
|
- The [GhostCell (2021)](https://plv.mpi-sws.org/rustbelt/ghostcell/) paper and
|
||||||
|
its [relevant implementation](https://gitlab.mpi-sws.org/FP/ghostcell) show
|
||||||
|
this kind of work off. While the borrow checker is restrictive, there are
|
||||||
|
still ways to use escape hatches and then _show that the ways you used those
|
||||||
|
escape hatches are consistent and safe._
|
||||||
|
|
||||||
|
</details>
|
||||||
@@ -0,0 +1,112 @@
|
|||||||
|
---
|
||||||
|
minutes: 10
|
||||||
|
---
|
||||||
|
|
||||||
|
# PhantomData 4/4: OwnedFd & BorrowedFd
|
||||||
|
|
||||||
|
`BorrowedFd` is a prime example of `PhantomData` in action.
|
||||||
|
|
||||||
|
<!--
|
||||||
|
This code has to define a fake libc module even though libc works fine on
|
||||||
|
rust playground because the CI does not currently support dependencies.
|
||||||
|
|
||||||
|
TODO: Once we can use libc as a dependency in rust tests, replace the
|
||||||
|
faux libc code with appropriate imports & `O_WRONLY | O_CREAT` permissions.
|
||||||
|
-->
|
||||||
|
|
||||||
|
```rust,editable
|
||||||
|
use std::marker::PhantomData;
|
||||||
|
use std::os::raw::c_int;
|
||||||
|
|
||||||
|
mod libc_ffi {
|
||||||
|
use std::os::raw::{c_char, c_int};
|
||||||
|
pub unsafe fn open(path: *const c_char, oflag: c_int) -> c_int {
|
||||||
|
3
|
||||||
|
}
|
||||||
|
pub unsafe fn close(fd: c_int) {}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct OwnedFd {
|
||||||
|
fd: c_int,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl OwnedFd {
|
||||||
|
fn try_from_fd(fd: c_int) -> Option<Self> {
|
||||||
|
if fd < 0 {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
Some(OwnedFd { fd })
|
||||||
|
}
|
||||||
|
|
||||||
|
fn as_fd<'a>(&'a self) -> BorrowedFd<'a> {
|
||||||
|
BorrowedFd { fd: self.fd, _phantom: PhantomData }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Drop for OwnedFd {
|
||||||
|
fn drop(&mut self) {
|
||||||
|
unsafe { libc_ffi::close(self.fd) };
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct BorrowedFd<'a> {
|
||||||
|
fd: c_int,
|
||||||
|
_phantom: PhantomData<&'a ()>,
|
||||||
|
}
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
// Create a file with a raw syscall with write-only and create permissions.
|
||||||
|
let fd = unsafe { libc_ffi::open(c"c_str.txt".as_ptr(), 065) };
|
||||||
|
// Pass the ownership of an integer file descriptor to an `OwnedFd`.
|
||||||
|
// `OwnedFd::drop()` closes the file descriptor.
|
||||||
|
let owned_fd =
|
||||||
|
OwnedFd::try_from_fd(fd).expect("Could not open file with syscall!");
|
||||||
|
|
||||||
|
// Create a `BorrowedFd` from an `OwnedFd`.
|
||||||
|
// `BorrowedFd::drop()` does not close the file because it doesn't own it!
|
||||||
|
let borrowed_fd: BorrowedFd<'_> = owned_fd.as_fd();
|
||||||
|
// std::mem::drop(owned_fd); // ❌🔨
|
||||||
|
std::mem::drop(borrowed_fd);
|
||||||
|
let second_borrowed = owned_fd.as_fd();
|
||||||
|
// owned_fd will be dropped here, and the file will be closed.
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
- A file descriptor represents a specific process's access to a file.
|
||||||
|
|
||||||
|
Reminder: Device and OS-specific features are exposed as if they were files on
|
||||||
|
unix-style systems.
|
||||||
|
|
||||||
|
- [`OwnedFd`](https://rust-lang.github.io/rfcs/3128-io-safety.html#ownedfd-and-borrowedfdfd)
|
||||||
|
is an owned wrapper type for a file descriptor. It _owns_ the file descriptor,
|
||||||
|
and closes it when dropped.
|
||||||
|
|
||||||
|
Note: We have our own implementation of it here, draw attention to the
|
||||||
|
explicit `Drop` implementation.
|
||||||
|
|
||||||
|
`BorrowedFd` is its borrowed counterpart, it does not need to close the file
|
||||||
|
when it is dropped.
|
||||||
|
|
||||||
|
Note: We have not explicitly implemented `Drop` for `BorrowedFd`.
|
||||||
|
|
||||||
|
- `BorrowedFd` uses a lifetime captured with a `PhantomData` to enforce the
|
||||||
|
invariant "if this file descriptor exists, the OS file descriptor is still
|
||||||
|
open even though it is not responsible for closing that file descriptor."
|
||||||
|
|
||||||
|
The lifetime parameter of `BorrowedFd` demands that there exists another value
|
||||||
|
in your program that lasts as long as that specific `BorrowedFd` or outlives
|
||||||
|
it (in this case an `OwnedFd`).
|
||||||
|
|
||||||
|
Demonstrate: Uncomment the `std::mem::drop(owned_fd)` line and try to compile
|
||||||
|
to show that `borrowed_fd` relies on the lifetime of `owned_fd`.
|
||||||
|
|
||||||
|
This has been encoded by the API designers to mean _that other value is what
|
||||||
|
keeps the access to the file open_.
|
||||||
|
|
||||||
|
Because Rust's borrow checker enforces this relationship where one value must
|
||||||
|
last at least as long as another, users of this API do not need to worry about
|
||||||
|
handling this correct file descriptor aliasing and closing logic themselves.
|
||||||
|
|
||||||
|
</details>
|
||||||
@@ -0,0 +1,80 @@
|
|||||||
|
---
|
||||||
|
minutes: 10
|
||||||
|
---
|
||||||
|
|
||||||
|
# Single-use values
|
||||||
|
|
||||||
|
Sometimes we want values that _can only be used once_. One critical example of
|
||||||
|
this is in cryptography: A "Nonce."
|
||||||
|
|
||||||
|
```rust,editable
|
||||||
|
pub struct Key(/* specifics omitted */);
|
||||||
|
/// A single-use number suitable for cryptographic purposes.
|
||||||
|
pub struct Nonce(u32);
|
||||||
|
/// A cryptographically sound random generator function.
|
||||||
|
pub fn new_nonce() -> Nonce {
|
||||||
|
Nonce(4) // chosen by a fair dice roll, https://xkcd.com/221/
|
||||||
|
}
|
||||||
|
/// Consume a nonce, but not the key or the data.
|
||||||
|
pub fn encrypt(nonce: Nonce, key: &Key, data: &[u8]) {}
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
let nonce = new_nonce();
|
||||||
|
let data_1: [u8; 4] = [1, 2, 3, 4];
|
||||||
|
let data_2: [u8; 4] = [4, 3, 2, 1];
|
||||||
|
let key = Key(/* specifics omitted */);
|
||||||
|
|
||||||
|
// The key and data can be re-used, copied, etc. but the nonce cannot.
|
||||||
|
encrypt(nonce, &key, &data_1);
|
||||||
|
// encrypt(nonce, &key, &data_2); // 🛠️❌
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
- Problem: How can we guarantee a value is used only once?
|
||||||
|
|
||||||
|
- Motivation: A nonce is a piece of random, unique data used in cryptographic
|
||||||
|
protocols to prevent replay attacks.
|
||||||
|
|
||||||
|
Background: In practice people have ended up accidentally re-using nonces.
|
||||||
|
Most commonly, this causes the cryptographic protocol to completely break down
|
||||||
|
and stop fulfilling its function.
|
||||||
|
|
||||||
|
Depending on the specifics of nonce reuse and cryptography at hand, private
|
||||||
|
keys can also become computable by attackers.
|
||||||
|
|
||||||
|
- Rust has an obvious tool for achieving the invariant "Once you use this, you
|
||||||
|
can't use it again": passing a value as an _owned argument_.
|
||||||
|
|
||||||
|
- Highlight: the `encrypt` function takes `nonce` by value (an owned argument),
|
||||||
|
but `key` and `data` by reference.
|
||||||
|
|
||||||
|
- The technique for single-use values is as follows:
|
||||||
|
|
||||||
|
- Keep constructors private, so a user can't construct values with the same
|
||||||
|
inner value twice.
|
||||||
|
|
||||||
|
- Don't implement `Clone`/`Copy` traits or equivalent methods, so a user can't
|
||||||
|
duplicate data we want to keep unique.
|
||||||
|
|
||||||
|
- Make the interior type opaque (like with the newtype pattern), so the user
|
||||||
|
cannot modify an existing value on their own.
|
||||||
|
|
||||||
|
- Ask: What are we missing from the newtype pattern in the slide's code?
|
||||||
|
|
||||||
|
Expect: Module boundary.
|
||||||
|
|
||||||
|
Demonstrate: Without a module boundary a user can construct a nonce on their
|
||||||
|
own.
|
||||||
|
|
||||||
|
Fix: Put `Key`, `Nonce`, and `new_nonce` behind a module.
|
||||||
|
|
||||||
|
## More to Explore
|
||||||
|
|
||||||
|
- Cryptography Nuance: A nonce might still be used twice if it was created
|
||||||
|
through pseudo-random process with no actual randomness. That can't be
|
||||||
|
prevented through this method. This API design prevents one nonce duplication,
|
||||||
|
but not all logic bugs.
|
||||||
|
|
||||||
|
</details>
|
||||||
Reference in New Issue
Block a user