You've already forked comprehensive-rust
mirror of
https://github.com/google/comprehensive-rust.git
synced 2025-08-08 08:22:52 +02:00
add typestate pattern chapter
This commit is contained in:
@ -437,6 +437,8 @@
|
|||||||
- [Semantic Confusion](idiomatic/leveraging-the-type-system/newtype-pattern/semantic-confusion.md)
|
- [Semantic Confusion](idiomatic/leveraging-the-type-system/newtype-pattern/semantic-confusion.md)
|
||||||
- [Parse, Don't Validate](idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md)
|
- [Parse, Don't Validate](idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md)
|
||||||
- [Is It Encapsulated?](idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md)
|
- [Is It Encapsulated?](idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md)
|
||||||
|
- [Typestate Pattern](idiomatic/leveraging-the-type-system/typestate-pattern.md)
|
||||||
|
- [Typestate Pattern with Generics](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics.md)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
@ -0,0 +1,84 @@
|
|||||||
|
---
|
||||||
|
minutes: 15
|
||||||
|
---
|
||||||
|
|
||||||
|
## Typestate Pattern
|
||||||
|
|
||||||
|
The typestate pattern uses Rust’s type system to make **invalid states
|
||||||
|
unrepresentable**.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
# use std::fmt::Write;
|
||||||
|
#[derive(Default)]
|
||||||
|
struct Serializer { output: String }
|
||||||
|
struct SerializeStruct { ser: Serializer }
|
||||||
|
|
||||||
|
impl Serializer {
|
||||||
|
fn serialize_struct(mut self, name: &str) -> SerializeStruct {
|
||||||
|
let _ = writeln!(&mut self.output, "{name} {{");
|
||||||
|
SerializeStruct { ser: self }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl SerializeStruct {
|
||||||
|
fn serialize_field(mut self, key: &str, value: &str) -> Self {
|
||||||
|
let _ = writeln!(&mut self.ser.output, " {key}={value};");
|
||||||
|
self
|
||||||
|
}
|
||||||
|
|
||||||
|
fn finish_struct(mut self) -> Serializer {
|
||||||
|
self.ser.output.push_str("}\n");
|
||||||
|
self.ser
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
let ser = Serializer::default()
|
||||||
|
.serialize_struct("User")
|
||||||
|
.serialize_field("id", "42")
|
||||||
|
.serialize_field("name", "Alice")
|
||||||
|
.finish_struct();
|
||||||
|
println!("{}", ser.output);
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
- This example is inspired by
|
||||||
|
[Serde's `Serializer` trait](https://docs.rs/serde/latest/serde/ser/trait.Serializer.html).
|
||||||
|
For a deeper explanation of how Serde models serialization as a state machine,
|
||||||
|
see <https://serde.rs/impl-serializer.html>.
|
||||||
|
|
||||||
|
- The typestate pattern allows us to model state machines using Rust’s type
|
||||||
|
system. In this case, the state machine is a simple serializer.
|
||||||
|
|
||||||
|
- The key idea is that each state in the process, starting a struct, writing
|
||||||
|
fields, and finishing, is represented by a different type. Transitions between
|
||||||
|
states happen by consuming one value and producing another.
|
||||||
|
|
||||||
|
- In the example above:
|
||||||
|
|
||||||
|
- Once we begin serializing a struct, the `Serializer` is moved into the
|
||||||
|
`SerializeStruct` state. At that point, we no longer have access to the
|
||||||
|
original `Serializer`.
|
||||||
|
|
||||||
|
- While in the `SerializeStruct` state, we can only call methods related to
|
||||||
|
writing fields. We cannot use the same instance to serialize a tuple, list,
|
||||||
|
or primitive. Those constructors simply do not exist here.
|
||||||
|
|
||||||
|
- Only after calling `finish_struct` do we get the `Serializer` back. At that
|
||||||
|
point, we can inspect the output or start a new serialization session.
|
||||||
|
|
||||||
|
- If we forget to call `finish_struct` and drop the `SerializeStruct` instead,
|
||||||
|
the original `Serializer` is lost. This ensures that incomplete or invalid
|
||||||
|
output can never be observed.
|
||||||
|
|
||||||
|
- By contrast, if all methods were defined on `Serializer` itself, nothing would
|
||||||
|
prevent users from mixing serialization modes or leaving a struct unfinished.
|
||||||
|
|
||||||
|
- This pattern avoids such misuse by making it **impossible to represent invalid
|
||||||
|
transitions**.
|
||||||
|
|
||||||
|
- One downside of typestate modeling is potential code duplication between
|
||||||
|
states. In the next section, we will see how to use **generics** to reduce
|
||||||
|
duplication while preserving correctness.
|
||||||
|
|
||||||
|
</details>
|
@ -0,0 +1,141 @@
|
|||||||
|
## Typestate Pattern with Generics
|
||||||
|
|
||||||
|
Generics can be used with the typestate pattern to reduce duplication and allow
|
||||||
|
shared logic across state variants, while still encoding state transitions in
|
||||||
|
the type system.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
# fn main() -> std::io::Result<()> {
|
||||||
|
#[non_exhaustive]
|
||||||
|
struct Insecure;
|
||||||
|
struct Secure {
|
||||||
|
client_cert: Option<Vec<u8>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
trait Transport {
|
||||||
|
/* ... */
|
||||||
|
}
|
||||||
|
impl Transport for Insecure {
|
||||||
|
/* ... */
|
||||||
|
}
|
||||||
|
impl Transport for Secure {
|
||||||
|
/* ... */
|
||||||
|
}
|
||||||
|
|
||||||
|
#[non_exhaustive]
|
||||||
|
struct WantsTransport;
|
||||||
|
struct Ready<T> {
|
||||||
|
transport: T,
|
||||||
|
}
|
||||||
|
|
||||||
|
struct ConnectionBuilder<T> {
|
||||||
|
host: String,
|
||||||
|
timeout: Option<u64>,
|
||||||
|
stage: T,
|
||||||
|
}
|
||||||
|
|
||||||
|
struct Connection {/* ... */}
|
||||||
|
|
||||||
|
impl Connection {
|
||||||
|
fn new(host: &str) -> ConnectionBuilder<WantsTransport> {
|
||||||
|
ConnectionBuilder {
|
||||||
|
host: host.to_owned(),
|
||||||
|
timeout: None,
|
||||||
|
stage: WantsTransport,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl<T> ConnectionBuilder<T> {
|
||||||
|
fn timeout(mut self, secs: u64) -> Self {
|
||||||
|
self.timeout = Some(secs);
|
||||||
|
self
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl ConnectionBuilder<WantsTransport> {
|
||||||
|
fn insecure(self) -> ConnectionBuilder<Ready<Insecure>> {
|
||||||
|
ConnectionBuilder {
|
||||||
|
host: self.host,
|
||||||
|
timeout: self.timeout,
|
||||||
|
stage: Ready { transport: Insecure },
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn secure(self) -> ConnectionBuilder<Ready<Secure>> {
|
||||||
|
ConnectionBuilder {
|
||||||
|
host: self.host,
|
||||||
|
timeout: self.timeout,
|
||||||
|
stage: Ready { transport: Secure { client_cert: None } },
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl ConnectionBuilder<Ready<Secure>> {
|
||||||
|
fn client_certificate(mut self, raw: Vec<u8>) -> Self {
|
||||||
|
self.stage.transport.client_cert = Some(raw);
|
||||||
|
self
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl<T: Transport> ConnectionBuilder<Ready<T>> {
|
||||||
|
fn connect(self) -> std::io::Result<Connection> {
|
||||||
|
// ... use valid state to establish the configured connection
|
||||||
|
Ok(Connection {})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
let _conn = Connection::new("db.local")
|
||||||
|
.secure()
|
||||||
|
.client_certificate(vec![1, 2, 3])
|
||||||
|
.timeout(10)
|
||||||
|
.connect()?;
|
||||||
|
Ok(())
|
||||||
|
# }
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
- This example extends the typestate pattern using **generic parameters** to
|
||||||
|
avoid duplication of common logic.
|
||||||
|
|
||||||
|
- We use a generic type `T` to represent the current stage of the builder, and
|
||||||
|
share fields like `host` and `timeout` across all stages.
|
||||||
|
|
||||||
|
- The transport phase uses `insecure()` and `secure()` to transition from
|
||||||
|
`WantsTransport` into `Ready<T>`, where `T` is a type that implements the
|
||||||
|
`Transport` trait.
|
||||||
|
|
||||||
|
- Only once the connection is in a `Ready<T>` state, we can call `.connect()`,
|
||||||
|
guaranteed at compile time.
|
||||||
|
|
||||||
|
- Using generics allows us to avoid writing separate `BuilderForSecure`,
|
||||||
|
`BuilderForInsecure`, etc. structs.
|
||||||
|
|
||||||
|
Shared behavior, like `.timeout(...)`, can be implemented once and reused
|
||||||
|
across all states.
|
||||||
|
|
||||||
|
- This same design appears
|
||||||
|
[in real-world libraries like **Rustls**](https://docs.rs/rustls/latest/rustls/struct.ConfigBuilder.html),
|
||||||
|
where the `ConfigBuilder` uses typestate and generics to guide users through a
|
||||||
|
safe, ordered configuration flow.
|
||||||
|
|
||||||
|
It enforces at compile time that users must choose protocol versions, a
|
||||||
|
certificate verifier, and client certificate options, in the correct sequence,
|
||||||
|
before building a config.
|
||||||
|
|
||||||
|
- **Downsides** of this approach include:
|
||||||
|
- The documentation of the various builder types can become difficult to
|
||||||
|
follow, since their names are generated by generics and internal structs
|
||||||
|
like `Ready<T>`.
|
||||||
|
- Error messages from the compiler may become more opaque, especially if a
|
||||||
|
trait bound is not satisfied or a state transition is incomplete.
|
||||||
|
|
||||||
|
The error messages might also be hard to follow due to the complexity as a
|
||||||
|
result of the nested generics types.
|
||||||
|
|
||||||
|
- Still, in return for this complexity, you get compile-time enforcement of
|
||||||
|
valid configuration, clear builder sequencing, and no possibility of
|
||||||
|
forgetting a required step or misusing the API at runtime.
|
||||||
|
|
||||||
|
</details>
|
Reference in New Issue
Block a user