1
0
mirror of https://github.com/google/comprehensive-rust.git synced 2025-08-08 08:22:52 +02:00

add typestate pattern chapter

This commit is contained in:
Glen De Cauwsemaecker
2025-07-16 15:56:18 +02:00
parent 570a726cb5
commit c69bffcb74
3 changed files with 227 additions and 0 deletions

View File

@ -437,6 +437,8 @@
- [Semantic Confusion](idiomatic/leveraging-the-type-system/newtype-pattern/semantic-confusion.md) - [Semantic Confusion](idiomatic/leveraging-the-type-system/newtype-pattern/semantic-confusion.md)
- [Parse, Don't Validate](idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md) - [Parse, Don't Validate](idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md)
- [Is It Encapsulated?](idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md) - [Is It Encapsulated?](idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md)
- [Typestate Pattern](idiomatic/leveraging-the-type-system/typestate-pattern.md)
- [Typestate Pattern with Generics](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics.md)
--- ---

View File

@ -0,0 +1,84 @@
---
minutes: 15
---
## Typestate Pattern
The typestate pattern uses Rust’s type system to make **invalid states
unrepresentable**.
```rust
# use std::fmt::Write;
#[derive(Default)]
struct Serializer { output: String }
struct SerializeStruct { ser: Serializer }
impl Serializer {
fn serialize_struct(mut self, name: &str) -> SerializeStruct {
let _ = writeln!(&mut self.output, "{name} {{");
SerializeStruct { ser: self }
}
}
impl SerializeStruct {
fn serialize_field(mut self, key: &str, value: &str) -> Self {
let _ = writeln!(&mut self.ser.output, " {key}={value};");
self
}
fn finish_struct(mut self) -> Serializer {
self.ser.output.push_str("}\n");
self.ser
}
}
let ser = Serializer::default()
.serialize_struct("User")
.serialize_field("id", "42")
.serialize_field("name", "Alice")
.finish_struct();
println!("{}", ser.output);
```
<details>
- This example is inspired by
[Serde's `Serializer` trait](https://docs.rs/serde/latest/serde/ser/trait.Serializer.html).
For a deeper explanation of how Serde models serialization as a state machine,
see <https://serde.rs/impl-serializer.html>.
- The typestate pattern allows us to model state machines using Rust’s type
system. In this case, the state machine is a simple serializer.
- The key idea is that each state in the process, starting a struct, writing
fields, and finishing, is represented by a different type. Transitions between
states happen by consuming one value and producing another.
- In the example above:
- Once we begin serializing a struct, the `Serializer` is moved into the
`SerializeStruct` state. At that point, we no longer have access to the
original `Serializer`.
- While in the `SerializeStruct` state, we can only call methods related to
writing fields. We cannot use the same instance to serialize a tuple, list,
or primitive. Those constructors simply do not exist here.
- Only after calling `finish_struct` do we get the `Serializer` back. At that
point, we can inspect the output or start a new serialization session.
- If we forget to call `finish_struct` and drop the `SerializeStruct` instead,
the original `Serializer` is lost. This ensures that incomplete or invalid
output can never be observed.
- By contrast, if all methods were defined on `Serializer` itself, nothing would
prevent users from mixing serialization modes or leaving a struct unfinished.
- This pattern avoids such misuse by making it **impossible to represent invalid
transitions**.
- One downside of typestate modeling is potential code duplication between
states. In the next section, we will see how to use **generics** to reduce
duplication while preserving correctness.
</details>

View File

@ -0,0 +1,141 @@
## Typestate Pattern with Generics
Generics can be used with the typestate pattern to reduce duplication and allow
shared logic across state variants, while still encoding state transitions in
the type system.
```rust
# fn main() -> std::io::Result<()> {
#[non_exhaustive]
struct Insecure;
struct Secure {
client_cert: Option<Vec<u8>>,
}
trait Transport {
/* ... */
}
impl Transport for Insecure {
/* ... */
}
impl Transport for Secure {
/* ... */
}
#[non_exhaustive]
struct WantsTransport;
struct Ready<T> {
transport: T,
}
struct ConnectionBuilder<T> {
host: String,
timeout: Option<u64>,
stage: T,
}
struct Connection {/* ... */}
impl Connection {
fn new(host: &str) -> ConnectionBuilder<WantsTransport> {
ConnectionBuilder {
host: host.to_owned(),
timeout: None,
stage: WantsTransport,
}
}
}
impl<T> ConnectionBuilder<T> {
fn timeout(mut self, secs: u64) -> Self {
self.timeout = Some(secs);
self
}
}
impl ConnectionBuilder<WantsTransport> {
fn insecure(self) -> ConnectionBuilder<Ready<Insecure>> {
ConnectionBuilder {
host: self.host,
timeout: self.timeout,
stage: Ready { transport: Insecure },
}
}
fn secure(self) -> ConnectionBuilder<Ready<Secure>> {
ConnectionBuilder {
host: self.host,
timeout: self.timeout,
stage: Ready { transport: Secure { client_cert: None } },
}
}
}
impl ConnectionBuilder<Ready<Secure>> {
fn client_certificate(mut self, raw: Vec<u8>) -> Self {
self.stage.transport.client_cert = Some(raw);
self
}
}
impl<T: Transport> ConnectionBuilder<Ready<T>> {
fn connect(self) -> std::io::Result<Connection> {
// ... use valid state to establish the configured connection
Ok(Connection {})
}
}
let _conn = Connection::new("db.local")
.secure()
.client_certificate(vec![1, 2, 3])
.timeout(10)
.connect()?;
Ok(())
# }
```
<details>
- This example extends the typestate pattern using **generic parameters** to
avoid duplication of common logic.
- We use a generic type `T` to represent the current stage of the builder, and
share fields like `host` and `timeout` across all stages.
- The transport phase uses `insecure()` and `secure()` to transition from
`WantsTransport` into `Ready<T>`, where `T` is a type that implements the
`Transport` trait.
- Only once the connection is in a `Ready<T>` state, we can call `.connect()`,
guaranteed at compile time.
- Using generics allows us to avoid writing separate `BuilderForSecure`,
`BuilderForInsecure`, etc. structs.
Shared behavior, like `.timeout(...)`, can be implemented once and reused
across all states.
- This same design appears
[in real-world libraries like **Rustls**](https://docs.rs/rustls/latest/rustls/struct.ConfigBuilder.html),
where the `ConfigBuilder` uses typestate and generics to guide users through a
safe, ordered configuration flow.
It enforces at compile time that users must choose protocol versions, a
certificate verifier, and client certificate options, in the correct sequence,
before building a config.
- **Downsides** of this approach include:
- The documentation of the various builder types can become difficult to
follow, since their names are generated by generics and internal structs
like `Ready<T>`.
- Error messages from the compiler may become more opaque, especially if a
trait bound is not satisfied or a state transition is incomplete.
The error messages might also be hard to follow due to the complexity as a
result of the nested generics types.
- Still, in return for this complexity, you get compile-time enforcement of
valid configuration, clear builder sequencing, and no possibility of
forgetting a required step or misusing the API at runtime.
</details>