diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 1dca1f14..437c2b0b 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -437,6 +437,8 @@ - [Semantic Confusion](idiomatic/leveraging-the-type-system/newtype-pattern/semantic-confusion.md) - [Parse, Don't Validate](idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md) - [Is It Encapsulated?](idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md) + - [Typestate Pattern](idiomatic/leveraging-the-type-system/typestate-pattern.md) + - [Typestate Pattern with Generics](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics.md) --- diff --git a/src/idiomatic/leveraging-the-type-system/typestate-pattern.md b/src/idiomatic/leveraging-the-type-system/typestate-pattern.md new file mode 100644 index 00000000..1753780f --- /dev/null +++ b/src/idiomatic/leveraging-the-type-system/typestate-pattern.md @@ -0,0 +1,84 @@ +--- +minutes: 15 +--- + +## Typestate Pattern + +The typestate pattern uses Rust’s type system to make **invalid states +unrepresentable**. + +```rust +# use std::fmt::Write; +#[derive(Default)] +struct Serializer { output: String } +struct SerializeStruct { ser: Serializer } + +impl Serializer { + fn serialize_struct(mut self, name: &str) -> SerializeStruct { + let _ = writeln!(&mut self.output, "{name} {{"); + SerializeStruct { ser: self } + } +} + +impl SerializeStruct { + fn serialize_field(mut self, key: &str, value: &str) -> Self { + let _ = writeln!(&mut self.ser.output, " {key}={value};"); + self + } + + fn finish_struct(mut self) -> Serializer { + self.ser.output.push_str("}\n"); + self.ser + } +} + +let ser = Serializer::default() + .serialize_struct("User") + .serialize_field("id", "42") + .serialize_field("name", "Alice") + .finish_struct(); +println!("{}", ser.output); +``` + +
+ +- This example is inspired by + [Serde's `Serializer` trait](https://docs.rs/serde/latest/serde/ser/trait.Serializer.html). + For a deeper explanation of how Serde models serialization as a state machine, + see . + +- The typestate pattern allows us to model state machines using Rust’s type + system. In this case, the state machine is a simple serializer. + +- The key idea is that each state in the process, starting a struct, writing + fields, and finishing, is represented by a different type. Transitions between + states happen by consuming one value and producing another. + +- In the example above: + + - Once we begin serializing a struct, the `Serializer` is moved into the + `SerializeStruct` state. At that point, we no longer have access to the + original `Serializer`. + + - While in the `SerializeStruct` state, we can only call methods related to + writing fields. We cannot use the same instance to serialize a tuple, list, + or primitive. Those constructors simply do not exist here. + + - Only after calling `finish_struct` do we get the `Serializer` back. At that + point, we can inspect the output or start a new serialization session. + + - If we forget to call `finish_struct` and drop the `SerializeStruct` instead, + the original `Serializer` is lost. This ensures that incomplete or invalid + output can never be observed. + +- By contrast, if all methods were defined on `Serializer` itself, nothing would + prevent users from mixing serialization modes or leaving a struct unfinished. + +- This pattern avoids such misuse by making it **impossible to represent invalid + transitions**. + +- One downside of typestate modeling is potential code duplication between + states. In the next section, we will see how to use **generics** to reduce + duplication while preserving correctness. + +
diff --git a/src/idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics.md b/src/idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics.md new file mode 100644 index 00000000..ecbd526a --- /dev/null +++ b/src/idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics.md @@ -0,0 +1,141 @@ +## Typestate Pattern with Generics + +Generics can be used with the typestate pattern to reduce duplication and allow +shared logic across state variants, while still encoding state transitions in +the type system. + +```rust +# fn main() -> std::io::Result<()> { +#[non_exhaustive] +struct Insecure; +struct Secure { + client_cert: Option>, +} + +trait Transport { + /* ... */ +} +impl Transport for Insecure { + /* ... */ +} +impl Transport for Secure { + /* ... */ +} + +#[non_exhaustive] +struct WantsTransport; +struct Ready { + transport: T, +} + +struct ConnectionBuilder { + host: String, + timeout: Option, + stage: T, +} + +struct Connection {/* ... */} + +impl Connection { + fn new(host: &str) -> ConnectionBuilder { + ConnectionBuilder { + host: host.to_owned(), + timeout: None, + stage: WantsTransport, + } + } +} + +impl ConnectionBuilder { + fn timeout(mut self, secs: u64) -> Self { + self.timeout = Some(secs); + self + } +} + +impl ConnectionBuilder { + fn insecure(self) -> ConnectionBuilder> { + ConnectionBuilder { + host: self.host, + timeout: self.timeout, + stage: Ready { transport: Insecure }, + } + } + + fn secure(self) -> ConnectionBuilder> { + ConnectionBuilder { + host: self.host, + timeout: self.timeout, + stage: Ready { transport: Secure { client_cert: None } }, + } + } +} + +impl ConnectionBuilder> { + fn client_certificate(mut self, raw: Vec) -> Self { + self.stage.transport.client_cert = Some(raw); + self + } +} + +impl ConnectionBuilder> { + fn connect(self) -> std::io::Result { + // ... use valid state to establish the configured connection + Ok(Connection {}) + } +} + +let _conn = Connection::new("db.local") + .secure() + .client_certificate(vec![1, 2, 3]) + .timeout(10) + .connect()?; +Ok(()) +# } +``` + +
+ +- This example extends the typestate pattern using **generic parameters** to + avoid duplication of common logic. + +- We use a generic type `T` to represent the current stage of the builder, and + share fields like `host` and `timeout` across all stages. + +- The transport phase uses `insecure()` and `secure()` to transition from + `WantsTransport` into `Ready`, where `T` is a type that implements the + `Transport` trait. + +- Only once the connection is in a `Ready` state, we can call `.connect()`, + guaranteed at compile time. + +- Using generics allows us to avoid writing separate `BuilderForSecure`, + `BuilderForInsecure`, etc. structs. + + Shared behavior, like `.timeout(...)`, can be implemented once and reused + across all states. + +- This same design appears + [in real-world libraries like **Rustls**](https://docs.rs/rustls/latest/rustls/struct.ConfigBuilder.html), + where the `ConfigBuilder` uses typestate and generics to guide users through a + safe, ordered configuration flow. + + It enforces at compile time that users must choose protocol versions, a + certificate verifier, and client certificate options, in the correct sequence, + before building a config. + +- **Downsides** of this approach include: + - The documentation of the various builder types can become difficult to + follow, since their names are generated by generics and internal structs + like `Ready`. + - Error messages from the compiler may become more opaque, especially if a + trait bound is not satisfied or a state transition is incomplete. + + The error messages might also be hard to follow due to the complexity as a + result of the nested generics types. + +- Still, in return for this complexity, you get compile-time enforcement of + valid configuration, clear builder sequencing, and no possibility of + forgetting a required step or misusing the API at runtime. + +