diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 437c2b0b..fce36275 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -438,6 +438,7 @@ - [Parse, Don't Validate](idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md) - [Is It Encapsulated?](idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md) - [Typestate Pattern](idiomatic/leveraging-the-type-system/typestate-pattern.md) + - [Typestate Pattern Example](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-example.md) - [Typestate Pattern with Generics](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics.md) --- diff --git a/src/idiomatic/leveraging-the-type-system/typestate-pattern.md b/src/idiomatic/leveraging-the-type-system/typestate-pattern.md index a67dadd2..627101c8 100644 --- a/src/idiomatic/leveraging-the-type-system/typestate-pattern.md +++ b/src/idiomatic/leveraging-the-type-system/typestate-pattern.md @@ -1,94 +1,88 @@ --- -minutes: 15 +minutes: 30 --- -## Typestate Pattern +## Typestate Pattern: Problem -Typestate is the practice of encoding a part of the state of the value in its type, preventing incorrect or inapplicable operations from being called on the value. +How can we ensure that only valid operations are allowed on a value based on its +current state? + +```rust,editable +use std::fmt::Write as _; -```rust -# use std::fmt::Write; #[derive(Default)] -struct Serializer { output: String } -struct SerializeStruct { serializer: Serializer } - -impl Serializer { - fn serialize_struct(mut self, name: &str) -> SerializeStruct { - let _ = writeln!(&mut self.output, "{name} {{"); - SerializeStruct { serializer: self } - } +struct Serializer { + output: String, } -impl SerializeStruct { - fn serialize_field(mut self, key: &str, value: &str) -> Self { - let _ = writeln!(&mut self.serializer.output, " {key}={value};"); - self +impl Serializer { + fn serialize_struct_start(&mut self, name: &str) { + let _ = writeln!(&mut self.output, "{name} {{"); } - fn finish_struct(mut self) -> Serializer { - self.serializer.output.push_str("}\n"); - self.serializer + fn serialize_struct_field(&mut self, key: &str, value: &str) { + let _ = writeln!(&mut self.output, " {key}={value};"); + } + + fn serialize_struct_end(&mut self) { + self.output.push_str("}\n"); + } + + fn finish(self) -> String { + self.output } } fn main() { - let serializer = Serializer::default() - .serialize_struct("User") - .serialize_field("id", "42") - .serialize_field("name", "Alice") - .finish_struct(); - println!("{}", serializer.output); + let mut serializer = Serializer::default(); + serializer.serialize_struct_start("User"); + serializer.serialize_struct_field("id", "42"); + serializer.serialize_struct_field("name", "Alice"); + + // serializer.serialize_struct_end(); // ← Oops! Forgotten + + println!("{}", serializer.finish()); } ```
-- This example is inspired by - [Serde's `Serializer` trait](https://docs.rs/serde/latest/serde/ser/trait.Serializer.html). - For a deeper explanation of how Serde models serialization as a state machine, - see . - -- The typestate pattern allows us to model state machines using Rust’s type - system. In this case, the state machine is a simple serializer. - -- The key idea is that at each state in the process, we can only - do the actions which are valid for that state. Transitions between - states happen by consuming one value and producing another. +- This `Serializer` is meant to write a structured value. The expected usage + follows this sequence: ```bob -+------------+ serialize struct +-----------------+ -| Serializer +-------------------->| SerializeStruct |<-------+ -+------------+ +-+-----+---------+ | - ^ | | | - | finish struct | | serialize field | - +-----------------------------+ +------------------+ +serialize struct start +-+--------------------- + | + +--> serialize struct field + -+--------------------- + | + +--> serialize struct field + -+--------------------- + | + +--> serialize struct end ``` -- In the example above: +- However, in this example we forgot to call `serialize_struct_end()` before + `finish()`. As a result, the serialized output is incomplete or syntactically + incorrect. - - Once we begin serializing a struct, the `Serializer` is moved into the - `SerializeStruct` state. At that point, we no longer have access to the - original `Serializer`. +- One approach to fix this would be to track internal state manually, and return + a `Result` from methods like `serialize_struct_field()` or `finish()` if the + current state is invalid. - - While in the `SerializeStruct` state, we can only call methods related to - writing fields. We cannot use the same instance to serialize a tuple, list, - or primitive. Those constructors simply do not exist here. +- But this has downsides: - - Only after calling `finish_struct` do we get the `Serializer` back. At that - point, we can inspect the output or start a new serialization session. + - It is easy to get wrong as an implementer. Rust’s type system cannot help + enforce the correctness of our state transitions. - - If we forget to call `finish_struct` and drop the `SerializeStruct` instead, - the original `Serializer` is lost. This ensures that incomplete or invalid - output can never be observed. + - It also adds unnecessary burden on the user, who must handle `Result` values + for operations that are misused in source code rather than at runtime. -- By contrast, if all methods were defined on `Serializer` itself, nothing would - prevent users from mixing serialization modes or leaving a struct unfinished. +- A better solution is to model the valid state transitions directly in the type + system. -- This pattern avoids such misuse by making it **impossible to represent invalid - transitions**. - -- One downside of typestate modeling is potential code duplication between - states. In the next section, we will see how to use **generics** to reduce - duplication while preserving correctness. + In the next slide, we will apply the **typestate pattern** to enforce correct + usage at compile time and make invalid states unrepresentable.
diff --git a/src/idiomatic/leveraging-the-type-system/typestate-pattern/typestate-example.md b/src/idiomatic/leveraging-the-type-system/typestate-pattern/typestate-example.md new file mode 100644 index 00000000..bde35052 --- /dev/null +++ b/src/idiomatic/leveraging-the-type-system/typestate-pattern/typestate-example.md @@ -0,0 +1,98 @@ +## Typestate Pattern: Example + +The typestate pattern encodes part of a value’s runtime state into its type. +This allows us to prevent invalid or inapplicable operations at compile time. + +```rust,editable +use std::fmt::Write as _; + +#[derive(Default)] +struct Serializer { + output: String, +} + +struct SerializeStruct { + serializer: Serializer, +} + +impl Serializer { + fn serialize_struct(mut self, name: &str) -> SerializeStruct { + let _ = writeln!(&mut self.output, "{name} {{"); + SerializeStruct { serializer: self } + } + + fn finish(self) -> String { + self.output + } +} + +impl SerializeStruct { + fn serialize_field(mut self, key: &str, value: &str) -> Self { + let _ = writeln!(&mut self.serializer.output, " {key}={value};"); + self + } + + fn finish_struct(mut self) -> Serializer { + self.serializer.output.push_str("}\n"); + self.serializer + } +} + +fn main() { + let serializer = Serializer::default() + .serialize_struct("User") + .serialize_field("id", "42") + .serialize_field("name", "Alice") + .finish_struct(); + + println!("{}", serializer.finish()); +} +``` + +
+ +- This example is inspired by Serde’s + [`Serializer` trait](https://docs.rs/serde/latest/serde/ser/trait.Serializer.html). + Serde uses typestates internally to ensure serialization follows a valid + structure. For more, see: + +- The key idea behind typestate is that state transitions happen by consuming a + value and producing a new one. At each step, only operations valid for that + state are available. + +```bob ++------------+ serialize struct +-----------------+ +| Serializer +-------------------->| SerializeStruct |<-------+ ++--+---------+ +-+-----+---------+ | + | ^ | | | + | | finish struct | | serialize field | + | +-----------------------------+ +------------------+ + | + +---> finish +``` + +- In this example: + + - We begin with a `Serializer`, which only allows us to start serializing a + struct. + + - Once we call `.serialize_struct(...)`, ownership moves into a + `SerializeStruct` value. From that point on, we can only call methods + related to serializing struct fields. + + - The original `Serializer` is no longer accessible — preventing us from + mixing modes (like writing a tuple or primitive mid-struct) or calling + `finish()` too early. + + - Only after calling `.finish_struct()` do we receive the `Serializer` back. + At that point, the output can be finalized or reused. + +- If we forget to call `finish_struct()` and drop the `SerializeStruct` early, + the `Serializer` is also dropped. This ensures incomplete output cannot leak + into the system. + +- By contrast, if we had implemented everything on `Serializer` directly — as + seen on the previous slide, nothing would stop someone from skipping important + steps or mixing serialization flows. + +