1
0
mirror of https://github.com/google/comprehensive-rust.git synced 2025-08-08 08:22:52 +02:00

rework the initial typestate no-generic content

This commit is contained in:
Glen De Cauwsemaecker
2025-08-02 11:46:36 +02:00
parent 4b0870eb35
commit 14cc136c3e
3 changed files with 156 additions and 63 deletions

View File

@ -438,6 +438,7 @@
- [Parse, Don't Validate](idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md) - [Parse, Don't Validate](idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md)
- [Is It Encapsulated?](idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md) - [Is It Encapsulated?](idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md)
- [Typestate Pattern](idiomatic/leveraging-the-type-system/typestate-pattern.md) - [Typestate Pattern](idiomatic/leveraging-the-type-system/typestate-pattern.md)
- [Typestate Pattern Example](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-example.md)
- [Typestate Pattern with Generics](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics.md) - [Typestate Pattern with Generics](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics.md)
--- ---

View File

@ -1,94 +1,88 @@
--- ---
minutes: 15 minutes: 30
--- ---
## Typestate Pattern ## Typestate Pattern: Problem
Typestate is the practice of encoding a part of the state of the value in its type, preventing incorrect or inapplicable operations from being called on the value. How can we ensure that only valid operations are allowed on a value based on its
current state?
```rust,editable
use std::fmt::Write as _;
```rust
# use std::fmt::Write;
#[derive(Default)] #[derive(Default)]
struct Serializer { output: String } struct Serializer {
struct SerializeStruct { serializer: Serializer } output: String,
impl Serializer {
fn serialize_struct(mut self, name: &str) -> SerializeStruct {
let _ = writeln!(&mut self.output, "{name} {{");
SerializeStruct { serializer: self }
}
} }
impl SerializeStruct { impl Serializer {
fn serialize_field(mut self, key: &str, value: &str) -> Self { fn serialize_struct_start(&mut self, name: &str) {
let _ = writeln!(&mut self.serializer.output, " {key}={value};"); let _ = writeln!(&mut self.output, "{name} {{");
self
} }
fn finish_struct(mut self) -> Serializer { fn serialize_struct_field(&mut self, key: &str, value: &str) {
self.serializer.output.push_str("}\n"); let _ = writeln!(&mut self.output, " {key}={value};");
self.serializer }
fn serialize_struct_end(&mut self) {
self.output.push_str("}\n");
}
fn finish(self) -> String {
self.output
} }
} }
fn main() { fn main() {
let serializer = Serializer::default() let mut serializer = Serializer::default();
.serialize_struct("User") serializer.serialize_struct_start("User");
.serialize_field("id", "42") serializer.serialize_struct_field("id", "42");
.serialize_field("name", "Alice") serializer.serialize_struct_field("name", "Alice");
.finish_struct();
println!("{}", serializer.output); // serializer.serialize_struct_end(); // ← Oops! Forgotten
println!("{}", serializer.finish());
} }
``` ```
<details> <details>
- This example is inspired by - This `Serializer` is meant to write a structured value. The expected usage
[Serde's `Serializer` trait](https://docs.rs/serde/latest/serde/ser/trait.Serializer.html). follows this sequence:
For a deeper explanation of how Serde models serialization as a state machine,
see <https://serde.rs/impl-serializer.html>.
- The typestate pattern allows us to model state machines using Rust’s type
system. In this case, the state machine is a simple serializer.
- The key idea is that at each state in the process, we can only
do the actions which are valid for that state. Transitions between
states happen by consuming one value and producing another.
```bob ```bob
+------------+ serialize struct +-----------------+ serialize struct start
| Serializer +-------------------->| SerializeStruct |<-------+ -+---------------------
+------------+ +-+-----+---------+ | |
^ | | | +--> serialize struct field
| finish struct | | serialize field | -+---------------------
+-----------------------------+ +------------------+ |
+--> serialize struct field
-+---------------------
|
+--> serialize struct end
``` ```
- In the example above: - However, in this example we forgot to call `serialize_struct_end()` before
`finish()`. As a result, the serialized output is incomplete or syntactically
incorrect.
- Once we begin serializing a struct, the `Serializer` is moved into the - One approach to fix this would be to track internal state manually, and return
`SerializeStruct` state. At that point, we no longer have access to the a `Result` from methods like `serialize_struct_field()` or `finish()` if the
original `Serializer`. current state is invalid.
- While in the `SerializeStruct` state, we can only call methods related to - But this has downsides:
writing fields. We cannot use the same instance to serialize a tuple, list,
or primitive. Those constructors simply do not exist here.
- Only after calling `finish_struct` do we get the `Serializer` back. At that - It is easy to get wrong as an implementer. Rust’s type system cannot help
point, we can inspect the output or start a new serialization session. enforce the correctness of our state transitions.
- If we forget to call `finish_struct` and drop the `SerializeStruct` instead, - It also adds unnecessary burden on the user, who must handle `Result` values
the original `Serializer` is lost. This ensures that incomplete or invalid for operations that are misused in source code rather than at runtime.
output can never be observed.
- By contrast, if all methods were defined on `Serializer` itself, nothing would - A better solution is to model the valid state transitions directly in the type
prevent users from mixing serialization modes or leaving a struct unfinished. system.
- This pattern avoids such misuse by making it **impossible to represent invalid In the next slide, we will apply the **typestate pattern** to enforce correct
transitions**. usage at compile time and make invalid states unrepresentable.
- One downside of typestate modeling is potential code duplication between
states. In the next section, we will see how to use **generics** to reduce
duplication while preserving correctness.
</details> </details>

View File

@ -0,0 +1,98 @@
## Typestate Pattern: Example
The typestate pattern encodes part of a value’s runtime state into its type.
This allows us to prevent invalid or inapplicable operations at compile time.
```rust,editable
use std::fmt::Write as _;
#[derive(Default)]
struct Serializer {
output: String,
}
struct SerializeStruct {
serializer: Serializer,
}
impl Serializer {
fn serialize_struct(mut self, name: &str) -> SerializeStruct {
let _ = writeln!(&mut self.output, "{name} {{");
SerializeStruct { serializer: self }
}
fn finish(self) -> String {
self.output
}
}
impl SerializeStruct {
fn serialize_field(mut self, key: &str, value: &str) -> Self {
let _ = writeln!(&mut self.serializer.output, " {key}={value};");
self
}
fn finish_struct(mut self) -> Serializer {
self.serializer.output.push_str("}\n");
self.serializer
}
}
fn main() {
let serializer = Serializer::default()
.serialize_struct("User")
.serialize_field("id", "42")
.serialize_field("name", "Alice")
.finish_struct();
println!("{}", serializer.finish());
}
```
<details>
- This example is inspired by Serde’s
[`Serializer` trait](https://docs.rs/serde/latest/serde/ser/trait.Serializer.html).
Serde uses typestates internally to ensure serialization follows a valid
structure. For more, see: <https://serde.rs/impl-serializer.html>
- The key idea behind typestate is that state transitions happen by consuming a
value and producing a new one. At each step, only operations valid for that
state are available.
```bob
+------------+ serialize struct +-----------------+
| Serializer +-------------------->| SerializeStruct |<-------+
+--+---------+ +-+-----+---------+ |
| ^ | | |
| | finish struct | | serialize field |
| +-----------------------------+ +------------------+
|
+---> finish
```
- In this example:
- We begin with a `Serializer`, which only allows us to start serializing a
struct.
- Once we call `.serialize_struct(...)`, ownership moves into a
`SerializeStruct` value. From that point on, we can only call methods
related to serializing struct fields.
- The original `Serializer` is no longer accessible — preventing us from
mixing modes (like writing a tuple or primitive mid-struct) or calling
`finish()` too early.
- Only after calling `.finish_struct()` do we receive the `Serializer` back.
At that point, the output can be finalized or reused.
- If we forget to call `finish_struct()` and drop the `SerializeStruct` early,
the `Serializer` is also dropped. This ensures incomplete output cannot leak
into the system.
- By contrast, if we had implemented everything on `Serializer` directly — as
seen on the previous slide, nothing would stop someone from skipping important
steps or mixing serialization flows.
</details>