From 4483602cd2cf3365b901399b93d8312e403b02e8 Mon Sep 17 00:00:00 2001 From: Luca Palmieri <20745048+LukeMathWalker@users.noreply.github.com> Date: Wed, 9 Jul 2025 21:03:41 +0200 Subject: [PATCH] Introduce 'Idiomatic Rust' learning module (#2800) This PR introduces: - A new section for the "Idiomatic Rust" learning module - (The beginning of) the section on newtype patterns --------- Co-authored-by: Dmitri Gribenko --- mdbook-course/src/replacements.rs | 3 +- src/SUMMARY.md | 11 ++ src/idiomatic/leveraging-the-type-system.md | 53 +++++++++ .../newtype-pattern.md | 51 ++++++++ .../newtype-pattern/is-it-encapsulated.md | 54 +++++++++ .../newtype-pattern/parse-don-t-validate.md | 58 +++++++++ .../newtype-pattern/semantic-confusion.md | 73 ++++++++++++ src/idiomatic/welcome.md | 111 ++++++++++++++++++ src/running-the-course/course-structure.md | 10 ++ src/user-defined-types/tuple-structs.md | 2 + 10 files changed, 425 insertions(+), 1 deletion(-) create mode 100644 src/idiomatic/leveraging-the-type-system.md create mode 100644 src/idiomatic/leveraging-the-type-system/newtype-pattern.md create mode 100644 src/idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md create mode 100644 src/idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md create mode 100644 src/idiomatic/leveraging-the-type-system/newtype-pattern/semantic-confusion.md create mode 100644 src/idiomatic/welcome.md diff --git a/mdbook-course/src/replacements.rs b/mdbook-course/src/replacements.rs index 3695aa79..48354929 100644 --- a/mdbook-course/src/replacements.rs +++ b/mdbook-course/src/replacements.rs @@ -48,7 +48,8 @@ pub fn replace( ["course", "outline"] if course.is_some() => { course.unwrap().schedule() } - ["course", "outline", course_name] => { + ["course", "outline", course_name @ ..] => { + let course_name = course_name.join(" "); let Some(course) = courses.find_course(course_name) else { return format!("not found - {}", captures[0].to_string()); }; diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 4caccd92..1950476a 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -429,6 +429,17 @@ --- +# Idiomatic Rust + +- [Welcome](idiomatic/welcome.md) +- [Leveraging the Type System](idiomatic/leveraging-the-type-system.md) + - [Newtype Pattern](idiomatic/leveraging-the-type-system/newtype-pattern.md) + - [Semantic Confusion](idiomatic/leveraging-the-type-system/newtype-pattern/semantic-confusion.md) + - [Parse, Don't Validate](idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md) + - [Is It Encapsulated?](idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md) + +--- + # Final Words - [Thanks!](thanks.md) diff --git a/src/idiomatic/leveraging-the-type-system.md b/src/idiomatic/leveraging-the-type-system.md new file mode 100644 index 00000000..d7a871b4 --- /dev/null +++ b/src/idiomatic/leveraging-the-type-system.md @@ -0,0 +1,53 @@ +--- +minutes: 5 +--- + +# Leveraging the Type System + +Rust's type system is _expressive_: you can use types and traits to build +abstractions that make your code harder to misuse. + +In some cases, you can go as far as enforcing correctness at _compile-time_, +with no runtime overhead. + +Types and traits can model concepts and constraints from your business domain. +With careful design, you can improve the clarity and maintainability of the +entire codebase. + +
+ +Additional items speaker may mention: + +- Rust's type system borrows a lot of ideas from functional programming + languages. + + For example, Rust's enums are known as "algebraic data types" in languages + like Haskell and OCaml. You can take inspiration from learning material geared + towards functional languages when looking for guidance on how to design with + types. ["Domain Modeling Made Functional"][1] is a great resource on the + topic, with examples written in F#. + +- Despite Rust's functional roots, not all functional design patterns can be + easily translated to Rust. + + For example, you must have a solid grasp on a broad selection of advanced + topics to design APIs that leverage higher-order functions and higher-kinded + types in Rust. + + Evaluate, on a case-by-case basis, whether a more imperative approach may be + easier to implement. Consider using in-place mutation, relying on Rust's + borrow-checker and type system to control what can be mutated, and where. + +- The same caution should be applied to object-oriented design patterns. Rust + doesn't support inheritance, and object decomposition should take into account + the constraints introduced by the borrow checker. + +- Mention that type-level programming can be often used to create "zero-cost + abstractions", although the label can be misleading: the impact on compile + times and code complexity may be significant. + +
+ +{{%segment outline}} + +[1]: https://pragprog.com/titles/swdddf/domain-modeling-made-functional/ diff --git a/src/idiomatic/leveraging-the-type-system/newtype-pattern.md b/src/idiomatic/leveraging-the-type-system/newtype-pattern.md new file mode 100644 index 00000000..330e465b --- /dev/null +++ b/src/idiomatic/leveraging-the-type-system/newtype-pattern.md @@ -0,0 +1,51 @@ +--- +minutes: 5 +--- + +# Newtype Pattern + +A _newtype_ is a wrapper around an existing type, often a primitive: + +```rust +/// A unique user identifier, implemented as a newtype around `u64`. +pub struct UserId(u64); +``` + +Unlike type aliases, newtypes aren't interchangeable with the wrapped type: + +```rust,compile_fail +# pub struct UserId(u64); +fn double(n: u64) -> u64 { + n * 2 +} + +double(UserId(1)); // 🛠️❌ +``` + +The Rust compiler won't let you use methods or operators defined on the +underlying type either: + +```rust,compile_fail +# pub struct UserId(u64); +assert_ne!(UserId(1), UserId(2)); // 🛠️❌ +``` + +
+ +- Students should have encountered the newtype pattern in the "Fundamentals" + course, when they learned about + [tuple structs](../../user-defined-types/tuple-structs.md). + +- Run the example to show students the error message from the compiler. + +- Modify the example to use a typealias instead of a newtype, such as + `type MessageId = u64`. The modified example should compile, thus highlighting + the differences between the two approaches. + +- Stress that newtypes, out of the box, have no behaviour attached to them. You + need to be intentional about which methods and operators you are willing to + forward from the underlying type. In our `UserId` example, it is reasonable to + allow comparisons between `UserId`s, but it wouldn't make sense to allow + arithmetic operations like addition or subtraction. + +
diff --git a/src/idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md b/src/idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md new file mode 100644 index 00000000..32decb7a --- /dev/null +++ b/src/idiomatic/leveraging-the-type-system/newtype-pattern/is-it-encapsulated.md @@ -0,0 +1,54 @@ +--- +minutes: 5 +--- + +# Is It Truly Encapsulated? + +You must evaluate _the entire API surface_ exposed by a newtype to determine if +invariants are indeed bullet-proof. It is crucial to consider all possible +interactions, including trait implementations, that may allow users to bypass +validation checks. + +```rust +pub struct Username(String); + +impl Username { + pub fn new(username: String) -> Result { + // Validation checks... + Ok(Self(username)) + } +} + +impl std::ops::DerefMut for Username { // ‼️ + fn deref_mut(&mut self) -> &mut Self::Target { + &mut self.0 + } +} +# impl std::ops::Deref for Username { +# type Target = str; +# +# fn deref(&self) -> &Self::Target { +# &self.0 +# } +# } +# pub struct InvalidUsername; +``` + +
+ +- `DerefMut` allows users to get a mutable reference to the wrapped value. + + The mutable reference can be used to modify the underlying data in ways that + may violate the invariants enforced by `Username::new`! + +- When auditing the API surface of a newtype, you can narrow down the review + scope to methods and traits that provide mutable access to the underlying + data. + +- Remind students of privacy boundaries. + + In particular, functions and methods defined in the same module of the newtype + can access its underlying data directly. If possible, move the newtype + definition to its own separate module to reduce the scope of the audit. + +
diff --git a/src/idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md b/src/idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md new file mode 100644 index 00000000..3989fcae --- /dev/null +++ b/src/idiomatic/leveraging-the-type-system/newtype-pattern/parse-don-t-validate.md @@ -0,0 +1,58 @@ +--- +minutes: 5 +--- + +# Parse, Don't Validate + +The newtype pattern can be leveraged to enforce _invariants_. + +```rust +pub struct Username(String); + +impl Username { + pub fn new(username: String) -> Result { + if username.is_empty() { + return Err(InvalidUsername::CannotBeEmpty) + } + if username.len() > 32 { + return Err(InvalidUsername::TooLong { len: username.len() }) + } + // Other validation checks... + Ok(Self(username)) + } + + pub fn as_str(&self) -> &str { + &self.0 + } +} +# pub enum InvalidUsername { +# CannotBeEmpty, +# TooLong { len: usize }, +# } +``` + +
+ +- The newtype pattern, combined with Rust's module and visibility system, can be + used to _guarantee_ that instances of a given type satisfy a set of + invariants. + + In the example above, the raw `String` stored inside the `Username` struct + can't be accessed directly from other modules or crates, since it's not marked + as `pub` or `pub(in ...)`. Consumers of the `Username` type are forced to use + the `new` method to create instances. In turn, `new` performs validation, thus + ensuring that all instances of `Username` satisfy those checks. + +- The `as_str` method allows consumers to access the raw string representation + (e.g., to store it in a database). However, consumers can't modify the + underlying value since `&str`, the returned type, restricts them to read-only + access. + +- Type-level invariants have second-order benefits. + + The input is validated once, at the boundary, and the rest of the program can + rely on the invariants being upheld. We can avoid redundant validation and + "defensive programming" checks throughout the program, reducing noise and + improving performance. + +
diff --git a/src/idiomatic/leveraging-the-type-system/newtype-pattern/semantic-confusion.md b/src/idiomatic/leveraging-the-type-system/newtype-pattern/semantic-confusion.md new file mode 100644 index 00000000..9c66acba --- /dev/null +++ b/src/idiomatic/leveraging-the-type-system/newtype-pattern/semantic-confusion.md @@ -0,0 +1,73 @@ +--- +minutes: 5 +--- + +# Semantic Confusion + +When a function takes multiple arguments of the same type, call sites are +unclear: + +```rust +# struct LoginError; +pub fn login(username: &str, password: &str) -> Result<(), LoginError> { + // [...] + # Ok(()) +} + +# let password = "password"; +# let username = "username"; +// In another part of the codebase, we swap arguments by mistake. +// Bug (best case), security vulnerability (worst case) +login(password, username); +``` + +The newtype pattern can prevent this class of errors at compile time: + +```rust,compile_fail +pub struct Username(String); +pub struct Password(String); +# struct LoginError; + +pub fn login(username: &Username, password: &Password) -> Result<(), LoginError> { + // [...] + # Ok(()) +} + +# let password = Password("password".into()); +# let username = Username("username".into()); +login(password, username); // 🛠️❌ +``` + +
+ +- Run both examples to show students the successful compilation for the original + example, and the compiler error returned by the modified example. + +- Stress the _semantic_ angle. The newtype pattern should be leveraged to use + distinct types for distinct concepts, thus ruling out this class of errors + entirely. + +- Nonetheless, note that there are legitimate scenarios where a function may + take multiple arguments of the same type. In those scenarios, if correctness + is of paramount importance, consider using a struct with named fields as + input: + ```rust + pub struct LoginArguments<'a> { + pub username: &'a str, + pub password: &'a str, + } + # fn login(i: LoginArguments) {} + # let password = "password"; + # let username = "username"; + + // No need to check the definition of the `login` function to spot the issue. + login(LoginArguments { + username: password, + password: username, + }) + ``` + Users are forced, at the callsite, to assign values to each field, thus + increasing the likelihood of spotting bugs. + + +
diff --git a/src/idiomatic/welcome.md b/src/idiomatic/welcome.md new file mode 100644 index 00000000..889ba721 --- /dev/null +++ b/src/idiomatic/welcome.md @@ -0,0 +1,111 @@ +--- +course: Idiomatic Rust +session: Morning +target_minutes: 180 +--- + +# Welcome to Idiomatic Rust + +[Rust Fundamentals](../welcome-day-1.md) introduced Rust syntax and core +concepts. We now want to go one step further: how do you use Rust _effectively_ +in your projects? What does _idiomatic_ Rust look like? + +This course is opinionated: we will nudge you towards some patterns, and away +from others. Nonetheless, we do recognize that some projects may have different +needs. We always provide the necessary information to help you make informed +decisions within the context and constraints of your own projects. + +> ⚠️ This course is under **active development**. +> +> The material may change frequently and there might be errors that have not yet +> been spotted. Nonetheless, we encourage you to browse through and provide +> early feedback! + +## Schedule + +{{%session outline}} + +
+ + + +The course will cover the topics listed below. Each topic may be covered in one +or more slides, depending on its complexity and relevance. + +### Foundations of API design + +- Golden rule: prioritize clarity and readability at the callsite. People will + spend much more time reading the call sites than declarations of the functions + being called. +- Make your API predictable + - Follow naming conventions (case conventions, prefer vocabulary precedented + in the standard library - e.g., methods should be called "push" not + "push_back", "is_empty" not "empty" etc.) + - Know the vocabulary types and traits in the standard library, and use them + in your APIs. If something feels like a basic type/algorithm, check in the + standard library first. + - Use well-established API design patterns that we will discuss later in this + class (e.g., newtype, owned/view type pairs, error handling) +- Write meaningful and effective doc comments (e.g., don't merely repeat the + method name with spaces instead of underscores, don't repeat the same + information just to fill out every markdown tag, provide usage examples) + +### Leveraging the type system + +- Short recap on enums, structs and type aliases +- Newtype pattern and encapsulation: parse, don't validate +- Extension traits: avoid the newtype pattern when you want to provide + additional behaviour +- RAII, scope guards and drop bombs: using `Drop` to clean up resources, trigger + actions or enforce invariants +- "Token" types: force users to prove they've performed a specific action +- The typestate pattern: enforce correct state transitions at compile-time +- Using the borrow checker to enforce invariants that have nothing to do with + memory ownership + - OwnedFd/BorrowedFd in the standard library + - [Branded types](https://plv.mpi-sws.org/rustbelt/ghostcell/paper.pdf) + +### Don't fight the borrow checker + +- "Owned" types and "view" types: `&str` and `String`, `Path` and `PathBuf`, + etc. +- Don't hide ownership requirements: avoid hidden `.clone()`, learn to love + `Cow` +- Split types along ownership boundaries +- Structure your ownership hierarchy like a tree +- Strategies to manage circular dependencies: reference counting, using indexes + instead of references +- Interior mutability (Cell, RefCell) +- Working with lifetime parameters on user-defined data types + +### Polymorphism in Rust + +- A quick refresher on traits and generic functions +- Rust has no inheritance: what are the implications? + - Using enums for polymorphism + - Using traits for polymorphism + - Using composition + - How do I pick the most appropriate pattern? +- Working with generics + - Generic type parameter in a function or trait object as an argument? + - Trait bounds don't have to refer to the generic parameter + - Type parameters in traits: should it be a generic parameter or an associated + type? +- Macros: a valuable tool to DRY up code when traits are not enough (or too + complex) + +### Error Handling + +- What is the purpose of errors? Recovery vs. reporting. +- Result vs. Option +- Designing good errors: + - Determine the error scope. + - Capture additional context as the error flows upwards, crossing scope + boundaries. + - Leverage the `Error` trait to keep track of the full error chain. + - Leverage `thiserror` to reduce boilerplate when defining error types. + - `anyhow` +- Distinguish fatal errors from recoverable errors using + `Result, FatalError>`. + +
diff --git a/src/running-the-course/course-structure.md b/src/running-the-course/course-structure.md index d47e56d0..cc8067e1 100644 --- a/src/running-the-course/course-structure.md +++ b/src/running-the-course/course-structure.md @@ -72,6 +72,16 @@ cargo run {{%course outline Concurrency}} +### Idiomatic Rust + +The [Idiomatic Rust](../idiomatic/welcome.md) deep dive is a 2-day class on Rust +idioms and patterns. + +You should be familiar with the material in +[Rust Fundamentals](../welcome-day-1.md) before starting this course. + +{{%course outline Idiomatic Rust}} + ## Format The course is meant to be very interactive and we recommend letting the diff --git a/src/user-defined-types/tuple-structs.md b/src/user-defined-types/tuple-structs.md index 53dd2b8f..f2340bf1 100644 --- a/src/user-defined-types/tuple-structs.md +++ b/src/user-defined-types/tuple-structs.md @@ -43,6 +43,8 @@ fn main() { - The value passed some validation when it was created, so you no longer have to validate it again at every use: `PhoneNumber(String)` or `OddNumber(u32)`. +- The newtype pattern is covered extensively in the + ["Idiomatic Rust" module](../idiomatic/leveraging-the-type-system/newtype-pattern.md). - Demonstrate how to add a `f64` value to a `Newtons` type by accessing the single field in the newtype. - Rust generally doesn’t like inexplicit things, like automatic unwrapping or