You've already forked comprehensive-rust
mirror of
https://github.com/google/comprehensive-rust.git
synced 2025-08-08 08:22:52 +02:00
Add Unsafe Rust Deep Dive (#2806)
Adds the start of an unsafe deep dive to Comprehensive Rust. The `unsafe` keyword is easy to type, but hard to master. When used appropriately, it forms a useful and indeed essential part of the Rust programming language. By the end of this deep dive, you'll know how to work with `unsafe` code, review others' changes that include the `unsafe` keyword, and produce your own. What you'll learn: - What the terms undefined behavior, soundness, and safety mean - Why the `unsafe` keyword exists in the Rust language - How to write your own code using `unsafe` safely - How to review `unsafe` code Here is a tentative outline of a 10h (2 day) treatment: Day 1: Using and Reviewing Unsafe - Welcome - Motivations: explain why the `unsafe` keyword exists - Foundations: provide background knowledge; what is soundness? what is undefined behavior? what is validity in respect to pointers? - Mechanics: what a safe `unsafe` block should look like - Representations and Interoperability: explore how data is laid out in memory and how that can be sent across the wire and/or stored on disk. - Reviewing unsafe - Patterns for safer unsafe: Encapsulating unsafe code in safe-to-use abstractions, such as marking a type's constructor as `unsafe` so that invariants only need to be enforced once by the programmer. Day 2: Deploying Unsafe to Build Abstractions - Welcome - Validity in detail: A refresher. Emphasis on the details of the invariants that are being upheld by a “typical” unsafe block, such as aliasing, alignment, data validity, padding. - Concurrency and thread safety: understanding `Send` and `Sync`, knowing how to implement them on a user-defined type - Case study: Small string optimization - Case study: Zero-copy parsing - Review --------- Co-authored-by: Dmitri Gribenko <gribozavr@gmail.com>
This commit is contained in:
@ -440,6 +440,23 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
# Unsafe
|
||||||
|
|
||||||
|
- [Welcome](unsafe-deep-dive/welcome.md)
|
||||||
|
- [Setup](unsafe-deep-dive/setup.md)
|
||||||
|
- [Motivations](unsafe-deep-dive/motivations.md)
|
||||||
|
- [Interoperability](unsafe-deep-dive/motivations/interop.md)
|
||||||
|
- [Data Structures](unsafe-deep-dive/motivations/data-structures.md)
|
||||||
|
- [Performance](unsafe-deep-dive/motivations/performance.md)
|
||||||
|
- [Foundations](unsafe-deep-dive/foundations.md)
|
||||||
|
- [What is unsafe?](unsafe-deep-dive/foundations/what-is-unsafe.md)
|
||||||
|
- [When is unsafe used?](unsafe-deep-dive/foundations/when-is-unsafe-used.md)
|
||||||
|
- [Data structures are safe](unsafe-deep-dive/foundations/data-structures-are-safe.md)
|
||||||
|
- [Actions might not be](unsafe-deep-dive/foundations/actions-might-not-be.md)
|
||||||
|
- [Less powerful than it seems](unsafe-deep-dive/foundations/less-powerful.md)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
# Final Words
|
# Final Words
|
||||||
|
|
||||||
- [Thanks!](thanks.md)
|
- [Thanks!](thanks.md)
|
||||||
|
@ -82,6 +82,15 @@ You should be familiar with the material in
|
|||||||
|
|
||||||
{{%course outline Idiomatic Rust}}
|
{{%course outline Idiomatic Rust}}
|
||||||
|
|
||||||
|
### Unsafe (Work in Progress)
|
||||||
|
|
||||||
|
The [Unsafe](../unsafe-deep-dive/welcome.md) deep dive is a two-day class on the
|
||||||
|
_unsafe_ Rust language. It covers the fundamentals of Rust's safety guarantees,
|
||||||
|
the motivation for `unsafe`, review process for `unsafe` code, FFI basics, and
|
||||||
|
building data structures that the borrow checker would normally reject.
|
||||||
|
|
||||||
|
{{%course outline Unsafe}}
|
||||||
|
|
||||||
## Format
|
## Format
|
||||||
|
|
||||||
The course is meant to be very interactive and we recommend letting the
|
The course is meant to be very interactive and we recommend letting the
|
||||||
|
0
src/unsafe-deep-dive/Cargo.toml
Normal file
0
src/unsafe-deep-dive/Cargo.toml
Normal file
5
src/unsafe-deep-dive/foundations.md
Normal file
5
src/unsafe-deep-dive/foundations.md
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
# Foundations
|
||||||
|
|
||||||
|
Some fundamental concepts and terms.
|
||||||
|
|
||||||
|
{{%segment outline}}
|
19
src/unsafe-deep-dive/foundations/actions-might-not-be.md
Normal file
19
src/unsafe-deep-dive/foundations/actions-might-not-be.md
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
---
|
||||||
|
minutes: 2
|
||||||
|
---
|
||||||
|
|
||||||
|
# ... but actions on them might not be
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn main() {
|
||||||
|
let n: i64 = 12345;
|
||||||
|
let safe = &n as *const _;
|
||||||
|
println!("{safe:p}");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
Modify the example to de-reference `safe` without an `unsafe` block.
|
||||||
|
|
||||||
|
</details>
|
25
src/unsafe-deep-dive/foundations/data-structures-are-safe.md
Normal file
25
src/unsafe-deep-dive/foundations/data-structures-are-safe.md
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
---
|
||||||
|
minutes: 2
|
||||||
|
---
|
||||||
|
|
||||||
|
# Data structures are safe ...
|
||||||
|
|
||||||
|
Data structures are inert. They cannot do any harm by themselves.
|
||||||
|
|
||||||
|
Safe Rust code can create raw pointers:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn main() {
|
||||||
|
let n: i64 = 12345;
|
||||||
|
let safe = &raw const n;
|
||||||
|
println!("{safe:p}");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
Consider a raw pointer to an integer, i.e., the value `safe` is the raw pointer
|
||||||
|
type `*const i64`. Raw pointers can be out-of-bounds, misaligned, or be null.
|
||||||
|
But the unsafe keyword is not required when creating them.
|
||||||
|
|
||||||
|
</details>
|
52
src/unsafe-deep-dive/foundations/less-powerful.md
Normal file
52
src/unsafe-deep-dive/foundations/less-powerful.md
Normal file
@ -0,0 +1,52 @@
|
|||||||
|
---
|
||||||
|
minutes: 10
|
||||||
|
---
|
||||||
|
|
||||||
|
# Less powerful than it seems
|
||||||
|
|
||||||
|
The `unsafe` keyword does not allow you to break Rust.
|
||||||
|
|
||||||
|
```rust,ignore
|
||||||
|
use std::mem::transmute;
|
||||||
|
|
||||||
|
let orig = b"RUST";
|
||||||
|
let n: i32 = unsafe { transmute(orig) };
|
||||||
|
|
||||||
|
println!("{n}")
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
## Suggested outline
|
||||||
|
|
||||||
|
- Request that someone explains what `std::mem::transmute` does
|
||||||
|
- Discuss why it doesn't compile
|
||||||
|
- Fix the code
|
||||||
|
|
||||||
|
## Expected compiler output
|
||||||
|
|
||||||
|
```ignore
|
||||||
|
Compiling playground v0.0.1 (/playground)
|
||||||
|
error[E0512]: cannot transmute between types of different sizes, or dependently-sized types
|
||||||
|
--> src/main.rs:5:27
|
||||||
|
|
|
||||||
|
5 | let n: i32 = unsafe { transmute(orig) };
|
||||||
|
| ^^^^^^^^^
|
||||||
|
|
|
||||||
|
= note: source type: `&[u8; 4]` (64 bits)
|
||||||
|
= note: target type: `i32` (32 bits)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Suggested change
|
||||||
|
|
||||||
|
```diff
|
||||||
|
- let n: i32 = unsafe { transmute(orig) };
|
||||||
|
+ let n: i64 = unsafe { transmute(orig) };
|
||||||
|
```
|
||||||
|
|
||||||
|
## Notes on less familiar Rust
|
||||||
|
|
||||||
|
- the `b` prefix on a string literal marks it as byte slice (`&[u8]`) rather
|
||||||
|
than a string slice (`&str`)
|
||||||
|
|
||||||
|
</details>
|
98
src/unsafe-deep-dive/foundations/what-is-unsafe.md
Normal file
98
src/unsafe-deep-dive/foundations/what-is-unsafe.md
Normal file
@ -0,0 +1,98 @@
|
|||||||
|
---
|
||||||
|
minutes: 6
|
||||||
|
---
|
||||||
|
|
||||||
|
# What is “unsafety”?
|
||||||
|
|
||||||
|
Unsafe Rust is a superset of Safe Rust.
|
||||||
|
|
||||||
|
Let's create a list of things that are enabled by the `unsafe` keyword.
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
## Definitions from authoritative docs:
|
||||||
|
|
||||||
|
From the [unsafe keyword's documentation]():
|
||||||
|
|
||||||
|
> Code or interfaces whose memory safety cannot be verified by the type system.
|
||||||
|
>
|
||||||
|
> ...
|
||||||
|
>
|
||||||
|
> Here are the abilities Unsafe Rust has in addition to Safe Rust:
|
||||||
|
>
|
||||||
|
> - Dereference raw pointers
|
||||||
|
> - Implement unsafe traits
|
||||||
|
> - Call unsafe functions
|
||||||
|
> - Mutate statics (including external ones)
|
||||||
|
> - Access fields of unions
|
||||||
|
|
||||||
|
From the [reference](https://doc.rust-lang.org/reference/unsafety.html)
|
||||||
|
|
||||||
|
> The following language level features cannot be used in the safe subset of
|
||||||
|
> Rust:
|
||||||
|
>
|
||||||
|
> - Dereferencing a raw pointer.
|
||||||
|
> - Reading or writing a mutable or external static variable.
|
||||||
|
> - Accessing a field of a union, other than to assign to it.
|
||||||
|
> - Calling an unsafe function (including an intrinsic or foreign function).
|
||||||
|
> - Calling a safe function marked with a target_feature from a function that
|
||||||
|
> does not have a target_feature attribute enabling the same features (see
|
||||||
|
> attributes.codegen.target_feature.safety-restrictions).
|
||||||
|
> - Implementing an unsafe trait.
|
||||||
|
> - Declaring an extern block.
|
||||||
|
> - Applying an unsafe attribute to an item.
|
||||||
|
|
||||||
|
## Group exercise
|
||||||
|
|
||||||
|
> You may have a group of learners who are not familiar with each other yet.
|
||||||
|
> This is a way for you to gather some data about their confidence levels and
|
||||||
|
> the psychological safety that they're feeling.
|
||||||
|
|
||||||
|
### Part 1: Informal definition
|
||||||
|
|
||||||
|
> Use this to gauge the confidence level of the group. If they are uncertain,
|
||||||
|
> then tailor the next section to be more directed.
|
||||||
|
|
||||||
|
Ask the class: **By raising your hand, indicate if you would feel comfortable
|
||||||
|
defining unsafe?**
|
||||||
|
|
||||||
|
If anyone's feeling confident, allow them to try to explain.
|
||||||
|
|
||||||
|
### Part 2: Evidence gathering
|
||||||
|
|
||||||
|
Ask the class to spend 3-5 minutes.
|
||||||
|
|
||||||
|
- Find a use of the unsafe keyword. What contract/invariant/pre-condition is
|
||||||
|
being established or satisfied?
|
||||||
|
- Write down terms that need to be defined (unsafe, memory safety, soundness,
|
||||||
|
undefined behavior)
|
||||||
|
|
||||||
|
### Part 3: Write a working definition
|
||||||
|
|
||||||
|
### Part 4: Remarks
|
||||||
|
|
||||||
|
Mention that we'll be reviewing our definition at the end of the day.
|
||||||
|
|
||||||
|
## Note: Avoid detailed discussion about precise semantics of memory safety
|
||||||
|
|
||||||
|
It's possible that the group will slide into a discussion about the precise
|
||||||
|
semantics of what memory safety actually is and how define pointer validity.
|
||||||
|
This isn't a productive line of discussion. It can undermine confidence in less
|
||||||
|
experienced learners.
|
||||||
|
|
||||||
|
Perhaps refer people who wish to discuss this to the discussion within the
|
||||||
|
official [documentation for pointer types] (excerpt below) as a place for
|
||||||
|
further research.
|
||||||
|
|
||||||
|
> Many functions in [this module] take raw pointers as arguments and read from
|
||||||
|
> or write to them. For this to be safe, these pointers must be _valid_ for the
|
||||||
|
> given access.
|
||||||
|
>
|
||||||
|
> ...
|
||||||
|
>
|
||||||
|
> The precise rules for validity are not determined yet.
|
||||||
|
|
||||||
|
[this module]: https://doc.rust-lang.org/std/ptr/index.html
|
||||||
|
[documentation for pointer types]: https://doc.rust-lang.org/std/ptr/index.html#safety
|
||||||
|
|
||||||
|
</details>
|
48
src/unsafe-deep-dive/foundations/when-is-unsafe-used.md
Normal file
48
src/unsafe-deep-dive/foundations/when-is-unsafe-used.md
Normal file
@ -0,0 +1,48 @@
|
|||||||
|
---
|
||||||
|
minutes: 2
|
||||||
|
---
|
||||||
|
|
||||||
|
# When is unsafe used?
|
||||||
|
|
||||||
|
The unsafe keyword indicates that the programmer is responsible for upholding
|
||||||
|
Rust's safety guarantees.
|
||||||
|
|
||||||
|
The keyword has two roles:
|
||||||
|
|
||||||
|
- define pre-conditions that must be satisfied
|
||||||
|
- assert to the compiler (= promise) that those defined pre-conditions are
|
||||||
|
satisfied
|
||||||
|
|
||||||
|
## Further references
|
||||||
|
|
||||||
|
- [The unsafe keyword chapter of the Rust Reference](https://doc.rust-lang.org/reference/unsafe-keyword.html)
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
Places where pre-conditions can be defined (Role 1)
|
||||||
|
|
||||||
|
- [unsafe functions] (`unsafe fn foo() { ... }`). Example: `get_unchecked`
|
||||||
|
method on slices, which requires callers to verify that the index is
|
||||||
|
in-bounds.
|
||||||
|
- unsafe traits (`unsafe trait`). Examples: [`Send`] and [`Sync`] marker traits
|
||||||
|
in the standard library.
|
||||||
|
|
||||||
|
Places where pre-conditions must be satisfied (Role 2)
|
||||||
|
|
||||||
|
- unsafe blocks (`unafe { ... }`)
|
||||||
|
- implementing unsafe traits (`unsafe impl`)
|
||||||
|
- access external items (`unsafe extern`)
|
||||||
|
- adding
|
||||||
|
[unsafe attributes](https://doc.rust-lang.org/reference/attributes.html) o an
|
||||||
|
item. Examples: [`export_name`], [`link_section`] and [`no_mangle`]. Usage:
|
||||||
|
`#[unsafe(no_mangle)]`
|
||||||
|
|
||||||
|
[unsafe functions]: https://doc.rust-lang.org/reference/unsafe-keyword.html#unsafe-functions-unsafe-fn
|
||||||
|
[unsafe traits]: https://doc.rust-lang.org/reference/unsafe-keyword.html#unsafe-traits-unsafe-trait
|
||||||
|
[`export_name`]: https://doc.rust-lang.org/reference/abi.html#the-export_name-attribute
|
||||||
|
[`link_section`]: https://doc.rust-lang.org/reference/abi.html#the-link_section-attribute
|
||||||
|
[`no_mangle`]: https://doc.rust-lang.org/reference/abi.html#the-no_mangle-attribute
|
||||||
|
[`Send`]: https://doc.rust-lang.org/std/marker/trait.Send.html
|
||||||
|
[`Sync`]: https://doc.rust-lang.org/std/marker/trait.Sync.html
|
||||||
|
|
||||||
|
</details>
|
24
src/unsafe-deep-dive/motivations.md
Normal file
24
src/unsafe-deep-dive/motivations.md
Normal file
@ -0,0 +1,24 @@
|
|||||||
|
---
|
||||||
|
minutes: 1
|
||||||
|
---
|
||||||
|
|
||||||
|
# Motivations
|
||||||
|
|
||||||
|
We know that writing code without the guarantees that Rust provides ...
|
||||||
|
|
||||||
|
> “Use-after-free (UAF), integer overflows, and out of bounds (OOB) reads/writes
|
||||||
|
> comprise 90% of vulnerabilities with OOB being the most common.”
|
||||||
|
>
|
||||||
|
> --— **Jeff Vander Stoep and Chong Zang**, Google.
|
||||||
|
> "[Queue the Hardening Enhancements](https://security.googleblog.com/2019/05/queue-hardening-enhancements.html)"
|
||||||
|
|
||||||
|
... so why is `unsafe` part of the language?
|
||||||
|
|
||||||
|
{{%segment outline}}
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
The `unsafe` keyword exists because there is no compiler technology available
|
||||||
|
today that makes it obsolete. Compilers cannot verify everything.
|
||||||
|
|
||||||
|
</details>
|
30
src/unsafe-deep-dive/motivations/data-structures.md
Normal file
30
src/unsafe-deep-dive/motivations/data-structures.md
Normal file
@ -0,0 +1,30 @@
|
|||||||
|
---
|
||||||
|
minutes: 5
|
||||||
|
---
|
||||||
|
|
||||||
|
# Data Structures
|
||||||
|
|
||||||
|
Some families of data structures are impossible to create in safe Rust.
|
||||||
|
|
||||||
|
- graphs
|
||||||
|
- bit twiddling
|
||||||
|
- self-referential types
|
||||||
|
- intrusive data structures
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
Graphs: General-purpose graphs cannot be created as they may need to represent
|
||||||
|
cycles. Cycles are impossible for the type system to reason about.
|
||||||
|
|
||||||
|
Bit twiddling: Overloading bits with multiple meanings. Examples include using
|
||||||
|
the NaN bits in `f64` for some other purpose or the higher-order bits of
|
||||||
|
pointers on `x86_64` platforms. This is somewhat common when writing language
|
||||||
|
interpreters to keep representations within the word size the target platform.
|
||||||
|
|
||||||
|
Self-referential types are too hard for the borrow checker to verify.
|
||||||
|
|
||||||
|
Intrusive data structures: store structural metadata (like pointers to other
|
||||||
|
elements) inside the elements themselves, which requires careful handling of
|
||||||
|
aliasing.
|
||||||
|
|
||||||
|
</details>
|
245
src/unsafe-deep-dive/motivations/interop.md
Normal file
245
src/unsafe-deep-dive/motivations/interop.md
Normal file
@ -0,0 +1,245 @@
|
|||||||
|
---
|
||||||
|
minutes: 5
|
||||||
|
---
|
||||||
|
|
||||||
|
> TODO: Refactor this content into multiple slides as this slide is intended as
|
||||||
|
> an introduction to the motivations only, rather than to be an elaborate
|
||||||
|
> discussion of the whole problem.
|
||||||
|
|
||||||
|
# Interoperability
|
||||||
|
|
||||||
|
Language interoperability allows you to:
|
||||||
|
|
||||||
|
- Call functions written in other languages from Rust
|
||||||
|
- Write functions in Rust that are callable from other languages
|
||||||
|
|
||||||
|
However, this requires unsafe.
|
||||||
|
|
||||||
|
```rust,editable,ignore
|
||||||
|
unsafe extern "C" {
|
||||||
|
safe fn random() -> libc::c_long;
|
||||||
|
}
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
let a = random() as i64;
|
||||||
|
println!("{a:?}");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
The Rust compiler can't enforce any safety guarantees for programs that it
|
||||||
|
hasn't compiled, so it delegates that responsibility to you through the unsafe
|
||||||
|
keyword.
|
||||||
|
|
||||||
|
The code example we're seeing shows how to call the random function provided by
|
||||||
|
libc within Rust. libc is available to scripts in the Rust Playground.
|
||||||
|
|
||||||
|
This uses Rust's _foreign function interface_.
|
||||||
|
|
||||||
|
This isn't the only style of interoperability, however it is the method that's
|
||||||
|
needed if you want to work between Rust and some other language in a zero cost
|
||||||
|
way. Another important strategy is message passing.
|
||||||
|
|
||||||
|
Message passing avoids unsafe, but serialization, allocation, data transfer and
|
||||||
|
parsing all take energy and time.
|
||||||
|
|
||||||
|
## Answers to questions
|
||||||
|
|
||||||
|
- _Where does "random" come from?_\
|
||||||
|
libc is dynamically linked to Rust programs by default, allowing our code to
|
||||||
|
rely on its symbols, including `random`, being available to our program.
|
||||||
|
- _What is the "safe" keyword?_\
|
||||||
|
It allows callers to call the function without needing to wrap that call in
|
||||||
|
`unsafe`. The [`safe` function qualifier] was introduced in the 2024 edition
|
||||||
|
of Rust and can only be used within `extern` blocks. It was introduced because
|
||||||
|
`unsafe` became a mandatory qualifier for `extern` blocks in that edition.
|
||||||
|
- _What is the [`std::ffi::c_long`] type?_\
|
||||||
|
According to the C standard, an integer that's at least 32 bits wide. On
|
||||||
|
today's systems, It's an `i32` on Windows and an `i64` on Linux.
|
||||||
|
|
||||||
|
[`safe` keyword]: https://doc.rust-lang.org/reference/safe-keyword.html
|
||||||
|
[`std::ffi::c_long`]: https://doc.rust-lang.org/std/ffi/type.c_long.html
|
||||||
|
|
||||||
|
## Consideration: type safety
|
||||||
|
|
||||||
|
Modify the code example to remove the need for type casting later. Discuss the
|
||||||
|
potential UB - long's width is defined by the target.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
unsafe extern "C" {
|
||||||
|
safe fn random() -> i64;
|
||||||
|
}
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
let a = random();
|
||||||
|
println!("{a:?}");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
> Changes from the original:
|
||||||
|
>
|
||||||
|
> ```diff
|
||||||
|
> unsafe extern "C" {
|
||||||
|
> - safe fn random() -> libc::c_long;
|
||||||
|
> + safe fn random() -> i64;
|
||||||
|
> }
|
||||||
|
>
|
||||||
|
> fn main() {
|
||||||
|
> - let a = random() as i64;
|
||||||
|
> + let a = random();
|
||||||
|
> println!("{a:?}");
|
||||||
|
> }
|
||||||
|
> ```
|
||||||
|
|
||||||
|
It's also possible to completely ignore the intended type and create undefined
|
||||||
|
behavior in multiple ways. The code below produces output most of the time, but
|
||||||
|
generally results in a stack overflow. It may also produce illegal `char`
|
||||||
|
values. Although `char` is represented in 4 bytes (32 bits),
|
||||||
|
[not all bit patterns are permitted as a `char`][char].
|
||||||
|
|
||||||
|
Stress that the Rust compiler will trust that the wrapper is telling the truth.
|
||||||
|
|
||||||
|
[char]: https://doc.rust-lang.org/std/primitive.char.html#validity-and-layout
|
||||||
|
|
||||||
|
<!-- TODO(timclicks): add libc to the mdbook build system so that the example can be tested -->
|
||||||
|
|
||||||
|
```rust,ignore
|
||||||
|
unsafe extern "C" {
|
||||||
|
safe fn random() -> [char; 2];
|
||||||
|
}
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
let a = random();
|
||||||
|
println!("{a:?}");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
> Changes from the original:
|
||||||
|
>
|
||||||
|
> ```diff
|
||||||
|
> unsafe extern "C" {
|
||||||
|
> - safe fn random() -> libc::c_long;
|
||||||
|
> + safe fn random() -> [char; 2];
|
||||||
|
> }
|
||||||
|
>
|
||||||
|
> fn main() {
|
||||||
|
> - let a = random() as i64;
|
||||||
|
> - println!("{a}");
|
||||||
|
> + let a = random();
|
||||||
|
> + println!("{a:?}");
|
||||||
|
> }
|
||||||
|
> ```
|
||||||
|
|
||||||
|
> Attempting to print a `[char; 2]` from randomly generated input will often
|
||||||
|
> produce strange output, including:
|
||||||
|
>
|
||||||
|
> ```ignore
|
||||||
|
> thread 'main' panicked at library/std/src/io/stdio.rs:1165:9:
|
||||||
|
> failed printing to stdout: Bad address (os error 14)
|
||||||
|
> ```
|
||||||
|
>
|
||||||
|
> ```ignore
|
||||||
|
> thread 'main' has overflowed its stack
|
||||||
|
> fatal runtime error: stack overflow, aborting
|
||||||
|
> ```
|
||||||
|
|
||||||
|
Mention that type safety is generally not a large concern in practice. Tools
|
||||||
|
that produce wrappers automatically, i.e. bindgen, are excellent at reading
|
||||||
|
header files and producing values of the correct type.
|
||||||
|
|
||||||
|
## Consideration: Ownership and lifetime management
|
||||||
|
|
||||||
|
While libc's `random` function doesn't use pointers, many do. This creates many
|
||||||
|
more possibilities for unsoundness.
|
||||||
|
|
||||||
|
- both sides might attempt to free the memory (double free)
|
||||||
|
- both sides can attempt to write to the data
|
||||||
|
|
||||||
|
For example, some C libraries expose functions that write to static buffers that
|
||||||
|
are re-used between calls.
|
||||||
|
|
||||||
|
<!--
|
||||||
|
|
||||||
|
TODO(timclicks): consider adding a safety comment in the docstring that discusses thread safety and the ownership of the returned pointer.
|
||||||
|
|
||||||
|
See <https://github.com/google/comprehensive-rust/pull/2806#discussion_r2207171041>.
|
||||||
|
|
||||||
|
-->
|
||||||
|
|
||||||
|
<!-- TODO(timclicks): add libc to the mdbook build system so that the example can be tested -->
|
||||||
|
|
||||||
|
```rust,ignore
|
||||||
|
use std::ffi::{CStr, c_char};
|
||||||
|
use std::time::{SystemTime, UNIX_EPOCH};
|
||||||
|
|
||||||
|
unsafe extern "C" {
|
||||||
|
/// Create a formatted time based on time `t`, including trailing newline.
|
||||||
|
/// Read `man 3 ctime` details.
|
||||||
|
fn ctime(t: *const libc::time_t) -> *const c_char;
|
||||||
|
}
|
||||||
|
|
||||||
|
unsafe fn format_timestamp<'a>(t: u64) -> &'a str {
|
||||||
|
let t = t as libc::time_t;
|
||||||
|
|
||||||
|
unsafe {
|
||||||
|
let fmt_ptr = ctime(&t);
|
||||||
|
CStr::from_ptr(fmt_ptr).to_str().unwrap()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
let now = SystemTime::now().duration_since(UNIX_EPOCH).unwrap();
|
||||||
|
|
||||||
|
let now = now.as_secs();
|
||||||
|
let now_fmt = unsafe { format_timestamp(now) };
|
||||||
|
print!("now (1): {}", now_fmt);
|
||||||
|
|
||||||
|
let future = now + 60;
|
||||||
|
let future_fmt = unsafe { format_timestamp(future) };
|
||||||
|
print!("future: {}", future_fmt);
|
||||||
|
|
||||||
|
print!("now (2): {}", now_fmt);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
> Aside: Lifetimes in the `format_timestamp()` function
|
||||||
|
>
|
||||||
|
> Neither `'a`, nor `'static` correctly describe the lifetime of the string
|
||||||
|
> that's returned. Rust treats it as an immutable reference, but subsequent
|
||||||
|
> calls to `ctime` will overwrite the static buffer that the string occupies.
|
||||||
|
|
||||||
|
Bonus points: can anyone spot the lifetime bug? `format_timestamp()` should
|
||||||
|
return a `&'static str`.
|
||||||
|
|
||||||
|
## Consideration: Representation mismatch
|
||||||
|
|
||||||
|
Different programming languages have made different design decisions and this
|
||||||
|
can create impedance mismatches between different domains.
|
||||||
|
|
||||||
|
Consider string handling. C++ defines `std::string`, which has an incompatible
|
||||||
|
memory layout with Rust's `String` type. `String` also requires text to be
|
||||||
|
encoded as UTF-8, whereas `std::string` does not. In C, text is represented by a
|
||||||
|
null-terminated sequence of bytes (`char*`).
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn main() {
|
||||||
|
let c_repr = b"Hello, C\0";
|
||||||
|
let rust_repr = (b"Hello, Rust", 11);
|
||||||
|
|
||||||
|
let c: &str = unsafe {
|
||||||
|
let ptr = c_repr.as_ptr() as *const i8;
|
||||||
|
std::ffi::CStr::from_ptr(ptr).to_str().unwrap()
|
||||||
|
};
|
||||||
|
println!("{c}");
|
||||||
|
|
||||||
|
let rust: &str = unsafe {
|
||||||
|
let ptr = rust_repr.0.as_ptr();
|
||||||
|
let bytes = std::slice::from_raw_parts(ptr, rust_repr.1);
|
||||||
|
std::str::from_utf8_unchecked(bytes)
|
||||||
|
};
|
||||||
|
println!("{rust}");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
10
src/unsafe-deep-dive/motivations/performance.md
Normal file
10
src/unsafe-deep-dive/motivations/performance.md
Normal file
@ -0,0 +1,10 @@
|
|||||||
|
---
|
||||||
|
minutes: 5
|
||||||
|
---
|
||||||
|
|
||||||
|
# Performance
|
||||||
|
|
||||||
|
> TODO: Stub for now
|
||||||
|
|
||||||
|
It's easy to think of performance as the main reason for unsafe, but high
|
||||||
|
performance code makes up the minority of unsafe blocks.
|
46
src/unsafe-deep-dive/setup.md
Normal file
46
src/unsafe-deep-dive/setup.md
Normal file
@ -0,0 +1,46 @@
|
|||||||
|
---
|
||||||
|
minutes: 2
|
||||||
|
---
|
||||||
|
|
||||||
|
# Setting Up
|
||||||
|
|
||||||
|
## Local Rust installation
|
||||||
|
|
||||||
|
You should have a Rust compiler installed that supports the 2024 edition of the
|
||||||
|
language, which is any version of rustc higher than 1.84.
|
||||||
|
|
||||||
|
```console
|
||||||
|
$ rustc --version
|
||||||
|
rustc 1.87
|
||||||
|
```
|
||||||
|
|
||||||
|
<!--
|
||||||
|
|
||||||
|
TODO (tim): Adding this for later while I'm here.
|
||||||
|
TODO (tim): We should be able to avoid this by just relying on the `cc` crate
|
||||||
|
|
||||||
|
We recommend that you install the [Bazel build system](https://bazel.build/install).
|
||||||
|
This will allow you to easily compile project that combine multiple languages.
|
||||||
|
|
||||||
|
-->
|
||||||
|
|
||||||
|
## (Optional) Create a local instance of the course
|
||||||
|
|
||||||
|
```console
|
||||||
|
$ git clone --depth=1 https://github.com/google/comprehensive-rust.git
|
||||||
|
Cloning into 'comprehensive-rust'...
|
||||||
|
...
|
||||||
|
$ cd comprehensive-rust
|
||||||
|
$ cargo install-tools
|
||||||
|
...
|
||||||
|
$ cargo serve # then open http://127.0.0.1:3000/ in a browser
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
Ask everyone to confirm that everyone is able to execute `rustc` with a version
|
||||||
|
older that 1.87.
|
||||||
|
|
||||||
|
For those people who do not, tell them that we'll resolve that in the break.
|
||||||
|
|
||||||
|
</details>
|
46
src/unsafe-deep-dive/welcome.md
Normal file
46
src/unsafe-deep-dive/welcome.md
Normal file
@ -0,0 +1,46 @@
|
|||||||
|
---
|
||||||
|
course: Unsafe
|
||||||
|
session: Day 1 Morning
|
||||||
|
target_minutes: 300
|
||||||
|
---
|
||||||
|
|
||||||
|
# Welcome to Unsafe Rust
|
||||||
|
|
||||||
|
> IMPORTANT: THIS MODULE IS IN AN EARLY STAGE OF DEVELOPMENT
|
||||||
|
>
|
||||||
|
> Please do not consider this module of Comprehensive Rust to be complete. With
|
||||||
|
> that in mind, your feedback, comments, and especially your concerns, are very
|
||||||
|
> welcome.
|
||||||
|
>
|
||||||
|
> To comment on this module's development, please use the
|
||||||
|
> [GitHub issue tracker].
|
||||||
|
|
||||||
|
[GitHub issue tracker]: https://github.com/google/comprehensive-rust/issues
|
||||||
|
|
||||||
|
The `unsafe` keyword is easy to type, but hard to master. When used
|
||||||
|
appropriately, it forms a useful and indeed essential part of the Rust
|
||||||
|
programming language.
|
||||||
|
|
||||||
|
By the end of this deep dive, you'll know how to work with `unsafe` code, review
|
||||||
|
others' changes that include the `unsafe` keyword, and produce your own.
|
||||||
|
|
||||||
|
What you'll learn:
|
||||||
|
|
||||||
|
- What the terms undefined behavior, soundness, and safety mean
|
||||||
|
- Why the `unsafe` keyword exists in the Rust language
|
||||||
|
- How to write your own code using `unsafe` safely
|
||||||
|
- How to review `unsafe` code
|
||||||
|
|
||||||
|
## Links to other sections of the course
|
||||||
|
|
||||||
|
The `unsafe` keyword has treatment in:
|
||||||
|
|
||||||
|
- _Rust Fundamentals_, the main module of Comprehensive Rust, includes a session
|
||||||
|
on [Unsafe Rust] in its last day.
|
||||||
|
- _Rust in Chromium_ discusses how to [interoperate with C++]. Consult that
|
||||||
|
material if you are looking into FFI.
|
||||||
|
- _Bare Metal Rust_ uses unsafe heavily to interact with the underlying host,
|
||||||
|
among other things.
|
||||||
|
|
||||||
|
[interoperate with C++]: ../chromium/interoperability-with-cpp.md
|
||||||
|
[Unsafe Rust]: ../unsafe-rust.html
|
Reference in New Issue
Block a user