You've already forked comprehensive-rust
mirror of
https://github.com/google/comprehensive-rust.git
synced 2025-06-16 14:17:34 +02:00
Update Concurrency course with times (#2007)
As I mentioned in #1536: * Break into segments at approximately the places @fw-immunant put breaks * Move all of the files into `src/concurrency` * Add timings and segment/session metadata so course outlines appear There's room for more work here, including some additional feedback from @fw-immunant after the session I observed, but let's do one step at a time :)
This commit is contained in:
committed by
GitHub
parent
a03b7e68e5
commit
face5af783
@ -1,21 +0,0 @@
|
||||
[package]
|
||||
name = "comprehensive-rust"
|
||||
version = "0.1.0"
|
||||
edition = "2021"
|
||||
publish = false
|
||||
|
||||
[[bin]]
|
||||
name = "dining-philosophers"
|
||||
path = "concurrency/dining-philosophers.rs"
|
||||
|
||||
[[bin]]
|
||||
name = "link-checker"
|
||||
path = "concurrency/link-checker.rs"
|
||||
|
||||
[dependencies]
|
||||
reqwest = { version = "0.12.4", features = ["blocking"] }
|
||||
scraper = "0.19.0"
|
||||
thiserror = "1.0.59"
|
||||
|
||||
[dev-dependencies]
|
||||
tempfile = "3.10.1"
|
@ -1,17 +0,0 @@
|
||||
# Exercises
|
||||
|
||||
To practice your Async Rust skills, we have again two exercises for you:
|
||||
|
||||
- Dining philosophers: we already saw this problem in the morning. This time you
|
||||
are going to implement it with Async Rust.
|
||||
|
||||
- A Broadcast Chat Application: this is a larger project that allows you
|
||||
experiment with more advanced Async Rust features.
|
||||
|
||||
<details>
|
||||
|
||||
After looking at the exercises, you can look at the [solutions] provided.
|
||||
|
||||
[solutions]: solutions-afternoon.md
|
||||
|
||||
</details>
|
@ -1,109 +0,0 @@
|
||||
# Broadcast Chat Application
|
||||
|
||||
In this exercise, we want to use our new knowledge to implement a broadcast chat
|
||||
application. We have a chat server that the clients connect to and publish their
|
||||
messages. The client reads user messages from the standard input, and sends them
|
||||
to the server. The chat server broadcasts each message that it receives to all
|
||||
the clients.
|
||||
|
||||
For this, we use [a broadcast channel][1] on the server, and
|
||||
[`tokio_websockets`][2] for the communication between the client and the server.
|
||||
|
||||
Create a new Cargo project and add the following dependencies:
|
||||
|
||||
_Cargo.toml_:
|
||||
|
||||
<!-- File Cargo.toml -->
|
||||
|
||||
```toml
|
||||
{{#include chat-async/Cargo.toml}}
|
||||
```
|
||||
|
||||
## The required APIs
|
||||
|
||||
You are going to need the following functions from `tokio` and
|
||||
[`tokio_websockets`][2]. Spend a few minutes to familiarize yourself with the
|
||||
API.
|
||||
|
||||
- [StreamExt::next()][3] implemented by `WebSocketStream`: for asynchronously
|
||||
reading messages from a Websocket Stream.
|
||||
- [SinkExt::send()][4] implemented by `WebSocketStream`: for asynchronously
|
||||
sending messages on a Websocket Stream.
|
||||
- [Lines::next_line()][5]: for asynchronously reading user messages from the
|
||||
standard input.
|
||||
- [Sender::subscribe()][6]: for subscribing to a broadcast channel.
|
||||
|
||||
## Two binaries
|
||||
|
||||
Normally in a Cargo project, you can have only one binary, and one `src/main.rs`
|
||||
file. In this project, we need two binaries. One for the client, and one for the
|
||||
server. You could potentially make them two separate Cargo projects, but we are
|
||||
going to put them in a single Cargo project with two binaries. For this to work,
|
||||
the client and the server code should go under `src/bin` (see the
|
||||
[documentation][7]).
|
||||
|
||||
Copy the following server and client code into `src/bin/server.rs` and
|
||||
`src/bin/client.rs`, respectively. Your task is to complete these files as
|
||||
described below.
|
||||
|
||||
_src/bin/server.rs_:
|
||||
|
||||
<!-- File src/bin/server.rs -->
|
||||
|
||||
```rust,compile_fail
|
||||
{{#include chat-async/src/bin/server.rs:setup}}
|
||||
|
||||
{{#include chat-async/src/bin/server.rs:handle_connection}}
|
||||
|
||||
// TODO: For a hint, see the description of the task below.
|
||||
|
||||
{{#include chat-async/src/bin/server.rs:main}}
|
||||
```
|
||||
|
||||
_src/bin/client.rs_:
|
||||
|
||||
<!-- File src/bin/client.rs -->
|
||||
|
||||
```rust,compile_fail
|
||||
{{#include chat-async/src/bin/client.rs:setup}}
|
||||
|
||||
// TODO: For a hint, see the description of the task below.
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
## Running the binaries
|
||||
|
||||
Run the server with:
|
||||
|
||||
```shell
|
||||
cargo run --bin server
|
||||
```
|
||||
|
||||
and the client with:
|
||||
|
||||
```shell
|
||||
cargo run --bin client
|
||||
```
|
||||
|
||||
## Tasks
|
||||
|
||||
- Implement the `handle_connection` function in `src/bin/server.rs`.
|
||||
- Hint: Use `tokio::select!` for concurrently performing two tasks in a
|
||||
continuous loop. One task receives messages from the client and broadcasts
|
||||
them. The other sends messages received by the server to the client.
|
||||
- Complete the main function in `src/bin/client.rs`.
|
||||
- Hint: As before, use `tokio::select!` in a continuous loop for concurrently
|
||||
performing two tasks: (1) reading user messages from standard input and
|
||||
sending them to the server, and (2) receiving messages from the server, and
|
||||
displaying them for the user.
|
||||
- Optional: Once you are done, change the code to broadcast messages to all
|
||||
clients, but the sender of the message.
|
||||
|
||||
[1]: https://docs.rs/tokio/latest/tokio/sync/broadcast/fn.channel.html
|
||||
[2]: https://docs.rs/tokio-websockets/
|
||||
[3]: https://docs.rs/futures-util/0.3.28/futures_util/stream/trait.StreamExt.html#method.next
|
||||
[4]: https://docs.rs/futures-util/0.3.28/futures_util/sink/trait.SinkExt.html#method.send
|
||||
[5]: https://docs.rs/tokio/latest/tokio/io/struct.Lines.html#method.next_line
|
||||
[6]: https://docs.rs/tokio/latest/tokio/sync/broadcast/struct.Sender.html#method.subscribe
|
||||
[7]: https://doc.rust-lang.org/cargo/reference/cargo-targets.html#binaries
|
@ -1,10 +0,0 @@
|
||||
[package]
|
||||
name = "chat-async"
|
||||
version = "0.1.0"
|
||||
edition = "2021"
|
||||
|
||||
[dependencies]
|
||||
futures-util = { version = "0.3.30", features = ["sink"] }
|
||||
http = "1.1.0"
|
||||
tokio = { version = "1.37.0", features = ["full"] }
|
||||
tokio-websockets = { version = "0.8.2", features = ["client", "fastrand", "server", "sha1_smol"] }
|
@ -1,58 +0,0 @@
|
||||
// Copyright 2023 Google LLC
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
// ANCHOR: solution
|
||||
// ANCHOR: setup
|
||||
use futures_util::stream::StreamExt;
|
||||
use futures_util::SinkExt;
|
||||
use http::Uri;
|
||||
use tokio::io::{AsyncBufReadExt, BufReader};
|
||||
use tokio_websockets::{ClientBuilder, Message};
|
||||
|
||||
#[tokio::main]
|
||||
async fn main() -> Result<(), tokio_websockets::Error> {
|
||||
let (mut ws_stream, _) =
|
||||
ClientBuilder::from_uri(Uri::from_static("ws://127.0.0.1:2000"))
|
||||
.connect()
|
||||
.await?;
|
||||
|
||||
let stdin = tokio::io::stdin();
|
||||
let mut stdin = BufReader::new(stdin).lines();
|
||||
|
||||
// ANCHOR_END: setup
|
||||
// Continuous loop for concurrently sending and receiving messages.
|
||||
loop {
|
||||
tokio::select! {
|
||||
incoming = ws_stream.next() => {
|
||||
match incoming {
|
||||
Some(Ok(msg)) => {
|
||||
if let Some(text) = msg.as_text() {
|
||||
println!("From server: {}", text);
|
||||
}
|
||||
},
|
||||
Some(Err(err)) => return Err(err.into()),
|
||||
None => return Ok(()),
|
||||
}
|
||||
}
|
||||
res = stdin.next_line() => {
|
||||
match res {
|
||||
Ok(None) => return Ok(()),
|
||||
Ok(Some(line)) => ws_stream.send(Message::text(line.to_string())).await?,
|
||||
Err(err) => return Err(err.into()),
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
}
|
@ -1,83 +0,0 @@
|
||||
// Copyright 2023 Google LLC
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
// ANCHOR: solution
|
||||
// ANCHOR: setup
|
||||
use futures_util::sink::SinkExt;
|
||||
use futures_util::stream::StreamExt;
|
||||
use std::error::Error;
|
||||
use std::net::SocketAddr;
|
||||
use tokio::net::{TcpListener, TcpStream};
|
||||
use tokio::sync::broadcast::{channel, Sender};
|
||||
use tokio_websockets::{Message, ServerBuilder, WebSocketStream};
|
||||
// ANCHOR_END: setup
|
||||
|
||||
// ANCHOR: handle_connection
|
||||
async fn handle_connection(
|
||||
addr: SocketAddr,
|
||||
mut ws_stream: WebSocketStream<TcpStream>,
|
||||
bcast_tx: Sender<String>,
|
||||
) -> Result<(), Box<dyn Error + Send + Sync>> {
|
||||
// ANCHOR_END: handle_connection
|
||||
|
||||
ws_stream
|
||||
.send(Message::text("Welcome to chat! Type a message".to_string()))
|
||||
.await?;
|
||||
let mut bcast_rx = bcast_tx.subscribe();
|
||||
|
||||
// A continuous loop for concurrently performing two tasks: (1) receiving
|
||||
// messages from `ws_stream` and broadcasting them, and (2) receiving
|
||||
// messages on `bcast_rx` and sending them to the client.
|
||||
loop {
|
||||
tokio::select! {
|
||||
incoming = ws_stream.next() => {
|
||||
match incoming {
|
||||
Some(Ok(msg)) => {
|
||||
if let Some(text) = msg.as_text() {
|
||||
println!("From client {addr:?} {text:?}");
|
||||
bcast_tx.send(text.into())?;
|
||||
}
|
||||
}
|
||||
Some(Err(err)) => return Err(err.into()),
|
||||
None => return Ok(()),
|
||||
}
|
||||
}
|
||||
msg = bcast_rx.recv() => {
|
||||
ws_stream.send(Message::text(msg?)).await?;
|
||||
}
|
||||
}
|
||||
}
|
||||
// ANCHOR: main
|
||||
}
|
||||
|
||||
#[tokio::main]
|
||||
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
|
||||
let (bcast_tx, _) = channel(16);
|
||||
|
||||
let listener = TcpListener::bind("127.0.0.1:2000").await?;
|
||||
println!("listening on port 2000");
|
||||
|
||||
loop {
|
||||
let (socket, addr) = listener.accept().await?;
|
||||
println!("New connection from {addr:?}");
|
||||
let bcast_tx = bcast_tx.clone();
|
||||
tokio::spawn(async move {
|
||||
// Wrap the raw TCP stream into a websocket.
|
||||
let ws_stream = ServerBuilder::new().accept(socket).await?;
|
||||
|
||||
handle_connection(addr, ws_stream, bcast_tx).await
|
||||
});
|
||||
}
|
||||
}
|
||||
// ANCHOR_END: main
|
@ -1,57 +0,0 @@
|
||||
# Dining Philosophers --- Async
|
||||
|
||||
See [dining philosophers](dining-philosophers.md) for a description of the
|
||||
problem.
|
||||
|
||||
As before, you will need a local
|
||||
[Cargo installation](../../cargo/running-locally.md) for this exercise. Copy the
|
||||
code below to a file called `src/main.rs`, fill out the blanks, and test that
|
||||
`cargo run` does not deadlock:
|
||||
|
||||
<!-- File src/main.rs -->
|
||||
|
||||
```rust,compile_fail
|
||||
{{#include dining-philosophers-async.rs:Philosopher}}
|
||||
// left_fork: ...
|
||||
// right_fork: ...
|
||||
// thoughts: ...
|
||||
}
|
||||
|
||||
{{#include dining-philosophers-async.rs:Philosopher-think}}
|
||||
|
||||
{{#include dining-philosophers-async.rs:Philosopher-eat}}
|
||||
{{#include dining-philosophers-async.rs:Philosopher-eat-body}}
|
||||
{{#include dining-philosophers-async.rs:Philosopher-eat-end}}
|
||||
// Create forks
|
||||
|
||||
// Create philosophers
|
||||
|
||||
// Make them think and eat
|
||||
|
||||
// Output their thoughts
|
||||
}
|
||||
```
|
||||
|
||||
Since this time you are using Async Rust, you'll need a `tokio` dependency. You
|
||||
can use the following `Cargo.toml`:
|
||||
|
||||
<!-- File Cargo.toml -->
|
||||
|
||||
```toml
|
||||
[package]
|
||||
name = "dining-philosophers-async-dine"
|
||||
version = "0.1.0"
|
||||
edition = "2021"
|
||||
|
||||
[dependencies]
|
||||
tokio = { version = "1.26.0", features = ["sync", "time", "macros", "rt-multi-thread"] }
|
||||
```
|
||||
|
||||
Also note that this time you have to use the `Mutex` and the `mpsc` module from
|
||||
the `tokio` crate.
|
||||
|
||||
<details>
|
||||
|
||||
- Can you make your implementation single-threaded?
|
||||
|
||||
</details>
|
@ -1,119 +0,0 @@
|
||||
// Copyright 2023 Google LLC
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
// ANCHOR: solution
|
||||
// ANCHOR: Philosopher
|
||||
use std::sync::Arc;
|
||||
use tokio::sync::mpsc::{self, Sender};
|
||||
use tokio::sync::Mutex;
|
||||
use tokio::time;
|
||||
|
||||
struct Fork;
|
||||
|
||||
struct Philosopher {
|
||||
name: String,
|
||||
// ANCHOR_END: Philosopher
|
||||
left_fork: Arc<Mutex<Fork>>,
|
||||
right_fork: Arc<Mutex<Fork>>,
|
||||
thoughts: Sender<String>,
|
||||
}
|
||||
|
||||
// ANCHOR: Philosopher-think
|
||||
impl Philosopher {
|
||||
async fn think(&self) {
|
||||
self.thoughts
|
||||
.send(format!("Eureka! {} has a new idea!", &self.name))
|
||||
.await
|
||||
.unwrap();
|
||||
}
|
||||
// ANCHOR_END: Philosopher-think
|
||||
|
||||
// ANCHOR: Philosopher-eat
|
||||
async fn eat(&self) {
|
||||
// Keep trying until we have both forks
|
||||
// ANCHOR_END: Philosopher-eat
|
||||
let (_left_fork, _right_fork) = loop {
|
||||
// Pick up forks...
|
||||
let left_fork = self.left_fork.try_lock();
|
||||
let right_fork = self.right_fork.try_lock();
|
||||
let Ok(left_fork) = left_fork else {
|
||||
// If we didn't get the left fork, drop the right fork if we
|
||||
// have it and let other tasks make progress.
|
||||
drop(right_fork);
|
||||
time::sleep(time::Duration::from_millis(1)).await;
|
||||
continue;
|
||||
};
|
||||
let Ok(right_fork) = right_fork else {
|
||||
// If we didn't get the right fork, drop the left fork and let
|
||||
// other tasks make progress.
|
||||
drop(left_fork);
|
||||
time::sleep(time::Duration::from_millis(1)).await;
|
||||
continue;
|
||||
};
|
||||
break (left_fork, right_fork);
|
||||
};
|
||||
|
||||
// ANCHOR: Philosopher-eat-body
|
||||
println!("{} is eating...", &self.name);
|
||||
time::sleep(time::Duration::from_millis(5)).await;
|
||||
// ANCHOR_END: Philosopher-eat-body
|
||||
|
||||
// The locks are dropped here
|
||||
// ANCHOR: Philosopher-eat-end
|
||||
}
|
||||
}
|
||||
|
||||
static PHILOSOPHERS: &[&str] =
|
||||
&["Socrates", "Hypatia", "Plato", "Aristotle", "Pythagoras"];
|
||||
|
||||
#[tokio::main]
|
||||
async fn main() {
|
||||
// ANCHOR_END: Philosopher-eat-end
|
||||
// Create forks
|
||||
let mut forks = vec![];
|
||||
(0..PHILOSOPHERS.len()).for_each(|_| forks.push(Arc::new(Mutex::new(Fork))));
|
||||
|
||||
// Create philosophers
|
||||
let (philosophers, mut rx) = {
|
||||
let mut philosophers = vec![];
|
||||
let (tx, rx) = mpsc::channel(10);
|
||||
for (i, name) in PHILOSOPHERS.iter().enumerate() {
|
||||
let left_fork = Arc::clone(&forks[i]);
|
||||
let right_fork = Arc::clone(&forks[(i + 1) % PHILOSOPHERS.len()]);
|
||||
philosophers.push(Philosopher {
|
||||
name: name.to_string(),
|
||||
left_fork,
|
||||
right_fork,
|
||||
thoughts: tx.clone(),
|
||||
});
|
||||
}
|
||||
(philosophers, rx)
|
||||
// tx is dropped here, so we don't need to explicitly drop it later
|
||||
};
|
||||
|
||||
// Make them think and eat
|
||||
for phil in philosophers {
|
||||
tokio::spawn(async move {
|
||||
for _ in 0..100 {
|
||||
phil.think().await;
|
||||
phil.eat().await;
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
// Output their thoughts
|
||||
while let Some(thought) = rx.recv().await {
|
||||
println!("Here is a thought: {thought}");
|
||||
}
|
||||
}
|
@ -1,50 +0,0 @@
|
||||
# Dining Philosophers
|
||||
|
||||
The dining philosophers problem is a classic problem in concurrency:
|
||||
|
||||
> Five philosophers dine together at the same table. Each philosopher has their
|
||||
> own place at the table. There is a fork between each plate. The dish served is
|
||||
> a kind of spaghetti which has to be eaten with two forks. Each philosopher can
|
||||
> only alternately think and eat. Moreover, a philosopher can only eat their
|
||||
> spaghetti when they have both a left and right fork. Thus two forks will only
|
||||
> be available when their two nearest neighbors are thinking, not eating. After
|
||||
> an individual philosopher finishes eating, they will put down both forks.
|
||||
|
||||
You will need a local [Cargo installation](../../cargo/running-locally.md) for
|
||||
this exercise. Copy the code below to a file called `src/main.rs`, fill out the
|
||||
blanks, and test that `cargo run` does not deadlock:
|
||||
|
||||
<!-- File src/main.rs -->
|
||||
|
||||
```rust,compile_fail
|
||||
{{#include dining-philosophers.rs:Philosopher}}
|
||||
// left_fork: ...
|
||||
// right_fork: ...
|
||||
// thoughts: ...
|
||||
}
|
||||
|
||||
{{#include dining-philosophers.rs:Philosopher-think}}
|
||||
|
||||
{{#include dining-philosophers.rs:Philosopher-eat}}
|
||||
// Pick up forks...
|
||||
{{#include dining-philosophers.rs:Philosopher-eat-end}}
|
||||
// Create forks
|
||||
|
||||
// Create philosophers
|
||||
|
||||
// Make each of them think and eat 100 times
|
||||
|
||||
// Output their thoughts
|
||||
}
|
||||
```
|
||||
|
||||
You can use the following `Cargo.toml`:
|
||||
|
||||
<!-- File Cargo.toml -->
|
||||
|
||||
```toml
|
||||
[package]
|
||||
name = "dining-philosophers"
|
||||
version = "0.1.0"
|
||||
edition = "2021"
|
||||
```
|
@ -1,95 +0,0 @@
|
||||
// Copyright 2022 Google LLC
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
// ANCHOR: solution
|
||||
// ANCHOR: Philosopher
|
||||
use std::sync::{mpsc, Arc, Mutex};
|
||||
use std::thread;
|
||||
use std::time::Duration;
|
||||
|
||||
struct Fork;
|
||||
|
||||
struct Philosopher {
|
||||
name: String,
|
||||
// ANCHOR_END: Philosopher
|
||||
left_fork: Arc<Mutex<Fork>>,
|
||||
right_fork: Arc<Mutex<Fork>>,
|
||||
thoughts: mpsc::SyncSender<String>,
|
||||
}
|
||||
|
||||
// ANCHOR: Philosopher-think
|
||||
impl Philosopher {
|
||||
fn think(&self) {
|
||||
self.thoughts
|
||||
.send(format!("Eureka! {} has a new idea!", &self.name))
|
||||
.unwrap();
|
||||
}
|
||||
// ANCHOR_END: Philosopher-think
|
||||
|
||||
// ANCHOR: Philosopher-eat
|
||||
fn eat(&self) {
|
||||
// ANCHOR_END: Philosopher-eat
|
||||
println!("{} is trying to eat", &self.name);
|
||||
let _left = self.left_fork.lock().unwrap();
|
||||
let _right = self.right_fork.lock().unwrap();
|
||||
|
||||
// ANCHOR: Philosopher-eat-end
|
||||
println!("{} is eating...", &self.name);
|
||||
thread::sleep(Duration::from_millis(10));
|
||||
}
|
||||
}
|
||||
|
||||
static PHILOSOPHERS: &[&str] =
|
||||
&["Socrates", "Hypatia", "Plato", "Aristotle", "Pythagoras"];
|
||||
|
||||
fn main() {
|
||||
// ANCHOR_END: Philosopher-eat-end
|
||||
let (tx, rx) = mpsc::sync_channel(10);
|
||||
|
||||
let forks = (0..PHILOSOPHERS.len())
|
||||
.map(|_| Arc::new(Mutex::new(Fork)))
|
||||
.collect::<Vec<_>>();
|
||||
|
||||
for i in 0..forks.len() {
|
||||
let tx = tx.clone();
|
||||
let mut left_fork = Arc::clone(&forks[i]);
|
||||
let mut right_fork = Arc::clone(&forks[(i + 1) % forks.len()]);
|
||||
|
||||
// To avoid a deadlock, we have to break the symmetry
|
||||
// somewhere. This will swap the forks without deinitializing
|
||||
// either of them.
|
||||
if i == forks.len() - 1 {
|
||||
std::mem::swap(&mut left_fork, &mut right_fork);
|
||||
}
|
||||
|
||||
let philosopher = Philosopher {
|
||||
name: PHILOSOPHERS[i].to_string(),
|
||||
thoughts: tx,
|
||||
left_fork,
|
||||
right_fork,
|
||||
};
|
||||
|
||||
thread::spawn(move || {
|
||||
for _ in 0..100 {
|
||||
philosopher.eat();
|
||||
philosopher.think();
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
drop(tx);
|
||||
for thought in rx {
|
||||
println!("{thought}");
|
||||
}
|
||||
}
|
@ -1,81 +0,0 @@
|
||||
# Multi-threaded Link Checker
|
||||
|
||||
Let us use our new knowledge to create a multi-threaded link checker. It should
|
||||
start at a webpage and check that links on the page are valid. It should
|
||||
recursively check other pages on the same domain and keep doing this until all
|
||||
pages have been validated.
|
||||
|
||||
For this, you will need an HTTP client such as [`reqwest`][1]. You will also
|
||||
need a way to find links, we can use [`scraper`][2]. Finally, we'll need some
|
||||
way of handling errors, we will use [`thiserror`][3].
|
||||
|
||||
Create a new Cargo project and `reqwest` it as a dependency with:
|
||||
|
||||
```shell
|
||||
cargo new link-checker
|
||||
cd link-checker
|
||||
cargo add --features blocking,rustls-tls reqwest
|
||||
cargo add scraper
|
||||
cargo add thiserror
|
||||
```
|
||||
|
||||
> If `cargo add` fails with `error: no such subcommand`, then please edit the
|
||||
> `Cargo.toml` file by hand. Add the dependencies listed below.
|
||||
|
||||
The `cargo add` calls will update the `Cargo.toml` file to look like this:
|
||||
|
||||
<!-- File Cargo.toml -->
|
||||
|
||||
```toml
|
||||
[package]
|
||||
name = "link-checker"
|
||||
version = "0.1.0"
|
||||
edition = "2021"
|
||||
publish = false
|
||||
|
||||
[dependencies]
|
||||
reqwest = { version = "0.11.12", features = ["blocking", "rustls-tls"] }
|
||||
scraper = "0.13.0"
|
||||
thiserror = "1.0.37"
|
||||
```
|
||||
|
||||
You can now download the start page. Try with a small site such as
|
||||
`https://www.google.org/`.
|
||||
|
||||
Your `src/main.rs` file should look something like this:
|
||||
|
||||
<!-- File src/main.rs -->
|
||||
|
||||
```rust,compile_fail
|
||||
{{#include link-checker.rs:setup}}
|
||||
|
||||
{{#include link-checker.rs:visit_page}}
|
||||
|
||||
fn main() {
|
||||
let client = Client::new();
|
||||
let start_url = Url::parse("https://www.google.org").unwrap();
|
||||
let crawl_command = CrawlCommand{ url: start_url, extract_links: true };
|
||||
match visit_page(&client, &crawl_command) {
|
||||
Ok(links) => println!("Links: {links:#?}"),
|
||||
Err(err) => println!("Could not extract links: {err:#}"),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Run the code in `src/main.rs` with
|
||||
|
||||
```shell
|
||||
cargo run
|
||||
```
|
||||
|
||||
## Tasks
|
||||
|
||||
- Use threads to check the links in parallel: send the URLs to be checked to a
|
||||
channel and let a few threads check the URLs in parallel.
|
||||
- Extend this to recursively extract links from all pages on the
|
||||
`www.google.org` domain. Put an upper limit of 100 pages or so so that you
|
||||
don't end up being blocked by the site.
|
||||
|
||||
[1]: https://docs.rs/reqwest/
|
||||
[2]: https://docs.rs/scraper/
|
||||
[3]: https://docs.rs/thiserror/
|
@ -1,181 +0,0 @@
|
||||
// Copyright 2022 Google LLC
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
// ANCHOR: solution
|
||||
use std::sync::{mpsc, Arc, Mutex};
|
||||
use std::thread;
|
||||
|
||||
// ANCHOR: setup
|
||||
use reqwest::blocking::Client;
|
||||
use reqwest::Url;
|
||||
use scraper::{Html, Selector};
|
||||
use thiserror::Error;
|
||||
|
||||
#[derive(Error, Debug)]
|
||||
enum Error {
|
||||
#[error("request error: {0}")]
|
||||
ReqwestError(#[from] reqwest::Error),
|
||||
#[error("bad http response: {0}")]
|
||||
BadResponse(String),
|
||||
}
|
||||
// ANCHOR_END: setup
|
||||
|
||||
// ANCHOR: visit_page
|
||||
#[derive(Debug)]
|
||||
struct CrawlCommand {
|
||||
url: Url,
|
||||
extract_links: bool,
|
||||
}
|
||||
|
||||
fn visit_page(client: &Client, command: &CrawlCommand) -> Result<Vec<Url>, Error> {
|
||||
println!("Checking {:#}", command.url);
|
||||
let response = client.get(command.url.clone()).send()?;
|
||||
if !response.status().is_success() {
|
||||
return Err(Error::BadResponse(response.status().to_string()));
|
||||
}
|
||||
|
||||
let mut link_urls = Vec::new();
|
||||
if !command.extract_links {
|
||||
return Ok(link_urls);
|
||||
}
|
||||
|
||||
let base_url = response.url().to_owned();
|
||||
let body_text = response.text()?;
|
||||
let document = Html::parse_document(&body_text);
|
||||
|
||||
let selector = Selector::parse("a").unwrap();
|
||||
let href_values = document
|
||||
.select(&selector)
|
||||
.filter_map(|element| element.value().attr("href"));
|
||||
for href in href_values {
|
||||
match base_url.join(href) {
|
||||
Ok(link_url) => {
|
||||
link_urls.push(link_url);
|
||||
}
|
||||
Err(err) => {
|
||||
println!("On {base_url:#}: ignored unparsable {href:?}: {err}");
|
||||
}
|
||||
}
|
||||
}
|
||||
Ok(link_urls)
|
||||
}
|
||||
// ANCHOR_END: visit_page
|
||||
|
||||
struct CrawlState {
|
||||
domain: String,
|
||||
visited_pages: std::collections::HashSet<String>,
|
||||
}
|
||||
|
||||
impl CrawlState {
|
||||
fn new(start_url: &Url) -> CrawlState {
|
||||
let mut visited_pages = std::collections::HashSet::new();
|
||||
visited_pages.insert(start_url.as_str().to_string());
|
||||
CrawlState { domain: start_url.domain().unwrap().to_string(), visited_pages }
|
||||
}
|
||||
|
||||
/// Determine whether links within the given page should be extracted.
|
||||
fn should_extract_links(&self, url: &Url) -> bool {
|
||||
let Some(url_domain) = url.domain() else {
|
||||
return false;
|
||||
};
|
||||
url_domain == self.domain
|
||||
}
|
||||
|
||||
/// Mark the given page as visited, returning false if it had already
|
||||
/// been visited.
|
||||
fn mark_visited(&mut self, url: &Url) -> bool {
|
||||
self.visited_pages.insert(url.as_str().to_string())
|
||||
}
|
||||
}
|
||||
|
||||
type CrawlResult = Result<Vec<Url>, (Url, Error)>;
|
||||
fn spawn_crawler_threads(
|
||||
command_receiver: mpsc::Receiver<CrawlCommand>,
|
||||
result_sender: mpsc::Sender<CrawlResult>,
|
||||
thread_count: u32,
|
||||
) {
|
||||
let command_receiver = Arc::new(Mutex::new(command_receiver));
|
||||
|
||||
for _ in 0..thread_count {
|
||||
let result_sender = result_sender.clone();
|
||||
let command_receiver = command_receiver.clone();
|
||||
thread::spawn(move || {
|
||||
let client = Client::new();
|
||||
loop {
|
||||
let command_result = {
|
||||
let receiver_guard = command_receiver.lock().unwrap();
|
||||
receiver_guard.recv()
|
||||
};
|
||||
let Ok(crawl_command) = command_result else {
|
||||
// The sender got dropped. No more commands coming in.
|
||||
break;
|
||||
};
|
||||
let crawl_result = match visit_page(&client, &crawl_command) {
|
||||
Ok(link_urls) => Ok(link_urls),
|
||||
Err(error) => Err((crawl_command.url, error)),
|
||||
};
|
||||
result_sender.send(crawl_result).unwrap();
|
||||
}
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
fn control_crawl(
|
||||
start_url: Url,
|
||||
command_sender: mpsc::Sender<CrawlCommand>,
|
||||
result_receiver: mpsc::Receiver<CrawlResult>,
|
||||
) -> Vec<Url> {
|
||||
let mut crawl_state = CrawlState::new(&start_url);
|
||||
let start_command = CrawlCommand { url: start_url, extract_links: true };
|
||||
command_sender.send(start_command).unwrap();
|
||||
let mut pending_urls = 1;
|
||||
|
||||
let mut bad_urls = Vec::new();
|
||||
while pending_urls > 0 {
|
||||
let crawl_result = result_receiver.recv().unwrap();
|
||||
pending_urls -= 1;
|
||||
|
||||
match crawl_result {
|
||||
Ok(link_urls) => {
|
||||
for url in link_urls {
|
||||
if crawl_state.mark_visited(&url) {
|
||||
let extract_links = crawl_state.should_extract_links(&url);
|
||||
let crawl_command = CrawlCommand { url, extract_links };
|
||||
command_sender.send(crawl_command).unwrap();
|
||||
pending_urls += 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
Err((url, error)) => {
|
||||
bad_urls.push(url);
|
||||
println!("Got crawling error: {:#}", error);
|
||||
continue;
|
||||
}
|
||||
}
|
||||
}
|
||||
bad_urls
|
||||
}
|
||||
|
||||
fn check_links(start_url: Url) -> Vec<Url> {
|
||||
let (result_sender, result_receiver) = mpsc::channel::<CrawlResult>();
|
||||
let (command_sender, command_receiver) = mpsc::channel::<CrawlCommand>();
|
||||
spawn_crawler_threads(command_receiver, result_sender, 16);
|
||||
control_crawl(start_url, command_sender, result_receiver)
|
||||
}
|
||||
|
||||
fn main() {
|
||||
let start_url = reqwest::Url::parse("https://www.google.org").unwrap();
|
||||
let bad_urls = check_links(start_url);
|
||||
println!("Bad URLs: {:#?}", bad_urls);
|
||||
}
|
@ -1,25 +0,0 @@
|
||||
# Concurrency Afternoon Exercise
|
||||
|
||||
## Dining Philosophers --- Async
|
||||
|
||||
([back to exercise](dining-philosophers-async.md))
|
||||
|
||||
```rust,compile_fail
|
||||
{{#include dining-philosophers-async.rs:solution}}
|
||||
```
|
||||
|
||||
## Broadcast Chat Application
|
||||
|
||||
([back to exercise](chat-app.md))
|
||||
|
||||
_src/bin/server.rs_:
|
||||
|
||||
```rust,compile_fail
|
||||
{{#include chat-async/src/bin/server.rs:solution}}
|
||||
```
|
||||
|
||||
_src/bin/client.rs_:
|
||||
|
||||
```rust,compile_fail
|
||||
{{#include chat-async/src/bin/client.rs:solution}}
|
||||
```
|
@ -1,17 +0,0 @@
|
||||
# Concurrency Morning Exercise
|
||||
|
||||
## Dining Philosophers
|
||||
|
||||
([back to exercise](dining-philosophers.md))
|
||||
|
||||
```rust
|
||||
{{#include dining-philosophers.rs:solution}}
|
||||
```
|
||||
|
||||
## Link Checker
|
||||
|
||||
([back to exercise](link-checker.md))
|
||||
|
||||
```rust,compile_fail
|
||||
{{#include link-checker.rs:solution}}
|
||||
```
|
Reference in New Issue
Block a user