1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00
ripgrep/crates/ignore
Andrew Gallant f314b0d55f ignore: fix parallel traversal
It turns out that the previous version wasn't quite correct. Namely, it
was possible for the following sequence to occur:

1. Consider that all workers, except for one, are `waiting`.
2. The last remaining worker finds one more job to do and sends it on
   the channel.
3. One of the previously `waiting` workers wakes up from the job that
   the last running worker sent, but `self.resume()` has not been
   called yet.
4. The last worker, from (2), calls `get_work` and sees that the
   channel has nothing on it, so it executes `self.waiting() ==
   1`. Since the worker in (3) hasn't called `self.resume()` yet,
   `self.waiting() == 1` evaluates to true.
5. This sets off a chain reaction that stops all workers, despite that
   fact that (3) got more work (which could itself spawn more work).

The end result is that the traversal may terminate while their are still
outstanding work items to process. This problem was observed through
spurious failures in CI. I was not actually able to reproduce the bug
locally.

We fix this by changing our strategy to detect termination using a
counter. Namely, we increment the counter just before sending new work
and decrement the counter just after finishing work. In this way, we
guarantee that the counter only ever reaches 0 once there is no more
work to process.

See #1337 for more discussion. Many thanks to @zsugabubus for helping me
work through this.
2020-02-20 16:07:51 -05:00
..
examples repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00
src ignore: fix parallel traversal 2020-02-20 16:07:51 -05:00
tests repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00
Cargo.toml repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00
COPYING repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00
LICENSE-MIT repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00
README.md repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00
UNLICENSE repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00

ignore

The ignore crate provides a fast recursive directory iterator that respects various filters such as globs, file types and .gitignore files. This crate also provides lower level direct access to gitignore and file type matchers.

Linux build status Windows build status

Dual-licensed under MIT or the UNLICENSE.

Documentation

https://docs.rs/ignore

Usage

Add this to your Cargo.toml:

[dependencies]
ignore = "0.4"

and this to your crate root:

extern crate ignore;

Example

This example shows the most basic usage of this crate. This code will recursively traverse the current directory while automatically filtering out files and directories according to ignore globs found in files like .ignore and .gitignore:

use ignore::Walk;

for result in Walk::new("./") {
    // Each item yielded by the iterator is either a directory entry or an
    // error, so either print the path or the error.
    match result {
        Ok(entry) => println!("{}", entry.path().display()),
        Err(err) => println!("ERROR: {}", err),
    }
}

Example: advanced

By default, the recursive directory iterator will ignore hidden files and directories. This can be disabled by building the iterator with WalkBuilder:

use ignore::WalkBuilder;

for result in WalkBuilder::new("./").hidden(false).build() {
    println!("{:?}", result);
}

See the documentation for WalkBuilder for many other options.