1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00
ripgrep/crates/globset
Andres Suarez 7534d5144f globset: fix recursive suffix over matching
Previous, 'foo/**' would match 'foo', but it shouldn't have. In this
case, not matching 'foo' is what is documented and also seems consistent
with other recursive globbing implementations (like that in zsh).

This also updates the prefix extractor to pull 'foo/' out of 'foo/**'.

Closes #1756
2021-05-31 21:51:18 -04:00
..
benches repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00
src globset: fix recursive suffix over matching 2021-05-31 21:51:18 -04:00
Cargo.toml globset-0.4.6 2020-10-21 21:10:43 -04:00
COPYING repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00
LICENSE-MIT repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00
README.md doc: update CI links in crate READMEs 2020-11-16 19:07:16 -05:00
UNLICENSE repo: move all source code in crates directory 2020-02-17 19:24:53 -05:00

globset

Cross platform single glob and glob set matching. Glob set matching is the process of matching one or more glob patterns against a single candidate path simultaneously, and returning all of the globs that matched.

Build status

Dual-licensed under MIT or the UNLICENSE.

Documentation

https://docs.rs/globset

Usage

Add this to your Cargo.toml:

[dependencies]
globset = "0.3"

and this to your crate root:

extern crate globset;

Features

  • serde1: Enables implementing Serde traits on the Glob type.

Example: one glob

This example shows how to match a single glob against a single file path.

use globset::Glob;

let glob = Glob::new("*.rs")?.compile_matcher();

assert!(glob.is_match("foo.rs"));
assert!(glob.is_match("foo/bar.rs"));
assert!(!glob.is_match("Cargo.toml"));

Example: configuring a glob matcher

This example shows how to use a GlobBuilder to configure aspects of match semantics. In this example, we prevent wildcards from matching path separators.

use globset::GlobBuilder;

let glob = GlobBuilder::new("*.rs")
    .literal_separator(true).build()?.compile_matcher();

assert!(glob.is_match("foo.rs"));
assert!(!glob.is_match("foo/bar.rs")); // no longer matches
assert!(!glob.is_match("Cargo.toml"));

Example: match multiple globs at once

This example shows how to match multiple glob patterns at once.

use globset::{Glob, GlobSetBuilder};

let mut builder = GlobSetBuilder::new();
// A GlobBuilder can be used to configure each glob's match semantics
// independently.
builder.add(Glob::new("*.rs")?);
builder.add(Glob::new("src/lib.rs")?);
builder.add(Glob::new("src/**/foo.rs")?);
let set = builder.build()?;

assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]);

Performance

This crate implements globs by converting them to regular expressions, and executing them with the regex crate.

For single glob matching, performance of this crate should be roughly on par with the performance of the glob crate. (*_regex correspond to benchmarks for this library while *_glob correspond to benchmarks for the glob library.) Optimizations in the regex crate may propel this library past glob, particularly when matching longer paths.

test ext_glob             ... bench:         425 ns/iter (+/- 21)
test ext_regex            ... bench:         175 ns/iter (+/- 10)
test long_glob            ... bench:         182 ns/iter (+/- 11)
test long_regex           ... bench:         173 ns/iter (+/- 10)
test short_glob           ... bench:          69 ns/iter (+/- 4)
test short_regex          ... bench:          83 ns/iter (+/- 2)

The primary performance advantage of this crate is when matching multiple globs against a single path. With the glob crate, one must match each glob synchronously, one after the other. In this crate, many can be matched simultaneously. For example:

test many_short_glob      ... bench:       1,063 ns/iter (+/- 47)
test many_short_regex_set ... bench:         186 ns/iter (+/- 11)

Comparison with the glob crate

  • Supports alternate "or" globs, e.g., *.{foo,bar}.
  • Can match non-UTF-8 file paths correctly.
  • Supports matching multiple globs at once.
  • Doesn't provide a recursive directory iterator of matching file paths, although I believe this crate should grow one eventually.
  • Supports case insensitive and require-literal-separator match options, but doesn't support the require-literal-leading-dot option.