9c940b45f4
Previously, `man gitignore` specified that `**` was invalid unless it
was used in one of a few specific circumstances, i.e., `**`, `a/**`,
`**/b` or `a/**/b`. That is, `**` always had to be surrounded by either
a path separator or the beginning/end of the pattern.
It turns out that git itself has treated `**` outside the above contexts
as valid for quite a while, so there was an inconsistency between the
spec `man gitignore` and the implementation, and it wasn't clear which
was actually correct.
@okdana filed a bug against git[1] and got this fixed. The spec was wrong,
which has now been fixed [2] and updated[2].
This commit brings ripgrep in line with git and treats `**` outside of
the above contexts as two consecutive `*` patterns. We deprecate the
`InvalidRecursive` error since it is no longer used.
Fixes #373, Fixes #1098
[1] - https://public-inbox.org/git/C16A9F17-0375-42F9-90A9-A92C9F3D8BBA@dana.is
[2] -
|
||
---|---|---|
.. | ||
benches | ||
src | ||
Cargo.toml | ||
COPYING | ||
LICENSE-MIT | ||
README.md | ||
UNLICENSE |
globset
Cross platform single glob and glob set matching. Glob set matching is the process of matching one or more glob patterns against a single candidate path simultaneously, and returning all of the globs that matched.
Dual-licensed under MIT or the UNLICENSE.
Documentation
Usage
Add this to your Cargo.toml
:
[dependencies]
globset = "0.3"
and this to your crate root:
extern crate globset;
Example: one glob
This example shows how to match a single glob against a single file path.
use globset::Glob;
let glob = Glob::new("*.rs")?.compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(glob.is_match("foo/bar.rs"));
assert!(!glob.is_match("Cargo.toml"));
Example: configuring a glob matcher
This example shows how to use a GlobBuilder
to configure aspects of match
semantics. In this example, we prevent wildcards from matching path separators.
use globset::GlobBuilder;
let glob = GlobBuilder::new("*.rs")
.literal_separator(true).build()?.compile_matcher();
assert!(glob.is_match("foo.rs"));
assert!(!glob.is_match("foo/bar.rs")); // no longer matches
assert!(!glob.is_match("Cargo.toml"));
Example: match multiple globs at once
This example shows how to match multiple glob patterns at once.
use globset::{Glob, GlobSetBuilder};
let mut builder = GlobSetBuilder::new();
// A GlobBuilder can be used to configure each glob's match semantics
// independently.
builder.add(Glob::new("*.rs")?);
builder.add(Glob::new("src/lib.rs")?);
builder.add(Glob::new("src/**/foo.rs")?);
let set = builder.build()?;
assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]);
Performance
This crate implements globs by converting them to regular expressions, and
executing them with the
regex
crate.
For single glob matching, performance of this crate should be roughly on par
with the performance of the
glob
crate. (*_regex
correspond to benchmarks for this library while *_glob
correspond to benchmarks for the glob
library.)
Optimizations in the regex
crate may propel this library past glob
,
particularly when matching longer paths.
test ext_glob ... bench: 425 ns/iter (+/- 21)
test ext_regex ... bench: 175 ns/iter (+/- 10)
test long_glob ... bench: 182 ns/iter (+/- 11)
test long_regex ... bench: 173 ns/iter (+/- 10)
test short_glob ... bench: 69 ns/iter (+/- 4)
test short_regex ... bench: 83 ns/iter (+/- 2)
The primary performance advantage of this crate is when matching multiple
globs against a single path. With the glob
crate, one must match each glob
synchronously, one after the other. In this crate, many can be matched
simultaneously. For example:
test many_short_glob ... bench: 1,063 ns/iter (+/- 47)
test many_short_regex_set ... bench: 186 ns/iter (+/- 11)
Comparison with the glob
crate
- Supports alternate "or" globs, e.g.,
*.{foo,bar}
. - Can match non-UTF-8 file paths correctly.
- Supports matching multiple globs at once.
- Doesn't provide a recursive directory iterator of matching file paths, although I believe this crate should grow one eventually.
- Supports case insensitive and require-literal-separator match options, but doesn't support the require-literal-leading-dot option.