1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00
Commit Graph

59 Commits

Author SHA1 Message Date
Andrew Gallant
0bc4f0447b style: rustfmt everything
This is why I was so intent on clearing the PR queue. This will
effectively invalidate all existing patches, so I wanted to start from a
clean slate.

We do make one little tweak: we put the default type definitions in
their own file and tell rustfmt to keep its grubby mits off of it. We
also sort it lexicographically and hopefully will enforce that from here
on.
2020-02-17 19:24:53 -05:00
Manfred Endres
804b43ecd8 globset: implement FromStr for Glob
The `globset::Glob` type [`new`] function creates a new value with an
`&str` parameter which returns an `Result<Glob, Error>` object. This is
exactly what [`std::str::FromStr::from_str`][`std::str::FromStr`] defines.
Libraries like [`clap`] use [`std::str::FromStr`] to create objects from
provided commandline arguments. This change makes this library usable
without a newtype wrapper.

[`std::str::FromStr`]: 	https://doc.rust-lang.org/std/str/trait.FromStr.html
[`clap`]:		https://docs.rs/clap/2.33.0/clap/macro.value_t.html
[`new`]:		https://docs.rs/globset/0.4.4/globset/struct.Glob.html#method.new

Closes #1447
2020-02-17 17:16:28 -05:00
Lucien Greathouse
2263b8ac92 globset: add GlobMatcher::glob
This exposes the underlying `Glob` used to compile the matcher. This can
be useful for wrapping up the glob matcher in other types.

Closes #1454
2020-02-17 17:16:28 -05:00
Ximin Luo
f8418c6a52 explicitly declare lazy_static dependency
`benches/bench.rs` uses lazy_static but Cargo.toml does not declare a
dependency on it. This causes rustc to use its own internal private
copy instead. Sometimes this causes unintuitive errors like this Debian
bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=942243

The underlying issue is https://github.com/rust-lang/rust#27812 but it
can be avoided by explicitly declaring the dependency, which you are
supposed to do anyways.

Closes #1435
2020-02-17 17:16:28 -05:00
Andrew Gallant
f8fb65f7e3
globset: fix benchmarks
There were apparently a lot of unused things, including lazy_static.
2020-01-27 16:45:12 -05:00
Andrew Gallant
785c1f1766
release: globset, grep-cli, grep-printer, grep-searcher 2019-06-26 16:53:30 -04:00
Andrew Gallant
b93762ea7a
bstr: update everything to bstr 0.2 2019-06-26 16:47:33 -04:00
Andrew Gallant
e79085e9e4
release: globset 0.4.3 2019-04-15 14:07:03 -04:00
Andrew Gallant
9952ba2068 deps: update glob dev-dependency 2019-04-14 19:29:27 -04:00
Andrew Gallant
254b8b67bb globset: small perf improvements
This tweaks the path handling functions slightly to make them a hair
faster. In particular, `file_name` is called on every path that ripgrep
visits, and it was possible to remove a few branches without changing
behavior.
2019-04-05 23:24:08 -04:00
Andrew Gallant
8a7f43b84d globset: use bstr
This simplifies the various path related functions and pushed more platform
dependent code down into bstr. This likely also makes things a bit more
efficient on Windows, since we now only do a single UTF-8 check for each
file path.
2019-04-05 23:24:08 -04:00
Andrew Gallant
cd9815cb37
deps: update to aho-corasick 0.7
We do the simplest possible change to migrate to the new version.

Fixes #1228
2019-04-03 13:51:26 -04:00
Andrew Gallant
9c940b45f4
globset: permit ** to appear anywhere
Previously, `man gitignore` specified that `**` was invalid unless it
was used in one of a few specific circumstances, i.e., `**`, `a/**`,
`**/b` or `a/**/b`. That is, `**` always had to be surrounded by either
a path separator or the beginning/end of the pattern.

It turns out that git itself has treated `**` outside the above contexts
as valid for quite a while, so there was an inconsistency between the
spec `man gitignore` and the implementation, and it wasn't clear which
was actually correct.

@okdana filed a bug against git[1] and got this fixed. The spec was wrong,
which has now been fixed [2] and updated[2].

This commit brings ripgrep in line with git and treats `**` outside of
the above contexts as two consecutive `*` patterns. We deprecate the
`InvalidRecursive` error since it is no longer used.

Fixes #373, Fixes #1098

[1] - https://public-inbox.org/git/C16A9F17-0375-42F9-90A9-A92C9F3D8BBA@dana.is
[2] - 627186d020
[3] - https://git-scm.com/docs/gitignore
2019-01-23 19:59:39 -05:00
Andrew Gallant
aeaa5fc1b1
globset: fix repeated use of **
This fixes a bug where repeated use of ** didn't behave as it should. In
particular, each use of `**` added a new requirement directory depth
requirement. For example, something like `**/**/b` would match
`foo/bar/b`, but it wouldn't match `foo/b` even though it should. In
particular, `**` semantics demand "infinite" depth, so repeated uses of
`**` should just coalesce as if only one was given.

We do this coalescing in the parser. It's a little tricky because we
treat `**/a`, `a/**` and `a/**/b` as distinct tokens with their own
regex conversions. We also test the crap out of it.

Fixes #1174
2019-01-23 19:15:02 -05:00
Andrew Gallant
63b0f31a22 deps: update various dependencies
We also increase the MSRV to 1.32, the current stable release, which sets
the stage for migrating to Rust 2018.
2019-01-19 10:44:30 -05:00
Andrew Gallant
d14f0b37d6
deps: update versions for all crates
I don't think every change here is needed, but this ensures we're using
the latest version of every direct dependency.
2018-09-07 14:00:22 -04:00
Andrew Gallant
bb110c1ebe ripgrep: migrate to libripgrep
This commit does the work to delete the old `grep` crate and effectively
rewrite most of ripgrep core to use the new libripgrep crates. The new
`grep` crate is now a facade that collects the various crates that make
up libripgrep.

The most complex part of ripgrep core is now arguably the translation
between command line parameters and the library options, which is
ultimately where we want to be.
2018-08-20 07:10:19 -04:00
Andrew Gallant
84585908ac
globset-0.4.1 2018-07-28 10:59:54 -04:00
Andrew Gallant
d09e2f6af1
globset: clarify documentation on regex method
This makes it clear that the `bytes` API of the regex crate should be
used instead of the Unicode API.

Fixes #985
2018-07-21 13:23:46 -04:00
Bastien Orivel
49f36c7dcd deps: update regex to 1.0
We retain the `simd-accel` feature on globset for backwards
compatibility, but will remove it in the next semver release.
2018-05-07 13:07:30 -04:00
Andrew Gallant
ab64da73ab ignore: speed up Gitignore::empty
This commit makes Gitignore::empty a bit faster by avoiding allocation
and manually specializing the implementation instead of routing it through
the GitignoreBuilder.

This helps improve uses of ripgrep that traverse *many* directories, and
in particular, when the use of ignores is disabled via command line
switches.

Fixes #835, Closes #836
2018-04-24 11:19:03 -04:00
Andrew Gallant
6c8b1e93d5
globset: release 0.4.0 2018-04-21 12:09:15 -04:00
FlorentBecker
c4dd927a13 ignore: add Clone/Debug for builders 2018-04-05 08:06:26 -04:00
Andrew Gallant
1f70e9187c deps: update regex crate
This update brings with it a new feature of the regex crate which will
now use SIMD optimizations automatically at runtime with no necessary
compile time flags. All that's needed is to enable the `unstable` feature.

Other crates, such as bytecount and encoding_rs, are still using the
old-style SIMD support, so we leave the simd-accel and avx-accel features.
However, the binaries we distribute on Github no longer have those
features enabled, which makes them truly portable.

Fixes #135
2018-03-12 23:21:42 -04:00
Markus Staab
7120f32258 globset/doc: update README for 0.3 release 2018-03-12 07:19:55 -04:00
Andrew Gallant
54256515b4
globset: make ErrorKind enum extensible
This commit makes the ErrorKind enum extensible by adding a
__Nonexhaustive variant. Callers should use this as a hint that
exhaustive case analysis isn't possible in a stable way since new
variants may be added in the future without a semver bump.
2018-03-10 09:30:55 -05:00
Brian Malehorn
e2516ed095
globset: support backslash escaping
From `man 7 glob`:

    One can remove the special meaning of '?', '*' and '[' by preceding
    them by a backslash, or, in case this is part of a shell command
    line, enclosing them in quotes.

Conform to glob / fnmatch / git implementations by making `\` escape the
following character - for example `\?` will match a literal `?`.

However, only enable this by default on Unix platforms. Windows builds
will continue to use `\` as a path separator, but can still get the new
behavior by calling `globset.backslash_escape(true)`.

Adding tests for the `Globset::backslash_escape` option was a bit
involved, since the default value of this option is platform-dependent.

Extend the options framework to hold an `Option<T>` for each
knob, where `None` means "default" and `Some(v)` means "override with
`v`". This way we only have to specify the default values once in
`GlobOptions::default()` rather than replicated in both code and tests.

Finally write a few behavioral tests, and some tests to confirm it
varies by platform.
2018-03-10 09:30:55 -05:00
Andrew Gallant
a431160d4c
globset: release 0.3.0 2018-02-11 13:41:36 -05:00
Andrew Gallant
96ee4482cd globset: remove use of unsafe
This commit removes, in retrospect, a silly use of `unsafe`. In particular,
to extract a file name extension (distinct from how `std` implements it),
we were transmuting an OsStr to its underlying WTF-8 byte representation
and then searching that. This required `unsafe` and relied on an
undocumented std API, so it was a bad choice to make, but everything gets
sacrificed at the Alter of Performance.

The thing I didn't seem to realize at the time was that:

  1. On Unix, you can already get the raw byte representation in a manner
     that has zero cost.
  2. On Windows, paths are already being encoded and copied every which
     way. So doing a UTF-8 check and, in rare cases (for invalid UTF-8),
     an extra copy, doesn't seem like that much more of an added expense.

Thus, rewrite the extension extraction using safe APIs. On Unix, this
should have identical performance characteristics as the previous
implementation. On Windows, we do pay a higher cost in the UTF-8
check, but Windows is already paying a similar cost a few times over
anyway.
2018-02-10 22:28:12 -05:00
Behnam Esfahbod
706323ad8f
globset: add more tests for single-asterisk pattern
This adds a few tests that check for bugs reported here:
https://github.com/rust-lang/cargo/issues/4268

The bugs reported in the aforementioned issue are probably caused by not
enabling the `literal_separator` option in `GlobBuilder`. Enabling that
in the tests under question fixes the issue.

Closes #773
2018-02-06 17:42:22 -05:00
Andrew Gallant
3535047094 logger: drop env_logger
This commit updates the `log` crate to 0.4 and drops the dependency on
env_logger. In particular, the latest version of env_logger brings in
additional non-optional dependencies such as chrono that I don't think is
worth including into ripgrep.

It turns out ripgrep doesn't need any fancy logging. We just need a concept
of log levels and the ability to print to stderr. Therefore, we just roll
our own super simple logger.

This update is motivated by the persistent configuration task. In
particular, we need the ability to toggle the global log level more than
once, and this doesn't appear to be possible with older versions of the
log crate.
2018-02-04 10:40:20 -05:00
Balaji Sivaraman
b6177f0459 cleanup: replace try! with ? 2018-01-01 09:22:35 -05:00
dana
679198e71a Support both [^...] and [!...] for globset class negation
Adds support for [^...] class negation in globs for parity with git, &al.

Fixes #663
2017-11-22 06:56:03 -05:00
Martin Lindhe
c794ef2f04 fix some typos 2017-11-01 07:10:54 -04:00
Andrew Gallant
efa4de8126
cargo: bump to 0.7.0 2017-10-21 22:40:10 -04:00
Andrew Gallant
1267f01c24
deps: upgrade to memchr 2 2017-10-21 22:40:09 -04:00
Benjamin Sago
4f1d6af296 globset README version bump 2017-10-08 08:01:50 -04:00
Andrew Gallant
5a666b042d
bump ripgrep, ignore, globset
The `ignore` and `globset` crates both got breaking changes in the
course of fixing #444, so increase 0.x to 0.(x+1).
2017-05-11 19:12:20 -04:00
Andrew Gallant
c50b8b4125 Add better error messages for invalid globs.
This threads the original glob given by end users through all of the
glob parsing errors. This was slightly trickier than it might appear
because the gitignore implementation actually modifies the glob before
compiling it. So in order to get better glob error messages everywhere,
we need to track the original glob both in the glob parser and in the
higher-level abstractions in the `ignore` crate.

Fixes #444
2017-04-12 18:14:23 -04:00
Andrew Gallant
c648eadbaa Bump and update deps. 2017-03-12 21:33:13 -04:00
Andrew Gallant
c1b841e934 Add license files to each crate.
Fixes #381
2017-03-12 16:57:15 -04:00
Marc Tiehuis
066f97d855 Add enclosing group to alternations in globs
Fixes #391.
2017-03-08 10:13:28 -05:00
Stu Hood
cf750a190f Implement Hash for Glob, and re-implement PartialEq using only non-redundant fields. 2017-02-18 11:46:03 -05:00
Andrew Gallant
d825648b86 Remove lazy_static from globset 2017-02-12 15:37:50 -05:00
Andrew Gallant
057ed6305a 0.4.0 2017-01-13 23:46:21 -05:00
Andrew Gallant
aed315e80a bump deps 2017-01-03 07:27:51 -05:00
Andrew Gallant
163e00677a Update to regex 0.2. 2017-01-01 01:03:21 -05:00
Andrew Gallant
d58236fbdc bump various versions 2016-12-30 15:44:08 -05:00
Andrew Gallant
652c70f207 Fix cut-off line in globset docs.
Fixes #277
2016-12-12 06:58:33 -05:00
Andrew Gallant
301ee6d3f5 globset-0.1.2 2016-11-06 15:35:05 -05:00