This basically rewrites every integration test. We reduce the amount of
magic involved here in terms of which arguments are being passed to
ripgrep processes. To make up for the boiler plate saved by the magic,
we make the Dir (formerly WorkDir) type a bit nicer to use, along with a
new TestCommand that wraps a std::process::Command. In exchange, we get
tests that are easier to read and write.
We also run every test with the `--pcre2` flag to make sure that works,
when PCRE2 is available.
This commit does the work to delete the old `grep` crate and effectively
rewrite most of ripgrep core to use the new libripgrep crates. The new
`grep` crate is now a facade that collects the various crates that make
up libripgrep.
The most complex part of ripgrep core is now arguably the translation
between command line parameters and the library options, which is
ultimately where we want to be.
libripgrep is not any one library, but rather, a collection of libraries
that roughly separate the following key distinct phases in a grep
implementation:
1. Pattern matching (e.g., by a regex engine).
2. Searching a file using a pattern matcher.
3. Printing results.
Ultimately, both (1) and (3) are defined by de-coupled interfaces, of
which there may be multiple implementations. Namely, (1) is satisfied by
the `Matcher` trait in the `grep-matcher` crate and (3) is satisfied by
the `Sink` trait in the `grep2` crate. The searcher (2) ties everything
together and finds results using a matcher and reports those results
using a `Sink` implementation.
Closes#162
winapi 0.3.5 changed how it represents some of its structs, which caused
a bug to surface in atty that prevents tty detection on Windows. atty
has an open PR to fix this: https://github.com/softprops/atty/pull/28
Until a new release of atty, we pin winapi to a version that works.
This commit mostly moves the transcoder implementation to its own
crate: https://github.com/BurntSushi/encoding_rs_io
The new crate adds clear documentation and cleans up the implementation
to fully implement the contract of io::Read.
This causes SIMD to kick in automatically when compiling with stable
Rust 1.27+.
We also update the README to describe the current state of things.
Thanks to @hartley for pointing this out:
https://twitter.com/hartley/status/1009950392862453760
atty 0.2.7 (and 0.2.8) contain a regression in cygwin terminals that
prevents basic use of ripgrep, and is also the cause of the Windows CI
test failures. For now, we pin to 0.2.6, but a patch has been submitted
upstream: https://github.com/softprops/atty/pull/25
Nothing to see here.
Note that we continue to refrain to update tempdir, which means we are
still bringing in rand 0.4 and rand 0.3. Updating tempdir brings in an
old version of remove_dir_all, which in turn brings in winapi 0.2. No
thanks.
This update brings with it many bug fixes:
* Better error messages are printed overall. We also include
explicit call out for unsupported features like backreferences
and look-around.
* Regexes like `\s*{` no longer emit incomprehensible errors.
* Unicode escape sequences, such as `\u{..}` are now supported.
For the most part, this upgrade was done in a straight-forward way. We
resist the urge to refactor the `grep` crate, in anticipation of it
being rewritten anyway.
Note that we removed the `--fixed-strings` suggestion whenever a regex
syntax error occurs. In practice, I've found that it results in a lot of
false positives, and I believe that its use is not as paramount now that
regex parse errors are much more readable.
Closes#268, Closes#395, Closes#702, Closes#853
This update brings with it a new feature of the regex crate which will
now use SIMD optimizations automatically at runtime with no necessary
compile time flags. All that's needed is to enable the `unstable` feature.
Other crates, such as bytecount and encoding_rs, are still using the
old-style SIMD support, so we leave the simd-accel and avx-accel features.
However, the binaries we distribute on Github no longer have those
features enabled, which makes them truly portable.
Fixes#135
This commit fixes a performance regression in Windows that resulted from
fallout from fixing #705. In particular, we introduced an additional
stat call for every single directory entry, which can be quite
disastrous for performance.
There is a corresponding companion PR that fixes the same bug in
walkdir: https://github.com/BurntSushi/walkdir/pull/96Fixes#820
This regex update disabled the Tuned Boyer-Moore literal searcher which
has a bug in it that isn't straight-forward to fix. We bring that update
into ripgrep with this commit.
Fixes#780, Fixes#781
We use the new AppSettings::AllArgsOverrideSelf to permit all flags to
be specified multiple times. This removes the need for our previous
work-around where we would enable `multiple` for every flag and then
just extract the last value when consuming clap's matches.
We also add a couple regression tests that ensure repeated switches and
flags work as expected.
This removes the vec-map feature from clap. clap's README claims that
vec-map provides a small performance benefit, but I could observe any in
ripgrep workloads.
The benefit here is that it drops a dependency.
Amazingly, this drops whole release build times for ripgrep from 68s to
33s, and debug build time also decreases from 22s to 15.5s. This was
entirely unintentional but a welcome surprise.
This commit updates the `log` crate to 0.4 and drops the dependency on
env_logger. In particular, the latest version of env_logger brings in
additional non-optional dependencies such as chrono that I don't think is
worth including into ripgrep.
It turns out ripgrep doesn't need any fancy logging. We just need a concept
of log levels and the ability to print to stderr. Therefore, we just roll
our own super simple logger.
This update is motivated by the persistent configuration task. In
particular, we need the ability to toggle the global log level more than
once, and this doesn't appear to be possible with older versions of the
log crate.
This commit fixes a bug on Windows where directory traversals were
completely broken when attempting to scan OneDrive directories that use
the "file on demand" strategy.
The specific problem was that Rust's standard library treats OneDrive
directories as reparse points instead of directories, which causes
methods like `FileType::is_file` and `FileType::is_dir` to always return
false, even when retrieved via methods like `metadata` that purport to
follow symbolic links.
We fix this by peppering our code with checks on the underlying file
attributes exposed by Windows. We consider an entry a directory if and
only if the directory bit is set on the attributes. We are careful to
make sure that the code remains the same on non-Windows platforms.
Note that we also bump the dependency on `walkdir`, which contains a
similar fix for its traversals.
This bug is recorded upstream:
https://github.com/rust-lang/rust/issues/46484
Upstream also has a pending PR:
https://github.com/rust-lang/rust/pull/47956Fixes#705