This undoes the patch to stop using bytecount on big-endian
architectures. In particular, we bump our bytecount dependency to the
latest release, which has a fix.
This reverts commit a4868b8835.
Fixes#1144 (again), Closes#1194
This brings in an updated `encoding_rs` crate that uses `packed_simd`,
which compiles on the latest nightly. Compilation times do appear to be
impacted significantly though.
Fixes#1175 (again)
This commit fixes a bug where ripgrep only treated files beginning with
a `.` as hidden. On Windows, we continue this tradition, but
additionally check whether a file has the special Windows "hidden"
attribute set. If so, we treat it as a hidden file.
In order to make this work without an additional stat call, we had to
rearrange some of the plumbing from the directory traverser.
Fixes#1154
This fixes a bug where a BOM prefix was included. While this was somewhat
intentional in order to have a faithful "UTF8 passthru" option, in
practice, this causes problems such as breaking patterns like `^` in a
really non-obvious way.
The actual fix was to add a new API to encoding_rs_io, which this commit
brings in.
Fixes#1163
bytecount now uses runtime dispatch for enabling SIMD, which means we can
no longer need the avx-accel features. We remove it from ripgrep since the
next release will be a minor version bump, but leave them as no-ops for
the crates that previously used it.
This commit is the result of doing:
$ cargo update
$ cargo update -p encoding_rs --precise 0.8.10
where the latter line prevents encoding_rs from updating to 0.8.11 (or
newer). In particular, the 0.8.11 release increased the minimum Rust
version to 1.29, where as ripgrep 0.10.x is still on 1.28. We stay on an
older version for now until ripgrep is ready to move to 0.11.x.
This also requires corresponding updates to both rand and rand_core. Doing
an update of rand without doing an update of rand_core results in
compilation errors because two distinct versions of rand_core are included
in the build, and the traits they expose are distinct and incompatible.
We also switch over to using tempfile instead of tempdir, which drops the
last remaining thing keeping rand 0.4 in the build.
Fixes#1141, Fixes#1142
This brings in some new Unicode properties, such as \p{Emoji}.
It is now also technically possible construct a regex that recognizes
grapheme clusters.
This commit moves a lot of "utility" code from ripgrep core into
grep-cli. Any one of these things might not be worth creating a new
crate, but combining everything together results in a fair number of a
convenience routines that make up a decent sized crate.
There is potentially more we could move into the crate, but much of what
remains in ripgrep core is almost entirely dealing with the number of
flags we support.
In the course of doing moving things to the grep-cli crate, we clean up
a lot of gunk and improve failure modes in a number of cases. In
particular, we've fixed a bug where other processes could deadlock if
they write too much to stderr.
Fixes#990
This commit adds a 'same_file_system' option to the walk builder. For
single threaded walking, it defers to the walkdir crate, which has the
same option. The bulk of this commit implements this flag for the parallel
walker. We add one very feeble test for this.
The parallel walker is now officially a complete mess.
Closes#321
This commit fixes a bug where the first path always reported itself as
as symlink via `path_is_symlink`.
Part of this fix includes updating walkdir to 2.2.1, which also includes
a corresponding bug fix.
Fixes#984
This also updates some code to make use of our more liberal versioning
requirement, including the use of crossbeam-channel instead of the MsQueue
from the older an unmaintained crossbeam 0.3. This does regrettably add
a sizable number of dependencies, however, compile times seem mostly
unaffected.
Closes#1019
This basically rewrites every integration test. We reduce the amount of
magic involved here in terms of which arguments are being passed to
ripgrep processes. To make up for the boiler plate saved by the magic,
we make the Dir (formerly WorkDir) type a bit nicer to use, along with a
new TestCommand that wraps a std::process::Command. In exchange, we get
tests that are easier to read and write.
We also run every test with the `--pcre2` flag to make sure that works,
when PCRE2 is available.
This commit does the work to delete the old `grep` crate and effectively
rewrite most of ripgrep core to use the new libripgrep crates. The new
`grep` crate is now a facade that collects the various crates that make
up libripgrep.
The most complex part of ripgrep core is now arguably the translation
between command line parameters and the library options, which is
ultimately where we want to be.
libripgrep is not any one library, but rather, a collection of libraries
that roughly separate the following key distinct phases in a grep
implementation:
1. Pattern matching (e.g., by a regex engine).
2. Searching a file using a pattern matcher.
3. Printing results.
Ultimately, both (1) and (3) are defined by de-coupled interfaces, of
which there may be multiple implementations. Namely, (1) is satisfied by
the `Matcher` trait in the `grep-matcher` crate and (3) is satisfied by
the `Sink` trait in the `grep2` crate. The searcher (2) ties everything
together and finds results using a matcher and reports those results
using a `Sink` implementation.
Closes#162