This commit removes, in retrospect, a silly use of `unsafe`. In particular,
to extract a file name extension (distinct from how `std` implements it),
we were transmuting an OsStr to its underlying WTF-8 byte representation
and then searching that. This required `unsafe` and relied on an
undocumented std API, so it was a bad choice to make, but everything gets
sacrificed at the Alter of Performance.
The thing I didn't seem to realize at the time was that:
1. On Unix, you can already get the raw byte representation in a manner
that has zero cost.
2. On Windows, paths are already being encoded and copied every which
way. So doing a UTF-8 check and, in rare cases (for invalid UTF-8),
an extra copy, doesn't seem like that much more of an added expense.
Thus, rewrite the extension extraction using safe APIs. On Unix, this
should have identical performance characteristics as the previous
implementation. On Windows, we do pay a higher cost in the UTF-8
check, but Windows is already paying a similar cost a few times over
anyway.
This commit uses the new virtual terminal processing feature in Windows 10
to enable the use of ANSI escape codes to color ripgrep's output. This
technique is preferred over the console APIs because it seems like where
the future is heading, but also because it avoids needing to use an
intermediate buffer to deal with the Windows console in a multithreaded
environment.
This commit adds a new method to the Console type which permits toggling
the VIRTUAL_TERMINAL_PROCESSING mode on a console. Specifically, this
enables the use of ANSI escape sequences for color in Windows terminals.
This fixes CI to handle the new documentation files. We also continue to
do more cleanup. In particular, we devise a nicer way of detecting the
most recent Cargo OUT_DIR by writing a dummy file, and looking for the
most recently modified version of that file.
This commit cleans up the README and splits portions of it out into
a user guide (GUIDE.md) and a FAQ (FAQ.md). The README now provides a
small list of documentation "quick" links to various parts of the docs.
This commit also does a few other minor touchups.
This regex update disabled the Tuned Boyer-Moore literal searcher which
has a bug in it that isn't straight-forward to fix. We bring that update
into ripgrep with this commit.
Fixes#780, Fixes#781
This adds hidden counter-flags for the following:
-L/--follow [--no-follow]
--hidden [--no-hidden]
--no-ignore [--ignore]
--no-ignore-parent [--ignore-parent]
--no-ignore-vcs [--ignore-vcs]
--no-messages [--messages]
--search-zip [--no-search-zip]
--sort-files [--no-sort-files]
--text [--no-text]
In the above list, the counter-flags are in brackets.
While these flags are hidden, we document the counter-flags in the
documentation for the flags they are countering.
This commit adds support for hidden flags. The purpose of hidden flags
is for things that end users likely won't need unless they have a
configuration file that disables ripgrep's defaults. These flags will
provide a way to re-enable ripgrep's defaults.
This adds a few tests that check for bugs reported here:
https://github.com/rust-lang/cargo/issues/4268
The bugs reported in the aforementioned issue are probably caused by not
enabling the `literal_separator` option in `GlobBuilder`. Enabling that
in the tests under question fixes the issue.
Closes#773
This commit uses the recent refactoring for defining flags to
automatically generate a man page. This finally allows us to define the
documentation for each flag in a single place.
The man page is generated on every build, if and only if `asciidoc` is
installed. When generated, it is placed in Cargo's `OUT_DIR` directory,
which is the same place that shell completions live.
This cleans up our CI scripts but doesn't significantly change anything.
Mostly this is removing dead code and wrong comments, and making the style
a bit more consistent.
This commit makes a small tweak to the --max-columns flag. Namely, if
the value of the flag is 0, then ripgrep behaves as-if the flag were
absent.
This is useful in the context of ripgrep reading configuration from the
environment. For example, an end user might set --max-columns=150, but we
should permit the user to disable this setting when needed. Using -M0 is
a nice way to do that.
We do this because a zero value for --max-columns isn't particularly
meaningful. We do leave the --max-count, --max-filesize and --maxdepth
flags alone though, since a zero value for those flags is potentially
meaningful. (--max-count even has tests for ripgrep's behavior when
given a value of 0.)
We use the new AppSettings::AllArgsOverrideSelf to permit all flags to
be specified multiple times. This removes the need for our previous
work-around where we would enable `multiple` for every flag and then
just extract the last value when consuming clap's matches.
We also add a couple regression tests that ensure repeated switches and
flags work as expected.
When referencing the PATTERN positional argument,
we should use `pattern` and not `PATTERN`. The former
is the clap identifier name while the latter is the argument
value name.
This commit adds support for reading configuration files that change
ripgrep's default behavior. The format of the configuration file is an
"rc" style and is very simple. It is defined by two rules:
1. Every line is a shell argument, after trimming ASCII whitespace.
2. Lines starting with '#' (optionally preceded by any amount of
ASCII whitespace) are ignored.
ripgrep will look for a single configuration file if and only if the
RIPGREP_CONFIG_PATH environment variable is set and is non-empty.
ripgrep will parse shell arguments from this file on startup and will
behave as if the arguments in this file were prepended to any explicit
arguments given to ripgrep on the command line.
For example, if your ripgreprc file contained a single line:
--smart-case
then the following command
RIPGREP_CONFIG_PATH=wherever/.ripgreprc rg foo
would behave identically to the following command
rg --smart-case foo
This commit also adds a new flag, --no-config, that when present will
suppress any and all support for configuration. This includes any future
support for auto-loading configuration files from pre-determined paths
(which this commit does not add).
Conflicts between configuration files and explicit arguments are handled
exactly like conflicts in the same command line invocation. That is,
this command:
RIPGREP_CONFIG_PATH=wherever/.ripgreprc rg foo --case-sensitive
is exactly equivalent to
rg --smart-case foo --case-sensitive
in which case, the --case-sensitive flag would override the --smart-case
flag.
Closes#196
This commit builds on the previous argv refactor by being more principled
about how we declared our flags. In particular, we now require that every
clap argument is one of three things: a positional argument, a switch or
a flag that accepts exactly one value. The latter two are always permitted
to be repeated, and we modify the code that consumes a clap::ArgMatches to
always use the *last* value of an argument. (clap by default always uses
the first value of argument, if it has been repeated and is accessed via
one of the singleton accessors.)
Fixes#553
This commit refactors how we define flags. In theory, this commit
should not result in any behavioral changes (other than perhaps more
consistent rules for flag overrides).
There are two important changes:
Firstly, each flag (or tightly coupled group of flags) is defined in its
own function. This function includes the documentation for that flag. This
improves the locality for each flag; everything you need to know about it
is self-contained in one small region of code.
Secondly, each flag is defined in terms of a very small ripgrep-specific
layer above clap. This permits us to have a set of structured arguments
independent of clap. The intention here is that we can use this indirection
to generate other documentation such as man pages.
This commit lays the ground work for modifying our use of clap in
principled way such that flags can be specified multiple times without
conflict. This in turn will help us implement support for persistent
configuration.
This removes the vec-map feature from clap. clap's README claims that
vec-map provides a small performance benefit, but I could observe any in
ripgrep workloads.
The benefit here is that it drops a dependency.
Amazingly, this drops whole release build times for ripgrep from 68s to
33s, and debug build time also decreases from 22s to 15.5s. This was
entirely unintentional but a welcome surprise.
This commit makes the git hash ripgrep was built with available for use
in the version string.
We also do a few minor touchups in build.rs and src/app.rs.
This commit updates the `log` crate to 0.4 and drops the dependency on
env_logger. In particular, the latest version of env_logger brings in
additional non-optional dependencies such as chrono that I don't think is
worth including into ripgrep.
It turns out ripgrep doesn't need any fancy logging. We just need a concept
of log levels and the ability to print to stderr. Therefore, we just roll
our own super simple logger.
This update is motivated by the persistent configuration task. In
particular, we need the ability to toggle the global log level more than
once, and this doesn't appear to be possible with older versions of the
log crate.
This commit fixes a bug on Windows where directory traversals were
completely broken when attempting to scan OneDrive directories that use
the "file on demand" strategy.
The specific problem was that Rust's standard library treats OneDrive
directories as reparse points instead of directories, which causes
methods like `FileType::is_file` and `FileType::is_dir` to always return
false, even when retrieved via methods like `metadata` that purport to
follow symbolic links.
We fix this by peppering our code with checks on the underlying file
attributes exposed by Windows. We consider an entry a directory if and
only if the directory bit is set on the attributes. We are careful to
make sure that the code remains the same on non-Windows platforms.
Note that we also bump the dependency on `walkdir`, which contains a
similar fix for its traversals.
This bug is recorded upstream:
https://github.com/rust-lang/rust/issues/46484
Upstream also has a pending PR:
https://github.com/rust-lang/rust/pull/47956Fixes#705
This commit updates to the latest walkdir release, which fixes a bug on
Windows where ripgrep would panic if it was told to traverse a directory
while following symlinks *and* if opening one of those symlinks failed.
Fixes#633
Previously, we would bail out of using memory maps if we could detect
ahead of time that opening a memory map would fail. The only case we
checked was whether the file size was 0 or not.
This is actually insufficient. The mmap call can return ENODEV errors
when a file doesn't support memory maps. This is the case for new files
exposed by Linux, for example,
/sys/devices/system/cpu/vulnerabilities/meltdown.
We fix this by checking the actual error codes returned by the mmap call.
If ENODEV (or EOVERFLOW) is returned, then we fall back to regular `read`
calls. If any other error occurs, we report it to the user.
Fixes#760
The eprintln! macro was added to Rust's standard library in Rust 1.19.0,
which is below ripgrep's minimum Rust version. Therefore, we can rely on
the standard library variant now.
This commit adds opt-in support for searching compressed files during
recursive search. This behavior is only enabled when the
`-z/--search-zip` flag is passed to ripgrep. When enabled, a limited set
of common compression formats are recognized via file extension, and a
new process is spawned to perform the decompression. ripgrep then
searches the stdout of that spawned process.
Closes#539
This commit adds 256-color and 24-bit truecolor support to ripgrep.
This only provides output support on ANSI terminals. If the Windows
console is used for coloring, then 256-color and 24-bit color settings
are ignored.