The `ignore` crate currently handles two different kinds of "global"
gitignore files: gitignores from `~/.gitconfig`'s `core.excludesFile`
and gitignores passed in via `WalkBuilder::add_ignore` (corresponding to
ripgrep's `--ignore-file` flag).
In contrast to any other kind of gitignore file, these gitignore files
should have their patterns interpreted relative to the current working
directory. (Arguably there are other choices we could make here, e.g.,
based on the paths given. But the `ignore` infrastructure can't handle
that, and it's not clearly correct to me.) Normally, a gitignore file
has its patterns interpreted relative to where the gitignore file is.
This relative interpretation matters for patterns like `/foo`, which are
anchored to _some_ directory.
Previously, we would generally get the global gitignores correct because
it's most common to use ripgrep without providing a path. Thus, it
searches the current working directory. In this case, no stripping of
the paths is needed in order for the gitignore patterns to be applied
directly.
But if one provides an absolute path (or something else) to ripgrep to
search, the paths aren't stripped correctly. Indeed, in the core, I had
just given up and not provided a "root" path to these global gitignores.
So it had no hope of getting this correct.
We fix this assigning the CWD to the `Gitignore` values created from
global gitignore files. This was a painful thing to do because we'd
ideally:
1. Call `std::env::current_dir()` at most once for each traversal.
2. Provide a way to avoid the library calling `std::env::current_dir()`
at all. (Since this is global process state and folks might want to
set it to different values for $reasons.)
The `ignore` crate's internals are a total mess. But I think I've
addressed the above 2 points in a semver compatible manner.
Fixes#3179
Building it can consume resources. In particular, on Windows, the
various binaries are eagerly resolved.
I think this originally wasn't done. The eager resolution was added
later for security purposes. But the "eager" part isn't actually
necessary.
It would probably be better to change the decompression reader to do
lazy resolution only when the binary is needed. But this will at least
avoid doing anything when the `-z/--search-zip` flag isn't used. But
when it is, ripgrep will still eagerly resolve all possible binaries.
Fixes#2111
This finishes what I started in commit
a6e0be3c90.
Specifically, the `max_matches` configuration has been moved to the
`grep-searcher` crate and *removed* from the `grep-printer` crate. The
commit message has the details for why we're doing this, but the short
story is to fix#3076.
Note that this is a breaking change for `grep-printer`, so this will
require a semver incompatible release.
In #2843, it's requested that these trailing contextual lines should be
displayed as non-matching because they exceed the limit. While
reasonable, I think that:
1. This would be a weird complication to the implementation.
2. This would overall be less intuitive and more complex. Today, there
is never a case where ripgrep emits a matching line in a way where
the match isn't highlighted.
Closes#2843
Specifically, it is only equivalent to `--count-matches` when the
pattern(s) given can match over multiple lines.
We could have instead made `--multiline --count` always equivalent to
`--multiline --count-matches`, but this seems plausibly less useful.
Indeed, I think it's generally a good thing that users can enable
`-U/--multiline` but still use patterns that only match a single line.
Changing how that behaves would I think be more surprising.
Either way we slice this, it's unfortunately pretty subtle.
Fixes#2852
This is unfortunate, but is a known bug that I don't think can be fixed
without either making `-l/--files-with-matches` much slower or changing
what "binary filtering" means by default.
In this PR, we document this inconsistency since users may find it quite
surprising. The actual work-around is to disable binary filtering with
the `--binary` flag.
We add a test confirming this behavior.
Closes#3131
Maybe 2024 changes?
Note that we now set `edition = "2024"` explicitly in `rustfmt.toml`.
Without this, it seems like it's possible in some cases for rustfmt to
run under an older edition's style. Not sure how though.
This is a bit of a brutal change, but I believe is necessary in order to
fix a bug in how we handle the "max matches" limit in multi-line mode
while simultaneously handling context lines correctly.
The main problem here is that "max matches" refers to the shorter of
"one match per line" or "a single match." In typical grep, matches
*can't* span multiple lines, so there's never a difference. But in
multi-line mode, they can. So match counts necessarily must be handled
differently for multi-line mode.
The printer was previously responsible for this. But for $reasons, the
printer is fundamentally not in charge of how matches are found and
reported.
See my comments in #3094 for even more context.
This is a breaking change for `grep-printer`.
Fixes#3076, Closes#3094
Apparently, if we don't do this, some roff renderers with use a special
Unicode hyphen. That in turn makes searching a man page not work as one
would expect.
Fixes#3140
The goal is to make the completion for `rg --hyperlink-format v<TAB>`
work in the fish shell.
These are not exhaustive (the user can also specify custom formats).
This is somewhat unfortunate, but is probably better than not doing
anything at all.
The `grep+` value necessitated a change to a test.
Closes#3096
This exports a new `HyperlinkAlias` type in the `grep-printer` crate.
This includes a "display priority" with each alias and a function for
getting all supported aliases from the crate.
This should hopefully make it possible for downstream users of this
crate to include a list of supported aliases in the documentation.
Closes#3103
Previously, `Quiet` mode in the summary printer always acted like
"print matching paths," except without the printing. This happened even
if we wanted to "print non-matching paths." Since this only afflicted
quiet mode, this had the effect of flipping the exit status when
`--files-without-match --quiet` was used.
Fixes#3108, Ref #3118
Previously, when `file` is empty (literally empty, as in, zero byte),
`rg -f file` and `rg -vf file` would behave identically. This is odd
and also doesn't match how GNU grep behaves. It's also not logically
correct. An empty file means _zero_ patterns which is an empty set. An
empty set matches nothing. Inverting the empty set should result in
matching everything.
This was because of an errant optimization that lets ripgrep quit early
if it can statically detect that no matches are possible.
Moreover, there was *also* a bug in how we constructed the PCRE2 pattern
when there are zero patterns. PCRE2 doesn't have a concept of sets of
patterns (unlike the `regex` crate), so we need to fake it with an empty
character class.
Fixes#1332, Fixes#3001, Closes#3041
This adds a `replacement` field to each submatch object in the JSON
output. In effect, this extends the `-r/--replace` flag so that it works
with `--json`.
This adds a new field instead of replacing the match text (which is how
the standard printer works) for maximum flexibility. This way, consumers
of the JSON output can access the original match text (and always rely
on it corresponding to the original match text) while also getting the
replacement text without needing to do the replacement themselves.
Closes#1872, Closes#2883
The fish completions now also pay attention to the configuration file
to determine whether to suggest negation options and not just to the
current command line.
This doesn't cover all edge cases. For example the config file is
cached, and so changes may not take effect until the next shell
session. But the cases it doesn't cover are hopefully very rare.
Closes#2708
These all seem pretty straight-forward. Compared with #2706, I dropped
the changes to the atomic orderings used in `ignore` because I haven't
had time to think through that carefully. But the ops in this PR seem
fine.
Closes#2706
Previously, you needed to save the completion script to a file and
then source it. Now, you can dynamically source completions in zsh by
running
$ source <(rg --generate complete-zsh)
Before this commit, you would get an error after step 1.
After this commit, it should work as expected.
We also improve the FAQ item for zsh completions.
Fixes#2956
This feature causes nothing but problems and is frequently broken. The
only optimization it was enabling were SIMD optimizations for
transcoding. In particular, for UTF-16 transcoding. This is performed by
the [`encoding_rs`](https://github.com/hsivonen/encoding_rs) crate,
which specifically uses unstable portable SIMD APIs instead of the
stable non-portable SIMD APIs.
SIMD optimizations that apply to search have long been making use of
stable APIs, and are automatically enabled when your target supports
them. This is, IMO, the correct user experience and one that
`encoding_rs` refuses to support. I'm done dealing with it, so
transcoding will only use scalar code until the SIMD optimizations in
`encoding_rs` work on stable. (This doesn't mean that `encoding_rs` has
to change. This could also be fixed by stabilizing `std::simd`.)
Fixes#2748
In effect, we switch from `path.is_file()` to `!path.is_dir()`. In cases
where process substitution is used, for example, the path can actually
have type "fifo" instead of "file." Even if it's a fifo, we want to
treat it as-if it were a file. The real key here is that we basically
always want to consider a lone argument as a file so long as we know it
isn't a directory. Because a directory is the only thing that will
causes us to (potentially) search more than one thing.
Fixes#2736
- Stop using `-n __fish_use_subcommand`. This had the effect of
ignoring options if a positional argument has already been given, but
that's not how ripgrep works.
- Only suggest negation options if the option they're negating is
passed (e.g., only complete `--no-pcre2` if `--pcre2` is present). The
zsh completions already do this.
- Take into account whether an option takes an argument. If an option
is not a switch then it won't suggest further options until the
argument is given, e.g. `-C<tab>` won't suggest options but `-i<tab>`
will.
- Suggest correct arguments for options. We already completed a fixed
set of choices where available, but now we go further:
- Filenames are only suggested for options that take filenames.
- `--pre` and `--hostname-bin` suggest binaries from `$PATH`.
- `-t`/`--type`/&c use `--type-list` for suggestions, like in zsh,
with a preview of the glob patterns.
- `--encoding` uses a hardcoded list extracted from the zsh
completions. This has been refactored into a separate file, and the
range globs (`{1..5}`) replaced by comma globs (`{1,2,3,4,5}`) since
those work in both shells. I verified that this produces the same
list as before in zsh, and the same list in fish (albeit in a
different order).
PR #2684
This is an embarrassing oversight. A `todo!()` actually made its way
into a release! Oof.
This was working in ripgrep 13, but I had redone some aspects of sorting
and this just got left undone.
Fixes#2664
And also, negated options don't take arguments.
Specifically, the fish completion generator currently forgets to add
`-l` to negation options, leading to a list of these errors:
complete: too many arguments
~/.config/fish/completions/rg.fish (line 146):
complete -c rg -n '__fish_use_subcommand' no-sort-files -d '(DEPRECATED) Sort results by file path.'
^
from sourcing file ~/.config/fish/completions/rg.fish
(Type 'help complete' for related documentation)
To reproduce, run `fish -c 'rg --generate=complete-fish | source'`.
It also potentially suggests a list of choices for negation options,
even though those never take arguments. That case doesn't occur with
any of the current options but it's an easy fix.
Fixes#2659, Closes#2655
We look for similar flag names via Jaccard index on ngrams. In my
experience this tends to work better than Levenshtein or other edit
distance based metrics. Principally because it allows for out-of-order
suggestions. For example, --case-smart will result in a suggestion for
--smart-case, even though the edit distance between them is pretty big.
This is something Clap did for us. I initially thought it wasn't
necessary to add this back in, but I realized it wouldn't be much work
and might actually be helpful to folks.
Basically, unless the -a/--text flag is given, it is generally always an
error to search for an explicit NUL byte because the binary detection
will prevent it from matching.
Fixes#1838