In #2843, it's requested that these trailing contextual lines should be
displayed as non-matching because they exceed the limit. While
reasonable, I think that:
1. This would be a weird complication to the implementation.
2. This would overall be less intuitive and more complex. Today, there
is never a case where ripgrep emits a matching line in a way where
the match isn't highlighted.
Closes#2843
Specifically, it is only equivalent to `--count-matches` when the
pattern(s) given can match over multiple lines.
We could have instead made `--multiline --count` always equivalent to
`--multiline --count-matches`, but this seems plausibly less useful.
Indeed, I think it's generally a good thing that users can enable
`-U/--multiline` but still use patterns that only match a single line.
Changing how that behaves would I think be more surprising.
Either way we slice this, it's unfortunately pretty subtle.
Fixes#2852
This is unfortunate, but is a known bug that I don't think can be fixed
without either making `-l/--files-with-matches` much slower or changing
what "binary filtering" means by default.
In this PR, we document this inconsistency since users may find it quite
surprising. The actual work-around is to disable binary filtering with
the `--binary` flag.
We add a test confirming this behavior.
Closes#3131
The underlying issue here is #2528, which was introduced by commit
efd9cfb2fc which fixed another bug.
For the specific case of "did a file match," we can always assume the
match count is at least 1 here. But this doesn't fix the underlying
problem.
Fixes#3139
Maybe 2024 changes?
Note that we now set `edition = "2024"` explicitly in `rustfmt.toml`.
Without this, it seems like it's possible in some cases for rustfmt to
run under an older edition's style. Not sure how though.
When enabled, patterns like `[abc`, `[]`, `[!]` are treated as if the
opening `[` is just a literal. This is in contrast the default behavior,
which prioritizes better error messages, of returning a parse error.
Fixes#3127, Closes#3145
This is a bit of a brutal change, but I believe is necessary in order to
fix a bug in how we handle the "max matches" limit in multi-line mode
while simultaneously handling context lines correctly.
The main problem here is that "max matches" refers to the shorter of
"one match per line" or "a single match." In typical grep, matches
*can't* span multiple lines, so there's never a difference. But in
multi-line mode, they can. So match counts necessarily must be handled
differently for multi-line mode.
The printer was previously responsible for this. But for $reasons, the
printer is fundamentally not in charge of how matches are found and
reported.
See my comments in #3094 for even more context.
This is a breaking change for `grep-printer`.
Fixes#3076, Closes#3094
qrc[1] are the resource files for data related to user interfaces, and
ui[2] is the extension that the Qt Designer generates, for Widget based
projects.
Note that the initial PR used `ui` as a name for `*.ui`, but this seems
overly general. Instead, we use `qui` here instead.
Closes#3141
[1]: https://doc.qt.io/qt-6/resources.html
[2]: https://doc.qt.io/qt-6/uic.html
Apparently, if we don't do this, some roff renderers with use a special
Unicode hyphen. That in turn makes searching a man page not work as one
would expect.
Fixes#3140
The goal is to make the completion for `rg --hyperlink-format v<TAB>`
work in the fish shell.
These are not exhaustive (the user can also specify custom formats).
This is somewhat unfortunate, but is probably better than not doing
anything at all.
The `grep+` value necessitated a change to a test.
Closes#3096
This exports a new `HyperlinkAlias` type in the `grep-printer` crate.
This includes a "display priority" with each alias and a function for
getting all supported aliases from the crate.
This should hopefully make it possible for downstream users of this
crate to include a list of supported aliases in the documentation.
Closes#3103
Previously, `Quiet` mode in the summary printer always acted like
"print matching paths," except without the printing. This happened even
if we wanted to "print non-matching paths." Since this only afflicted
quiet mode, this had the effect of flipping the exit status when
`--files-without-match --quiet` was used.
Fixes#3108, Ref #3118
For users of globset who already have a `Vec<Glob>` (or similar),
the current API requires them to iterate over their `Vec<Glob>`,
calling `GlobSetBuilder::add` for each `Glob`, thus constructing a new
`Vec<Glob>` internal to the GlobSetBuilder. This makes the consuming
code unnecessarily verbose. (There is unlikely to be any meaningful
performance impact of this, however, since the cost of allocating a new
`Vec` is likely marginal compared to the cost of glob compilation.)
Instead of taking a `&[Glob]`, we accept an iterator of anything that
can be borrowed as a `&Glob`. This required some light refactoring of
the constructor, but nothing onerous.
Closes#3066
[Gleam] is a general-purpose, concurrent, functional high-level
programming language that compiles to Erlang or JavaScript source code.
Closes#3105
[Gleam]: https://gleam.run/
This PR adds llvm to the list of default types, matching files with
extension ll which is used widely for the textual form of LLVM's
Intermediate Representation.
Ref: https://llvm.org/docs/LangRef.htmlCloses#3079
`.env` or "dotenv" is used quite often in cross-compilation/embedded
development environments to load environment variables, define shell
functions or even to execute shell commands. Just like `.zshenv` in
this list, I think `.env` should also be added here.
Closes#3063
I'm not sure why I did this, but I think I was trying to imitate the
contract of [`std::path::Path::file_name`]:
> Returns None if the path terminates in `..`.
But the status quo clearly did not implement this. And as a result, if
you have a glob that ends in a `.`, it was instead treated as the empty
string (which only matches the empty string).
We fix this by implementing the semantic from the standard library
correctly.
Fixes#2990
[`std::path::Path::file_name`]: https://doc.rust-lang.org/std/path/struct.Path.html#method.file_name
This is already technically possible to do on Unix by going through
`OsStr` and `&[u8]` conversions. This just makes it easier to do in all
circumstances and is reasonable to intentionally support.
Closes#2954, Closes#2955
Specifically, if the search was instructed to quit early, we might not
have correctly marked the number of bytes consumed.
I don't think this bug occurs when memory maps are used to read the
haystack.
Closes#2944
Kconfig files are used to represent the configuration database of
Kbuild build system. Kbuild is developed as part of the Linux kernel.
There are numerous other users including OpenWrt and U-Boot.
Ref: https://docs.kernel.org/kbuild/index.htmlCloses#2942
I'm not sure why it was written with `map` previously. It almost looks
like I was trying to make it deref, but apparently that isn't needed.
Closes#2941
This existed before the `#[non_exhaustive]` attribute was a thing. Since
it was not part of the API of the crate, it is not a semver incompatible
change.
For example, `**/{node_modules/**/*/{ts,js},crates/**/*.{rs,toml}`.
I originally didn't add this I think for implementation simplicity, but
it turns out that it really isn't much work to do. There might have also
been some odd behavior in the regex engine for dealing with empty
alternates, but that has all been long fixed.
Closes#3048, Closes#3112
I was somewhat unsure about adding this, since `.svelte.ts` seems
primarily like a TypeScript file and it could be surprising to show up
in a search for Svelte files. In particular, ripgrep doesn't know how to
only search the Svelte stuff inside of a `.svelte.ts` file, so you could
end up with lots of false positives.
However, I was swayed[1] by the argument that the extension does
actually include `svelte` in it, so maybe this is fine. Please open an
issue if this change ends up being too annoying for most users.
Closes#2874, Closes#2909
[1]: https://github.com/BurntSushi/ripgrep/issues/2874#issuecomment-3126892931
Previously, when `file` is empty (literally empty, as in, zero byte),
`rg -f file` and `rg -vf file` would behave identically. This is odd
and also doesn't match how GNU grep behaves. It's also not logically
correct. An empty file means _zero_ patterns which is an empty set. An
empty set matches nothing. Inverting the empty set should result in
matching everything.
This was because of an errant optimization that lets ripgrep quit early
if it can statically detect that no matches are possible.
Moreover, there was *also* a bug in how we constructed the PCRE2 pattern
when there are zero patterns. PCRE2 doesn't have a concept of sets of
patterns (unlike the `regex` crate), so we need to fake it with an empty
character class.
Fixes#1332, Fixes#3001, Closes#3041
Ideally we'd have a compact impl for `GlobSet` too, but that's a lot
more work. In particular, the constituent types don't all store the
original pattern string, so that would need to be added.
Closes#3026
This PR adds the .rake extension to the Ruby type. It's a pretty common
file extension in Rails apps—in my experience, the Rakefile is often
pretty empty and only sets some stuff up while most of the code lives
in various .rake files.
See: https://ruby.github.io/rake/doc/rakefile_rdoc.html#label-Multiple+Rake+FilesCloses#2921
This adds a `replacement` field to each submatch object in the JSON
output. In effect, this extends the `-r/--replace` flag so that it works
with `--json`.
This adds a new field instead of replacing the match text (which is how
the standard printer works) for maximum flexibility. This way, consumers
of the JSON output can access the original match text (and always rely
on it corresponding to the original match text) while also getting the
replacement text without needing to do the replacement themselves.
Closes#1872, Closes#2883