I was somewhat unsure about adding this, since `.svelte.ts` seems
primarily like a TypeScript file and it could be surprising to show up
in a search for Svelte files. In particular, ripgrep doesn't know how to
only search the Svelte stuff inside of a `.svelte.ts` file, so you could
end up with lots of false positives.
However, I was swayed[1] by the argument that the extension does
actually include `svelte` in it, so maybe this is fine. Please open an
issue if this change ends up being too annoying for most users.
Closes#2874, Closes#2909
[1]: https://github.com/BurntSushi/ripgrep/issues/2874#issuecomment-3126892931
Previously, when `file` is empty (literally empty, as in, zero byte),
`rg -f file` and `rg -vf file` would behave identically. This is odd
and also doesn't match how GNU grep behaves. It's also not logically
correct. An empty file means _zero_ patterns which is an empty set. An
empty set matches nothing. Inverting the empty set should result in
matching everything.
This was because of an errant optimization that lets ripgrep quit early
if it can statically detect that no matches are possible.
Moreover, there was *also* a bug in how we constructed the PCRE2 pattern
when there are zero patterns. PCRE2 doesn't have a concept of sets of
patterns (unlike the `regex` crate), so we need to fake it with an empty
character class.
Fixes#1332, Fixes#3001, Closes#3041
Ideally we'd have a compact impl for `GlobSet` too, but that's a lot
more work. In particular, the constituent types don't all store the
original pattern string, so that would need to be added.
Closes#3026
This PR adds the .rake extension to the Ruby type. It's a pretty common
file extension in Rails apps—in my experience, the Rakefile is often
pretty empty and only sets some stuff up while most of the code lives
in various .rake files.
See: https://ruby.github.io/rake/doc/rakefile_rdoc.html#label-Multiple+Rake+FilesCloses#2921
This adds a `replacement` field to each submatch object in the JSON
output. In effect, this extends the `-r/--replace` flag so that it works
with `--json`.
This adds a new field instead of replacing the match text (which is how
the standard printer works) for maximum flexibility. This way, consumers
of the JSON output can access the original match text (and always rely
on it corresponding to the original match text) while also getting the
replacement text without needing to do the replacement themselves.
Closes#1872, Closes#2883
When searching in parallel with many more arguments than threads, the
first arguments are searched last -- unlike in the -j1 case.
This is unexpected for users who know about the parallel nature of rg
and think they can give the scheduler a hint by positioning larger
input files (L1, L2, ..) before smaller ones (█, ██). Instead, this can
result in sub-optimal thread usage and thus longer runtime (simplified
example with 2 threads):
T1: █ ██ █ █ █ █ ██ █ █ █ █ █ ██ ╠═════════════L1════════════╣
T2: █ █ ██ █ █ ██ █ █ █ ██ █ █ ╠═════L2════╣
┏━━━━┳━━━━┳━━━━┳━━━━┓
This is caused by assigning work to ┃ T1 ┃ T2 ┃ T3 ┃ T4 ┃
per-thread stacks in a round-robin ┡━━━━╇━━━━╇━━━━╇━━━━┩
manner, starting here → │ L1 │ L2 │ L3 │ L4 │ ↵
├────├────┼────┼────┤
│ s5 │ s6 │ s7 │ s8 │ ↵
├────┼────┼────┼────┤
╷ .. ╷ .. ╷ .. ╷ .. ╷
├────┼────┼────┼────┤
│ st │ su │ sv │ sw │ ↵
├────┼────┼────┼────┘
│ sx │ sy │ sz │
└────┴────┴────┘
and then processing them bottom-up: ↥ ↥ ↥ ↥
╷ .. ╷ .. ╷ .. ╷ .. ╷
This patch reverses the input order ├────┼────┼────┼────┤
so the two reversals cancel each other │ s7 │ s6 │ s5 │ L4 │ ↵
out. Now at least the first N ├────┼────┼────┼────┘
arguments, N=number-of-threads, are │ L3 │ L2 │ L1 │
processed before any others (then └────┴────┴────┘
work-stealing may happen):
T1: ╠═════════════L1════════════╣ █ ██ █ █ █ █ █ █ ██
T2: ╠═════L2════╣ █ █ ██ █ █ ██ █ █ █ ██ █ █ ██ █ █ █
(With some more shuffling T1 could always be assigned L1 etc., but
that would mostly be for optics).
Closes#2849
The *BSD build systems make use of "Makefile.inc" a lot. Make the
"make" type recognize this file by default. And more generally,
`Makefile.*` seems to be a convention, so just generalize it.
Closes#2846
This makes it so the presence of `.jj` will cause ripgrep to treat it
as a VCS directory, just as if `.git` were present. This is useful for
ripgrep's default behavior when working with jj repositories that don't
have a `.git` but do have `.gitignore`. Namely, ripgrep requires the
presence of a VCS repository in order to respect `.gitignore`.
We don't handle clone-specific exclude rules for jj repositories without
`.git` though. It seems it isn't 100% set yet where we can find
those[1].
Closes#2842
[1]: https://github.com/BurntSushi/ripgrep/pull/2842#discussion_r2020076722
The previous code deleted too many parts of the path when constructing
the absolute path, resulting in a shortened final path. This patch
creates the correct absolute path by only removing the necessary parts.
Fixes#829, Fixes#2731, Fixes#2747, Fixes#2778, Fixes#2836, Fixes#2933Closes#2933
The fish completions now also pay attention to the configuration file
to determine whether to suggest negation options and not just to the
current command line.
This doesn't cover all edge cases. For example the config file is
cached, and so changes may not take effect until the next shell
session. But the cases it doesn't cover are hopefully very rare.
Closes#2708
These all seem pretty straight-forward. Compared with #2706, I dropped
the changes to the atomic orderings used in `ignore` because I haven't
had time to think through that carefully. But the ops in this PR seem
fine.
Closes#2706
Previously, you needed to save the completion script to a file and
then source it. Now, you can dynamically source completions in zsh by
running
$ source <(rg --generate complete-zsh)
Before this commit, you would get an error after step 1.
After this commit, it should work as expected.
We also improve the FAQ item for zsh completions.
Fixes#2956
In some rare cases, it was possible for ripgrep's inner literal detector
to extract a set of literals that could produce a false negative. #2884
gives an example: `(?i:e.x|ex)`. In this case, the set extracted can be
discovered by running `rg '(?i:e.x|ex) --trace`:
Seq[E("EX"), E("Ex"), E("eX"), E("ex")]
This extraction leads to building a multi-substring matcher for `EX`,
`Ex`, `eX` and `ex`. Searching the haystack `e-x` produces no match,
and thus, ripgrep shows no matches. But the regex `(?i:e.x|ex)` matches
`e-x`.
The issue at play here was that when two extracted literal sequences
were unioned, we were correctly unioning their "prefix" attribute.
And this in turn leads to those literal sequences being combined
incorrectly via cross product. This case in particular triggers it
because two different optimizations combine to produce an incorrect
result. Firslty, the regex has a common prefix extracted and is
rewritten as `(?i:e(?:.x|x))`. Secondly, the `x` in the first branch of
the alternation has its `prefix` attribute set to `false` (correctly),
which means it can't be cross producted with another concatenation. But
in this case, it is unioned with the `x` from the second branch, and
this results in the union result having `prefix` set to `true`. This
in turn pops up and lets it get cross producted with the `e` prefix,
producing an incorrect literal sequence.
We fix this by changing the implementation of `union` to return
`prefix` set to `true` only when *both* literal sequences being unioned
have `prefix` set to `true`.
Doing this exposed a second bug that was present, but was purely
cosmetic: the extracted literals in this case, after the fix, are
`X` and `x`. They were considered "exact" (i.e., lead to a match),
but of course they are not. Observing an `X` or an `x` does not mean
there is a match. This was fixed by making `choose` always return
an inexact literal sequence. This is perhaps too conservative in
aggregate in some cases, but always correct. The idea here is that if
one is choosing between two concatenations, then it is likely the case
that the sequence returned should be considered inexact. The issue
is that this can lead to avoiding cross products in some cases that
would otherwise be correct. This is bad because it means extracting
shorter literals in some cases. (In general, the longer the literal the
better.) But we prioritize correctness for now and fix it. You can see
a few tests where this shortens some extracted literals.
Fixes#2884