In my fix for #3184, I actually had two fixes. One was a tweak to how we
read data and the other was a tweak to how we determined how much of the
buffer we needed to keep around. It turns out that fixing #3184 only
required the latter fix, found in commit
d4b77a8d89. The former fix also helped the
specific case of #3184, but it ended up regressing `--line-buffered`.
Specifically, previous to 8c6595c215 (the
first fix), we would do one `read` syscall. This call might not fill our
caller provided buffer. And in particular, `stdin` seemed to fill fewer
bytes than reading from a file. So the "fix" was to put `read` in a loop
and keep calling it until the caller provided buffer was full or until
the stream was exhausted. This helped alleviate #3184 by amortizing
`read` syscalls better.
But of course, in retrospect, this change is clearly contrary to how
`--line-buffered` works. We specifically do _not_ want to wait around
until the buffer is full. We want to read what we can, search it and
move on.
So this reverts the first fix but leaves the second, which still
keeps #3184 fixed and also fixes#3194 (the regression).
This reverts commit 8c6595c215.
Fixes#3194
There seems to be a modest improvement on some workloads:
```
$ time rg -co '\w+' sixteenth.txt
158520346
real 8.457
user 8.426
sys 0.020
maxmem 779 MB
faults 0
$ time rg-lto -co '\w+' sixteenth.txt
158520346
real 8.200
user 8.178
sys 0.012
maxmem 778 MB
faults 0
```
I've somewhat reversed course on my previous thoughts here. The
improvement isn't much, but the hit to compile times in CI isn't
terrible. Mostly I'm doing this out of "good sense," and I think it's
generally unlikely to make it more difficult for me to diagnose
performance problems. (Since I still use the default `release` profile
locally, since it's about an order of magnitude quicker to compile.)
Ref #325, Ref #413, Ref #1187, Ref #1255
Their CI workflows broke for different reasons.
I perceive these as niche platforms that aren't worth blocking
a release on. And not worth my time investigating CI problems.
Somehow, the JSON printer seems to have never emitted correct summary
statistics. And I believe #3178 is the first time anyone has ever
reported it. I believe this bug has persisted for years. That's
surprising.
Anyway, the problem here was that we were bailing out of `finish()` on
the sink if we weren't supposed to print anything. But we bailed out
before we tallied our summary statistics. Obviously we shouldn't do
that.
Fixes#3178
The `ignore` crate currently handles two different kinds of "global"
gitignore files: gitignores from `~/.gitconfig`'s `core.excludesFile`
and gitignores passed in via `WalkBuilder::add_ignore` (corresponding to
ripgrep's `--ignore-file` flag).
In contrast to any other kind of gitignore file, these gitignore files
should have their patterns interpreted relative to the current working
directory. (Arguably there are other choices we could make here, e.g.,
based on the paths given. But the `ignore` infrastructure can't handle
that, and it's not clearly correct to me.) Normally, a gitignore file
has its patterns interpreted relative to where the gitignore file is.
This relative interpretation matters for patterns like `/foo`, which are
anchored to _some_ directory.
Previously, we would generally get the global gitignores correct because
it's most common to use ripgrep without providing a path. Thus, it
searches the current working directory. In this case, no stripping of
the paths is needed in order for the gitignore patterns to be applied
directly.
But if one provides an absolute path (or something else) to ripgrep to
search, the paths aren't stripped correctly. Indeed, in the core, I had
just given up and not provided a "root" path to these global gitignores.
So it had no hope of getting this correct.
We fix this assigning the CWD to the `Gitignore` values created from
global gitignore files. This was a painful thing to do because we'd
ideally:
1. Call `std::env::current_dir()` at most once for each traversal.
2. Provide a way to avoid the library calling `std::env::current_dir()`
at all. (Since this is global process state and folks might want to
set it to different values for $reasons.)
The `ignore` crate's internals are a total mess. But I think I've
addressed the above 2 points in a semver compatible manner.
Fixes#3179
This was a crazy subtle bug where ripgrep could slow down exponentially
as increasingly larger values of `-A/--after-context` were used. But,
interestingly, this would only occur when searching `stdin` and _not_
when searching the same data as a regular file.
This confounded me because ripgrep, pretty early on, erases the
difference between searching a single file and `stdin`. So it wasn't
like there were different code paths. And I mistakenly assumed that they
would otherwise behave the same as they are just treated as streams.
But... it turns out that running `read` on a `stdin` versus a regular
file seems to behave differently. At least on my Linux system, with
`stdin`, `read` never seems to fill the buffer with more than 64K. But
with a regular file, `read` pretty reliably fills the caller's buffer
with as much space as declared.
Of course, it is expected that `read` doesn't *have* to fill up the
caller's buffer, and ripgrep is generally fine with that. But when
`-A/--after-context` is used with a very large value---big enough that
the default buffer capacity is too small---then more heap memory needs
to be allocated to correctly handle all cases. This can result in
passing buffers bigger than 64K to `read`.
While we *correctly* handle `read` calls that don't fill the buffer,
it turns out that if we don't fill the buffer, then we get into a
pathological case where we aren't processing as many bytes as we could.
That is, because of the `-A/--after-context` causing us to keep a lot of
bytes around while we roll the buffer and because reading from `stdin`
gives us fewer bytes than normal, we weren't amortizing our `read` calls
as well as we should have been. Indeed, our buffer capacity increases
specifically take this amortization into account, but we weren't taking
advantage of it.
We fix this by putting `read` into an inner loop that ensures our
buffer gets filled up. This fixes the performance bug:
```
$ (time rg ZQZQZQZQZQ bigger.txt --no-mmap -A9999) | wc -l
real 1.330
user 0.767
sys 0.559
maxmem 29 MB
faults 0
10000
$ cat bigger.txt | (time rg ZQZQZQZQZQ --no-mmap -A9999) | wc -l
real 2.355
user 0.860
sys 0.613
maxmem 29 MB
faults 0
10000
$ (time rg ZQZQZQZQZQ bigger.txt --no-mmap -A99999) | wc -l
real 3.636
user 3.091
sys 0.537
maxmem 29 MB
faults 0
100000
$ cat bigger.txt | (time rg ZQZQZQZQZQ --no-mmap -A99999) | wc -l
real 4.918
user 3.236
sys 0.710
maxmem 29 MB
faults 0
100000
$ (time rg ZQZQZQZQZQ bigger.txt --no-mmap -A999999) | wc -l
real 5.430
user 4.666
sys 0.750
maxmem 51 MB
faults 0
1000000
$ cat bigger.txt | (time rg ZQZQZQZQZQ --no-mmap -A999999) | wc -l
real 6.894
user 4.907
sys 0.850
maxmem 51 MB
faults 0
1000000
```
For comparison, here is GNU grep:
```
$ cat bigger.txt | (time grep ZQZQZQZQZQ -A9999) | wc -l
real 1.466
user 0.159
sys 0.839
maxmem 29 MB
faults 0
10000
$ cat bigger.txt | (time grep ZQZQZQZQZQ -A99999) | wc -l
real 1.663
user 0.166
sys 0.941
maxmem 29 MB
faults 0
100000
$ cat bigger.txt | (time grep ZQZQZQZQZQ -A999999) | wc -l
real 1.631
user 0.204
sys 0.910
maxmem 29 MB
faults 0
1000000
```
GNU grep is still notably faster. We'll fix that in the next commit.
Fixes#3184
The abstraction boundary fuck up is the gift that keeps on giving. It
turns out that the invariant that the match would never exceed the range
given is not always true. So we kludge around it.
Also, update the CHANGELOG to include the fix for #2111.
Fixes#3180
Building it can consume resources. In particular, on Windows, the
various binaries are eagerly resolved.
I think this originally wasn't done. The eager resolution was added
later for security purposes. But the "eager" part isn't actually
necessary.
It would probably be better to change the decompression reader to do
lazy resolution only when the binary is needed. But this will at least
avoid doing anything when the `-z/--search-zip` flag isn't used. But
when it is, ripgrep will still eagerly resolve all possible binaries.
Fixes#2111
Every single call site wants to pass a path relative to the directory
the command was created for. So just make it do that automatically,
similar to `Dir::create` and friends.
Note that we skip lz4/brotli/zstd tests on RISC-V.
The CI runs RISC-V tests using cross/QEMU emulation. The decompression
tools (lz4, brotli, zstd) are x86_64 binaries on the host that cannot
execute in the RISC-V QEMU environment.
Skip these three tests at compile-time on RISC-V to avoid test failures.
The -z/--search-zip functionality itself works correctly on real RISC-V
hardware where native decompression tools are available.
PR #3165