In a prior commit, we fixed a performance problem with the -w flag by
doing a little extra work to extract literals. It turns out that using
literals in this case when the -w flag is NOT used results in a
performance regression. The reasoning is that we end up using a "fast"
regex as a prefilter when the regex engine itself uses its own
equivalent prefilter, so ripgrep ends up redoing a fair amount of work.
Instead, we only do this extra work when we know the -w flag is enabled.
We should not assume that the commondir file actually exists. If it
doesn't, then just move on. This otherwise emits an error message when
searching normal submodules, which is not OK.
This regression was introduced in #1446.
Fixes#1520
If a literal is entirely whitespace, then it's quite likely that it is
very common. So when that case occurs, just don't do (inner) literal
optimizations at all.
The regex engine may still make sub-optimal decisions here, but that's a
problem for another day.
Fixes#1087
The purpose of this flag is to force ripgrep to ignore all --ignore-file
flags (whether they come before or after --no-ignore-files).
This flag can be overridden with --ignore-files.
Fixes#1466
It doesn't really belong in the man page since it's an artifact of a
build/runtime configuration. Moreover, it inhibits reproducible builds.
Fixes#1441
This permits switching between the different regex engine modes that
ripgrep supports. The purpose of this flag is to make it easier to
extend ripgrep with additional regex engines.
Closes#1488, Closes#1502
This is in preparation for adding a new --engine flag which is intended
to eventually supplant --auto-hybrid-regex.
While there are no immediate plans to add more regex engines to ripgrep,
this is intended to make it easier to maintain a patch to ripgrep with
an additional regex engine. See #1488 for more details.
We can just ask the channel whether any work has been loaded. Normally
querying a channel for its length is a strong predictor of bugs, but in
this case, we do it before we ever attempt a `recv`, so it should work.
Kudos to @zsugabubus for suggesting this!
It turns out that the previous version wasn't quite correct. Namely, it
was possible for the following sequence to occur:
1. Consider that all workers, except for one, are `waiting`.
2. The last remaining worker finds one more job to do and sends it on
the channel.
3. One of the previously `waiting` workers wakes up from the job that
the last running worker sent, but `self.resume()` has not been
called yet.
4. The last worker, from (2), calls `get_work` and sees that the
channel has nothing on it, so it executes `self.waiting() ==
1`. Since the worker in (3) hasn't called `self.resume()` yet,
`self.waiting() == 1` evaluates to true.
5. This sets off a chain reaction that stops all workers, despite that
fact that (3) got more work (which could itself spawn more work).
The end result is that the traversal may terminate while their are still
outstanding work items to process. This problem was observed through
spurious failures in CI. I was not actually able to reproduce the bug
locally.
We fix this by changing our strategy to detect termination using a
counter. Namely, we increment the counter just before sending new work
and decrement the counter just after finishing work. In this way, we
guarantee that the counter only ever reaches 0 once there is no more
work to process.
See #1337 for more discussion. Many thanks to @zsugabubus for helping me
work through this.
The top-level listing was just getting a bit too long for my taste. So
put all of the code in one directory and shrink the large top-level mess
to a small top-level mess.
NOTE: This commit only contains renames. The subsequent commit will
actually make ripgrep build again. We do it this way with the naive hope
that this will make it easier for git history to track the renames.
Sigh.