The specific issue is that -w causes the regex to be wrapped in Unicode
word boundaries. Regrettably, Unicode word boundaries are the one thing
our regex engine can't handle well in the presence of non-ASCII text. We
work around its slowness by stripping word boundaries in some
circumstances, and using the resulting expression as a way to produce match
candidates that are then verified by the full original regex.
This doesn't fix all cases, but it should fix all cases where -w is used.
We should probably still test on it, but I'd prefer distributing exactly
one Linux binary. Since the musl build is a totally static executable,
we should prefer that.
(The right answer is to test on GNU nightly, but don't produce a release
artifact.)
If you're in a directory that has a parent .gitignore (like, your $HOME),
then it can cause ripgrep to simply not do anything depending on your
ignore rules.
There are probably other scenarios where ripgrep applies some filter that
an end user doesn't expect, so try to catch the worst case (when ripgrep
doesn't search anything).
I don't like having multiple flags do the same thing, but -u, -uu and -uuu
are much easier to remember, particularly with -uuu meaning "search
everything."
These benchmarks are exactly like the ones ran on 2016-09-17 with three
changes:
1. `pt` was added back to a few more benchmarks so that it appears any
time `sift` appears.
2. Warmup iterations was bumped from 1 to 3.
3. Actual benchmark iterations were bumped from 3 to 10.
These benchmarks took around two hours to run.
The runner now detects if commands exist and permits running incomplete
benchmarks.
Also, explicitly use Python 3 since that's what default Ubuntu 16.04 seems
to want.