In regex 1.10.0, the \b{start-half} and \b{end-half} assertions were
introduced to improve how the -w/--word-regexp flag handles patterns
containing non-word characters. While ripgrep's CLI help text was
updated to mention this syntax during the lexopt transition (082245d),
it omitted the rationale, potentially leaving users puzzled by this
casual mention of a highly specific, non-standard regex feature.
Furthermore, GUIDE.md was still outdated, incorrectly claiming
that -w wraps patterns in standard \b boundaries,
further compounding the inconsistency and potential for confusion.
This commit updates GUIDE.md to reflect the actual underlying regex
employed by ripgrep when the -w/--word-regexp flag is used.
It also adds a brief explanation to both the guide and the manpage
acknowledging the non-standard nature of these half-boundary markers,
and explaining what they're meant for.
PR #3279
It seems to be common that folks either don't understand how to set
environment variables in a way that propagates to subprocesses, or just
forget to include `export` and think ripgrep is somehow broken.
While `RIPGREP_CONFIG_PATH` not being set is not an error, it still
seems useful to log a debug message when it isn't. This should hopefully
provide a clue that, no, ripgrep isn't broken. It just doesn't see the
environment variable.
Ref #3277
* tests: fix cmd_exists for QEMU environments
QEMU user-mode has a bug where posix_spawn returns success even when
the command doesn't exist. The child exits with 127, but the parent
thinks it succeeded.
Change cmd_exists to check if the command actually ran successfully
(exit code 0), not just if spawn returned Ok.
This fixes compression tests on riscv64 and other QEMU-emulated
architectures.
Ref https://github.com/rust-lang/rust/issues/90825
* tests: remove riscv64 skip for compression tests
Remove the cfg guards that disabled lz4, brotli, and zstd tests on
riscv64. These now work with the QEMU fix.
I was comparing the work being done by fd and find and noticed (with
`strace -f -c -S` calls) that fd was doing a ton of failed `statx`
calls. Upon closer inspection it was stating `.jj` even though I
was passing `--no-ignore`. Eventually I turned up this check in
`Ignore::add_child_path` that was doing stat on `.jj` regardless of
whether the options request it.
With this patch it'll only stat `.jj` if that's relevant to the query.
PR #3212
In my fix for #3184, I actually had two fixes. One was a tweak to how we
read data and the other was a tweak to how we determined how much of the
buffer we needed to keep around. It turns out that fixing #3184 only
required the latter fix, found in commit
d4b77a8d89. The former fix also helped the
specific case of #3184, but it ended up regressing `--line-buffered`.
Specifically, previous to 8c6595c215 (the
first fix), we would do one `read` syscall. This call might not fill our
caller provided buffer. And in particular, `stdin` seemed to fill fewer
bytes than reading from a file. So the "fix" was to put `read` in a loop
and keep calling it until the caller provided buffer was full or until
the stream was exhausted. This helped alleviate #3184 by amortizing
`read` syscalls better.
But of course, in retrospect, this change is clearly contrary to how
`--line-buffered` works. We specifically do _not_ want to wait around
until the buffer is full. We want to read what we can, search it and
move on.
So this reverts the first fix but leaves the second, which still
keeps #3184 fixed and also fixes#3194 (the regression).
This reverts commit 8c6595c215.
Fixes#3194