1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2025-01-19 05:49:14 +02:00

161 Commits

Author SHA1 Message Date
Austin Wise
46d0130597 cargo: statically link binary on Windows/MSVC
Before this change, rg.exe depended on vcruntime140.dll, which does not
exist on a fresh install of Windows.

Closes #1613
2021-05-31 21:51:18 -04:00
Andres Suarez
7534d5144f globset: fix recursive suffix over matching
Previous, 'foo/**' would match 'foo', but it shouldn't have. In this
case, not matching 'foo' is what is documented and also seems consistent
with other recursive globbing implementations (like that in zsh).

This also updates the prefix extractor to pull 'foo/' out of 'foo/**'.

Closes #1756
2021-05-31 21:51:18 -04:00
Richard Khoury
a28e664abd ignore: check ignore rules before issuing stat calls
This seems like an obvious optimization but becomes critical when
filesystem operations even as simple as stat can result in significant
overheads; an example of this was a bespoke filesystem layer in Windows
that hosted files remotely and would download them on-demand when
particular filesystem operations occurred. Users of this system who
ensured correct file-type fileters were being used could still get
unnecessary file access resulting in large downloads.

Fixes #1657, Closes #1660
2021-05-31 21:51:18 -04:00
Pen Tree
0ca96e004c printer: fix context bug when --max-count is used
In the case where after-context is requested with a match count limit,
we need to be careful not to reset the state tracking the remaining
context lines.

Fixes #1380, Closes #1642
2021-05-31 21:51:18 -04:00
Alessandro Menezes
2295061e80 searcher: do UTF-8 BOM sniffing like UTF-16
Previously, we were only looking for the UTF-16 BOM for determining
whether to do transcoding or not. But we should also look for the UTF-8
BOM as well.

Fixes #1638, Closes #1697
2021-05-31 21:51:18 -04:00
Raimon Grau
53c4855517 ignore/types: add red
See: https://www.red-lang.org/

Closes #1663
2021-05-31 21:51:18 -04:00
Simon Morgan
121e0135c1 ignore/types: replace duplicate glob with *.aspx.vb
*.aspx.cs was listed twice and the VB variant is missing.

Closes #1683
2021-05-31 21:51:18 -04:00
João Marcos
4566882521 cli: add -. as short option for --hidden
This is somewhat non-standard, but it seems nice on the surface: short
flag names are in short supply, --hidden is probably somewhat common and
-. has an obvious connection with how hidden files are named on Unix.

Closes #1680
2021-05-31 21:51:18 -04:00
Andrew Gallant
12dd455ee9 printer: fix \r\n line terminator handling
This fixes a bug where it was assumed that 'is_suffix' when CRLF
handling was enabled mean that '\r\n' was present. But that's not the
case, and it is intentional that 'is_suffix' only looks for '\n'. (Which
is why #1803 wasn't taken, which tries to fix this by changing
'is_suffix'.)

Fixes #1765, Closes #1803
2021-05-31 21:51:18 -04:00
goto-engineering
e6cac8b119 cli: print warning if nothing was searched
This was once part of ripgrep, but at some point, was unintentionally
removed. The value of this warning is that since ripgrep tries to be
"smart" by default, it can be surprising if it doesn't search certain
things. This warning covers the case when ripgrep searches *nothing*,
which happens somewhat more frequently than you might expect. e.g., If
you're searching within an ignore directory.

Note that for now, we only print this message when the user has not
supplied any explicit paths. It's not clear that we want to print this
otherwise, and in particular, it seems that the message shows up too
eagerly. e.g., 'rg foo does-not-exist' will both print an error about
'does-not-exist' not existing, *and* the message about no files being
searched, which seems annoying in this case. We can always refine this
logic later.

Fixes #1404, Closes #1762
2021-05-31 21:51:18 -04:00
Ilya Grigoriev
51d2db7f19 doc: document '{a,b}' glob syntax
This syntax does not exist in `git`, so it is not documented in `man
gitignore`. There is a question of whether it *should* exist, but as
long as it does, it should be documented somewhere.

See also:
https://github.com/BurntSushi/ripgrep/issues/1221
https://github.com/BurntSushi/ripgrep/issues/1368

Closes #1816
2021-05-31 21:51:18 -04:00
Jade
26a29c750e doc: clarify --files-with-matches and --files-without-match
Ref https://github.com/BurntSushi/ripgrep/issues/103#issuecomment-763083510

Closes #1869
2021-05-31 21:51:18 -04:00
Andrew Gallant
a77b914e7a args: make --passthru and -A/-B/-C override each other
Fixes #1868
2021-05-31 21:51:18 -04:00
Andrew Gallant
2e2af50a4d
doc: add vulnerability report docs
Fixes #1773
2021-05-29 09:53:18 -04:00
Andrew Gallant
229d1a8d41
cli: fix arbitrary execution of program bug
This fixes a bug only present on Windows that would permit someone to
execute an arbitrary program if they crafted an appropriate directory
tree. Namely, if someone put an executable named 'xz.exe' in the root of
a directory tree and one ran 'rg -z foo' from the root of that tree,
then the 'xz.exe' executable in that tree would execute if there are any
'xz' files anywhere in the tree.

The root cause of this problem is that 'CreateProcess' on Windows will
implicitly look in the current working directory for an executable when
it is given a relative path to a program. Rust's standard library allows
this behavior to occur, so we work around it here. We work around it by
explicitly resolving programs like 'xz' via 'PATH'. That way, we only
ever pass an absolute path to 'CreateProcess', which avoids the implicit
behavior of checking the current working directory.

This fix doesn't apply to non-Windows systems as it is believed to only
impact Windows. In theory, the bug could apply on Unix if '.' is in
one's PATH, but at that point, you reap what you sow.

While the extent to which this is a security problem isn't clear, I
think users generally expect to be able to download or clone
repositories from the Internet and run ripgrep on them without fear of
anything too awful happening. Being able to execute an arbitrary program
probably violates that expectation. Therefore, CVE-2021-3013[1] was
created for this issue.

We apply the same logic to the --pre command, since the --pre command is
likely in a user's config file and it would be surprising for something
that the user is searching to modify which preprocessor command is used.

The --pre and -z/--search-zip flags are the only two ways that ripgrep
will invoke external programs, so this should cover any possible
exploitable cases of this bug.

[1] - https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3013
2021-05-29 09:36:48 -04:00
Andrew Gallant
8ec6ef373f
changelog: sync with commits since last release
I'm hoping to get a release out soon, and this is the first step.
2021-05-29 08:26:46 -04:00
Andrew Gallant
581a35e568
impl: fix --multiline anchored match bug
This fixes a bug where using \A or (?-m)^ in combination with
-U/--multiline would permit matches that aren't anchored to the
beginning of the file. The underlying cause was an optimization that
occurred when mmaps couldn't be used. Namely, ripgrep tries to still
read the input incrementally if it knows the pattern can't match through
a new line. But the detection logic was flawed, since it didn't account
for line anchors. This commit fixes that.

Fixes #1878, Fixes #1879
2021-05-29 07:37:28 -04:00
Andrew Gallant
94e4b8e301
printer: fix --vimgrep for multi-line mode
It turned out that --vimgrep wasn't quite getting the column of each
match correctly. Instead of printing column numbers relative to the
current line, it was printing column numbers as byte offsets relative to
where the match began. To fix this, we simply subtract the offset of the
line number from the beginning of the match. If the beginning of the
match came before the start of the current line, then there's really
nothing sensible we can do other than to use a column number of 1, which
we now document.

Interestingly, existing tests were checking that the previous behavior
was intended. My only defense is that I somehow tricked myself into
thinking it was a byte offset instead of a column number.

Kudos to @bfrg for calling this out in #1866:
https://github.com/BurntSushi/ripgrep/issues/1866#issuecomment-841635553
2021-05-15 08:27:59 -04:00
Roey Darwish Dror
020c5453a5
cli: fix stdin detection for Powershell on Unix
It seems that PowerShell uses sockets instead of FIFOs to redirect the
output between commands. So add `is_socket` to our `is_readable_stdin`
check.

This seems unlikely to cause problems and it probably more generally
correct than what we had before. In theory, it could cause problems if
it produces false positives, in which case, ripgrep will try to read
stdin when it should search the current working directory. (And this
usually winds up manifesting as ripgrep blocking forever.) But, if the
stdin handle reports itself as a socket, then it seems like we should
read it.

Fixes #1741, Closes #1742
2020-11-23 10:23:34 -05:00
Andrew Gallant
2819212f89 printer: tweak binary detection message format
This roughly matches similar changes made in GNU grep recently.
2020-11-02 10:52:51 -05:00
Josh Soref
def993bad1
spelling: fix various misspellings
These were found by the check spelling action[1] and reported
here[2].

PR #1685 

[1] - https://github.com/marketplace/actions/check-spelling
[2] - 6f02d05671 (commitcomment-42625778)
2020-09-22 10:29:16 -04:00
Andrew Gallant
e6e50054b0
doc: document cygwin path translation behavior
Kudos to @Pyker for posting more details about this.

Closes #1277
2020-09-13 09:29:28 -04:00
Martin Michlmayr
1b2c1dc675
doc: fix typos
PR #1605
2020-06-04 09:06:09 -04:00
Andrew Gallant
b1e3de246c
changelog: add empty TBD section to CHANGELOG
And update the release checklist to mention this process.
2020-05-29 09:49:45 -04:00
Andrew Gallant
a73c0a21d9
changelog: 12.1.1 2020-05-29 09:26:33 -04:00
Andrew Gallant
a700b75843
doc: clarify capture group indices
And in particular, note the special $0 index, which corresponds to the
entire match.

Fixes #1591
2020-05-21 22:22:51 -04:00
Andrew Gallant
1980630f17
doc: fix egregious markup output
We use '+++' syntax to output a literal '**' for a '--glob' example.
This '+++' syntax is pretty ugly when rendered literally via --help. We
fix this by hackily inserting the '+++' syntax for its one specific case
that we need it during man page generation.

Not ideal but it works. And --help still has some '*foo*' markup, but we
live with that for now.

Fixes #1581
2020-05-13 08:13:05 -04:00
Andrew Gallant
6162b000a3
changelog: 12.1.0 2020-05-09 11:36:44 -04:00
Andrew Gallant
b56315ea84
changelog: add #1550 to CHANGELOG 2020-05-08 23:37:17 -04:00
Andrew Gallant
e02bb6b99a changelog: add downstream notices 2020-05-08 23:24:40 -04:00
Chayoung You
16a1221fc7 doc: use asciidoctor instead of a2x
AsciiDoc development is continued under asciidoctor. See
https://github.com/asciidoc/asciidoc.

We do however fallback to a2x if asciidoctor is not present. This is to
ease migration, but at some point, it's likely that support for a2x will
be dropped.

Originally reported downstream:
https://github.com/Homebrew/linuxbrew-core/issues/19885

Closes #1544
2020-05-08 23:24:40 -04:00
Wieland Hoffmann
df7a3bfc7f grep-cli: support files compressed by compress(1)
While Linux distributions (at least Arch Linux, RHEL, Debian) do not support
compressing files with compress(1), macOS & AIX do (the utility is part of
POSIX). Additionally, gzip is able to uncompress such compressed files and
provides an `uncompress` binary.

Closes #1547
2020-05-08 23:24:40 -04:00
Andrew Gallant
0eb2501b6e doc: add a section about --pre to the GUIDE
Fixes #1252
2020-05-08 23:24:40 -04:00
Andrew Gallant
64a4dee495 cli: improve invalid UTF-8 pattern error message
When a pattern with invalid UTF-8 is given, the error message suggests
unqualified use of hex escape sequences to match arbitrary bytes. But
you *also* need to disable Unicode mode. So include that in the error
message.

Fixes #1339
2020-05-08 23:24:40 -04:00
Andrew Gallant
50840ea43b doc: note how to escape a '$' in --replace
Fixes #1524
2020-05-08 23:24:40 -04:00
Andrew Gallant
9a858e4909 doc: add config file note for --type-{add,clear}
This clarifies that persistence is possible via a configuration file.

Fixes #1571
2020-05-08 23:24:40 -04:00
Andrew Gallant
7ed9a31819 printer: fix --count-matches output
In order to implement --count-matches, we simply re-execute the regex on
the spans reported by the searcher. The spans always correspond to the
lines that participated in the match. This is the correct thing to do,
except when the regex contains look-ahead (or look-behind).

In particular, the look-around permits the regex's match success to
depends on an arbitrary point before or after the lines actually
reported as participating in the match. Since only the matched lines are
reported to the printer, it is possible for subsequent searching on
those lines to fail.

A true fix for this would somehow make the total span available to the
printer. But that seems tricky since it isn't always available. For
PCRE2's case in multiline mode, it is available because we force it to
be so for correctness.

For now, we simply detect this corner case heuristically. If the match
count is zero, then it necessarily means there is some kind of
look-around that isn't matching. So we set the match count to 1. This is
probably incorrect in some cases, although my brain can't quite come up
with a concrete example. Nevertheless, this is strictly better than the
status quo.

Fixes #1573
2020-05-08 23:24:40 -04:00
Andrew Gallant
1c4b5adb7b
regex: fix another inner literal bug
It looks like `is_simple` wasn't quite correct.

I can't wait until this code is rewritten. It is still not quite clearly
correct to me.

Fixes #1537
2020-04-01 20:37:48 -04:00
Andrew Gallant
1bb30b72fc
changelog: prepare for 12.0.1 release, redux 2020-03-29 18:50:31 -04:00
Andrew Gallant
58c428827d
changelog: prepare for 12.0.1 release 2020-03-29 18:47:46 -04:00
Andrew Gallant
34edb8123a
ignore: squash noisy error message
We should not assume that the commondir file actually exists. If it
doesn't, then just move on. This otherwise emits an error message when
searching normal submodules, which is not OK.

This regression was introduced in #1446.

Fixes #1520
2020-03-16 18:50:02 -04:00
Andrew Gallant
a8c1fb7c88 changelog: prepare for 12.0.0 release 2020-03-15 21:06:45 -04:00
Andrew Gallant
e772a95b58 regex: avoid using literal optimizations when whitespace is detected
If a literal is entirely whitespace, then it's quite likely that it is
very common. So when that case occurs, just don't do (inner) literal
optimizations at all.

The regex engine may still make sub-optimal decisions here, but that's a
problem for another day.

Fixes #1087
2020-03-15 13:19:14 -04:00
Andrew Gallant
c4c43c733e cli: add --no-ignore-files flag
The purpose of this flag is to force ripgrep to ignore all --ignore-file
flags (whether they come before or after --no-ignore-files).

This flag can be overridden with --ignore-files.

Fixes #1466
2020-03-15 13:19:14 -04:00
Andrew Gallant
447506ebe0 doc: clarify globing behavior
Fixes #1442, Fixes #1478
2020-03-15 13:19:14 -04:00
Andrew Gallant
12e4180985 doc: remove CPU features from man pages
It doesn't really belong in the man page since it's an artifact of a
build/runtime configuration. Moreover, it inhibits reproducible builds.

Fixes #1441
2020-03-15 13:19:14 -04:00
Andrew Gallant
daa8319398 doc: note ripgrep's stdin behavior
Fixes #1439
2020-03-15 13:19:14 -04:00
pierrenn
3a6a24a52a
cli: add engine flag
This permits switching between the different regex engine modes that
ripgrep supports. The purpose of this flag is to make it easier to
extend ripgrep with additional regex engines.

Closes #1488, Closes #1502
2020-03-15 09:30:58 -04:00
Andrew Gallant
66f045e055
changelog: add commit links
... now that we have stable identifiers.
2020-02-17 17:34:19 -05:00
Andrew Gallant
52d7f47420 ignore: treat symbolic links to directories as directories
Due to how walkdir works if symlinks are not followed, symlinks to
directories are seen as simple files by ripgrep. This caused a panic
in some cases due to receiving a WalkEvent::Exit event without a
corresponding WalkEvent::Dir event.

This is fixed by looking at the metadata of the file in the case of a
symlink to determine if it's a directory. We are careful to only do
this stat check when the depth of the entry is 0, as this bug only
impacts us when 1) we aren't following symlinks generally and 2) the
user provides a symlinked directory that we do follow as a top-level
path to search.

Fixes #1389, Closes #1397
2020-02-17 17:16:28 -05:00