1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00
Commit Graph

58 Commits

Author SHA1 Message Date
dana
145cef2eff
doc: elaborate on the function of -u/--unrestricted
Fixes #1703
2020-10-16 09:52:42 -04:00
Andy Freeland
fc2a99bb1f
ignore/types: add vcl (#1659)
VCL is the Varnish Configuration Language used by Varnish and Fastly.

https://varnish-cache.org/docs/trunk/users-guide/vcl.html

PR #1659
2020-08-19 16:28:14 -04:00
Raimon Grau (rgrau)
ffd4c9ccba
ignore/types: add racket
PR #1628
2020-06-25 08:51:32 -04:00
jtrakk
a16bfcb3d6
ignore/types: add dvc
This provides support for DVC files (https://dvc.org/).

PR #1608
2020-06-09 07:44:09 -04:00
Martin Michlmayr
1b2c1dc675
doc: fix typos
PR #1605
2020-06-04 09:06:09 -04:00
Andrew Gallant
f97cc623f7
grep-0.2.7 2020-05-29 09:17:24 -04:00
Andrew Gallant
f35de5c523
grep: update minimal dependency versions 2020-05-29 09:17:08 -04:00
Andrew Gallant
c9bb78ceba
grep-cli-0.1.5 2020-05-29 09:14:18 -04:00
Andrew Gallant
72bdde6771
ignore-0.4.16 2020-05-29 09:13:02 -04:00
Andy Salerno
e8822ce97a
ignore/doc: update misleading documentation
This likely originated from a bad copy/paste.

PR #1596
2020-05-24 23:12:53 -04:00
Andrew Gallant
a700b75843
doc: clarify capture group indices
And in particular, note the special $0 index, which corresponds to the
entire match.

Fixes #1591
2020-05-21 22:22:51 -04:00
Gerion Entrup
b72ad8f8aa
ignore/types: add meson filetype
Closes #1586, PR #1587
2020-05-18 14:01:35 -04:00
Andrew Gallant
1980630f17
doc: fix egregious markup output
We use '+++' syntax to output a literal '**' for a '--glob' example.
This '+++' syntax is pretty ugly when rendered literally via --help. We
fix this by hackily inserting the '+++' syntax for its one specific case
that we need it during man page generation.

Not ideal but it works. And --help still has some '*foo*' markup, but we
live with that for now.

Fixes #1581
2020-05-13 08:13:05 -04:00
Andrew Gallant
72807462e8
deps: update minimal versions for dependencies 2020-05-09 10:39:43 -04:00
Andrew Gallant
08dee094dd
grep-0.2.6 2020-05-09 10:37:29 -04:00
Andrew Gallant
caa53b7b09
grep: update minimal dependency versions 2020-05-09 10:37:08 -04:00
Andrew Gallant
c5d6141562
grep-printer-0.1.5 2020-05-09 10:33:02 -04:00
Andrew Gallant
c0f0492b98
grep-regex-0.1.8 2020-05-09 10:31:29 -04:00
Andrew Gallant
568018386b
ignore-0.4.15 2020-05-09 10:27:19 -04:00
Andrew Gallant
b458cf39f2
deps: update to base64 0.12
No code changes were necessary.
2020-05-09 10:25:37 -04:00
Casey Rodarmor
793c1179cc ignore: allow filtering with predicate
Adds `WalkBuilder::filter_entry` that takes a predicate to be applied to
all entries. If the predicate returns `false` on a given entry, that
entry and all children will be skipped.

Fixes #1555, Closes #1557
2020-05-08 23:24:40 -04:00
Wieland Hoffmann
df7a3bfc7f grep-cli: support files compressed by compress(1)
While Linux distributions (at least Arch Linux, RHEL, Debian) do not support
compressing files with compress(1), macOS & AIX do (the utility is part of
POSIX). Additionally, gzip is able to uncompress such compressed files and
provides an `uncompress` binary.

Closes #1547
2020-05-08 23:24:40 -04:00
Andrew Gallant
28f2a93cae doc: shorten -h/--help prelude
It has grown quite long. It would be nice if we could shorten this only
when -h is used and keep it long for --help, but it seems clap doesn't
let this happen. (It does have `about` and `long_about` options, but
they don't work, even when I disable the use of the template.)

The longer prelude is now only available in the man page.

This addresses #189.
2020-05-08 23:24:40 -04:00
Andrew Gallant
64a4dee495 cli: improve invalid UTF-8 pattern error message
When a pattern with invalid UTF-8 is given, the error message suggests
unqualified use of hex escape sequences to match arbitrary bytes. But
you *also* need to disable Unicode mode. So include that in the error
message.

Fixes #1339
2020-05-08 23:24:40 -04:00
Andrew Gallant
50840ea43b doc: note how to escape a '$' in --replace
Fixes #1524
2020-05-08 23:24:40 -04:00
Andrew Gallant
17dcc2bf51 doc: clarify that *files* override gitignores
This attempts to fix some mild confusion that came up as part of #1574.
Specifically:
https://github.com/BurntSushi/ripgrep/issues/1574#issuecomment-625780436
2020-05-08 23:24:40 -04:00
Andrew Gallant
9a858e4909 doc: add config file note for --type-{add,clear}
This clarifies that persistence is possible via a configuration file.

Fixes #1571
2020-05-08 23:24:40 -04:00
Andrew Gallant
7ed9a31819 printer: fix --count-matches output
In order to implement --count-matches, we simply re-execute the regex on
the spans reported by the searcher. The spans always correspond to the
lines that participated in the match. This is the correct thing to do,
except when the regex contains look-ahead (or look-behind).

In particular, the look-around permits the regex's match success to
depends on an arbitrary point before or after the lines actually
reported as participating in the match. Since only the matched lines are
reported to the printer, it is possible for subsequent searching on
those lines to fail.

A true fix for this would somehow make the total span available to the
printer. But that seems tricky since it isn't always available. For
PCRE2's case in multiline mode, it is available because we force it to
be so for correctness.

For now, we simply detect this corner case heuristically. If the match
count is zero, then it necessarily means there is some kind of
look-around that isn't matching. So we set the match count to 1. This is
probably incorrect in some cases, although my brain can't quite come up
with a concrete example. Nevertheless, this is strictly better than the
status quo.

Fixes #1573
2020-05-08 23:24:40 -04:00
Andrew Gallant
139f186e57 crates/ignore: switch to depth first traversal
This replaces the use of channels in the parallel directory traversal
with a simple stack. The primary motivation for this change is to reduce
peak memory usage. In particular, when using a channel (which is a
queue), we wind up visiting files in a breadth first fashion. Using a
stack switches us to a depth first traversal. While there are no real
intrinsic differences, depth first traversal generally tends to use less
memory because directory trees are more commonly wide than they are
deep.

In particular, the queue/stack size itself is not the only concern. In
one recent case documented in #1550, a user wanted to search all Rust
crates. The directory structure was shallow but extremely wide, with a
single directory containing all crates. This in turn results is in
descending into each of those directories and building a gitignore
matcher for each (since most crates have `.gitignore` files) before ever
searching a single file. This means that ripgrep has all such matchers
in memory simultaneously, which winds up using quite a bit of memory.

In a depth first traversal, peak memory usage is much lower because
gitignore matches are built and discarded more quickly. In the case of
searching all crates, the peak memory usage decrease is dramatic. On my
system, it shrinks by an order magnitude, from almost 1GB to 50MB. The
decline in peak memory usage is consistent across other use cases as
well, but is typically more modest. For example, searching the Linux
repo has a 50% decrease in peak memory usage and searching the Chromium
repo has a 25% decrease in peak memory usage.

Search times generally remain unchanged, although some ad hoc benchmarks
that I typically run have gotten a bit slower. As far as I can tell,
this appears to be result of scheduling changes. Namely, the depth first
traversal seems to result in searching some very large files towards the
end of the search, which reduces the effectiveness of parallelism and
makes the overall search take longer. This seems to suggest that a stack
isn't optimal. It would instead perhaps be better to prioritize
searching larger files first, but it's not quite clear how to do this
without introducing more overhead (getting the file size for each file
requires a stat call).

Fixes #1550
2020-04-18 11:33:03 -04:00
Andrew Gallant
a75b4d122a
doc: fix newline escape
Fixes #1551
2020-04-13 08:49:27 -04:00
Andrew Gallant
1c4b5adb7b
regex: fix another inner literal bug
It looks like `is_simple` wasn't quite correct.

I can't wait until this code is rewritten. It is still not quite clearly
correct to me.

Fixes #1537
2020-04-01 20:37:48 -04:00
Marius Schulz
3d6a58faff
doc: fix typo in help description
PR #1536
2020-03-30 17:31:16 -04:00
Andrew Gallant
09a4b75baf
ignore-0.4.14 2020-03-29 18:49:01 -04:00
Zoltan Puskas
4dfea016b9 ignore/types: add ebuild type
Add support for Gentoo's portage package manager spec files:
https://wiki.gentoo.org/wiki/Portage
2020-03-29 18:44:04 -04:00
Andrew Gallant
67c0f576b6
ignore-0.4.13 2020-03-22 21:08:37 -04:00
Andrew Gallant
543f99dbf1
grep-regex-0.1.7 2020-03-22 21:08:19 -04:00
Andrew Gallant
0ea65efd6d
regex: special case literal extraction
In a prior commit, we fixed a performance problem with the -w flag by
doing a little extra work to extract literals. It turns out that using
literals in this case when the -w flag is NOT used results in a
performance regression. The reasoning is that we end up using a "fast"
regex as a prefilter when the regex engine itself uses its own
equivalent prefilter, so ripgrep ends up redoing a fair amount of work.

Instead, we only do this extra work when we know the -w flag is enabled.
2020-03-22 21:02:51 -04:00
Andrew Gallant
8ba6ccd159
ignore: fix failing test
This fixes fallout from fixing #1520.
2020-03-16 19:16:24 -04:00
Andrew Gallant
34edb8123a
ignore: squash noisy error message
We should not assume that the commondir file actually exists. If it
doesn't, then just move on. This otherwise emits an error message when
searching normal submodules, which is not OK.

This regression was introduced in #1446.

Fixes #1520
2020-03-16 18:50:02 -04:00
Andrew Gallant
92daa34eb3
ripgrep: release 12.0.0 2020-03-15 21:42:54 -04:00
Andrew Gallant
e772a95b58 regex: avoid using literal optimizations when whitespace is detected
If a literal is entirely whitespace, then it's quite likely that it is
very common. So when that case occurs, just don't do (inner) literal
optimizations at all.

The regex engine may still make sub-optimal decisions here, but that's a
problem for another day.

Fixes #1087
2020-03-15 13:19:14 -04:00
Andrew Gallant
9dd4bf8d7f style: fix rust-analyzer lint warnings 2020-03-15 13:19:14 -04:00
Andrew Gallant
c4c43c733e cli: add --no-ignore-files flag
The purpose of this flag is to force ripgrep to ignore all --ignore-file
flags (whether they come before or after --no-ignore-files).

This flag can be overridden with --ignore-files.

Fixes #1466
2020-03-15 13:19:14 -04:00
Andrew Gallant
447506ebe0 doc: clarify globing behavior
Fixes #1442, Fixes #1478
2020-03-15 13:19:14 -04:00
Andrew Gallant
12e4180985 doc: remove CPU features from man pages
It doesn't really belong in the man page since it's an artifact of a
build/runtime configuration. Moreover, it inhibits reproducible builds.

Fixes #1441
2020-03-15 13:19:14 -04:00
Andrew Gallant
daa8319398 doc: note ripgrep's stdin behavior
Fixes #1439
2020-03-15 13:19:14 -04:00
pierrenn
3a6a24a52a
cli: add engine flag
This permits switching between the different regex engine modes that
ripgrep supports. The purpose of this flag is to make it easier to
extend ripgrep with additional regex engines.

Closes #1488, Closes #1502
2020-03-15 09:30:58 -04:00
pierrenn
aab3d80374
args: refactor to permit adding other engines
This is in preparation for adding a new --engine flag which is intended
to eventually supplant --auto-hybrid-regex.

While there are no immediate plans to add more regex engines to ripgrep,
this is intended to make it easier to maintain a patch to ripgrep with
an additional regex engine. See #1488 for more details.
2020-03-15 09:24:28 -04:00
Andrew Gallant
1856cda77b
style: fix rust-analyzer lints in core 2020-03-15 09:04:54 -04:00
chip
50d2047ae2
crates: update URLs in Cargo.toml
This corrects an oversight when the repo was re-organized to
have its crates moved into a 'crates' sub-directory.

PR #1505
2020-02-28 20:31:43 -05:00