1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2025-01-19 05:49:14 +02:00

1450 Commits

Author SHA1 Message Date
Andrew Gallant
fac47906e6 doc: add a release checklist
The steps are numerous, subtle and complex enough that it's worth
writing them down. In particular, getting the order correct is
important. (i.e., If we released to crates.io first and the GitHub
release infrastructure failed, then we'd be in a pickle.)
2020-05-08 23:24:40 -04:00
Andrew Gallant
e02bb6b99a changelog: add downstream notices 2020-05-08 23:24:40 -04:00
Chayoung You
16a1221fc7 doc: use asciidoctor instead of a2x
AsciiDoc development is continued under asciidoctor. See
https://github.com/asciidoc/asciidoc.

We do however fallback to a2x if asciidoctor is not present. This is to
ease migration, but at some point, it's likely that support for a2x will
be dropped.

Originally reported downstream:
https://github.com/Homebrew/linuxbrew-core/issues/19885

Closes #1544
2020-05-08 23:24:40 -04:00
Casey Rodarmor
793c1179cc ignore: allow filtering with predicate
Adds `WalkBuilder::filter_entry` that takes a predicate to be applied to
all entries. If the predicate returns `false` on a given entry, that
entry and all children will be skipped.

Fixes #1555, Closes #1557
2020-05-08 23:24:40 -04:00
Wieland Hoffmann
df7a3bfc7f grep-cli: support files compressed by compress(1)
While Linux distributions (at least Arch Linux, RHEL, Debian) do not support
compressing files with compress(1), macOS & AIX do (the utility is part of
POSIX). Additionally, gzip is able to uncompress such compressed files and
provides an `uncompress` binary.

Closes #1547
2020-05-08 23:24:40 -04:00
Andrew Gallant
28f2a93cae doc: shorten -h/--help prelude
It has grown quite long. It would be nice if we could shorten this only
when -h is used and keep it long for --help, but it seems clap doesn't
let this happen. (It does have `about` and `long_about` options, but
they don't work, even when I disable the use of the template.)

The longer prelude is now only available in the man page.

This addresses #189.
2020-05-08 23:24:40 -04:00
Andrew Gallant
0eb2501b6e doc: add a section about --pre to the GUIDE
Fixes #1252
2020-05-08 23:24:40 -04:00
Andrew Gallant
184c15882e doc: add -U/--multiline to common options 2020-05-08 23:24:40 -04:00
Andrew Gallant
64a4dee495 cli: improve invalid UTF-8 pattern error message
When a pattern with invalid UTF-8 is given, the error message suggests
unqualified use of hex escape sequences to match arbitrary bytes. But
you *also* need to disable Unicode mode. So include that in the error
message.

Fixes #1339
2020-05-08 23:24:40 -04:00
Andrew Gallant
50840ea43b doc: note how to escape a '$' in --replace
Fixes #1524
2020-05-08 23:24:40 -04:00
Andrew Gallant
17dcc2bf51 doc: clarify that *files* override gitignores
This attempts to fix some mild confusion that came up as part of #1574.
Specifically:
https://github.com/BurntSushi/ripgrep/issues/1574#issuecomment-625780436
2020-05-08 23:24:40 -04:00
Andrew Gallant
9a858e4909 doc: add config file note for --type-{add,clear}
This clarifies that persistence is possible via a configuration file.

Fixes #1571
2020-05-08 23:24:40 -04:00
Andrew Gallant
cbfbe9312f snap: remove snapcraft configuration
This hasn't been updated in ages and it's not clear what purpose it's
serving.
2020-05-08 23:24:40 -04:00
Andrew Gallant
7ed9a31819 printer: fix --count-matches output
In order to implement --count-matches, we simply re-execute the regex on
the spans reported by the searcher. The spans always correspond to the
lines that participated in the match. This is the correct thing to do,
except when the regex contains look-ahead (or look-behind).

In particular, the look-around permits the regex's match success to
depends on an arbitrary point before or after the lines actually
reported as participating in the match. Since only the matched lines are
reported to the printer, it is possible for subsequent searching on
those lines to fail.

A true fix for this would somehow make the total span available to the
printer. But that seems tricky since it isn't always available. For
PCRE2's case in multiline mode, it is available because we force it to
be so for correctness.

For now, we simply detect this corner case heuristically. If the match
count is zero, then it necessarily means there is some kind of
look-around that isn't matching. So we set the match count to 1. This is
probably incorrect in some cases, although my brain can't quite come up
with a concrete example. Nevertheless, this is strictly better than the
status quo.

Fixes #1573
2020-05-08 23:24:40 -04:00
Andrew Gallant
a2e6aec7a4
tests: add new regression test for fixed inner literal bug
This adds a new test case for a bug (#1537) that has already been fixed.
Or more precisely, a new bug with the same root cause.

Closes #1559
2020-04-23 08:37:04 -04:00
Andrew Gallant
73103df6d9
deps: small dependency updates 2020-04-18 11:33:27 -04:00
Andrew Gallant
139f186e57 crates/ignore: switch to depth first traversal
This replaces the use of channels in the parallel directory traversal
with a simple stack. The primary motivation for this change is to reduce
peak memory usage. In particular, when using a channel (which is a
queue), we wind up visiting files in a breadth first fashion. Using a
stack switches us to a depth first traversal. While there are no real
intrinsic differences, depth first traversal generally tends to use less
memory because directory trees are more commonly wide than they are
deep.

In particular, the queue/stack size itself is not the only concern. In
one recent case documented in #1550, a user wanted to search all Rust
crates. The directory structure was shallow but extremely wide, with a
single directory containing all crates. This in turn results is in
descending into each of those directories and building a gitignore
matcher for each (since most crates have `.gitignore` files) before ever
searching a single file. This means that ripgrep has all such matchers
in memory simultaneously, which winds up using quite a bit of memory.

In a depth first traversal, peak memory usage is much lower because
gitignore matches are built and discarded more quickly. In the case of
searching all crates, the peak memory usage decrease is dramatic. On my
system, it shrinks by an order magnitude, from almost 1GB to 50MB. The
decline in peak memory usage is consistent across other use cases as
well, but is typically more modest. For example, searching the Linux
repo has a 50% decrease in peak memory usage and searching the Chromium
repo has a 25% decrease in peak memory usage.

Search times generally remain unchanged, although some ad hoc benchmarks
that I typically run have gotten a bit slower. As far as I can tell,
this appears to be result of scheduling changes. Namely, the depth first
traversal seems to result in searching some very large files towards the
end of the search, which reduces the effectiveness of parallelism and
makes the overall search take longer. This seems to suggest that a stack
isn't optimal. It would instead perhaps be better to prioritize
searching larger files first, but it's not quite clear how to do this
without introducing more overhead (getting the file size for each file
requires a stat call).

Fixes #1550
2020-04-18 11:33:03 -04:00
Andrew Gallant
afb325f733
readme: fix ordering of benchmarks
Results remain the same. I just didn't order them correctly.
2020-04-16 12:03:46 -04:00
Andrew Gallant
40af352d74
github: add necessary metadata 2020-04-14 16:28:09 -04:00
Andrew Gallant
3f1d4b397d
github: switch to new issue template format
And also point folks toward Discussions.
2020-04-14 16:23:47 -04:00
Andrew Gallant
a75b4d122a
doc: fix newline escape
Fixes #1551
2020-04-13 08:49:27 -04:00
Simon Robin
f51b762c6d
pkg: fix brew tap version
It wasn't updated after the 12.0.1 release, even though the
SHA values were.

PR #1545
2020-04-07 19:45:53 -04:00
Andrew Gallant
49de7b119c
ci: disable man page check
It appears to be intermittently failing. Specifically, a2x seems to be
failing occasionally with no apparent reason why. The error message it
gives is inscrutable. Sigh.
2020-04-01 21:18:04 -04:00
Andrew Gallant
1c4b5adb7b
regex: fix another inner literal bug
It looks like `is_simple` wasn't quite correct.

I can't wait until this code is rewritten. It is still not quite clearly
correct to me.

Fixes #1537
2020-04-01 20:37:48 -04:00
Marius Schulz
3d6a58faff
doc: fix typo in help description
PR #1536
2020-03-30 17:31:16 -04:00
Andrew Gallant
5b6ca04e39
ci: upgrade to actions/checkout@v2
In particular, this appears to fix an extremely annoying bug that was
causing PR builds to fail if they were re-run.

For more details:
https://github.com/actions/checkout/issues/23#issuecomment-572688577
2020-03-30 17:09:41 -04:00
Andrew Gallant
47f20c2661
pkg: update brew tap to 12.0.1 2020-03-29 19:18:57 -04:00
Andrew Gallant
1d5b1011e5
12.0.1 12.0.1 2020-03-29 18:59:40 -04:00
Andrew Gallant
1bb30b72fc
changelog: prepare for 12.0.1 release, redux 2020-03-29 18:50:31 -04:00
Andrew Gallant
09a4b75baf
ignore-0.4.14 ignore-0.4.14 2020-03-29 18:49:01 -04:00
Andrew Gallant
58c428827d
changelog: prepare for 12.0.1 release 2020-03-29 18:47:46 -04:00
Andrew Gallant
b9bb04b793
deps: minor dependency updates 2020-03-29 18:47:15 -04:00
Zoltan Puskas
4dfea016b9 ignore/types: add ebuild type
Add support for Gentoo's portage package manager spec files:
https://wiki.gentoo.org/wiki/Portage
2020-03-29 18:44:04 -04:00
Andrew Gallant
3193d57ac1
ci: attempt to fix CI
It looks like a2x isn't working, so take a shot at fixing it.
2020-03-28 21:36:29 -04:00
Andrew Gallant
67c0f576b6
ignore-0.4.13 ignore-0.4.13 2020-03-22 21:08:37 -04:00
Andrew Gallant
543f99dbf1
grep-regex-0.1.7 grep-regex-0.1.7 2020-03-22 21:08:19 -04:00
Andrew Gallant
0ea65efd6d
regex: special case literal extraction
In a prior commit, we fixed a performance problem with the -w flag by
doing a little extra work to extract literals. It turns out that using
literals in this case when the -w flag is NOT used results in a
performance regression. The reasoning is that we end up using a "fast"
regex as a prefilter when the regex engine itself uses its own
equivalent prefilter, so ripgrep ends up redoing a fair amount of work.

Instead, we only do this extra work when we know the -w flag is enabled.
2020-03-22 21:02:51 -04:00
Paul A. Patience
20deae6497
tests: fix typo in test name
PR #1528
2020-03-22 07:43:16 -04:00
Andrew Gallant
655e33219a
crates.io: remove badges
... and don't replace them with anything because crates.io does not
support GitHub Actions yet. But it's almost there:
https://github.com/rust-lang/crates.io/pull/1838

Thanks @atouchet for noticing this.
2020-03-17 17:50:37 -04:00
Andrew Gallant
8ba6ccd159
ignore: fix failing test
This fixes fallout from fixing #1520.
2020-03-16 19:16:24 -04:00
Andrew Gallant
34edb8123a
ignore: squash noisy error message
We should not assume that the commondir file actually exists. If it
doesn't, then just move on. This otherwise emits an error message when
searching normal submodules, which is not OK.

This regression was introduced in #1446.

Fixes #1520
2020-03-16 18:50:02 -04:00
Andrew Gallant
5b30c2aed6
ci: fix deb build script 2020-03-15 22:11:32 -04:00
Andrew Gallant
bf1027a83e
pkg: update brew tap to 12.0.0 2020-03-15 22:10:08 -04:00
Andrew Gallant
031264e5fb
ci: tweak release name
This is consistent with prior releases.
2020-03-15 22:07:22 -04:00
Andrew Gallant
b9cd95faf1
release: 12.0.0, take 2 12.0.0 2020-03-15 21:54:11 -04:00
Andrew Gallant
92daa34eb3
ripgrep: release 12.0.0 grep-searcher-0.1.7 grep-regex-0.1.6 grep-printer-0.1.4 grep-pcre2-0.1.4 grep-matcher-0.1.4 ignore-0.4.12 grep-0.2.5 globset-0.4.5 grep-cli-0.1.4 2020-03-15 21:42:54 -04:00
Andrew Gallant
a8c1fb7c88 changelog: prepare for 12.0.0 release 2020-03-15 21:06:45 -04:00
Andrew Gallant
52ec68799c ci: make script names consistent 2020-03-15 21:06:45 -04:00
Andrew Gallant
c0d78240df ci: remove Travis and appveyor specific stuff 2020-03-15 21:06:45 -04:00
Andrew Gallant
cda9acb876 ci: rebuild release infrastructure on GitHub Actions 2020-03-15 21:06:45 -04:00