ripgrep

mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00

Author	SHA1	Message	Date
Andrew Gallant	17dcc2bf51	doc: clarify that files override gitignores This attempts to fix some mild confusion that came up as part of #1574. Specifically: https://github.com/BurntSushi/ripgrep/issues/1574#issuecomment-625780436	2020-05-08 23:24:40 -04:00
Andrew Gallant	9a858e4909	doc: add config file note for --type-{add,clear} This clarifies that persistence is possible via a configuration file. Fixes #1571	2020-05-08 23:24:40 -04:00
Andrew Gallant	cbfbe9312f	snap: remove snapcraft configuration This hasn't been updated in ages and it's not clear what purpose it's serving.	2020-05-08 23:24:40 -04:00
Andrew Gallant	7ed9a31819	printer: fix --count-matches output In order to implement --count-matches, we simply re-execute the regex on the spans reported by the searcher. The spans always correspond to the lines that participated in the match. This is the correct thing to do, except when the regex contains look-ahead (or look-behind). In particular, the look-around permits the regex's match success to depends on an arbitrary point before or after the lines actually reported as participating in the match. Since only the matched lines are reported to the printer, it is possible for subsequent searching on those lines to fail. A true fix for this would somehow make the total span available to the printer. But that seems tricky since it isn't always available. For PCRE2's case in multiline mode, it is available because we force it to be so for correctness. For now, we simply detect this corner case heuristically. If the match count is zero, then it necessarily means there is some kind of look-around that isn't matching. So we set the match count to 1. This is probably incorrect in some cases, although my brain can't quite come up with a concrete example. Nevertheless, this is strictly better than the status quo. Fixes #1573	2020-05-08 23:24:40 -04:00
Andrew Gallant	a2e6aec7a4	tests: add new regression test for fixed inner literal bug This adds a new test case for a bug (#1537) that has already been fixed. Or more precisely, a new bug with the same root cause. Closes #1559	2020-04-23 08:37:04 -04:00
Andrew Gallant	73103df6d9	deps: small dependency updates	2020-04-18 11:33:27 -04:00
Andrew Gallant	139f186e57	crates/ignore: switch to depth first traversal This replaces the use of channels in the parallel directory traversal with a simple stack. The primary motivation for this change is to reduce peak memory usage. In particular, when using a channel (which is a queue), we wind up visiting files in a breadth first fashion. Using a stack switches us to a depth first traversal. While there are no real intrinsic differences, depth first traversal generally tends to use less memory because directory trees are more commonly wide than they are deep. In particular, the queue/stack size itself is not the only concern. In one recent case documented in #1550, a user wanted to search all Rust crates. The directory structure was shallow but extremely wide, with a single directory containing all crates. This in turn results is in descending into each of those directories and building a gitignore matcher for each (since most crates have `.gitignore` files) before ever searching a single file. This means that ripgrep has all such matchers in memory simultaneously, which winds up using quite a bit of memory. In a depth first traversal, peak memory usage is much lower because gitignore matches are built and discarded more quickly. In the case of searching all crates, the peak memory usage decrease is dramatic. On my system, it shrinks by an order magnitude, from almost 1GB to 50MB. The decline in peak memory usage is consistent across other use cases as well, but is typically more modest. For example, searching the Linux repo has a 50% decrease in peak memory usage and searching the Chromium repo has a 25% decrease in peak memory usage. Search times generally remain unchanged, although some ad hoc benchmarks that I typically run have gotten a bit slower. As far as I can tell, this appears to be result of scheduling changes. Namely, the depth first traversal seems to result in searching some very large files towards the end of the search, which reduces the effectiveness of parallelism and makes the overall search take longer. This seems to suggest that a stack isn't optimal. It would instead perhaps be better to prioritize searching larger files first, but it's not quite clear how to do this without introducing more overhead (getting the file size for each file requires a stat call). Fixes #1550	2020-04-18 11:33:03 -04:00
Andrew Gallant	afb325f733	readme: fix ordering of benchmarks Results remain the same. I just didn't order them correctly.	2020-04-16 12:03:46 -04:00
Andrew Gallant	40af352d74	github: add necessary metadata	2020-04-14 16:28:09 -04:00
Andrew Gallant	3f1d4b397d	github: switch to new issue template format And also point folks toward Discussions.	2020-04-14 16:23:47 -04:00
Andrew Gallant	a75b4d122a	doc: fix newline escape Fixes #1551	2020-04-13 08:49:27 -04:00
Simon Robin	f51b762c6d	pkg: fix brew tap version It wasn't updated after the 12.0.1 release, even though the SHA values were. PR #1545	2020-04-07 19:45:53 -04:00
Andrew Gallant	49de7b119c	ci: disable man page check It appears to be intermittently failing. Specifically, a2x seems to be failing occasionally with no apparent reason why. The error message it gives is inscrutable. Sigh.	2020-04-01 21:18:04 -04:00
Andrew Gallant	1c4b5adb7b	regex: fix another inner literal bug It looks like `is_simple` wasn't quite correct. I can't wait until this code is rewritten. It is still not quite clearly correct to me. Fixes #1537	2020-04-01 20:37:48 -04:00
Marius Schulz	3d6a58faff	doc: fix typo in help description PR #1536	2020-03-30 17:31:16 -04:00
Andrew Gallant	5b6ca04e39	ci: upgrade to actions/checkout@v2 In particular, this appears to fix an extremely annoying bug that was causing PR builds to fail if they were re-run. For more details: https://github.com/actions/checkout/issues/23#issuecomment-572688577	2020-03-30 17:09:41 -04:00
Andrew Gallant	47f20c2661	pkg: update brew tap to 12.0.1	2020-03-29 19:18:57 -04:00
Andrew Gallant	1d5b1011e5	12.0.1	2020-03-29 18:59:40 -04:00
Andrew Gallant	1bb30b72fc	changelog: prepare for 12.0.1 release, redux	2020-03-29 18:50:31 -04:00
Andrew Gallant	09a4b75baf	ignore-0.4.14	2020-03-29 18:49:01 -04:00
Andrew Gallant	58c428827d	changelog: prepare for 12.0.1 release	2020-03-29 18:47:46 -04:00
Andrew Gallant	b9bb04b793	deps: minor dependency updates	2020-03-29 18:47:15 -04:00
Zoltan Puskas	4dfea016b9	ignore/types: add ebuild type Add support for Gentoo's portage package manager spec files: https://wiki.gentoo.org/wiki/Portage	2020-03-29 18:44:04 -04:00
Andrew Gallant	3193d57ac1	ci: attempt to fix CI It looks like a2x isn't working, so take a shot at fixing it.	2020-03-28 21:36:29 -04:00
Andrew Gallant	67c0f576b6	ignore-0.4.13	2020-03-22 21:08:37 -04:00
Andrew Gallant	543f99dbf1	grep-regex-0.1.7	2020-03-22 21:08:19 -04:00
Andrew Gallant	0ea65efd6d	regex: special case literal extraction In a prior commit, we fixed a performance problem with the -w flag by doing a little extra work to extract literals. It turns out that using literals in this case when the -w flag is NOT used results in a performance regression. The reasoning is that we end up using a "fast" regex as a prefilter when the regex engine itself uses its own equivalent prefilter, so ripgrep ends up redoing a fair amount of work. Instead, we only do this extra work when we know the -w flag is enabled.	2020-03-22 21:02:51 -04:00
Paul A. Patience	20deae6497	tests: fix typo in test name PR #1528	2020-03-22 07:43:16 -04:00
Andrew Gallant	655e33219a	crates.io: remove badges ... and don't replace them with anything because crates.io does not support GitHub Actions yet. But it's almost there: https://github.com/rust-lang/crates.io/pull/1838 Thanks @atouchet for noticing this.	2020-03-17 17:50:37 -04:00
Andrew Gallant	8ba6ccd159	ignore: fix failing test This fixes fallout from fixing #1520.	2020-03-16 19:16:24 -04:00
Andrew Gallant	34edb8123a	ignore: squash noisy error message We should not assume that the commondir file actually exists. If it doesn't, then just move on. This otherwise emits an error message when searching normal submodules, which is not OK. This regression was introduced in #1446. Fixes #1520	2020-03-16 18:50:02 -04:00
Andrew Gallant	5b30c2aed6	ci: fix deb build script	2020-03-15 22:11:32 -04:00
Andrew Gallant	bf1027a83e	pkg: update brew tap to 12.0.0	2020-03-15 22:10:08 -04:00
Andrew Gallant	031264e5fb	ci: tweak release name This is consistent with prior releases.	2020-03-15 22:07:22 -04:00
Andrew Gallant	b9cd95faf1	release: 12.0.0, take 2	2020-03-15 21:54:11 -04:00
Andrew Gallant	92daa34eb3	ripgrep: release 12.0.0	2020-03-15 21:42:54 -04:00
Andrew Gallant	a8c1fb7c88	changelog: prepare for 12.0.0 release	2020-03-15 21:06:45 -04:00
Andrew Gallant	52ec68799c	ci: make script names consistent	2020-03-15 21:06:45 -04:00
Andrew Gallant	c0d78240df	ci: remove Travis and appveyor specific stuff	2020-03-15 21:06:45 -04:00
Andrew Gallant	cda9acb876	ci: rebuild release infrastructure on GitHub Actions	2020-03-15 21:06:45 -04:00
Andrew Gallant	1ece50694e	readme: update file size	2020-03-15 13:27:31 -04:00
Andrew Gallant	f3a966bcbc	readme: add 'Unicode' label to ugrep	2020-03-15 13:26:02 -04:00
Andrew Gallant	a38913b63a	readme: update benchmarks This also updates the corpora used, so previous times (and counts) are not comparable. We also remove some tools, likt pt, sift and ucg, since they appear to be no longer maintained. ag isn't really maintained either, but it still has significant mind share, so we retain a benchmark for it. We also upgrade ack to version 3, and remove the clarification on how `-w` is implemented. We also add `git grep -P` (uses PCRE2) which appears to be much faster than `git grep -E`. Finally, we add ugrep which is a new up and comer in this space. Fixes #1474	2020-03-15 13:21:18 -04:00
Andrew Gallant	e772a95b58	regex: avoid using literal optimizations when whitespace is detected If a literal is entirely whitespace, then it's quite likely that it is very common. So when that case occurs, just don't do (inner) literal optimizations at all. The regex engine may still make sub-optimal decisions here, but that's a problem for another day. Fixes #1087	2020-03-15 13:19:14 -04:00
Andrew Gallant	9dd4bf8d7f	style: fix rust-analyzer lint warnings	2020-03-15 13:19:14 -04:00
Andrew Gallant	c4c43c733e	cli: add --no-ignore-files flag The purpose of this flag is to force ripgrep to ignore all --ignore-file flags (whether they come before or after --no-ignore-files). This flag can be overridden with --ignore-files. Fixes #1466	2020-03-15 13:19:14 -04:00
Andrew Gallant	447506ebe0	doc: clarify globing behavior Fixes #1442, Fixes #1478	2020-03-15 13:19:14 -04:00
Andrew Gallant	12e4180985	doc: remove CPU features from man pages It doesn't really belong in the man page since it's an artifact of a build/runtime configuration. Moreover, it inhibits reproducible builds. Fixes #1441	2020-03-15 13:19:14 -04:00
Andrew Gallant	daa8319398	doc: note ripgrep's stdin behavior Fixes #1439	2020-03-15 13:19:14 -04:00
pierrenn	3a6a24a52a	cli: add engine flag This permits switching between the different regex engine modes that ripgrep supports. The purpose of this flag is to make it easier to extend ripgrep with additional regex engines. Closes #1488, Closes #1502	2020-03-15 09:30:58 -04:00

... 2 3 4 5 6 ...

1590 Commits