ripgrep

mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00

Author	SHA1	Message	Date
Andrew Gallant	91470572cd	changelog: add notes about new file types	2020-02-17 17:16:28 -05:00
Sven-Hendrik Haase	027adbf485	ignore/types: add 'diff' file type This includes .patch and .diff files. Fixes #1418, Closes #1419	2020-02-17 17:16:28 -05:00
Mohammad AlSaleh	e71eedf0eb	cli: add --no-context-separator flag --context-separator='' still adds a new line separator, which could still potentially be useful. So we add a new `--no-context-separator` flag that completely disables context separators even when the -A/-B/-C context flags are used. Closes #1390	2020-02-17 17:16:28 -05:00
Andrew Gallant	88f46d12f1	tests: remove existing test directory I'm surprised this wasn't caught until now, but if a test directory already exists, then it was reused. This can result in hard to debug problems with tests when, e.g., file names are changed and a recursive search is executed.	2020-02-17 17:16:28 -05:00
sharkdp	a18cf6ec39	ignore: add existence check for ignore files This commit adds a simple `.exists()` check for `.gitignore`, `.ignore`, and other similar files before actually calling `File::open(…)` in `GitIgnoreBuilder::add`. The reason is that a simple existence check via `stat` can be faster than actually trying to `open` the file, see https://stackoverflow.com/a/12774387/704831. As we typically expect(?) the number of directories without ignore files to be much larger than the number of directories with ignore files, this leads to an overall speedup. The performance gain is not huge for `rg`, but can be quite significant if more `.gitignore`-like files are added via `add_custom_ignore_filename`. The speedup is larger for folders with low files-per-directory ratios. Note though that we do not do this check on Windows until a specific analysis there suggests this is beneficial. Namely, Windows generally has slower file system operations, so it's not clear whether this speculative check is actually a benefit or not. Benchmark results ----------------- `rg --files` in my home folder (200k results, 6.5 files per directory): \| Command \| Mean [ms] \| Min [ms] \| Max [ms] \| Relative \| \|:---\|---:\|---:\|---:\|---:\| \| `./rg-master --files` \| 396.4 ± 3.2 \| 390.9 \| 400.0 \| 1.05 \| \| `./rg-feature --files` \| 376.0 ± 3.6 \| 369.3 \| 383.5 \| 1.00 \| `rg --files --hidden` in my home folder (800k results, 5.4 files per directory) \| Command \| Mean [s] \| Min [s] \| Max [s] \| Relative \| \|:---\|---:\|---:\|---:\|---:\| \| `./rg-master --files --hidden` \| 1.575 ± 0.012 \| 1.560 \| 1.597 \| 1.06 \| \| `./rg-feature --files --hidden` \| 1.479 ± 0.011 \| 1.464 \| 1.496 \| 1.00 \| `rg --files` in the chromium-79.0.3915.2 source tree (300k results, 12.7 files per directory) \| Command \| Mean [ms] \| Min [ms] \| Max [ms] \| Relative \| \|:---\|---:\|---:\|---:\|---:\| \| `~/rg-master --files` \| 445.2 ± 5.3 \| 435.6 \| 453.0 \| 1.04 \| \| `~/rg-feature --files` \| 428.9 ± 7.0 \| 418.2 \| 440.0 \| 1.00 \| `rg --files` in the linux-5.3 source tree (65k results, 15.1 files per directory) \| Command \| Mean [ms] \| Min [ms] \| Max [ms] \| Relative \| \|:---\|---:\|---:\|---:\|---:\| \| `./rg-master --files` \| 94.5 ± 1.9 \| 89.8 \| 98.5 \| 1.02 \| \| `./rg-feature --files` \| 92.6 ± 2.7 \| 88.4 \| 98.7 \| 1.00 \| Closes #1381	2020-02-17 17:16:28 -05:00
Gibson Fahnestock	c78c3236a8	readme: remove outdated SIMD info Looks like the upstream brew Formula [0][] now has SIMD support, so remove the extraneous info now that the custom tap is no longer needed [1][]. [0]: https://github.com/Homebrew/homebrew-core/blob/master/Formula/ripgrep.rb [1]: `f3083e4574` PR #1431	2020-02-15 17:19:22 -05:00
Sorin Sbarnea	7cf21600cd	readme: document CentOS 8 support ripgrep install instructions are valid even for the 7 version. The tool works without problems on these too. PR #1428	2020-02-15 17:16:57 -05:00
Jonathan Mast	647b0d3977	ignore/types: add HAML and ERB These are commonly used templating languages for Ruby, add their extensions to the filetypes list for convenient filtering. PR #1407	2020-02-15 09:18:32 -05:00
Jeff S	e572fc1683	ignore/types: add slim, slime, and skim templates PR #1391	2020-02-15 09:17:46 -05:00
Andrew Gallant	9cb93abd11	ignore: allow use of Error::description We can remove it in the next semver incompatible release.	2020-02-10 06:44:21 -05:00
Luca Kredel	41695c66fa	ignore/types: add typoscript file type Add the file types for TypoScript - the configuration language of the TYPO3 CMS. PR #1477	2020-02-07 08:41:00 -05:00
Andrew Gallant	cb0dfda936	faq: add section about donations This is asked often enough that it's worth having a canonical answer.	2020-02-05 13:09:11 -05:00
Andrew Gallant	74d1fe59e9	deps: update everything	2020-01-30 18:33:40 -05:00
Andrew Gallant	9fd1e202e0	deps: update regex, regex-syntax and aho-corasick Notably, this brings in a bug fix reported by @okdana: https://github.com/rust-lang/regex/issues/640	2020-01-30 18:32:56 -05:00
Robert Irelan	e76807b1b5	ignore/types: add *.org_archive to org file type .org_archive is the default extension for Org archive files, created when entries from an Org-mode file are archived (see <https://orgmode.org/org.html#Moving-subtrees>). These files are still in Org mode format, so it's worth searching them at the same time as non-archive Org mode files. PR #1475	2020-01-29 13:59:34 -05:00
Andrew Gallant	f8fb65f7e3	globset: fix benchmarks There were apparently a lot of unused things, including lazy_static.	2020-01-27 16:45:12 -05:00
Tristan Waddington	98de8d248a	ignore/types: make 'gradle' it's own type This change maintains the existing behavior of the 'groovy' type, which includes both .groovy and .gradle files. PR #1470	2020-01-23 06:51:11 -05:00
Crestwave	c358700dfb	readme: add instructions for Haiku x86_64 and x86_gcc2 PR #1465	2020-01-21 07:34:24 -05:00
Alex Touchet	8670a4a969	readme: update outdated links PR #1463	2020-01-21 07:32:54 -05:00
Oliver Newman	e3b1f86908	doc: add missing "will" to the user guide PR #1462	2020-01-20 17:26:08 -05:00
Jan Verbeek	46b07bb2ee	ignore/types: fix postscript globs The postscript globs were missing asterisks, so they were treated as literal filenames. PR #1461	2020-01-20 07:48:57 -05:00
Andrew Gallant	8bdf84e3a8	deps: update everything	2020-01-16 19:47:23 -05:00
Andrew Gallant	5a6e17fcc1	deps: various updates Most of these updates (sans thread_local) are from crates I maintain that have seen updates recently. Notably, this includes a bump to `termcolor 1.1.0` which includes support for respecting `NO_COLOR`. This commit therefore means that ripgrep now supports `NO_COLOR`. As an added bonus, we drop a dependency on Windows. (Although the total amount of code compiled remains the same.) Closes #1186	2020-01-11 10:09:10 -05:00
Andrew Gallant	00bfcd14a6	ignore-0.4.11	2020-01-10 15:08:27 -05:00
Andrew Gallant	bf0ddc4675	ci: fix musl docker build Looks like the old japaric images are bunk. We update our docker image to be based on the new rustembedded images and configure cross to use it. Turns out that this wasn't due to a stale docker image, but rather, a bug in cross: https://github.com/rust-embedded/cross/issues/357 We work around that bug by installing the master branch of cross. Sigh.	2020-01-10 15:07:47 -05:00
Andrew Gallant	0fb3f6a159	ci: disable github actions for now The CI build failures are annoying and distracting. Hopefully soon I'll be able to invest more time in the switch.	2020-01-10 15:07:47 -05:00
Andrew Gallant	837fb5e21f	deps: update to crossbeam-channel 0.4 Closes #1427	2020-01-10 15:07:47 -05:00
Andrew Gallant	2e1815606e	deps: update to bytecount 0.6 Looks like there aren't any major changes other than dependency updates.	2020-01-10 15:07:47 -05:00
Andrew Gallant	cb2f6ddc61	deps: update to thread_local 1.0 We also update the pcre2 and regex dependencies, which removes any other lingering uses of thread_local 0.3.	2020-01-10 15:07:47 -05:00
Andrew Gallant	bd7a42602f	deps: bump to base64 0.11	2020-01-10 15:07:47 -05:00
Andrew Gallant	528ce56e1b	deps: run cargo update The only new dependency is an unused target specific dependency hermit via the atty crate.	2020-01-10 15:07:47 -05:00
Yevgen Antymyrov	8892bf648c	doc: fix typo in FAQ	2019-09-25 08:13:27 -04:00
Jonathan Clem	8cb7271b64	ci: get GitHub Actions running again Basically, matrix.os needs to be defined for every build. We were commenting out some of the builds in order to debug CI in the `include` section, but we also need to comment them out in the `build section.	2019-09-11 09:08:24 -04:00
Andrew Gallant	4858267f3b	ci: initial github actions config	2019-08-31 09:24:44 -04:00
Andrew Gallant	5011dba2fd	ignore: remove unused parameter	2019-08-28 20:21:34 -04:00
Andrew Gallant	e14f9195e5	deps: update everything	2019-08-28 20:18:47 -04:00
Andrew Gallant	ef0e7af56a	deps: update bstr to 0.2.7 The new bstr release contains a small performance bug fix where some trivial methods weren't being inlined.	2019-08-11 10:41:05 -04:00
Todd Walton	b266818aa5	doc: use XDG_CONFIG_HOME in comments XDG_CONFIG_DIR does not actually exist. PR #1347	2019-08-09 13:37:37 -04:00
LawAbidingCactus	81415ae52d	doc: update to reflect glob matching behavior change Specifically, paths contains a `/` are not allowed to match any other slash in the path, even as a prefix. So `!.git` is the correct incantation for ignoring a `.git` directory that occurs anywhere in the path.	2019-08-07 13:47:18 -04:00
Andrew Gallant	5c4584aa7c	grep-regex-0.1.5	2019-08-06 09:51:13 -04:00
Andrew Gallant	0972c6e7c7	grep-searcher-0.1.6	2019-08-06 09:50:52 -04:00
Andrew Gallant	0a372bf2e4	deps: update ignore	2019-08-06 09:50:35 -04:00
Andrew Gallant	345124a7fa	ignore-0.4.10	2019-08-06 09:47:45 -04:00
Andrew Gallant	31807f805a	deps: drop tempfile We were only using it to create temporary directories for `ignore` tests, but it pulls in a bunch of dependencies and we don't really need randomness. So just use our own simple wrapper instead.	2019-08-06 09:46:05 -04:00
Andrew Gallant	4de227fd9a	deps: update everything Mostly this just updates regex and its assorted dependencies. This does drop utf8-ranges and ucd-util, in accordance with changes to regex-syntax and regex.	2019-08-05 13:50:55 -04:00
jimbo1qaz	d7ce274722	readme: Debian Buster is stable now PR #1338	2019-08-04 08:06:10 -04:00
Andrew Gallant	5b10328f41	changelog: update with bug fix	2019-08-02 07:37:27 -04:00
Andrew Gallant	813c676eca	searcher: fix roll buffer bug This commit fixes a subtle bug in how the line buffer was rolling its contents. Specifically, when ripgrep searches without memory maps, it uses a "roll" buffer for incremental line oriented search without needing to read the entire file into memory at once. The roll buffer works by reading a chunk of bytes from the file into memory, and then searching everything in that buffer up to the last `\n` byte. The bytes after the last `\n` byte are preserved, since they likely correspond to part of the next line. Once ripgrep is done searching the buffer, it "rolls" the buffer such that the start of the next line is at the beginning of the buffer, and then ripgrep reads more data into the buffer starting at the (possibly) partial end of that line. The implication of this strategy, necessarily so, is that a buffer must be big enough to fit a single line in memory. This is because the regex engine needs a contiguous block of memory to search, so there is no way to search anything smaller than a single line. So if a file contains a single line with 7.5 million bytes, then the buffer will grow to be at least that size. (Many files have super long lines like this, but they tend to be binary files, which ripgrep will detect and stop searching unless the user forces it with the `-a/--text` flag. So in practice, they aren't usually a problem. However, in this case, #1335 found a case where a plain text file had a line with 7.5 million bytes.) Now, for performance reasons, ripgrep reuses these buffers across its search. Typically, it will create `N` of these line buffers when it starts (where `N` is the number of threads it is using), and then reuse them without creating any new ones as it searches through files. This means that if you search a file with a very long line, that buffer will expand to be big enough to store that line. ripgrep never contracts these buffers, so once it searches the next file, ripgrep will continue to use this large buffer. While it might be prudent to contract these buffers in some circumstances, this isn't otherwise inherently a problem. The memory has already been allocated, and there isn't much cost to using it, other than the fact that ripgrep hangs on to it and never gives it back to the OS. However, the `roll` implementation described above had a really important bug in it that was impacted by the size of the buffer. Specifically, it used the following to "roll" the partial line at the end of the buffer to the beginning: self.buf.copy_within_str(self.pos.., 0); Which means that if the buffer is very large, ripgrep will copy everything from `self.pos` (which might be very small, e.g., for small files) to the end of the buffer, and move it to the beginning of the buffer. This will happen repeatedly each time the buffer is used to search small files, which winds up being quite a large slow down if the line was exceptionally large (say, megabytes). It turns out that copying everything is completely unnecessary. We only need to copy the remainder of the last read to the beginning of the buffer. Everything after the last read in the buffer is just free space that can be filled for the next read. So, all we need to do is copy just those bytes: self.buf.copy_within_str(self.pos..self.end, 0); ... which is typically much much smaller than the rest of the buffer. This was likely also causing small performance losses in other cases as well. For example, when searching a lot of small files, ripgrep would likely do a lot more copying than necessary. Although, given that the default buffer size is 8KB, this extra copying was likely pretty small, and was thus harder to observe. Fixes #1335	2019-08-02 07:23:27 -04:00
Andrew Gallant	f625d72b6f	pkg: update brew tap to 11.0.2	2019-08-01 19:39:53 -04:00
Andrew Gallant	3de31f7527	ci: fix musl deployment The docker image that the Linux binary is now built in does not have ASCII doc installed, so setup Cross to point to my own image with those tools installed.	2019-08-01 18:41:44 -04:00

... 4 5 6 7 8 ...

1590 Commits