ripgrep

mirror of https://github.com/BurntSushi/ripgrep.git synced 2025-08-04 21:52:54 +02:00

Author	SHA1	Message	Date
Andrew Gallant	0a372bf2e4	deps: update ignore	2019-08-06 09:50:35 -04:00
Andrew Gallant	345124a7fa	ignore-0.4.10 ignore-0.4.10	2019-08-06 09:47:45 -04:00
Andrew Gallant	31807f805a	deps: drop tempfile We were only using it to create temporary directories for `ignore` tests, but it pulls in a bunch of dependencies and we don't really need randomness. So just use our own simple wrapper instead.	2019-08-06 09:46:05 -04:00
Andrew Gallant	4de227fd9a	deps: update everything Mostly this just updates regex and its assorted dependencies. This does drop utf8-ranges and ucd-util, in accordance with changes to regex-syntax and regex.	2019-08-05 13:50:55 -04:00
jimbo1qaz	d7ce274722	readme: Debian Buster is stable now PR #1338	2019-08-04 08:06:10 -04:00
Andrew Gallant	5b10328f41	changelog: update with bug fix	2019-08-02 07:37:27 -04:00
Andrew Gallant	813c676eca	searcher: fix roll buffer bug This commit fixes a subtle bug in how the line buffer was rolling its contents. Specifically, when ripgrep searches without memory maps, it uses a "roll" buffer for incremental line oriented search without needing to read the entire file into memory at once. The roll buffer works by reading a chunk of bytes from the file into memory, and then searching everything in that buffer up to the last `\n` byte. The bytes after the last `\n` byte are preserved, since they likely correspond to part of the next line. Once ripgrep is done searching the buffer, it "rolls" the buffer such that the start of the next line is at the beginning of the buffer, and then ripgrep reads more data into the buffer starting at the (possibly) partial end of that line. The implication of this strategy, necessarily so, is that a buffer must be big enough to fit a single line in memory. This is because the regex engine needs a contiguous block of memory to search, so there is no way to search anything smaller than a single line. So if a file contains a single line with 7.5 million bytes, then the buffer will grow to be at least that size. (Many files have super long lines like this, but they tend to be binary files, which ripgrep will detect and stop searching unless the user forces it with the `-a/--text` flag. So in practice, they aren't usually a problem. However, in this case, #1335 found a case where a plain text file had a line with 7.5 million bytes.) Now, for performance reasons, ripgrep reuses these buffers across its search. Typically, it will create `N` of these line buffers when it starts (where `N` is the number of threads it is using), and then reuse them without creating any new ones as it searches through files. This means that if you search a file with a very long line, that buffer will expand to be big enough to store that line. ripgrep never contracts these buffers, so once it searches the next file, ripgrep will continue to use this large buffer. While it might be prudent to contract these buffers in some circumstances, this isn't otherwise inherently a problem. The memory has already been allocated, and there isn't much cost to using it, other than the fact that ripgrep hangs on to it and never gives it back to the OS. However, the `roll` implementation described above had a really important bug in it that was impacted by the size of the buffer. Specifically, it used the following to "roll" the partial line at the end of the buffer to the beginning: self.buf.copy_within_str(self.pos.., 0); Which means that if the buffer is very large, ripgrep will copy everything from `self.pos` (which might be very small, e.g., for small files) to the end of the buffer, and move it to the beginning of the buffer. This will happen repeatedly each time the buffer is used to search small files, which winds up being quite a large slow down if the line was exceptionally large (say, megabytes). It turns out that copying everything is completely unnecessary. We only need to copy the remainder of the last read to the beginning of the buffer. Everything after the last read in the buffer is just free space that can be filled for the next read. So, all we need to do is copy just those bytes: self.buf.copy_within_str(self.pos..self.end, 0); ... which is typically much much smaller than the rest of the buffer. This was likely also causing small performance losses in other cases as well. For example, when searching a lot of small files, ripgrep would likely do a lot more copying than necessary. Although, given that the default buffer size is 8KB, this extra copying was likely pretty small, and was thus harder to observe. Fixes #1335	2019-08-02 07:23:27 -04:00
Andrew Gallant	f625d72b6f	pkg: update brew tap to 11.0.2	2019-08-01 19:39:53 -04:00
Andrew Gallant	3de31f7527	ci: fix musl deployment The docker image that the Linux binary is now built in does not have ASCII doc installed, so setup Cross to point to my own image with those tools installed. 11.0.2	2019-08-01 18:41:44 -04:00
Andrew Gallant	e402d6c260	ripgrep: release 11.0.2	2019-08-01 18:02:15 -04:00
Andrew Gallant	48b5bdc441	src: remove old directories termcolor has had its own repository for a while now. No need for these redirects any more.	2019-08-01 17:49:28 -04:00
Andrew Gallant	709ca91f50	ignore: release 0.4.9 ignore-0.4.9	2019-08-01 17:48:37 -04:00
Andrew Gallant	9c220f9a9b	grep-regex: release 0.1.4 grep-regex-0.1.4	2019-08-01 17:47:45 -04:00
Andrew Gallant	9085bed139	grep-matcher: release 0.1.3 grep-matcher-0.1.3	2019-08-01 17:46:59 -04:00
Andrew Gallant	931ab35f76	changelog: start work on 11.0.2 release	2019-08-01 17:42:38 -04:00
Andrew Gallant	b5e5979ff1	deps: update everything This drops `spin` and `autocfg`, yay.	2019-08-01 17:42:38 -04:00
Andrew Gallant	052c857da0	doc: mention .ignore and .rgignore more prominently Fixes #1284	2019-08-01 17:37:46 -04:00
Andrew Gallant	5e84e784c8	doc: add translations section We note that they may not be up to date and are unofficial. Fixes #1246	2019-08-01 17:37:46 -04:00
Andrew Gallant	01e8e11621	doc: improve PCRE2 failure mode documentation If a user tries to search for an explicit `\n` character in a PCRE2 regex, ripgrep won't report an error and instead will (likely) silently fail to match. Fixes #1261	2019-08-01 17:32:44 -04:00
Ninan John	9268ff8e8d	ripgrep: fix bug when CWD has directory named `-` Specifically, when searching stdin, if the current directory has a directory named `-`, then the `--with-filename` flag would automatically be turned on. This is because `--with-filename` is automatically enabled when ripgrep is given a single path that is a directory. When ripgrep is given empty arguments, and if it is searching stdin, then its default path list is just simple `["-"]`. The `is_dir` check passes, and `--with-filename` gets enabled. This commit fixes the problem by checking whether the path is `-` first. If so, then we assume it isn't a directory. This is fine, since if it is a directory and one asks to search it explicitly, then ripgrep will interpret `-` as stdin anyway (which is arguably a bug on its own, but probably not one worth fixing). Fixes #1223, Closes #1292	2019-08-01 17:27:23 -04:00
dana	c2cb0a4de4	ripgrep: add --glob-case-insensitive This flag forces -g/--glob patterns to be treated case-insensitively, as with --iglob patterns. Fixes #1293	2019-08-01 17:08:58 -04:00
Andrew Gallant	adb9332f52	regex: fix -F aho-corasick optimization It turns out that when the -F flag was used, if any of the patterns contained a regex meta character (such as `.`), then we winded up escaping the pattern first before handing it off to Aho-Corasick, which treats all patterns literally. We continue to apply band-aides here and just avoid Aho-Corasick if there is an escape in any of the literal patterns. This is unfortunate, but making this work better requires more refactoring, and the right solution is to get this optimization pushed down into the regex engine. Fixes #1334	2019-08-01 16:58:12 -04:00
Matthew Davidson	bc37c32717	ignore/types: add edn type from Clojure ecosystem PR #1330	2019-07-29 16:43:28 -04:00
Andrew Gallant	08ae4da2b7	deps: update them There are some nice removals. It looks like rand has slimmed down, and smallvec is gone now as well.	2019-07-25 07:52:33 -04:00
Andrew Gallant	7ac95c1f50	deps: bump ignore	2019-07-24 12:56:47 -04:00
Andrew Gallant	7a6903bd4e	ignore-0.4.8 ignore-0.4.8	2019-07-24 12:56:01 -04:00
Tiziano Santoro	9801fae29f	ignore: support compilation on wasm Currently the crate assumes that exactly one of `cfg(windows)` or `cfg(unix)` is true, but this is not actually the case, for instance when compiling for `wasm32`. Implement the missing functions so that the crate can compile on other platforms, even though those functions will always return an error. PR #1327	2019-07-24 12:55:37 -04:00
Miloš Stojanović	abdf7140d7	readme: fix broken link to Scoop bucket PR #1324	2019-07-20 12:03:46 -04:00
Conrad Olega	b83e7968ef	ignore/types: add Robot Framework PR #1322	2019-07-14 08:12:34 -04:00
Hugo Locurcio	8ebc113847	doc: improve docs for --replace flag Specifically, we document shell-specific caveats related to the `--replace` flag. PR #1318	2019-07-04 11:42:35 -04:00
Andrew Gallant	785c1f1766	release: globset, grep-cli, grep-printer, grep-searcher grep-cli-0.1.3 grep-printer-0.1.3 grep-searcher-0.1.5 globset-0.4.4	2019-06-26 16:53:30 -04:00
Andrew Gallant	8b734cb490	deps: update everything	2019-06-26 16:51:06 -04:00
Andrew Gallant	b93762ea7a	bstr: update everything to bstr 0.2	2019-06-26 16:47:33 -04:00
Andrew Gallant	34677d2622	search: a few small touchups	2019-06-18 20:23:47 -04:00
Andrew Gallant	d1389db2e3	search: better errors for preprocessor commands If a preprocessor command could not be started, we now show some additional context with the error message. Previously, it showed something like this: some/file: No such file or directory (os error 2) Which is itself pretty misleading. Now it shows: some/file: preprocessor command could not start: '"nonexist" "some/file"': No such file or directory (os error 2) Fixes #1302	2019-06-16 19:02:02 -04:00
Andrew Gallant	50bcb7409e	deps: update everything	2019-06-16 18:38:45 -04:00
Andrew Gallant	7b9972c308	style: fix deprecations Use `dyn` for trait objects and use `..=` for inclusive ranges.	2019-06-16 18:37:51 -04:00
Hitesh Jasani	9f000c2910	ignore/types: add more nim types PR #1297	2019-06-12 14:02:28 -04:00
skierpage	392682d352	doc: point regex doc link to the latest version The latest doc is different, e.g. adds "symmetric differences" under https://docs.rs/regex/*/regex/#character-classes PR #1287	2019-06-01 08:44:55 -04:00
Andrew Gallant	7d3f794588	ignore: remove .git check in some cases When we know we aren't going to process gitignores, we shouldn't waste the syscall in every directory to check for a git repo.	2019-05-29 18:06:11 -04:00
bruce-one	290fd2a7b6	readme: mention Zstandard and Brotli Also alphabetise the list. PR #1288	2019-05-29 13:37:31 -04:00
Fabian Würfl	d1e4d28f30	readme: remove outdated statement Issue #10 already states that "ripgrep is now in most or all of the major package repositories." PR #1280	2019-05-14 18:44:50 -04:00
Andrew Gallant	5ce2d7351d	ci: use cross for musl x86_64 builds This is necessary because jemalloc + musl + Ubuntu 16.04 is apparently broken. Moreover, jemalloc doesn't support i686, so we accept the performance regression there. See also: https://github.com/gnzlbg/jemallocator/issues/124	2019-04-25 11:12:14 -04:00
Andrew Gallant	9dcfd9a205	deps: bump pcre2-sys to 0.2.1 This brings in a bug fix that no longer tries to run `git` to update the submodule if the `git` command doesn't exist. This is useful is more restricted build contexts where `git` isn't installed. Such as in the docker image used for running `cross`.	2019-04-25 11:12:14 -04:00
Andrew Gallant	36b276c6d0	printer: remove unnecessary mut	2019-04-24 17:22:27 -04:00
Andrew Gallant	03bf37ff4a	alloc: use jemalloc when building with musl It turns out that musl's allocator is slow enough to cause a fairly noticeable performance regression when ripgrep is built as a static binary with musl. We fix this by using jemalloc when building with musl. We continue to use the default system allocator in all other scenarios. Namely, glibc's allocator doesn't noticeably regress performance compared to jemalloc. But we could add more targets to this logic if other system allocators (macOS, Windows) prove to be slow. This wasn't necessary before because rustc recently stopped using jemalloc by default. Fixes #1268	2019-04-24 17:21:38 -04:00
Andrew Gallant	e7829c05d3	cli: fix bug where last byte was stripped In an effort to strip line terminators, we assumed their existence. But a pattern file may not end with a line terminator, so we shouldn't unconditionally strip them. We fix this by moving to bstr's line handling, which does this for us automatically.	2019-04-19 07:11:44 -04:00
Rory O’Kane	a6222939f9	readme: mention --pcre2 as long form of -P This is for consistency with the short and long flags given in other bullet points. I originally assumed there was no long flag for `-P` because none was given here. PR #1254	2019-04-16 21:22:48 -04:00
Rory O’Kane	6ffd434232	readme: mention --auto-hybrid-regex in advantages This feature solves a major reason I was skeptical of using ripgrep, so I think it’s good to mention it in the section about why one should use it. I use backreferences a lot, so I had previously thought that ripgrep would provide no speed advantage over ag, since I would always have `-P` enabled. But when I saw `--auto-hybrid-regex` in the 11.0.0 changelog, I learned that ripgrep can use it to speed up simple queries while still allowing me to write backreferences. PR #1253	2019-04-16 17:21:40 -04:00
Andrew Gallant	1f1cd9b467	pkg: update brew tap to 11.0.1	2019-04-16 13:39:56 -04:00

1 2 3 4 5 ...

1299 Commits