ripgrep

mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00

Author	SHA1	Message	Date
Ian Kerins	ed01e80a79	Provide a mechanism to compose type definitions This extends the syntax of the --type-add flag to allow including the globs of other already defined types. Fixes #83.	2017-01-07 18:14:24 -05:00
Andrew Gallant	b65a8c353b	Add --sort-files flag. When used, parallelism is disabled but the results are sorted by file path. Closes #263	2017-01-06 22:43:59 -05:00
Andrew Gallant	bb70f96743	Fix a non-termination bug. This was a very silly bug. Instead of creating a particular atomic once and cloning it, we created a new value for each worker. Fixes #279	2016-12-12 06:55:49 -05:00
Andrew Gallant	d66812102b	Fix leading hypen bug by updating clap. Fixes #270	2016-12-06 17:29:34 -05:00
Andrew Gallant	7282706b42	Fix bug reading root symlink. When give an explicit file path on the command line like `foo` where `foo` is a symlink, ripgrep should follow it even if `-L` isn't set. This is consistent with the behavior of `foo/`. Fixes #256	2016-12-05 20:05:57 -05:00
Andrew Gallant	0473df1ef5	Disable Unicode mode for literal regex. When ripgrep detects a literal, it emits them as raw hex escaped byte sequences to Regex::new. This permits literal optimizations for arbitrary byte sequences (i.e., possibly invalid UTF-8). The problem is that Regex::new interprets hex escaped byte sequences as Unicode codepoints by default, but we want them to actually stand for their raw byte values. Therefore, disable Unicode mode. This is OK, since the regex is composed entirely of literals and literal extraction does Unicode case folding. Fixes #251	2016-11-28 18:31:58 -05:00
Andrew Gallant	301a3fd71d	Detect more uppercase literals for --smart-case. This changes the uppercase literal detection for the "smart case" functionality. In particular, a character class is considered to have an uppercase literal if at least one of its ranges starts or stops with an uppercase literal. Fixes #229	2016-11-28 17:57:26 -05:00
Andrew Gallant	03f7605322	Rename --files-without-matches to --files-without-match. This is to be consistent with grep.	2016-11-19 20:15:41 -05:00
Daniel Luz	bd3e7eedb1	Add --files-without-matches flag. Performs the opposite of --files-with-matches: only shows paths of files that contain zero matches. Closes #138	2016-11-19 21:48:59 -02:00
Andrew Gallant	e37f783fc0	Fix issue number mixup. Thanks @bluss!	2016-11-17 20:30:18 -05:00
Andrew Gallant	92dc402f7f	Switch from Docopt to Clap. There were two important reasons for the switch: 1. Performance. Docopt does poorly when the argv becomes large, which is a reasonable common use case for search tools. (e.g., use with xargs) 2. Better failure modes. Clap knows a lot more about how a particular argv might be invalid, and can therefore provide much clearer error messages. While both were important, (1) made it urgent. Note that since Clap requires at least Rust 1.11, this will in turn increase the minimum Rust version supported by ripgrep from Rust 1.9 to Rust 1.11. It is therefore a breaking change, so the soonest release of ripgrep with Clap will have to be 0.3. There is also at least one subtle breaking change in real usage. Previous to this commit, this used to work: rg -e -foo Where this would cause ripgrep to search for the string `-foo`. Clap currently has problems supporting this use case (see: https://github.com/kbknapp/clap-rs/issues/742), but it can be worked around by using this instead: rg -e [-]foo or even rg [-]foo and this still works: rg -- -foo This commit also adds Bash, Fish and PowerShell completion files to the release, fixes a bug that prevented ripgrep from working on file paths containing invalid UTF-8 and shows short descriptions in the output of `-h` but longer descriptions in the output of `--help`. Fixes #136, Fixes #189, Fixes #210, Fixes #230	2016-11-17 19:53:41 -05:00
Eric Kidd	e9cd0a1cc3	Allow specifying patterns with `-f FILE` and `-f-` This is a somewhat basic implementation of `-f-` (#7), with unit tests. Changes include: 1. The internals of the `pattern` function have been refactored to avoid code duplication, but there's a lot more we could do. Right now we read the entire pattern list into a `Vec`. 2. There's now a `WorkDir::pipe` command that allows sending standard input to `rg` when testing. Not implemented: aho-corasick.	2016-11-15 13:00:16 -05:00
Andrew Gallant	4b18f82899	Disable symlink tests on Windows. For some reason, these work on AppVeyor but not in other build systems. Let's just disable them. See: https://github.com/rust-lang/rust/pull/37149	2016-11-11 06:44:23 -05:00
Andrew Gallant	2dce0dc0df	Fix a bug with handling --ignore-file. Namely, passing a directory to --ignore-file caused ripgrep to allocate memory without bound. The issue was that I got a bit overzealous with partial error reporting. Namely, when processing a gitignore file, we should try to use every pattern even if some patterns are invalid globs (e.g., a**b). In the process, I applied the same logic to I/O errors. In this case, it manifest by attempting to read lines from a directory, which appears to yield Results forever, where each Result is an error of the form "you can't read from a directory silly." Since I treated it as a partial error, ripgrep was just spinning and accruing each error in memory, which caused the OOM killer to kick in. Fixes #228	2016-11-09 16:45:23 -05:00
Andrew Gallant	58aca2efb2	Add -m/--max-count flag. This flag limits the number of matches printed per file. Closes #159	2016-11-06 13:09:53 -05:00
Andrew Gallant	0222e024fe	Fixes a bug with --smart-case. This was a subtle bug, but the big picture was that the smart case information wasn't being carried through to the literal extraction in some cases. When this happened, it was possible to get back an incomplete set of literals, which would therefore miss some valid matches. The fix to this is to actually parse the regex and determine whether smart case applies before doing anything else. It's a little extra work, but parsing is pretty fast. Fixes #199	2016-11-06 12:07:47 -05:00
Andre Bogus	02de97b8ce	Use the bytecount crate for fast line counting. Fixes #128	2016-11-05 22:29:26 -04:00
Andrew Gallant	16975797fe	Fixes a matching bug in the glob override matcher. This was probably a transcription error when moving the ignore matcher code out of ripgrep core. Specifically, the override glob matcher should not ignore directories if they don't match. Fixes #206	2016-10-31 19:54:38 -04:00
Andrew Gallant	d79add341b	Move all gitignore matching to separate crate. This PR introduces a new sub-crate, `ignore`, which primarily provides a fast recursive directory iterator that respects ignore files like gitignore and other configurable filtering rules based on globs or even file types. This results in a substantial source of complexity moved out of ripgrep's core and into a reusable component that others can now (hopefully) benefit from. While much of the ignore code carried over from ripgrep's core, a substantial portion of it was rewritten with the following goals in mind: 1. Reuse matchers built from gitignore files across directory iteration. 2. Design the matcher data structure to be amenable for parallelizing directory iteration. (Indeed, writing the parallel iterator is the next step.) Fixes #9, #44, #45	2016-10-29 20:48:59 -04:00
Andrew Gallant	f2e1711781	Fix bug when processing parent gitignore files. This particular bug was triggered whenever a search was run in a directory with a parent directory that contains a relevant .gitignore file. In particular, before matching against a parent directory's gitignore rules, a path's leading `./` was not stripped, which results in errant matching. We now make sure `./` is stripped. Fixes #184.	2016-10-16 10:15:11 -04:00
Andrew Gallant	4737326ed3	Update regex-syntax for bug fix. The bug fix was in expression pretty printing. ripgrep parses the regex into an AST and may do some modifications to it, which requires the ability to go from string -> AST -> string' -> AST' where string == string' implies AST == AST'. Also, add a regression test for the specific regex that tripped the bug. Fixes #156.	2016-10-10 22:04:29 -04:00
Andrew Gallant	a3537aa32a	Update darwin cfg attributes.	2016-10-10 21:48:47 -04:00
Andrew Gallant	4e52059ad6	Disable regression_131 test on darwin. It's not clear why it's failing. Maybe it doesn't permit certain characters in file paths?	2016-10-10 21:03:11 -04:00
Andrew Gallant	27a980c1bc	Fix symlink test. We attempt to run it on Windows, but I'm getting "access denied" errors when trying to create a file symlink. So we disable the test on Windows.	2016-10-10 19:34:57 -04:00
Andrew Gallant	e8645dc8ae	style nits	2016-10-10 19:27:12 -04:00
Andrew Gallant	e96d93034a	Finish overhaul of glob matching. This commit completes the initial move of glob matching to an external crate, including fixing up cross platform support, polishing the external crate for others to use and fixing a number of bugs in the process. Fixes #87, #127, #131	2016-10-10 19:24:18 -04:00
Ian Kerins	1c964372ad	Always follow symlinks on explicit file arguments.	2016-10-08 22:40:03 -04:00
Andrew Gallant	175406df01	Refactor and test glob sets. This commit goes a long way toward refactoring glob sets so that the code is easier to maintain going forward. In particular, it makes the literal optimizations that glob sets used a lot more structured and much easier to extend. Tests have also been modified to include glob sets. There's still a bit of polish work left to do before a release. This also fixes the immediate issue where large gitignore files were causing ripgrep to slow way down. While we don't technically fix it for good, we're a lot better about reducing the number of regexes we compile. In particular, if a gitignore file contains thousands of patterns that can't be matched more simply using literals, then ripgrep will slow down again. We could fix this for good by avoiding RegexSet if the number of regexes grows too large. Fixes #134.	2016-10-04 20:28:56 -04:00
Andrew Gallant	925d0db9f0	Add -s/--case-sensitive flag. This flag overrides both --smart-case and --ignore-case. Closes #124.	2016-09-28 16:32:29 -04:00
Garrett Squire	babe80d498	add a max-depth option for directory traversal CR and add integration test	2016-09-27 16:14:53 -07:00
Andrew Gallant	3e78fce3a3	Don't print empty lines in single threaded mode. Fixes #99.	2016-09-26 19:57:23 -04:00
Andrew Gallant	7a3fd1f23f	Add a --null flag. This flag causes a NUL byte to follow any file path in ripgrep's output. Closes #89.	2016-09-26 19:21:17 -04:00
Andrew Gallant	d306403440	Fix an off-by-one error with --column. Fixes #105.	2016-09-26 19:09:59 -04:00
Andrew Gallant	b034b77798	Don't replace NUL bytes when searching binary files as text. This was a result of misinterpreting a feature in grep where NUL bytes are replaced with \n. The primary reason for doing this is to avoid excessive memory usage on truly binary data. However, grep only does this when searching binary files as if they were binary, and which only reports whether the file matched or not. When grep is told to search binary data as text (the -a/--text flag), then it doesn't do any replacement so we shouldn't either. In general, this makes sense, because the user is essentially asserting that a particular file that looks like binary is actually text. In that case, we shouldn't try to replace any NUL bytes. ripgrep doesn't actually support searching binary data for whether it matches or not, so we don't actually need the replace_buf function. However, it does seem like a potentially useful feature.	2016-09-25 21:26:49 -04:00
Andrew Gallant	6a8051b258	Don't union inner literals of repetitions. If we do, this results in extracting `foofoofoo` from `(\wfoo){3}`, which is wrong. This does prevent us from extracting `foofoofoo` from `foo{3}`, which is unfortunate, but we miss plenty of other stuff too. Literal extracting needs a good rethink (all the way down into the regex engine). Fixes #93	2016-09-25 20:10:28 -04:00
Andrew Gallant	ed94aedf27	Permit whitelisting hidden files in ignores. Fixes #90	2016-09-25 18:31:41 -04:00
Andrew Gallant	3d6a39be06	Fix tests on Windows. Mostly this is just using \\ instead of / in paths reported by the OS.	2016-09-25 15:45:51 -04:00
Andrew Schwartzmeyer	a8f3d9e87e	Add --files-with-matches flag. Closes #26. Acts like --count but emits only the paths of files with matches, suitable for piping to xargs. Both mmap and no-mmap searches terminate after the first match is found. Documentation updated and tests added.	2016-09-24 21:40:17 -07:00
Andrew Gallant	1595f0faf5	Add --smart-case. It does what it says on the tin. Closes #70.	2016-09-24 21:51:04 -04:00
Andrew Gallant	8eeb0c0b60	Add --no-ignore-vcs flag. This flag will respect .ignore but not .gitignore. Closes #68.	2016-09-24 21:31:24 -04:00
Andrew Gallant	c8227e0cf3	Don't ignore first path when using --files. This is a docopt oddity, but probably not a bug. If --files is given, then just interpret the pattern (if not empty) as the first file path. Fixes #64.	2016-09-24 20:22:02 -04:00
Andrew Gallant	b941c10b90	Fix directory whitelisting. There was a bug in the translation from a gitignore pattern to a standard glob where `!/dir` wasn't being interpreted as an absolute path. Fixes #67.	2016-09-24 20:10:30 -04:00
Andrew Gallant	71ad9bf393	Fix trailing recursive globs in gitignore. A standard glob of `foo/` will match `foo`, but gitignore semantics specify that `foo/` should only match the contents of `foo` and not `foo` itself. We capture those semantics by translating `foo/` to `foo//*`. Fixes #30.	2016-09-24 19:44:06 -04:00
Andrew Gallant	a6e3cab65a	Add --no-filename flag. When this flag is set, a filename is never shown for a match. Closes #20	2016-09-24 19:24:24 -04:00
Andrew Gallant	7b860affbe	Change the default output of --files to elide './'. This is kind of a ticky-tack change. I do think ./ as a prefix is reasonable default, but we strip ./ when showing search results, so it does make sense to be consistent. Fixes #21.	2016-09-24 19:18:48 -04:00
Andrew Gallant	346bad7dfc	Fix handling of absolute patterns in parent gitignore files. If a gitignore file in a parent directory is used, then it must be matched relative to the directory it's in. ripgrep wasn't actually adhering to this rule. Consider an example: .gitignore src llvm foo Where `.gitignore` contains `/llvm/` and `foo` contains `test`. When running `rg test` at the top-level directory, `foo` is correctly searched. If you `cd` into `src` and re-run the same search, `foo` is ignored because the `/llvm/` pattern is interpreted with respect to the current working directory, which is wrong. The problem is that the path of `llvm` is `./llvm`, which makes it look like it should match. We fix this by rebuilding the directory path of each file when traversing gitignores in parent directories. This does come with a small performance hit. Fixes #25.	2016-09-24 18:40:50 -04:00
Andrew Gallant	a3fc4cdded	Fix a bug in the translation from a gitignore pattern to a glob. We were erroneously neglecting to prefix a pattern like `foo/` with `/` (to make `/foo/`) because it had a slash in it. In fact, the only reason to neglect a / prefix is if the pattern already starts with /, or if the pattern is absolute. Fixes #16, #49, #50, #65	2016-09-24 16:29:25 -04:00
Andrew Gallant	cc90511ab2	Switch from .rgignore to .ignore. But don't actually remove support for .rgignore until the next semver bump. Note that this puts us in line with the silver searcher: https://github.com/ggreer/the_silver_searcher/pull/974 Fixes #40	2016-09-23 22:44:33 -04:00
Andrew Gallant	6367dd61ba	Column numbers should start at 1. ripgrep was documented to do 1-based indexing, so this is a bug and not a breaking change. Fixes #18	2016-09-23 17:11:09 -04:00
Andrew Gallant	dfebed6cbe	Add --vimgrep flag. The --vimgrep flag forces a line to be printed for every match, with line and column numbers.	2016-09-22 21:32:38 -04:00

1 2

57 Commits