ripgrep

mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00

Author	SHA1	Message	Date
Andrew Gallant	bb110c1ebe	ripgrep: migrate to libripgrep This commit does the work to delete the old `grep` crate and effectively rewrite most of ripgrep core to use the new libripgrep crates. The new `grep` crate is now a facade that collects the various crates that make up libripgrep. The most complex part of ripgrep core is now arguably the translation between command line parameters and the library options, which is ultimately where we want to be.	2018-08-20 07:10:19 -04:00
Andrew Gallant	2913fc4cd0	tests: reduce reliance on state in tests This commit improves the integration test setup by running tests inside the system's temporary directory instead of within ripgrep's `target` directory. The motivation here is to attempt to reduce the effect of unanticipated state on ripgrep's integration tests, such as the presence of `.gitignore` files in ripgrep's checkout directory hierarchy (including parent directories). This doesn't remove all possible state. For example, there's no guarantee that the system's temporary directory isn't itself within a git repository. Moreover, there may still be other ignore rules in the directory tree that might impact test behavior. Fixing this seems somewhat difficult. Conceptually, it seems like ripgrep should run each test in its own `chroot`-like environment, but doing this in a non-annoying and portable way (including Windows) doesn't appear to be possible. Another approach to take here might be to teach ripgrep itself that a particular directory should be treated as root, and therefore, never look at anything outside that directory. This also seems complex to implement, but tractable. Let's see how this approach works for now. Fixes #448, #996	2018-07-29 10:41:03 -04:00
Andrew Gallant	7c412bb2fa	tests/style: 80 columns, dammit	2018-07-29 10:41:03 -04:00
Andrew Gallant	dca8110da2	ripgrep: when given no patterns, don't match Generally speaking, ripgrep prevents the case of not having any patterns via its arg parsing. However, it is possible for users to provide a file of patterns via the `-f` flag. If that file is empty, then ripgrep has nothing to search for and therefore should not ever produce any match. One way of fixing this might be to replace the absence of patterns with a pattern that can never match, but this still requires opening and searching through every file, which is quite a waste. Instead, we detect this case explicitly and quit early. Fixes #900	2018-07-22 12:07:18 -04:00
Andrew Gallant	0d11497d21	tests: be looser with gzip failure Don't expect the exact error message. Instead, just ask that the error message exist and be non-empty. Fixes #903	2018-07-22 11:08:16 -04:00
Andrew Gallant	e65ca21a6c	ignore: only respect .gitignore in git repos This commit fixes an interesting bug in the `ignore` crate where it would basically respect any `.gitignore` file anywhere (including global gitignores in `~/.config/git/ignore`), regardless of whether we were searching in a git repository or not. This commit rectifies that behavior to only respect gitignore rules when in a git repo. The key change here is to move the logic of whether to traverse parents into the directory matcher rather than putting the onus on the directory traverser. In particular, we now need to traverse parent directories in more cases than we previously did, since we need to determine whether we're in a git repository or not. Fixes #934	2018-07-22 10:33:23 -04:00
Charles Blake	231456c409	ripgrep: add --pre flag The preprocessor flag accepts a command program and executes this program for every input file that is searched. Instead of searching the file directly, ripgrep will instead search the stdout contents of the program. Closes #978, Closes #981	2018-07-21 17:25:12 -04:00
Kalle Samuels	1d09d4d31b	ripgrep: add support for lz4 decompression This uses the lz4 binary for decompression. Closes #898	2018-07-21 16:26:39 -04:00
Jon Surrell	ca23a170f7	ripgrep: use exit code 2 to indicate error Exit code 1 was shared to indicate both "no results" and "error." Use status code 2 to indicate errors, similar to grep's behavior. Fixes #948 PR #954	2018-06-19 07:41:44 -04:00
Andrew Gallant	bf51058eb2	tests: fix tests on Windows A bug in the atty crate was previously masking a problem with the integration tests on Windows. Namely, the bug in atty resulted in atty::is(Stdin) returning true if we couldn't get the file name for the stdin stream. This in turn caused tests like `rg foo` to search the CWD, which was the intended behavior. However, once the atty bug was fixed, atty::is(Stdin) no longer returned true, causing `rg foo` searches to fail. On Unix-like systems, the atty behavior has always been correct. However, on Unix-like systems we have a decent way of detecting whether stdin is readable or not. If it isn't---which is the case in the integration tests---then we fall back to searching the CWD. On Windows however, we haven't yet implemented anything to detect whether stdin is readable or not, so we must always assume that it is. Therefore, we never get the "go ahead" to search the CWD and the tests fail. Most of the tests are written to search the CWD explicitly, but there were a few stragglers that don't. This isn't great, and we should try to figure out how to do better stdin detection on Windows.	2018-04-23 20:37:59 -04:00
Andrew Gallant	ae6f871491	output: remove --line-number-width flag This commit does what no software project has ever done before: we've outright removed a flag with no possible way to recapture its functionality. This flag presents numerous problems in that it never really worked well in the first place, and completely falls over when ripgrep uses the --no-heading output format. Well meaning users want ripgrep to fix this by getting into the alignment business by buffering all output, but that is a line that I refuse to cross. Fixes #795	2018-04-23 19:57:22 -04:00
Andrew Gallant	b75526bd7f	output: add --no-column flag This disables columns in the output if they were otherwise enabled. Fixes #880	2018-04-23 19:26:58 -04:00
Andrew Gallant	507801c1f2	ignore: support .git directory OR file This improves support for submodules, which seem to use a '.git' file instead of a '.git' directory to indicate a worktree. Fixes #893	2018-04-23 18:33:25 -04:00
Andrew Gallant	cd08707c7c	grep: upgrade to regex-syntax 0.5 This update brings with it many bug fixes: * Better error messages are printed overall. We also include explicit call out for unsupported features like backreferences and look-around. * Regexes like `\s{` no longer emit incomprehensible errors. Unicode escape sequences, such as `\u{..}` are now supported. For the most part, this upgrade was done in a straight-forward way. We resist the urge to refactor the `grep` crate, in anticipation of it being rewritten anyway. Note that we removed the `--fixed-strings` suggestion whenever a regex syntax error occurs. In practice, I've found that it results in a lot of false positives, and I believe that its use is not as paramount now that regex parse errors are much more readable. Closes #268, Closes #395, Closes #702, Closes #853	2018-03-13 22:55:39 -04:00
Balaji Sivaraman	00520b30f5	output: add --stats flag This commit provides basic support for a --stats flag, which will print various aggregate statistics about a search after all of the results have been printed. This is mostly intended to support a similar feature found in the Silver Searcher. Note though that we don't emit the total bytes searched; this is a first pass at an implementation and we can improve upon it later. Closes #411, Closes #799	2018-03-10 10:59:00 -05:00
Andrew Gallant	11a8f0eaf0	args: treat --count --only-matching as --count-matches Namely, when ripgrep is asked to count things and is also asked to print every match on its own line, then we should just automatically count the matches and not the lines. This is a departure from how GNU grep behaves, but there is a compelling argument to be made that GNU grep's behavior doesn't make a lot of sense. Note that since this changes the behavior of combining two existing flags, this is a breaking change.	2018-03-10 10:38:34 -05:00
Balaji Sivaraman	27fc9f2fd3	search: add a --count-matches flag This commit introduces a new flag, --count-matches, which will cause ripgrep to report a total count of all matches instead of a count of total lines matched. Closes #566, Closes #814	2018-03-10 10:38:25 -05:00
Balaji Sivaraman	b006943c01	search: add -b/--byte-offset flag This commit adds support for printing 0-based byte offset before each line. We handle corner cases such as `-o/--only-matching` and `-C/--context` as well. Closes #812	2018-03-10 10:15:19 -05:00
Brian Malehorn	91d0756f62	ignore: support backslash escaping Use the new `Globset::backslash_escape` knob to conform to git behavior: `\` will escape the following character. For example, the pattern `\` will match a file literally named ``. Also tweak a test in ripgrep that was relying on this incorrect behavior. Closes #526, Closes #811	2018-03-10 09:30:55 -05:00
Balaji Sivaraman	d57fc58081	termcolor: add underline support This commit adds underline support to the termcolor crate, and exposes it through ripgrep. Fixes #798	2018-02-20 07:10:03 -05:00
Andrew Gallant	361698b90a	ignore: fix improper hidden filtering This commit fixes a bug where `rg --hidden .` would behave differently with respect to ignore filtering than `rg --hidden ./`. In particular, this was due to a bug where the directory name `.` caused the leading `.` in a hidden directory to get stripped, which in turn caused the ignore rules to fail. Fixes #807	2018-02-14 18:16:38 -05:00
Andrew Gallant	8cb5833ef9	argv: update clap to 2.29.4 We use the new AppSettings::AllArgsOverrideSelf to permit all flags to be specified multiple times. This removes the need for our previous work-around where we would enable `multiple` for every flag and then just extract the last value when consuming clap's matches. We also add a couple regression tests that ensure repeated switches and flags work as expected.	2018-02-06 12:07:59 -05:00
Andrew Gallant	c57d0fb4e8	config: add persistent configuration This commit adds support for reading configuration files that change ripgrep's default behavior. The format of the configuration file is an "rc" style and is very simple. It is defined by two rules: 1. Every line is a shell argument, after trimming ASCII whitespace. 2. Lines starting with '#' (optionally preceded by any amount of ASCII whitespace) are ignored. ripgrep will look for a single configuration file if and only if the RIPGREP_CONFIG_PATH environment variable is set and is non-empty. ripgrep will parse shell arguments from this file on startup and will behave as if the arguments in this file were prepended to any explicit arguments given to ripgrep on the command line. For example, if your ripgreprc file contained a single line: --smart-case then the following command RIPGREP_CONFIG_PATH=wherever/.ripgreprc rg foo would behave identically to the following command rg --smart-case foo This commit also adds a new flag, --no-config, that when present will suppress any and all support for configuration. This includes any future support for auto-loading configuration files from pre-determined paths (which this commit does not add). Conflicts between configuration files and explicit arguments are handled exactly like conflicts in the same command line invocation. That is, this command: RIPGREP_CONFIG_PATH=wherever/.ripgreprc rg foo --case-sensitive is exactly equivalent to rg --smart-case foo --case-sensitive in which case, the --case-sensitive flag would override the --smart-case flag. Closes #196	2018-02-04 10:40:20 -05:00
Balaji Sivaraman	f007f940c5	search: add support for searching compressed files This commit adds opt-in support for searching compressed files during recursive search. This behavior is only enabled when the `-z/--search-zip` flag is passed to ripgrep. When enabled, a limited set of common compression formats are recognized via file extension, and a new process is spawned to perform the decompression. ripgrep then searches the stdout of that spawned process. Closes #539	2018-01-30 09:13:53 -05:00
kennytm	8514d4fbb4	termcolor: tweak reset escape Write `Ansi::reset()` using `\x1b[0m` instead of `\x1b[m`. This works around an AppVeyor bug: https://github.com/appveyor/ci/issues/1824	2018-01-29 14:14:55 -05:00
dana	58bdc366ec	printer: add --passthru flag The --passthru flag causes ripgrep to print every line, even if the line does not contain a match. This is a response to the common pattern of `^\|foo` to match every line, while still highlighting things like `foo`. Fixes #740	2018-01-11 18:45:51 -05:00
Balaji Sivaraman	14779ed0ea	ux: suggest --fixed-strings flag If a regex syntax error occurs, then ripgrep will suggest using the --fixed-strings flag. Fixes #727	2018-01-01 11:24:46 -05:00
Balaji Sivaraman	ba1023e1e4	printer: add support for line number alignment Closes #544	2018-01-01 09:00:31 -05:00
Igor Gnatenko	a5855a5d73	couple of trivial fixes to make clippy a bit more happy (#704 ) clippy: fix a few lints The fixes are: * Use single quotes for single-character * Use ticks in documentation when necessary. * Just bow to clippy's wisdom.	2017-12-30 16:06:16 -05:00
dana	d73a75d6cd	Omit context separators when using a contextless option like -c or -l Fixes #693	2017-11-29 12:55:42 -05:00
Martin Lindhe	c794ef2f04	fix some typos	2017-11-01 07:10:54 -04:00
Andrew Gallant	2a14bf2249	printer: fix colors on empty matches This fixes a bug where a "match" color escape was erroneously emitted after the new line character. This is because `^` is actually allowed to match after the end of a trailing new line, which means `^$` matches both before and after the trailing new line when multiline mode is enabled. The trailing match was causing the phantom escape sequence to appear, which we don't want. Incidentally, this is the root cause of #441 as well, although this commit doesn't fix that issue, since the line itself is printed before we detect the phantom match. Fixes #599	2017-10-21 22:40:10 -04:00
Evgeny Kulikov	f887bc1f86	printer: --only-matching works with --replace When -o/--only-matching is used with -r/--replace, the replacement works as expected. This is not a breaking change because the flags were previously set to conflict.	2017-10-20 20:58:27 -04:00
Sebastian Nowicki	363a4fa9b7	Fix path passed to try_create_bytes()	2017-10-20 20:51:12 -04:00
Sebastian Nowicki	712311fdc6	Don't create command until we know we can test it For regression 210 we may not actually need to test anything if the file system doesn't support creating files with invalid UTF-8 bytes. Don't create the command until we know there will be an assertion.	2017-10-20 20:51:12 -04:00
Sebastian Nowicki	0d2354aca6	Wrap comments to 79 columns	2017-10-20 20:51:12 -04:00
Sebastian Nowicki	8dc513b5d2	Skip regression 210 test on APFS APFS does not support creating filenames with invalid UTF-8 byte codes, thus this test doesn't make sense. Skip it on file systems where this shouldn't be possible. Fixes #559	2017-10-20 20:51:12 -04:00
Andrew Gallant	73c9ac4da5	integration tests: ignore regression_428 on Windows The test is severely constrained to the specific ANSI formatting of ripgrep in accordance with its default color scheme. The default color scheme on Windows changed, which caused the test to fail. For now, just disable the test on Windows.	2017-08-23 17:49:40 -04:00
dana	40bacbcd7c	Add -x/--line-regexp (#520 ) add -x/--line-regexp flag	2017-08-09 06:53:35 -04:00
dana	b7c3cf314d	Add test for option-arguments with leading hyphens	2017-07-30 17:55:24 -04:00
dana	6dce04963d	Allow options with non-numeric arguments to accept leading hyphens in arguments (fixes #568 )	2017-07-30 17:55:24 -04:00
Peter S Panov	4047d9db71	add --iglob flag Working with Chris Stadler, implemented https://github.com/BurntSushi/ripgrep/issues/163#issuecomment-300012592	2017-07-03 06:52:52 -04:00
Evan.Mattiza	06393f888c	fix word boundary w/ capture group fixes BurntSushi/ripgrep#506. Word boundary search as arg had unexpected behavior. added capture group to regex to encapsulate 'or' option search and prevent expansion and partial boundary finds. Signed-off-by: Evan.Mattiza <emattiza@gmail.com>	2017-06-15 06:55:55 -04:00
Andrew Gallant	112b3c5e0a	Fix another bug in -o/--only-matching. The handling of the -o/--only-matching was incorrect. We cannot ever re-run regexes on a subset of a matched line, because it doesn't take into account zero width assertions on the edges of the regex. This occurs whenever an end user uses an assertion explicity, but also occurs when one is used implicitly, e.g., with the `-w` flag. This instead reuses the initial matched range from the first regex match. We also apply this fix to coloring. Fixes #493	2017-05-29 09:51:58 -04:00
Marc Tiehuis	229b8e3b33	Make --quiet flag apply when using --files option Fixes #483.	2017-05-19 20:00:47 -04:00
Roman Proskuryakov	362abed44a	Fix reiteration of the first found match with --only-mathing flag Fixes #451	2017-04-21 08:11:55 -04:00
Andrew Gallant	7ad23e5565	Use for_label_no_replacement. This will cause certain unsupported legacy encodings to act as if they don't exist, in order to avoid using an unhelpful (in the context of file searching) "replacement" encoding. Kudos to @hsivonen for chirping about this!	2017-04-12 18:14:23 -04:00
Marc Tiehuis	66efbad871	Add dfa-size-limit and regex-size-limit arguments Fixes #362.	2017-04-12 18:14:23 -04:00
Roman Proskuryakov	90a11dec5e	Add `-o/--only-matching` flag. Currently, the `--only-matching` flag conflicts with the `--replace` flag. In the future, this restriction may be relaxed. Fixes #34	2017-04-09 08:47:35 -04:00
Roman Proskuryakov	aed3ccb9c7	Improves Printer, fixes some bugs	2017-03-31 14:44:13 -04:00
Roman Proskuryakov	01deac9427	Add -0 shortcut for --null Fixes #419	2017-03-28 18:37:40 -04:00
Ralf Jung	d352b79294	Add new -M/--max-columns option. This permits setting the maximum line width with respect to the number of bytes in a line. Omitted lines (whether part of a match, replacement or context) are replaced with a message stating that the line was elided. Fixes #129	2017-03-12 21:21:28 -04:00
Andrew Gallant	8bbe58d623	Add support for additional text encodings. This includes, but is not limited to, UTF-16, latin-1, GBK, EUC-JP and Shift_JIS. (Courtesy of the `encoding_rs` crate.) Specifically, this feature enables ripgrep to search files that are encoded in an encoding other than UTF-8. The list of available encodings is tied directly to what the `encoding_rs` crate supports, which is in turn tied to the Encoding Standard. The full list of available encodings can be found here: https://encoding.spec.whatwg.org/#concept-encoding-get This pull request also introduces the notion that text encodings can be automatically detected on a best effort basis. Currently, the only support for this is checking for a UTF-16 bom. In all other cases, a text encoding of `auto` (the default) implies a UTF-8 or ASCII compatible source encoding. When a text encoding is otherwise specified, it is unconditionally used for all files searched. Since ripgrep's regex engine is fundamentally built on top of UTF-8, this feature works by transcoding the files to be searched from their source encoding to UTF-8. This transcoding only happens when: 1. `auto` is specified and a non-UTF-8 encoding is detected. 2. A specific encoding is given by end users (including UTF-8). When transcoding occurs, errors are handled by automatically inserting the Unicode replacement character. In this case, ripgrep's output is guaranteed to be valid UTF-8 (excluding non-UTF-8 file paths, if they are printed). In all other cases, the source text is searched directly, which implies an assumption that it is at least ASCII compatible, but where UTF-8 is most useful. In this scenario, encoding errors are not detected. In this case, ripgrep's output will match the input exactly, byte-for-byte. This design may not be optimal in all cases, but it has some advantages: 1. In the happy path ("UTF-8 everywhere") remains happy. I have not been able to witness any performance regressions. 2. In the non-UTF-8 path, implementation complexity is kept relatively low. The cost here is transcoding itself. A potentially superior implementation might build decoding of any encoding into the regex engine itself. In particular, the fundamental problem with transcoding everything first is that literal optimizations are nearly negated. Future work should entail improving the user experience. For example, we might want to auto-detect more text encodings. A more elaborate UX experience might permit end users to specify multiple text encodings, although this seems hard to pull off in an ergonomic way. Fixes #1	2017-03-12 19:54:48 -04:00
Andrew Gallant	6ecffec537	Fix test on Windows. (This is what I get for directly pushing to master.)	2017-03-12 16:07:31 -04:00
Andrew Gallant	80e91a1f1d	Fix leading slash bug when used with `!`. When writing paths like `!/foo` in gitignore files (or when using the -g/--glob flag), the presence of `!` would prevent the gitignore builder from noticing the leading slash, which causes absolute path matching to fail. Fixes #405	2017-03-12 15:51:17 -04:00
Marc Tiehuis	adff43fbb4	Remove clap validator + add max-filesize integration tests	2017-03-08 10:17:18 -05:00
tiehuis	714ae82241	Add `--max-filesize` option to cli The --max-filesize option allows filtering files which are larger than the specified limit. This is potentially useful if one is attempting to search a number of large files without common file-types/suffixes. See #369.	2017-03-08 10:17:18 -05:00
Marc Tiehuis	066f97d855	Add enclosing group to alternations in globs Fixes #391.	2017-03-08 10:13:28 -05:00
Andrew Gallant	7a951f103a	Make --column imply --line-number. Closes #243	2017-01-11 18:53:35 -05:00
Andrew Gallant	8751e55706	Add --path-separator flag. This flag permits setting the path separator used for all file paths printed by ripgrep in normal operation. Fixes #275	2017-01-10 18:16:15 -05:00
Andrew Gallant	97e6873b38	Fix type compose test.	2017-01-07 22:50:38 -05:00
Ian Kerins	ed01e80a79	Provide a mechanism to compose type definitions This extends the syntax of the --type-add flag to allow including the globs of other already defined types. Fixes #83.	2017-01-07 18:14:24 -05:00
Andrew Gallant	b65a8c353b	Add --sort-files flag. When used, parallelism is disabled but the results are sorted by file path. Closes #263	2017-01-06 22:43:59 -05:00
Andrew Gallant	bb70f96743	Fix a non-termination bug. This was a very silly bug. Instead of creating a particular atomic once and cloning it, we created a new value for each worker. Fixes #279	2016-12-12 06:55:49 -05:00
Andrew Gallant	d66812102b	Fix leading hypen bug by updating clap. Fixes #270	2016-12-06 17:29:34 -05:00
Andrew Gallant	7282706b42	Fix bug reading root symlink. When give an explicit file path on the command line like `foo` where `foo` is a symlink, ripgrep should follow it even if `-L` isn't set. This is consistent with the behavior of `foo/`. Fixes #256	2016-12-05 20:05:57 -05:00
Andrew Gallant	0473df1ef5	Disable Unicode mode for literal regex. When ripgrep detects a literal, it emits them as raw hex escaped byte sequences to Regex::new. This permits literal optimizations for arbitrary byte sequences (i.e., possibly invalid UTF-8). The problem is that Regex::new interprets hex escaped byte sequences as Unicode codepoints by default, but we want them to actually stand for their raw byte values. Therefore, disable Unicode mode. This is OK, since the regex is composed entirely of literals and literal extraction does Unicode case folding. Fixes #251	2016-11-28 18:31:58 -05:00
Andrew Gallant	301a3fd71d	Detect more uppercase literals for --smart-case. This changes the uppercase literal detection for the "smart case" functionality. In particular, a character class is considered to have an uppercase literal if at least one of its ranges starts or stops with an uppercase literal. Fixes #229	2016-11-28 17:57:26 -05:00
Andrew Gallant	03f7605322	Rename --files-without-matches to --files-without-match. This is to be consistent with grep.	2016-11-19 20:15:41 -05:00
Daniel Luz	bd3e7eedb1	Add --files-without-matches flag. Performs the opposite of --files-with-matches: only shows paths of files that contain zero matches. Closes #138	2016-11-19 21:48:59 -02:00
Andrew Gallant	e37f783fc0	Fix issue number mixup. Thanks @bluss!	2016-11-17 20:30:18 -05:00
Andrew Gallant	92dc402f7f	Switch from Docopt to Clap. There were two important reasons for the switch: 1. Performance. Docopt does poorly when the argv becomes large, which is a reasonable common use case for search tools. (e.g., use with xargs) 2. Better failure modes. Clap knows a lot more about how a particular argv might be invalid, and can therefore provide much clearer error messages. While both were important, (1) made it urgent. Note that since Clap requires at least Rust 1.11, this will in turn increase the minimum Rust version supported by ripgrep from Rust 1.9 to Rust 1.11. It is therefore a breaking change, so the soonest release of ripgrep with Clap will have to be 0.3. There is also at least one subtle breaking change in real usage. Previous to this commit, this used to work: rg -e -foo Where this would cause ripgrep to search for the string `-foo`. Clap currently has problems supporting this use case (see: https://github.com/kbknapp/clap-rs/issues/742), but it can be worked around by using this instead: rg -e [-]foo or even rg [-]foo and this still works: rg -- -foo This commit also adds Bash, Fish and PowerShell completion files to the release, fixes a bug that prevented ripgrep from working on file paths containing invalid UTF-8 and shows short descriptions in the output of `-h` but longer descriptions in the output of `--help`. Fixes #136, Fixes #189, Fixes #210, Fixes #230	2016-11-17 19:53:41 -05:00
Eric Kidd	e9cd0a1cc3	Allow specifying patterns with `-f FILE` and `-f-` This is a somewhat basic implementation of `-f-` (#7), with unit tests. Changes include: 1. The internals of the `pattern` function have been refactored to avoid code duplication, but there's a lot more we could do. Right now we read the entire pattern list into a `Vec`. 2. There's now a `WorkDir::pipe` command that allows sending standard input to `rg` when testing. Not implemented: aho-corasick.	2016-11-15 13:00:16 -05:00
Andrew Gallant	4b18f82899	Disable symlink tests on Windows. For some reason, these work on AppVeyor but not in other build systems. Let's just disable them. See: https://github.com/rust-lang/rust/pull/37149	2016-11-11 06:44:23 -05:00
Andrew Gallant	2dce0dc0df	Fix a bug with handling --ignore-file. Namely, passing a directory to --ignore-file caused ripgrep to allocate memory without bound. The issue was that I got a bit overzealous with partial error reporting. Namely, when processing a gitignore file, we should try to use every pattern even if some patterns are invalid globs (e.g., a**b). In the process, I applied the same logic to I/O errors. In this case, it manifest by attempting to read lines from a directory, which appears to yield Results forever, where each Result is an error of the form "you can't read from a directory silly." Since I treated it as a partial error, ripgrep was just spinning and accruing each error in memory, which caused the OOM killer to kick in. Fixes #228	2016-11-09 16:45:23 -05:00
Andrew Gallant	58aca2efb2	Add -m/--max-count flag. This flag limits the number of matches printed per file. Closes #159	2016-11-06 13:09:53 -05:00
Andrew Gallant	0222e024fe	Fixes a bug with --smart-case. This was a subtle bug, but the big picture was that the smart case information wasn't being carried through to the literal extraction in some cases. When this happened, it was possible to get back an incomplete set of literals, which would therefore miss some valid matches. The fix to this is to actually parse the regex and determine whether smart case applies before doing anything else. It's a little extra work, but parsing is pretty fast. Fixes #199	2016-11-06 12:07:47 -05:00
Andre Bogus	02de97b8ce	Use the bytecount crate for fast line counting. Fixes #128	2016-11-05 22:29:26 -04:00
Andrew Gallant	16975797fe	Fixes a matching bug in the glob override matcher. This was probably a transcription error when moving the ignore matcher code out of ripgrep core. Specifically, the override glob matcher should not ignore directories if they don't match. Fixes #206	2016-10-31 19:54:38 -04:00
Andrew Gallant	d79add341b	Move all gitignore matching to separate crate. This PR introduces a new sub-crate, `ignore`, which primarily provides a fast recursive directory iterator that respects ignore files like gitignore and other configurable filtering rules based on globs or even file types. This results in a substantial source of complexity moved out of ripgrep's core and into a reusable component that others can now (hopefully) benefit from. While much of the ignore code carried over from ripgrep's core, a substantial portion of it was rewritten with the following goals in mind: 1. Reuse matchers built from gitignore files across directory iteration. 2. Design the matcher data structure to be amenable for parallelizing directory iteration. (Indeed, writing the parallel iterator is the next step.) Fixes #9, #44, #45	2016-10-29 20:48:59 -04:00
Andrew Gallant	f2e1711781	Fix bug when processing parent gitignore files. This particular bug was triggered whenever a search was run in a directory with a parent directory that contains a relevant .gitignore file. In particular, before matching against a parent directory's gitignore rules, a path's leading `./` was not stripped, which results in errant matching. We now make sure `./` is stripped. Fixes #184.	2016-10-16 10:15:11 -04:00
Andrew Gallant	4737326ed3	Update regex-syntax for bug fix. The bug fix was in expression pretty printing. ripgrep parses the regex into an AST and may do some modifications to it, which requires the ability to go from string -> AST -> string' -> AST' where string == string' implies AST == AST'. Also, add a regression test for the specific regex that tripped the bug. Fixes #156.	2016-10-10 22:04:29 -04:00
Andrew Gallant	a3537aa32a	Update darwin cfg attributes.	2016-10-10 21:48:47 -04:00
Andrew Gallant	4e52059ad6	Disable regression_131 test on darwin. It's not clear why it's failing. Maybe it doesn't permit certain characters in file paths?	2016-10-10 21:03:11 -04:00
Andrew Gallant	27a980c1bc	Fix symlink test. We attempt to run it on Windows, but I'm getting "access denied" errors when trying to create a file symlink. So we disable the test on Windows.	2016-10-10 19:34:57 -04:00
Andrew Gallant	e8645dc8ae	style nits	2016-10-10 19:27:12 -04:00
Andrew Gallant	e96d93034a	Finish overhaul of glob matching. This commit completes the initial move of glob matching to an external crate, including fixing up cross platform support, polishing the external crate for others to use and fixing a number of bugs in the process. Fixes #87, #127, #131	2016-10-10 19:24:18 -04:00
Ian Kerins	1c964372ad	Always follow symlinks on explicit file arguments.	2016-10-08 22:40:03 -04:00
Andrew Gallant	175406df01	Refactor and test glob sets. This commit goes a long way toward refactoring glob sets so that the code is easier to maintain going forward. In particular, it makes the literal optimizations that glob sets used a lot more structured and much easier to extend. Tests have also been modified to include glob sets. There's still a bit of polish work left to do before a release. This also fixes the immediate issue where large gitignore files were causing ripgrep to slow way down. While we don't technically fix it for good, we're a lot better about reducing the number of regexes we compile. In particular, if a gitignore file contains thousands of patterns that can't be matched more simply using literals, then ripgrep will slow down again. We could fix this for good by avoiding RegexSet if the number of regexes grows too large. Fixes #134.	2016-10-04 20:28:56 -04:00
Andrew Gallant	925d0db9f0	Add -s/--case-sensitive flag. This flag overrides both --smart-case and --ignore-case. Closes #124.	2016-09-28 16:32:29 -04:00
Garrett Squire	babe80d498	add a max-depth option for directory traversal CR and add integration test	2016-09-27 16:14:53 -07:00
Andrew Gallant	3e78fce3a3	Don't print empty lines in single threaded mode. Fixes #99.	2016-09-26 19:57:23 -04:00
Andrew Gallant	7a3fd1f23f	Add a --null flag. This flag causes a NUL byte to follow any file path in ripgrep's output. Closes #89.	2016-09-26 19:21:17 -04:00
Andrew Gallant	d306403440	Fix an off-by-one error with --column. Fixes #105.	2016-09-26 19:09:59 -04:00
Andrew Gallant	b034b77798	Don't replace NUL bytes when searching binary files as text. This was a result of misinterpreting a feature in grep where NUL bytes are replaced with \n. The primary reason for doing this is to avoid excessive memory usage on truly binary data. However, grep only does this when searching binary files as if they were binary, and which only reports whether the file matched or not. When grep is told to search binary data as text (the -a/--text flag), then it doesn't do any replacement so we shouldn't either. In general, this makes sense, because the user is essentially asserting that a particular file that looks like binary is actually text. In that case, we shouldn't try to replace any NUL bytes. ripgrep doesn't actually support searching binary data for whether it matches or not, so we don't actually need the replace_buf function. However, it does seem like a potentially useful feature.	2016-09-25 21:26:49 -04:00
Andrew Gallant	6a8051b258	Don't union inner literals of repetitions. If we do, this results in extracting `foofoofoo` from `(\wfoo){3}`, which is wrong. This does prevent us from extracting `foofoofoo` from `foo{3}`, which is unfortunate, but we miss plenty of other stuff too. Literal extracting needs a good rethink (all the way down into the regex engine). Fixes #93	2016-09-25 20:10:28 -04:00
Andrew Gallant	ed94aedf27	Permit whitelisting hidden files in ignores. Fixes #90	2016-09-25 18:31:41 -04:00
Andrew Gallant	3d6a39be06	Fix tests on Windows. Mostly this is just using \\ instead of / in paths reported by the OS.	2016-09-25 15:45:51 -04:00
Andrew Schwartzmeyer	a8f3d9e87e	Add --files-with-matches flag. Closes #26. Acts like --count but emits only the paths of files with matches, suitable for piping to xargs. Both mmap and no-mmap searches terminate after the first match is found. Documentation updated and tests added.	2016-09-24 21:40:17 -07:00
Andrew Gallant	1595f0faf5	Add --smart-case. It does what it says on the tin. Closes #70.	2016-09-24 21:51:04 -04:00

1 2 3 4

168 Commits