ripgrep

mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00

Author	SHA1	Message	Date
Andrew Gallant	7b72e982f2	deps: update everything	2023-07-05 14:04:29 -04:00
Andrew Gallant	a68db3ac02	deps: drop temporary patch and move to bstr 1.6 Now that regex 1.9 is out, we can depend on it from crates.io.	2023-07-05 14:04:29 -04:00
Andrew Gallant	b12905daca	deps: update everything	2023-07-05 14:04:29 -04:00
Andrew Gallant	ca740d9ace	regex: add new inner literal extractor This is mostly a copy of the prefix literal extractor in regex-syntax, but with a tweaked notion of Seq that keeps track of whether it's a prefix of an expression or not. If it isn't, then we can't cross it as a suffix to another Seq. This new extractor should be a lot more robust than the old one. We actually will keep going through the regex to try and find the "best" literals to search for (according to some heuristic).	2023-07-05 14:04:29 -04:00
Andrew Gallant	e80c102dee	regex: tweak formatting of regex-automata version spec This makes it easier to enable the `logging` feature for regex-automata. I wish I could just enable it unconditionally, but it winds up producing a lot of output because ripgrep uses regexes for things other than the primary search (like every glob). Sigh.	2023-07-05 14:04:29 -04:00
Andrew Gallant	8ac66a9e04	regex: refactor matcher construction This does a little bit of refactoring so that we can pass both a ConfiguredHIR and a Regex to the inner literal extraction routine. One downside of this approach is that a regex object hangs on to a ConfiguredHIR. But the extra memory usage is probably negligible. A benefit though is that converting the HIR to its concrete syntax is now lazy and only happens when logging is enabled.	2023-07-05 14:04:29 -04:00
Andrew Gallant	04dde9a4eb	regex: tweak DFA settings This increases the limits a bit for when the regex engine will build and use a fully compiled DFA. They can faster in some circumstances. For example, '(?-u)^\w{30,}$' gets a nice speed boost from state acceleration. We are also able to remove `regex` proper as a dependency. Wow.	2023-07-05 14:04:29 -04:00
Andrew Gallant	81341702af	regex: push more pattern handling to matcher construction Previously, ripgrep core was responsible for escaping regex patterns and implementing the --line-regexp flag. This commit moves that responsibility down into the matchers such that ripgrep just needs to hand the patterns it gets off to the matcher builder. The builder will then take care of escaping and all that. This was done to make pattern construction completely owned by the matcher builders. With the arrival regex-automata, this means we can move to the HIR very quickly and then never move back to the concrete syntax. We can then build our regex directly from the HIR. This overall can save quite a bit of time, especially when searching for large dictionaries. We still aren't quite as fast as GNU grep when searching something on the scale of /usr/share/dict/words, but we are basically within spitting distance. Prior to this, we were about an order of magnitude slower. This architecture in particular lets us write a pretty simple fast path that avoids AST parsing and HIR translation entirely: the case where one is just searching for a literal. In that case, we can hand construct the HIR directly.	2023-07-05 14:04:29 -04:00
Andrew Gallant	d34c5c88a7	globset: fix build error in tests I guess we haven't been testing with the Serde feature enabled? Weird.	2023-07-05 14:04:29 -04:00
Andrew Gallant	4b8aa91ae5	deps: update to pcre2 0.2.4 0.2.4 updates to PCRE2 10.42 and has a few other nice changes. For example, when `utf` is enabled, the crate will always set the PCRE2_MATCH_INVALID_UTF option. That means we no longer need to do transcoding or UTF-8 validity checks. Because of this, we actually get to remove one of the two uses of `unsafe` in ripgrep's `main` program. (This also updates a couple other dependencies for convenience.)	2023-07-05 14:04:29 -04:00
Andrew Gallant	a775b493fd	regex: small cleanups Just some small polishing. We also get rid of thread_local in favor of using regex-automata, mostly just in the name of reducing dependencies. (We should eventually be able to drop thread_local completely.)	2023-07-05 14:04:29 -04:00
Andrew Gallant	a6dbff502f	regex: s/locations/captures Now that we use regex-automata, we no longer use any type with "locations" in it. Instead, that's mostly legacy from the top-level regex crate.	2023-07-05 14:04:29 -04:00
Andrew Gallant	51480d57a6	regex: simplify AST analysis a bit The verbatim literal stuff hasn't been used for a while and I don't foresee it being used. If it's really needed, it would probably better to just implement it by looking at the pattern string itself, which avoids parsing it into an AST altogether.	2023-07-05 14:04:29 -04:00
Andrew Gallant	d9bd261be8	regex: some small cleanup in 'strip.rs' We also utilize bstr's methods to get rid of some helpers we had written by hand.	2023-07-05 14:04:29 -04:00
Andrew Gallant	9d62eb997a	BREAKING: regex: finally remove CRLF hack Now that Rust's regex crate finally supports a CRLF mode, we can remove this giant hack in ripgrep to enable it. (And assuredly did not work in all cases.) The way this works in the regex engine is actually subtly different than what ripgrep previously did. Namely, --crlf would previously treat either \r\n or \n as a line terminator. But now it treats \r\n, \n and \r as line terminators. In effect, it is implemented by treating \r and \n as line terminators, but ^ and $ will never match at a position between a \r and a \n. So basically this means that $ will end up matching in more cases than it might be intended too, but I don't expect this to be a big problem in practice. Note that passing --crlf to ripgrep and enabling CRLF mode in the regex via the `R` inline flag (e.g., `(?R:$)`) are subtly different. The `R` flag just controls the regex engine, but --crlf instructs all of ripgrep to use \r\n as a line terminator. There are likely some inconsistencies or corner cases that are wrong as a result of this cognitive dissonance, but we choose to leave well enough alone for now. Fixing this for real will probably require re-thinking how line terminators are handled in ripgrep. For example, one "problem" with how they're handled now is that ripgrep will re-insert its own line terminators when printing output instead of copying the input. This is maybe not so great and perhaps unexpected. (ripgrep probably can't get away with not inserting any line terminators. Users probably expect files that don't end with a line terminator whose last line matches to have a line terminator inserted.)	2023-07-05 14:04:29 -04:00
Andrew Gallant	e028ea3792	regex: migrate grep-regex to regex-automata We just do a "basic" dumb migration. We don't try to improve anything here.	2023-07-05 14:04:29 -04:00
Andrew Gallant	1035f6b1ff	deps: initial migration steps to regex 1.9 This leaves the grep-regex crate in tatters. Pretty much the entire thing needs to be re-worked. The upshot is that it should result in some big simplifications. I hope. The idea here is to drop down and actually use regex-automata 0.3 instead of the regex crate itself.	2023-07-05 14:04:29 -04:00
Andrew Gallant	a7f1276021	readme: update Debian instructions We probably don't need to mention Buster specifically nor Debian unstable since ripgrep has been in Debian for a while now. But we can't just get rid of the `deb` file either, because Debian might package a very old version. Fixes #2531	2023-06-12 07:50:13 -04:00
Martin Nordholts	4fcb1b2202	cli: replace atty with std::io::IsTerminal The `atty` crate is unmaintained[1] and `std::io::IsTerminal` was stabilized in Rust 1.70. [1]: https://rustsec.org/advisories/RUSTSEC-2021-0145.html PR #2526	2023-06-05 14:00:46 -04:00
Francois Marier	949092fd22	ignore/types: add 'mdwn' to Markdown PR #2520	2023-05-26 14:44:41 -04:00
Andrew Gallant	4a7e7094ad	deps: update everything else	2023-05-25 13:06:13 -04:00
Andrew Gallant	fc0d9b90a9	deps: bump regex to 1.8.3 This brings in an update from the regex crate that fixes a matching bug for particular kinds of alternations of literals. Fixes #2518	2023-05-25 13:06:13 -04:00
Ville Skyttä	335aa4937a	ignore/types: add *.pyi for Python https://peps.python.org/pep-0484/#stub-files PR #2517	2023-05-23 07:10:02 -04:00
Adam Reichold	803c447845	searcher: re-enable mmap on 32-bit architectures memmap2 v0.3.0 introduced a regression when trying to map files larger than 4GB on 32-bit architectures[1] which was subsequently fixed in v0.3.1[2]. This commit bumps locked version of the memmap2 dependency to the current v0.5.0 and reverts `fdfc418be5` to re-enable mmap on 32-bit architectures as a different approach to fixing [3]. This was tested to report matches from the end of a 5GB file using MinGW and Wine. Ref #1911, PR #2000 [1] `5e271224c8` [2] `9aa838aed9` [3] https://github.com/BurntSushi/ripgrep/issues/1911	2023-05-19 08:23:53 -04:00
Andrew Gallant	c5415adbe8	deps: update everything This does unfortunately bring in both regex-syntax 0.6 and 0.7, but we'll fix that once regex 1.9 is out.	2023-05-16 13:14:23 -04:00
Andrew Gallant	251376597f	deps: update minimum version of grep crate Ref #2516	2023-05-16 13:13:34 -04:00
Andrew Gallant	e593f5b7ee	grep-0.2.12	2023-05-16 13:12:45 -04:00
Andrew Gallant	6b19be2477	crates/grep: remove 'deny(missing_docs)' This crate is only a shim over a bunch of other crates. I'm not sure that there's anything to add to each of the `pub extern` items. So instead of just writing fluff, I removed the lint. Fixes #2516	2023-05-16 13:10:42 -04:00
Ryan Whitehouse	041544853c	doc: fix --quiet docs The wording was previously inverted, which had the opposite meaning as was intended. Fixes #1962	2023-03-28 07:22:59 -04:00
Manu	a7ae9e4043	ignore/types: add support for docker-compose files Default file is docker-compose.yml and the documentation mentions overrides in the form of docker-compose.*.yml. PR #2469	2023-03-21 12:56:38 -04:00
Andrew Gallant	595e7845b8	readme: add a link to delta's support for ripgrep Ref: https://github.com/BurntSushi/ripgrep/issues/86#issuecomment-1469717706	2023-03-15 08:02:04 -04:00
David Ringo	44fb9fce2c	ignore/types: add *.sln for msbuild .sln is the extension for Visual Studio Project Soltion files, one of the file types accepted as inputs by MSBuild. PR #2415	2023-02-09 21:20:49 -05:00
Vincent Bockaert	339c46a6ed	ignore/types: enhance terraform default filter The default filter for terraform only checks for .tf files, but there are quite few other terraform filetypes. The explanation for all of them can be found below (including link to documentation from Hashicorp at time of writing) - .tf.json & .tfvars.json is to capture the files written in JSON-based variant of the Terraform language - https://developer.hashicorp.com/terraform/language/files - .tfvars is used to supply variables - https://developer.hashicorp.com/terraform/cloud-docs/workspaces/variables#6-auto-tfvars-variable-files - .terraform.lock.hcl is used as a Dependency lock file - https://developer.hashicorp.com/terraform/language/files/dependency-lock - terraform.rc & .terraformrc, *.tfrc - https://developer.hashicorp.com/terraform/cli/config/config-file PR #2412	2023-02-09 12:57:01 -05:00
Andrew Gallant	fe97c0a152	ignore-0.4.20	2023-01-15 08:21:02 -05:00
Christian Vallentin	826f3fad5b	ignore/api: add Clone and Debug impls for OverrideBuilder PR #2397	2023-01-15 08:16:27 -05:00
Andrew Gallant	bc55049327	readme: update MSRV in README ... this was apparently long outdated, wow.	2023-01-05 12:09:46 -05:00
Andrew Gallant	d58e9353fc	deps: update to grep 0.2.11	2023-01-05 09:13:47 -05:00
Andrew Gallant	ca60fef4db	grep-0.2.11	2023-01-05 09:12:49 -05:00
Andrew Gallant	a25307d6c8	deps: update to grep-printer 0.1.7	2023-01-05 09:12:37 -05:00
Andrew Gallant	b80947a8b3	grep-printer-0.1.7	2023-01-05 09:11:16 -05:00
Andrew Gallant	ad793a0d8f	deps: update to grep-searcher 0.1.11	2023-01-05 09:07:49 -05:00
Andrew Gallant	120e55e7c7	grep-searcher-0.1.11	2023-01-05 09:07:09 -05:00
Andrew Gallant	3941a7701d	deps: update to grep-pcre2 0.1.6	2023-01-05 09:06:52 -05:00
Andrew Gallant	96e130fbf9	grep-pcre2-0.1.6	2023-01-05 09:05:59 -05:00
Andrew Gallant	180c4eaf8b	deps: update to grep-regex 0.1.11	2023-01-05 09:05:39 -05:00
Andrew Gallant	81529288cf	grep-regex-0.1.11	2023-01-05 09:02:55 -05:00
Andrew Gallant	bcc7473a87	deps: update to grep-matcher 0.1.6	2023-01-05 09:02:40 -05:00
Andrew Gallant	bc78c644db	grep-matcher-0.1.6	2023-01-05 09:00:33 -05:00
Andrew Gallant	dc7267a0fb	deps: update to grep-cli 0.1.7	2023-01-05 08:58:47 -05:00
Andrew Gallant	3224324e25	grep-cli-0.1.7	2023-01-05 08:57:31 -05:00

... 3 4 5 6 7 ...

1958 Commits