ripgrep

mirror of https://github.com/BurntSushi/ripgrep.git synced 2025-08-10 05:59:25 +02:00

Author	SHA1	Message	Date
Richard Sternagel	f3241fd657	cli: '--no-ignore-dot' should also '.rgignore' Fixes #2198, Closes #2202	2023-07-08 18:52:42 -04:00
Andrew Gallant	cfe357188d	ignore/types: fix formatting	2023-07-08 18:52:42 -04:00
edam	792451e331	ignore/types: added V type V (http://vlang.io) uses '.v' files. Closes #2302	2023-07-08 18:52:42 -04:00
Andrew Gallant	7dafd58a32	readme: use 'sudo' more consistently I definitely wonder whether I should just drop 'sudo' from the install instructions and just rely on the user to "know" to do it. But some commands legitimately do not require 'sudo', so there are actual differences. Overall, this feels clearer to me but reasonable people can disagree.	2023-07-08 18:52:42 -04:00
Andrew Savchenko	b92550b67b	readme: add install command for ALT Linux Closes #2330	2023-07-08 18:52:42 -04:00
Kevin Ushey	383d3b336b	doc: add '--hidden' to example configuration This increases visibility of the fact that hidden files are skipped by default. Closes #2356	2023-07-08 18:52:42 -04:00
James McKinney	fc7e634395	ci/release: Use GITHUB_REF_NAME instead of GITHUB_REF This is a nice quality of life improvement. Closes #2358	2023-07-08 18:52:42 -04:00
James McKinney	c9584b035b	ci/release: use GitHub CLI The old actions I was using are apparently archived because they make use of deprecated features (like `set-output`). Sigh. Closes #2360	2023-07-08 18:52:42 -04:00
Alex Rawson	f34fd5c4b6	globset: introduce option to keep empty alternates Add a method GlobBuilder::empty_alternates and supporting mechanisms. Ref #1368 Closes #2369	2023-07-08 18:52:42 -04:00
Jérome Eertmans	d51c6c005a	globset: permit deserializing Glob from String Closes #2386, Closes #2388	2023-07-08 18:52:42 -04:00
Jakub Wilk	ea05881319	readme: fix awkward grammar Closes #2402	2023-07-08 18:52:42 -04:00
sitiom	1d4e3df19c	readme: add winget installation section Closes #2409	2023-07-08 18:52:42 -04:00
Mark Sisson	0f6181d309	ignore/types: add USD to the default file types Closes #2432	2023-07-08 18:52:42 -04:00
Sam James	e902e2fef4	ignore/types: add Gentoo eclass type Eclasses are "ebuild libraries" and generally if you're filtering for/filtering out an ebuild/eclass, you don't want the other either. Followup to `4dfea016b9` Closes #2437	2023-07-08 18:52:42 -04:00
angrycandy	07cbfee225	ignore/types: improve Elixir globs Closes #2450	2023-07-08 18:52:42 -04:00
Andrew Gallant	d675844510	core: don't let context flags override eachother This matches the behavior of GNU grep which does not ignore before-context and after-context completely if the context flag is also provided. Note that this change wasn't done just to match GNU grep. In this case, GNU grep has the more sensible behavior. Fixes #2288, Closes #2451	2023-07-08 18:52:42 -04:00
Andrew Gallant	54e609d657	doc: add another example for the config file Closes #2453	2023-07-08 18:52:42 -04:00
Misaki	43bbcca06f	doc: note '-n' and '-N' override each other Closes #2460	2023-07-08 18:52:42 -04:00
Eric Arellano	ad9bfdd981	ignore/gitignore: expose `gitconfig_excludes_path` I have reservations about this, but it looks useful and doesn't seem terribly onerous to support. The `ignore` crate will really always need to have some kind of logic supporting this in some form I think. Closes #2482	2023-07-08 18:52:42 -04:00
Gal Ofri	36194c2742	test: test that regex inline flags work as intended This was originally fixed by using non-capturing groups when joining patterns in crates/core/args.rs, but before that landed, it ended up getting fixed via a refactor in the course of migrating to regex 1.9. Namely, it's now fixed by pushing pattern joining down into the regex layer, so that patterns can be joined in the most effective way possible. Still, #2488 contains a useful test, so we bring that in here. The test actually failed for `rg -e ')('`, since it expected the command to fail with a syntax error. But my refactor actually causes this command to succeed. And indeed, #2488 worked around this by special casing a single pattern. That work-around fixes it for the single pattern case, but doesn't fix it for the -w or -X or multi-pattern case. So for now, we're content to leave well enough alone. The only real way to fix this for real is to parse each regexp individual and verify that each is valid on its own. It's not clear that doing so is worth it. Fixes #2480, Closes #2488	2023-07-08 18:52:42 -04:00
Jakub Jirutka	0c1cbd99f3	ignore: tweak regex crate features This removes most of the Unicode features as they aren't currently used. We can always add them back later if necessary. We can avoid the unicode-perl feature by changing `\s` to `[[:space:]]`, which uses the ASCII-only definition of `\s`. Since we don't expect non-ASCII whitespace in git config files, this seems okay. Closes #2502	2023-07-08 18:52:42 -04:00
Jon Parise	96cfc0ed13	ignore/types: add 'graphql' type GraphQL file extensions: .graphql and .graphqls (schema) We could also add `.gql`, but perhaps it's less correct to do so. We'll start conservatively here, and we can always add `.gql` later. Closes #2439, Closes #2508	2023-07-08 18:52:42 -04:00
mataha	da8ecddce9	cli: make resolve_binary take COM executables into account When `resolve_binary()` attempts to resolve a path to a program on Windows while searching for a program in `PATH` without an extension, `ripgrep` will assume the extension of the file to be `.exe` as it's the de facto standard, which will work most (99.99%) of the time... ...unless the binary is a COM executable (we're on Windows, duh). Closes #2523	2023-07-08 18:52:42 -04:00
Yifei Teng	545a7dc759	ignore/types: add cml to the default types list It's used in Fuchsia to mean "component manifest language."[1] [1]: https://fuchsia.dev/reference/cml?hl=en Closes #2529	2023-07-08 18:52:42 -04:00
Jonathan Schwender	16f783832e	doc: update rust-version in Cargo.toml The MSRV got bumped a little bit ago, so this is just catchup. Closes #2539	2023-07-08 18:52:42 -04:00
Andrew Gallant	f4d07b9cbd	grep-cli-0.1.8 grep-cli-0.1.8	2023-07-05 17:09:09 -04:00
Andrew Gallant	0b6eccf4d3	ci: try to fix CI	2023-07-05 14:04:29 -04:00
Andrew Gallant	3ac4541e9f	regex: remove old inner literal extractor (It had already been removed from the crate.)	2023-07-05 14:04:29 -04:00
Andrew Gallant	7b72e982f2	deps: update everything	2023-07-05 14:04:29 -04:00
Andrew Gallant	a68db3ac02	deps: drop temporary patch and move to bstr 1.6 Now that regex 1.9 is out, we can depend on it from crates.io.	2023-07-05 14:04:29 -04:00
Andrew Gallant	b12905daca	deps: update everything	2023-07-05 14:04:29 -04:00
Andrew Gallant	ca740d9ace	regex: add new inner literal extractor This is mostly a copy of the prefix literal extractor in regex-syntax, but with a tweaked notion of Seq that keeps track of whether it's a prefix of an expression or not. If it isn't, then we can't cross it as a suffix to another Seq. This new extractor should be a lot more robust than the old one. We actually will keep going through the regex to try and find the "best" literals to search for (according to some heuristic).	2023-07-05 14:04:29 -04:00
Andrew Gallant	e80c102dee	regex: tweak formatting of regex-automata version spec This makes it easier to enable the `logging` feature for regex-automata. I wish I could just enable it unconditionally, but it winds up producing a lot of output because ripgrep uses regexes for things other than the primary search (like every glob). Sigh.	2023-07-05 14:04:29 -04:00
Andrew Gallant	8ac66a9e04	regex: refactor matcher construction This does a little bit of refactoring so that we can pass both a ConfiguredHIR and a Regex to the inner literal extraction routine. One downside of this approach is that a regex object hangs on to a ConfiguredHIR. But the extra memory usage is probably negligible. A benefit though is that converting the HIR to its concrete syntax is now lazy and only happens when logging is enabled.	2023-07-05 14:04:29 -04:00
Andrew Gallant	04dde9a4eb	regex: tweak DFA settings This increases the limits a bit for when the regex engine will build and use a fully compiled DFA. They can faster in some circumstances. For example, '(?-u)^\w{30,}$' gets a nice speed boost from state acceleration. We are also able to remove `regex` proper as a dependency. Wow.	2023-07-05 14:04:29 -04:00
Andrew Gallant	81341702af	regex: push more pattern handling to matcher construction Previously, ripgrep core was responsible for escaping regex patterns and implementing the --line-regexp flag. This commit moves that responsibility down into the matchers such that ripgrep just needs to hand the patterns it gets off to the matcher builder. The builder will then take care of escaping and all that. This was done to make pattern construction completely owned by the matcher builders. With the arrival regex-automata, this means we can move to the HIR very quickly and then never move back to the concrete syntax. We can then build our regex directly from the HIR. This overall can save quite a bit of time, especially when searching for large dictionaries. We still aren't quite as fast as GNU grep when searching something on the scale of /usr/share/dict/words, but we are basically within spitting distance. Prior to this, we were about an order of magnitude slower. This architecture in particular lets us write a pretty simple fast path that avoids AST parsing and HIR translation entirely: the case where one is just searching for a literal. In that case, we can hand construct the HIR directly.	2023-07-05 14:04:29 -04:00
Andrew Gallant	d34c5c88a7	globset: fix build error in tests I guess we haven't been testing with the Serde feature enabled? Weird.	2023-07-05 14:04:29 -04:00
Andrew Gallant	4b8aa91ae5	deps: update to pcre2 0.2.4 0.2.4 updates to PCRE2 10.42 and has a few other nice changes. For example, when `utf` is enabled, the crate will always set the PCRE2_MATCH_INVALID_UTF option. That means we no longer need to do transcoding or UTF-8 validity checks. Because of this, we actually get to remove one of the two uses of `unsafe` in ripgrep's `main` program. (This also updates a couple other dependencies for convenience.)	2023-07-05 14:04:29 -04:00
Andrew Gallant	a775b493fd	regex: small cleanups Just some small polishing. We also get rid of thread_local in favor of using regex-automata, mostly just in the name of reducing dependencies. (We should eventually be able to drop thread_local completely.)	2023-07-05 14:04:29 -04:00
Andrew Gallant	a6dbff502f	regex: s/locations/captures Now that we use regex-automata, we no longer use any type with "locations" in it. Instead, that's mostly legacy from the top-level regex crate.	2023-07-05 14:04:29 -04:00
Andrew Gallant	51480d57a6	regex: simplify AST analysis a bit The verbatim literal stuff hasn't been used for a while and I don't foresee it being used. If it's really needed, it would probably better to just implement it by looking at the pattern string itself, which avoids parsing it into an AST altogether.	2023-07-05 14:04:29 -04:00
Andrew Gallant	d9bd261be8	regex: some small cleanup in 'strip.rs' We also utilize bstr's methods to get rid of some helpers we had written by hand.	2023-07-05 14:04:29 -04:00
Andrew Gallant	9d62eb997a	BREAKING: regex: finally remove CRLF hack Now that Rust's regex crate finally supports a CRLF mode, we can remove this giant hack in ripgrep to enable it. (And assuredly did not work in all cases.) The way this works in the regex engine is actually subtly different than what ripgrep previously did. Namely, --crlf would previously treat either \r\n or \n as a line terminator. But now it treats \r\n, \n and \r as line terminators. In effect, it is implemented by treating \r and \n as line terminators, but ^ and $ will never match at a position between a \r and a \n. So basically this means that $ will end up matching in more cases than it might be intended too, but I don't expect this to be a big problem in practice. Note that passing --crlf to ripgrep and enabling CRLF mode in the regex via the `R` inline flag (e.g., `(?R:$)`) are subtly different. The `R` flag just controls the regex engine, but --crlf instructs all of ripgrep to use \r\n as a line terminator. There are likely some inconsistencies or corner cases that are wrong as a result of this cognitive dissonance, but we choose to leave well enough alone for now. Fixing this for real will probably require re-thinking how line terminators are handled in ripgrep. For example, one "problem" with how they're handled now is that ripgrep will re-insert its own line terminators when printing output instead of copying the input. This is maybe not so great and perhaps unexpected. (ripgrep probably can't get away with not inserting any line terminators. Users probably expect files that don't end with a line terminator whose last line matches to have a line terminator inserted.)	2023-07-05 14:04:29 -04:00
Andrew Gallant	e028ea3792	regex: migrate grep-regex to regex-automata We just do a "basic" dumb migration. We don't try to improve anything here.	2023-07-05 14:04:29 -04:00
Andrew Gallant	1035f6b1ff	deps: initial migration steps to regex 1.9 This leaves the grep-regex crate in tatters. Pretty much the entire thing needs to be re-worked. The upshot is that it should result in some big simplifications. I hope. The idea here is to drop down and actually use regex-automata 0.3 instead of the regex crate itself.	2023-07-05 14:04:29 -04:00
Andrew Gallant	a7f1276021	readme: update Debian instructions We probably don't need to mention Buster specifically nor Debian unstable since ripgrep has been in Debian for a while now. But we can't just get rid of the `deb` file either, because Debian might package a very old version. Fixes #2531	2023-06-12 07:50:13 -04:00
Martin Nordholts	4fcb1b2202	cli: replace atty with std::io::IsTerminal The `atty` crate is unmaintained[1] and `std::io::IsTerminal` was stabilized in Rust 1.70. [1]: https://rustsec.org/advisories/RUSTSEC-2021-0145.html PR #2526	2023-06-05 14:00:46 -04:00
Francois Marier	949092fd22	ignore/types: add 'mdwn' to Markdown PR #2520	2023-05-26 14:44:41 -04:00
Andrew Gallant	4a7e7094ad	deps: update everything else	2023-05-25 13:06:13 -04:00
Andrew Gallant	fc0d9b90a9	deps: bump regex to 1.8.3 This brings in an update from the regex crate that fixes a matching bug for particular kinds of alternations of literals. Fixes #2518	2023-05-25 13:06:13 -04:00

1 2 3 4 5 ...

1786 Commits