ripgrep

mirror of https://github.com/BurntSushi/ripgrep.git synced 2025-08-10 05:59:25 +02:00

Author	SHA1	Message	Date
Andrew Gallant	09905560ff	printer: clean-up Like a previous commit did for the grep-cli crate, this does some polishing to the grep-printer crate. We aren't able to achieve as much as we did with grep-cli, but we at least eliminate all rust-analyzer lints and group imports in the way I've been doing recently. Next we'll start doing some more invasive changes.	2023-09-25 14:39:54 -04:00
Andrew Gallant	25a7145c79	cli: add new 'hostname' function This will enable us to query for the current system's hostname in both Unix and Windows environments. We could have pulled in the 'gethostname' crate for this, but: 1. I'm not a huge fan of micro-crates. 2. The 'gethostname' crate panics if an error occurs. (Which, to be fair, an error should never occur, but it seems plausible on borked systems? ripgrep runs in a lot of places, so I'd rather not take the chance of a panic bringing down ripgrep for an optional convenience feature.) 3. The 'gethostname' crate uses the 'windows-targets' crate from Microsoft. This is arguably the "right" thing to do, but ripgrep doesn't use them yet and they appear high-churn. So I just added a safe wrapper to do this to winapi-util[1] and then inlined the Unix version here. This brings in no extra dependencies and the routine is fallible so that callers can recover from potentially strange failures. [1]: https://github.com/BurntSushi/winapi-util/pull/14	2023-09-25 14:39:54 -04:00
Andrew Gallant	19a08bee8a	cli: clean-up crate This does a variety of polishing. 1. Deprecate the tty methods in favor of std's IsTerminal trait. 2. Trim down un-needed dependencies. 3. Use bstr to implement escaping. 4. Various aesthetic polishing. I'm doing this as prep work before adding more to this crate. And as part of a general effort toward reducing ripgrep's dependencies.	2023-09-25 14:39:54 -04:00
Lucas Trzesniewski	1a50324013	printer: add hyperlinks This commit represents the initial work to get hyperlinks working and was submitted as part of PR #2483. Subsequent commits largely retain the functionality and structure of the hyperlink support added here, but rejigger some things around.	2023-09-25 14:39:54 -04:00
Andrew Gallant	86ef683308	deps: update everything Notably, this includes termcolor 1.3, which comes with hyperlink support.	2023-09-20 11:52:42 -04:00
Tavian Barnes	d938e955af	ignore: use work-stealing stack instead of Arc<Mutex<Vec<_>>> This represents yet another iteration on how `ignore` enqueues and distributes work in parallel. The original implementation used a multi-producer/multi-consumer thread safe queue from crossbeam. At some point, I migrated to a simple `Arc<Mutex<Vec<_>>>` and treated it as a stack so that we did depth first traversal. This helped with memory usage in very wide directories. But it turns out that a naive stack-behind-a-mutex can be quite a bit slower than something that's a little smarter, such as a work-stealing stack used in this commit. My hypothesis for why this helps is that without the stealing component, work distribution can get stuck in sub-optimal configurations that depend on which directory entries get assigned to a particular worker. It's likely that this can result in some workers getting "more" work than others, just by chance, and thus remain idle. But the work-stealing approach heads that off. This does re-introduce a dependency on parts of crossbeam which is kind of a bummer, but it's carrying its weight for now. Closes #1823, Closes #2591 Ref https://github.com/sharkdp/fd/issues/28	2023-09-20 11:52:42 -04:00
Thilo Uttendorfer	cad1f5fae2	ignore: fix filtering when searching subdirectories When searching subdirectories the path was not correctly built and included duplicate parts. This fix will remove the duplicate part if possible. Fixes #1757, Closes #2295	2023-09-20 11:52:42 -04:00
dana	2198bd92fa	github: convert bug-report issue template to issue form Trying this to see how well it works. PR #2560	2023-09-18 11:07:46 -04:00
Andrew Gallant	a4387ed491	deps: bump to aho-corasick 1.1.0 This brings in aarch64 SIMD support for Teddy[1]. In effect, it means searches that are multiple (but a small number of) literals extracted will likely get much faster on aarch64 (i.e., Apple silicon). For example, from the PR, on my M2 mac mini: $ time rg-before-teddy-aarch64 -i -c 'Sherlock Holmes' OpenSubtitles2018.half.en 3055 real 8.196 user 7.726 sys 0.469 maxmem 5728 MB faults 17 $ time rg-after-teddy-aarch64 -i -c 'Sherlock Holmes' OpenSubtitles2018.half.en 3055 real 1.127 user 0.701 sys 0.425 maxmem 4880 MB faults 13 w00t. [1]: https://github.com/BurntSushi/aho-corasick/pull/129	2023-09-18 09:35:06 -04:00
Andrew Gallant	d2a409f89f	deps: bump to memchr 2.6.3 This brings in a fix for line counting when SIMD isn't available[1]. [1]: https://github.com/BurntSushi/memchr/pull/137	2023-09-02 14:40:45 -04:00
Andrew Gallant	6cdb99ea61	deps: drop bytecount in favor of memchr_iter(..).count() As of the memchr 2.6 release, its Iterator::count method is specialized to only count the number of occurrences instead of finding the offset of each occurrence. This replaces ripgrep's use of the bytecount crate. While micro-benchmarks suggest that memchr's method has better throughput than bytecount, it turned out to be an illusion. Namely, on a ~13GB haystack prior to this change: $ time rg-bytecount 'You killed my friend, my best friend, my lifelong friend!' OpenSubtitles2018.raw.en --line-number 441450441:- You killed my friend, my best friend, my lifelong friend! real 1.473 user 1.186 sys 0.286 maxmem 12512 MB faults 0 And then after: $ time rg 'You killed my friend, my best friend, my lifelong friend!' OpenSubtitles2018.raw.en --line-number 441450441:- You killed my friend, my best friend, my lifelong friend! real 1.532 user 1.280 sys 0.250 maxmem 12512 MB faults 0 But perf is just about in the same ballpark. That's good enough for me at the moment in order to drop the extra dependency. I did this because the marginal cost of adding the Iterator::count() specialization to memchr was extremely small.	2023-09-02 12:25:34 -04:00
Andrew Gallant	551ad3bada	deps: update bstr	2023-09-02 12:15:15 -04:00
Andrew Gallant	8856f72df5	deps: update the regex family of crates	2023-09-02 12:14:50 -04:00
Yochem van Rosmalen	d596f6ebd0	ignore/types: add *.vsh to V type PR #2604	2023-08-31 08:51:07 -04:00
Christian Vallentin	6cd9479634	ignore: implement FusedIterator for Walk PR #2567	2023-08-28 22:55:19 -04:00
Andrew Gallant	3bfa125b2e	ci: replace mips with powerpc64, aarch64 and s390x We drop our MIPS target because it no longer works.[1] We were previously using it as a means of testing ripgrep in a big endian environment. So to achieve that without MIPS, we test on powerpc64 and s390x. (No particular reason to do both, but why not.) We also add aarch64 as a proxy for at least ensuring everything works for the same architecture as Apple silicon. It's not a guarantee that everything works, but it seems better than nothing until we can actually test Apple silicon in CI. [1]: `c788378d6f`	2023-08-28 22:45:46 -04:00
Andrew Gallant	51765f2f4c	ignore: apply rustfmt I believe this happened because rustfmt now knows how to format `let ... else` constructs.	2023-08-28 20:09:26 -04:00
Andrew Gallant	67abd49678	deps: bump everything else	2023-08-28 20:00:41 -04:00
Andrew Gallant	a7fe296772	deps: bump regex, regex-automata and regex-syntax	2023-08-28 19:59:09 -04:00
Andrew Gallant	f75991538b	deps: bump memchr to 2.6.0 This in particular brings in a PR[1] that provides huge speedups on aarch64 (e.g., Apple silicon). [1]: https://github.com/BurntSushi/memchr/pull/129	2023-08-28 19:56:59 -04:00
mataha	962d47e6a1	ignore/types: add Prolog file types This improves the Prolog file type rules. * `.pl` is the most common extension in the wild, though `.pro` is preferred in places where file extension may clash with Perl[1]. * `.P` is used for compatibility with XSB Prolog dialect[2]. PR #2590 [1]: https://www.swi-prolog.org/pldoc/man?section=fileext [2]: https://www.swi-prolog.org/pldoc/man?section=xsb-source	2023-08-21 10:53:56 -04:00
mataha	19b6a45abb	ignore/types: tweak Gradle file types This PR extends Gradle file types with the following: - Kotlin DSL buildscripts (`.gradle.kts`) - Gradle Java properties (`gradle.properties`) - wrapper files (`gradle-wrapper.`) - wrapper scripts (`gradlew`, `gradlew.bat`) PR #2587	2023-08-20 18:49:02 -04:00
Andrew Gallant	c51790b56d	deps: update everything	2023-08-15 11:09:46 -04:00
Andrew Gallant	2af3734e0c	deps: update aho-corasick This brings in [1,2], which improves memory usage substantially when Aho-Corasick is used. [1]: https://github.com/BurntSushi/aho-corasick/pull/120 [2]: https://github.com/BurntSushi/aho-corasick/pull/121	2023-08-15 11:08:41 -04:00
Andrew Gallant	61733f6378	globset-0.4.13 globset-0.4.13	2023-08-05 09:34:36 -04:00
Andrew Gallant	7227e94ce5	globset: use non-capture groups in regex transform We currently implement globs by converting them to regexes, and in doing so, sometimes use grouping. In all but one case, we used non-capturing groups. But for alternations, we used capturing groups, which was likely just an oversight. We don't make use of capture groups at all, and while they usually don't have any overhead, they lead to weird cases like this one: https://github.com/rust-lang/regex/issues/1059 That particular issue is also a bug in the regex crate itself, which is fixed in https://github.com/rust-lang/regex/pull/1062. Note though that the bug fix in the regex crate is required. Even with this patch to globset, memory usage is reduced (by about half in rust-lang/regex#1059) but is not returned to where it was prior to the regex 1.9 release.	2023-08-05 09:33:57 -04:00
Andrew Gallant	341a19e0d0	regex: fix fast path for -w/--word-regexp flag (#2576 ) It turns out our fast path for -w/--word-regexp wasn't quite correct in some cases. Namely, we use `(?m:^\|\W)(<original-regex>)(?m:\W\|$)` as the implementation of -w/--word-regexp since `\b(<original-regex>)\b` has some unintuitive results in certain cases, specifically when <original-regex> matches non-word characters at match boundaries. The problem is that using this formulation means that you need to extract the capture group around <original-regex> to find the "real" match, since the surrounding (^\|\W) and (\W\|$) aren't part of the match. This is fine, but the capture group engine is usually slow, so we have a fast path where we try to deduce the correct match boundary after an initial match (before running capture groups). The problem is that doing this is rather tricky because it's hard to know, in general, whether the `^` or the `\W` matched. This still doesn't seem quite right overall, but we at least fix one more case. Fixes #2574	2023-07-31 08:51:09 -04:00
Vidar	fed4fea217	ignore/types: add csproj Supports the .NET C# Project file extension. PR #2575	2023-07-31 07:08:44 -04:00
Andrew Gallant	053a1669bb	globset-0.4.12 globset-0.4.12	2023-07-26 19:51:38 -04:00
David Tolnay	31d3f16254	api: impl Deserialize for GlobSet PR #2569	2023-07-26 19:51:22 -04:00
Andrew Gallant	304a60e8e9	grep-cli-0.1.9 grep-cli-0.1.9	2023-07-18 13:25:23 -04:00
Andrew Gallant	1d35859861	globset-0.4.11 globset-0.4.11	2023-07-12 12:58:43 -04:00
mataha	601e122e9f	ignore/types: add Windows Command Prompt files This PR adds `.bat` and `.cmd` file types. In doing so, it makes a distinction between batch files (old standard from the MS-DOS era) and command scripts (new flavor - can operate on batch files, although `*.cmd` is preferred for various reasons, the main one being batch files will set `ERRORLEVEL` following inconsistent MS-DOS style rules[1]). PR #2556 [1]: https://groups.google.com/g/microsoft.public.win2000.cmdprompt.admin/c/XHeUq8oe2wk/m/LIEViGNmkK0J#i106	2023-07-10 15:58:17 -04:00
Andrew Gallant	efb2e8ce1e	ci/release: use latest OS versions	2023-07-09 10:14:03 -04:00
xEgoist	8d464e5c78	ci/release: add sha256 sums to release artifacts Fixes #1924, Closes #2168	2023-07-09 10:14:03 -04:00
Andrew Gallant	d67809d6c4	github: remove dependabot configuration This does not seem to have worked at all. For example, there were Actions being used that were clearly deprecated/archived[1]. But Dependabot didn't make a peep. So just get rid of it to avoid the false sense that someone is checking our dependencies for us. [1]: https://github.com/BurntSushi/ripgrep/pull/2360	2023-07-09 10:14:03 -04:00
nguyenvukhang	6abb962f0d	cli: fix non-path sorting behavior Previously, sorting worked by sorting the parents and then sorting the children within each parent. This was done during traversal, but it only works when sorting parents preserves the overall order. This generally only works for '--sort path' in ascending order. This commit fixes the rest of the sorting behavior by collecting all of the paths to search and then sorting them before searching. We only collect all of the paths when sorting was requested. Fixes #2243, Closes #2361	2023-07-09 10:14:03 -04:00
Edoardo Pirovano	6d95c130d5	cli: add --stop-on-nonmatch flag This causes ripgrep to stop searching an individual file after it has found a non-matching line. But this only occurs after it has found a matching line. Fixes #1790, Closes #1930	2023-07-08 18:52:42 -04:00
Garrett Thornburg	4782ebd5e0	core: lock stdout before printing an error message to stderr Adds a new eprintln_locked macro which locks STDOUT before logging to STDERR. This patch also replaces instances of eprintln with eprintln_locked to avoid interleaving lines. Fixes #1941, Closes #1968	2023-07-08 18:52:42 -04:00
piegames	4993d29a16	globset: add 'escape' routine Fixes #2060, Closes #2061	2023-07-08 18:52:42 -04:00
Seth Stadick	23adbd6795	cli: force binary existance check Previously, we were only doing a binary existence check on Windows. And in fact, the main point there wasn't binary existence, but ensuring we didn't accidentally resolve a binary name relative to the CWD, which could result in executing a program one didn't mean to run. However, it is useful to be able to check whether a binary exists on any platform when associating a glob with a binary. If the binary doesn't exist, then the association can fail eagerly and let some other glob apply. Closes #1946	2023-07-08 18:52:42 -04:00
Kevin Svetlitski	9df8ab42b1	cargo: reduce the size of the .crate file published to crates.io None of this stuff is needed for the main ripgrep crate. Closes #1940	2023-07-08 18:52:42 -04:00
Michal Terepeta	cb7501ff11	doc: clarify the comment on `Worker.work_done` We call `work_done` only once the work has been actually performed (otherwise `num_pending` could go to 0 before the actual work is done). Closes #2039	2023-07-08 18:52:42 -04:00
Kyle Todeschini	3b66f37a31	doc: improve -r/--replace flag syntax docs Fixes #2108, Closes #2123	2023-07-08 18:52:42 -04:00
Andrew Gallant	3eccb7c363	readme: add 'yum-utils' to RHEL/Centos instructions Closes #2103	2023-07-08 18:52:42 -04:00
kotborealis	f30a30867e	ignore/types: name aliases for file types We also make py/python, md/markdown and ts/typescript aliases of one another. Note that this only introduces aliases at the point where default types are defined. This just makes them a bit easier to read/write, and also makes it easier to expose more names that describe the same thing. Fixes #1857, Closes #1895	2023-07-08 18:52:42 -04:00
Klas Mellbourn	7313dca472	ignore/types: add 'typescript' alias for 'ts' Closes #2009	2023-07-08 18:52:42 -04:00
Tama McGlinn	99bf2b01dc	ignore/types: add Ada filetypes, including gprbuild and alire .adb and .ads are the usual extensions for Ada source code, and *.gpr indicates a GPRbuild project file used for Ada, and these days often being combined with alire for package dependency resolution. Alire stores a bunch of files named alire.toml in different directories in your (gitignored) cache/dependencies/... Closes #2013	2023-07-08 18:52:42 -04:00
Juan Francisco Cantero Hurtado	ee1360cc07	ignore/types: add raku extensions to ignore types Closes #2117	2023-07-08 18:52:42 -04:00
Andrew Gallant	db6bb21a62	windows: attempt to enable long path support for MSVC targets See the README and comments in the build.rs. Basically, this embeds an XML file that I guess is a way of setting configuration knobs on Windows. One of those knobs is enabling long path support. You still need to enable it in your registry (lol), but this will handle the other half of it. Fixes #364, Closes #2049	2023-07-08 18:52:42 -04:00

1 2 3 4 5 ...

1841 Commits