mirror of
https://github.com/BurntSushi/ripgrep.git
synced 2025-07-11 14:30:24 +02:00
readme: update benchmarks
This also updates the corpora used, so previous times (and counts) are not comparable. We also remove some tools, likt pt, sift and ucg, since they appear to be no longer maintained. ag isn't really maintained either, but it still has significant mind share, so we retain a benchmark for it. We also upgrade ack to version 3, and remove the clarification on how `-w` is implemented. We also add `git grep -P` (uses PCRE2) which appears to be much faster than `git grep -E`. Finally, we add ugrep which is a new up and comer in this space. Fixes #1474
This commit is contained in:
55
README.md
55
README.md
@ -38,10 +38,11 @@ Please see the [CHANGELOG](CHANGELOG.md) for a release history.
|
|||||||
|
|
||||||
### Quick examples comparing tools
|
### Quick examples comparing tools
|
||||||
|
|
||||||
This example searches the entire Linux kernel source tree (after running
|
This example searches the entire
|
||||||
`make defconfig && make -j8`) for `[A-Z]+_SUSPEND`, where all matches must be
|
[Linux kernel source tree](https://github.com/BurntSushi/linux)
|
||||||
words. Timings were collected on a system with an Intel i7-6900K 3.2 GHz, and
|
(after running `make defconfig && make -j8`) for `[A-Z]+_SUSPEND`, where
|
||||||
ripgrep was compiled with SIMD enabled.
|
all matches must be words. Timings were collected on a system with an Intel
|
||||||
|
i7-6900K 3.2 GHz.
|
||||||
|
|
||||||
Please remember that a single benchmark is never enough! See my
|
Please remember that a single benchmark is never enough! See my
|
||||||
[blog post on ripgrep](https://blog.burntsushi.net/ripgrep/)
|
[blog post on ripgrep](https://blog.burntsushi.net/ripgrep/)
|
||||||
@ -49,40 +50,38 @@ for a very detailed comparison with more benchmarks and analysis.
|
|||||||
|
|
||||||
| Tool | Command | Line count | Time |
|
| Tool | Command | Line count | Time |
|
||||||
| ---- | ------- | ---------- | ---- |
|
| ---- | ------- | ---------- | ---- |
|
||||||
| ripgrep (Unicode) | `rg -n -w '[A-Z]+_SUSPEND'` | 450 | **0.106s** |
|
| ripgrep (Unicode) | `rg -n -w '[A-Z]+_SUSPEND'` | 452 | **0.136s** |
|
||||||
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 0.553s |
|
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `git grep -P -n -w '[A-Z]+_SUSPEND'` | 452 | 0.348s |
|
||||||
| [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 450 | 0.589s |
|
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep -r --ignore-files --no-hidden -I -w '[A-Z]+_SUSPEND'` | 452 | 0.506s |
|
||||||
| [git grep (Unicode)](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 2.266s |
|
| [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 452 | 1.150s |
|
||||||
| [sift](https://github.com/svent/sift) | `sift --git -n -w '[A-Z]+_SUSPEND'` | 450 | 3.505s |
|
| [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 452 | 0.654s |
|
||||||
| [ack](https://github.com/beyondgrep/ack2) | `ack -w '[A-Z]+_SUSPEND'` | 1878 | 6.823s |
|
| [ack](https://github.com/beyondgrep/ack3) | `ack -w '[A-Z]+_SUSPEND'` | 452 | 4.054s |
|
||||||
| [The Platinum Searcher](https://github.com/monochromegane/the_platinum_searcher) | `pt -w -e '[A-Z]+_SUSPEND'` | 450 | 14.208s |
|
| [git grep (Unicode)](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 452 | 4.205s |
|
||||||
|
|
||||||
(Yes, `ack` [has](https://github.com/beyondgrep/ack2/issues/445) a
|
Here's another benchmark on the same corpus as above that disregards gitignore
|
||||||
[bug](https://github.com/beyondgrep/ack2/issues/14).)
|
files and searches with a whitelist instead. The corpus is the same as in the
|
||||||
|
previous benchmark, and the flags passed to each command ensure that they are
|
||||||
Here's another benchmark that disregards gitignore files and searches with a
|
doing equivalent work:
|
||||||
whitelist instead. The corpus is the same as in the previous benchmark, and the
|
|
||||||
flags passed to each command ensure that they are doing equivalent work:
|
|
||||||
|
|
||||||
| Tool | Command | Line count | Time |
|
| Tool | Command | Line count | Time |
|
||||||
| ---- | ------- | ---------- | ---- |
|
| ---- | ------- | ---------- | ---- |
|
||||||
| ripgrep | `rg -L -u -tc -n -w '[A-Z]+_SUSPEND'` | 404 | **0.079s** |
|
| ripgrep | `rg -uuu -tc -n -w '[A-Z]+_SUSPEND'` | 388 | **0.096s** |
|
||||||
| [ucg](https://github.com/gvansickle/ucg) | `ucg --type=cc -w '[A-Z]+_SUSPEND'` | 390 | 0.163s |
|
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep -r -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 388 | 0.493s |
|
||||||
| [GNU grep](https://www.gnu.org/software/grep/) | `egrep -R -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 404 | 0.611s |
|
| [GNU grep](https://www.gnu.org/software/grep/) | `egrep -r -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 388 | 0.806s |
|
||||||
|
|
||||||
(`ucg` [has slightly different behavior in the presence of symbolic links](https://github.com/gvansickle/ucg/issues/106).)
|
And finally, a straight-up comparison between ripgrep, ugrep and GNU grep on a
|
||||||
|
single large file
|
||||||
And finally, a straight-up comparison between ripgrep and GNU grep on a single
|
(~9.3GB, [`OpenSubtitles.raw.en.gz`](http://opus.nlpl.eu/download.php?f=OpenSubtitles/v2018/mono/OpenSubtitles.raw.en.gz)):
|
||||||
large file (~9.3GB,
|
|
||||||
[`OpenSubtitles2016.raw.en.gz`](http://opus.lingfil.uu.se/OpenSubtitles2016/mono/OpenSubtitles2016.raw.en.gz)):
|
|
||||||
|
|
||||||
| Tool | Command | Line count | Time |
|
| Tool | Command | Line count | Time |
|
||||||
| ---- | ------- | ---------- | ---- |
|
| ---- | ------- | ---------- | ---- |
|
||||||
| ripgrep | `rg -w 'Sherlock [A-Z]\w+'` | 5268 | **2.108s** |
|
| ripgrep | `rg -w 'Sherlock [A-Z]\w+'` | 7882 | **2.769s** |
|
||||||
| [GNU grep](https://www.gnu.org/software/grep/) | `LC_ALL=C egrep -w 'Sherlock [A-Z]\w+'` | 5268 | 7.014s |
|
| [ugrep](https://github.com/Genivia/ugrep) | `ugrep -w 'Sherlock [A-Z]\w+'` | 7882 | 6.802s |
|
||||||
|
| [GNU grep](https://www.gnu.org/software/grep/) | `LC_ALL=en_US.UTF-8 egrep -w 'Sherlock [A-Z]\w+'` | 7882 | 9.027s |
|
||||||
|
|
||||||
In the above benchmark, passing the `-n` flag (for showing line numbers)
|
In the above benchmark, passing the `-n` flag (for showing line numbers)
|
||||||
increases the times to `2.640s` for ripgrep and `10.277s` for GNU grep.
|
increases the times to `3.423s` for ripgrep and `13.031s` for GNU grep. ugrep
|
||||||
|
times are unaffected by the presence or absence of `-n`.
|
||||||
|
|
||||||
|
|
||||||
### Why should I use ripgrep?
|
### Why should I use ripgrep?
|
||||||
|
Reference in New Issue
Block a user