1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2025-01-19 05:49:14 +02:00

30 Commits

Author SHA1 Message Date
Andrew Gallant
6b61271bbb
benchsuite/runs: add another run of the benchmarks
Looks like ripgrep is still the king. ;-)
2022-12-16 11:24:10 -05:00
Andrew Gallant
1be86392e0
benchsuite: pass '-a' to ugrep in some cases
It looks like it incorrectly treats a file that is purely valid UTF-8 as
a binary file, which in turn effectively renders all of the Russian
subtitle benchmarks moot for ugrep. So we pass '-a' to force ugrep to
treat the file as text.

This technically gives ugrep an edge because it now no longer needs to
look to see if the haystack is binary or not. In practice this is
usually implemented using highly optimized SIMD routines (e.g.,
'memchr'), so it tends not to matter much. We might also consider
passing '-a' to all grep commands. But... I think using '-a' is the less
common case and we should try to benchmark the common case.
2022-12-16 11:21:58 -05:00
Andrew Gallant
63058453fa
benchsuite: update URLs
This removes the old commented out URLs for the 2016 subtitles that
don't work any more. I should probably upload the files to a more stable
URL.

This also switches to a 'https://' GitHub URL as I believe the 'git://'
URLs are no longer supported.
2022-12-16 11:20:45 -05:00
Andrew Gallant
20534fad04
benchsuite/runs: add updated benchmark, with ugrep 2020-10-14 17:01:45 -04:00
Andrew Gallant
de0c24f31c
benchsuite: add ugrep commands to benchmarks 2020-10-14 17:00:35 -04:00
Andrew Gallant
c55e7af675
benchsuite: remove -a flag from grep
It's not quite clear why I added this originally. ripgrep doesn't have
its `-a` flag enabled. It's possible I tricked myself into adding it
because ripgrep's binary detection has evolved to be more like GNU
grep's nowadays.

In any case, using `-a` on data that is non-binary can only improve
performance because it removes the overhead for checking whether the
data is binary or not. So this was giving an artificial boost to GNU
grep.
2020-10-14 15:16:25 -04:00
Andrew Gallant
5ebb3ad039
benchsuite: remove sift, pt and ucg
None of these tools got particularly popular (except for pt briefly),
but they do not appear to be active projects nowadays. While ucg was
fast, sift and pt were ecscruiating slow in a number of cases that
required special care in the benchmarks.

This also fixes the ordering of benchmark output to reflect the ordering
in the source of the benchsuite script.
2020-10-14 15:16:07 -04:00
Andrew Gallant
b0066274cb
benchsuite: update subtitle URLs
Since the English subtitle file actually changed its content, we tweak
the benchmark to use a slightly bigger sample that more closely matches
the file size of the Russian subtitle file.

Also, the BurntSushi/linux repo has been updated and I've confirmed that
it builds on my Linux machine.

Fixes #1257
2020-10-14 14:17:23 -04:00
Josh Soref
def993bad1
spelling: fix various misspellings
These were found by the check spelling action[1] and reported
here[2].

PR #1685 

[1] - https://github.com/marketplace/actions/check-spelling
[2] - 6f02d05671 (commitcomment-42625778)
2020-09-22 10:29:16 -04:00
Andrew Gallant
01b7859399
benchsuite: fix formatting 2018-01-08 19:23:43 -05:00
Andrew Gallant
d1fa295bb2
benchsuite: add updated benchmarks 2018-01-08 19:10:49 -05:00
Andrew Gallant
f86f987d71
benchsuite: fix another bug 2017-07-17 09:28:49 -04:00
Andrew Gallant
bfbd53eb92
benchsuite: fix bugs
This fixes a few bugs in the benchsuite script that have apparently
cropped up over time due to insufficient testing.

Fixes #558
2017-07-17 08:21:42 -04:00
Andrew Gallant
163e00677a Update to regex 0.2. 2017-01-01 01:03:21 -05:00
Andrew Gallant
de91c26bb1 Add some additional benchmark runs.
2016-12-24-archlinux-cheetah uses the same setup/compile config as
2016-09-22-archlinux-cheetah. The improvements are nice.

The other 2016-12-24-archlinux-cheetah-* benchmarks try to suss out
a difference between MUSL, glibc, jemalloc and the system allocator.
2016-12-24 09:15:09 -05:00
Andrew Gallant
d4527854de Add --disabled flag to benchsuite.
This allows one to selectively choose which commands aren't
benchmarked.
2016-12-24 09:08:06 -05:00
Andrew Gallant
d547b92d76 Add benchmarks from local machine. 2016-09-22 19:55:30 -04:00
Andrew Gallant
e5a9cd1b64 Remove old benchmark runs. 2016-09-22 19:29:10 -04:00
Andrew Gallant
c1c484d1a7 Add a rg (no mmap) benchmark.
This is added to the subtitle benchmark. The purpose is to demonstrate
how memory mapping a single file for search is faster.
2016-09-21 21:42:34 -04:00
Andrew Gallant
7698b60256 Add new benchmarks.
These benchmarks are exactly like the ones ran on 2016-09-17 with three
changes:

1. `pt` was added back to a few more benchmarks so that it appears any
   time `sift` appears.
2. Warmup iterations was bumped from 1 to 3.
3. Actual benchmark iterations were bumped from 3 to 10.

These benchmarks took around two hours to run.
2016-09-20 16:35:09 -04:00
Andrew Gallant
0a63158a61 Fix error handling bug. 2016-09-17 15:17:48 -04:00
Andrew Gallant
403ba5fdc8 Add Ubuntu 16.04 benchmark runs 2016-09-17 12:41:10 -04:00
Andrew Gallant
bc9d12c4c8 Improve ergonomics of benchsuite.
The runner now detects if commands exist and permits running incomplete
benchmarks.

Also, explicitly use Python 3 since that's what default Ubuntu 16.04 seems
to want.
2016-09-17 11:30:01 -04:00
Andrew Gallant
5a0c873f61 Fixing, polishing and adding benchmarks. 2016-09-16 21:02:46 -04:00
Andrew Gallant
65fec147d6 rename 2016-09-16 18:27:34 -04:00
Andrew Gallant
7fbf2f014c Reorganize some files. 2016-09-16 18:22:35 -04:00
Andrew Gallant
0e46171e3b Rework glob sets.
We try to reduce the pressure on regexes and offload some of it to
Aho-Corasick or exact lookups.
2016-09-15 22:06:04 -04:00
Andrew Gallant
f11d9fb922 Add a word benchmark.
Add ag to case insensitive benchmark.
2016-09-12 19:35:59 -04:00
Andrew Gallant
466cd70a8e More benchmarks for subtitle corpus. 2016-09-11 18:52:53 -04:00
Andrew Gallant
9bf7696ec8 Initial cut at a benchmark suite for CLI search tools. 2016-09-11 01:05:36 -04:00