1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00
ripgrep recursively searches directories for a regex pattern while respecting your gitignore
Go to file
tiehuis 714ae82241 Add --max-filesize option to cli
The --max-filesize option allows filtering files which are larger than
the specified limit. This is potentially useful if one is attempting to
search a number of large files without common file-types/suffixes.

See #369.
2017-03-08 10:17:18 -05:00
benchsuite Update to regex 0.2. 2017-01-01 01:03:21 -05:00
ci fix ci 2016-12-07 11:13:44 -05:00
doc Add --max-filesize option to cli 2017-03-08 10:17:18 -05:00
globset Add enclosing group to alternations in globs 2017-03-08 10:13:28 -05:00
grep 0.4.0 2017-01-13 23:46:21 -05:00
ignore Add file size exclusion to walker 2017-03-08 10:17:18 -05:00
pkg Fix homebrew formula 2017-01-19 08:33:54 -05:00
src Add --max-filesize option to cli 2017-03-08 10:17:18 -05:00
termcolor fix badges 2017-02-19 11:28:36 -05:00
tests Add --max-filesize option to cli 2017-03-08 10:17:18 -05:00
wincolor wincolor-0.1.2 2017-01-17 19:34:48 -05:00
.gitignore Completely re-work colored output and tty handling. 2016-11-20 11:14:52 -05:00
.travis.yml Tweak build matrix. 2017-01-02 15:51:45 -05:00
appveyor.yml Completely re-work colored output and tty handling. 2016-11-20 11:14:52 -05:00
build.rs Add --max-filesize option to cli 2017-03-08 10:17:18 -05:00
Cargo.lock bump clap to 2.20.5 2017-02-25 18:43:13 -05:00
Cargo.toml Add --max-filesize option to cli 2017-03-08 10:17:18 -05:00
CHANGELOG.md Update 0.4.0 changelog. 2017-01-17 19:34:18 -05:00
compile Add some commented out commands to compile script. 2016-12-24 09:13:53 -05:00
COPYING initial commit 2016-02-27 11:07:26 -05:00
HomebrewFormula Make the repo a Homebrew Tap 2016-09-30 12:51:37 -05:00
LICENSE-MIT initial commit 2016-02-27 11:07:26 -05:00
README.md Added Chocolatey to the installation list 2017-03-01 06:41:52 -05:00
UNLICENSE initial commit 2016-02-27 11:07:26 -05:00

ripgrep (rg)

ripgrep is a line oriented search tool that combines the usability of The Silver Searcher (similar to ack) with the raw speed of GNU grep. ripgrep works by recursively searching your current directory for a regex pattern. ripgrep has first class support on Windows, Mac and Linux, with binary downloads available for every release.

Linux build status Windows build status

Dual-licensed under MIT or the UNLICENSE.

Screenshot of search results

A screenshot of a sample search with ripgrep

Quick examples comparing tools

This example searches the entire Linux kernel source tree (after running make defconfig && make -j8) for [A-Z]+_SUSPEND, where all matches must be words. Timings were collected on a system with an Intel i7-6900K 3.2 GHz, and ripgrep was compiled using the compile script in this repo.

Please remember that a single benchmark is never enough! See my blog post on ripgrep for a very detailed comparison with more benchmarks and analysis.

Tool Command Line count Time
ripgrep (Unicode) rg -n -w '[A-Z]+_SUSPEND' 450 0.134s
The Silver Searcher ag -w '[A-Z]+_SUSPEND' 450 0.753s
git grep LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND' 450 0.823s
git grep (Unicode) LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND' 450 2.880s
sift sift --git -n -w '[A-Z]+_SUSPEND' 450 3.656s
The Platinum Searcher pt -w -e '[A-Z]+_SUSPEND' 450 12.369s
ack ack -w '[A-Z]+_SUSPEND' 1878 16.952s

(Yes, ack has a bug.)

Here's another benchmark that disregards gitignore files and searches with a whitelist instead. The corpus is the same as in the previous benchmark, and the flags passed to each command ensures that they are doing equivalent work:

Tool Command Line count Time
ripgrep rg -L -u -tc -n -w '[A-Z]+_SUSPEND' 404 0.108s
ucg ucg --type=cc -w '[A-Z]+_SUSPEND' 392 0.219s
GNU grep egrep -R -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND' 404 0.733s

(ucg has slightly different behavior in the presence of symbolic links.)

And finally, a straight up comparison between ripgrep and GNU grep on a single large file (~9.3GB, OpenSubtitles2016.raw.en.gz):

Tool Command Line count Time
ripgrep rg -w 'Sherlock [A-Z]\w+' 5268 2.520s
GNU grep LC_ALL=C egrep -w 'Sherlock [A-Z]\w+' 5268 7.143s

In the above benchmark, passing the -n flag (for showing line numbers) increases the times to 3.081s for ripgrep and 11.403s for GNU grep.

Why should I use ripgrep?

  • It can replace both The Silver Searcher and GNU grep because it is faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement for both, but the feature sets are far more similar than different.)
  • Like The Silver Searcher, ripgrep defaults to recursive directory search and won't search files ignored by your .gitignore files. It also ignores hidden and binary files by default. ripgrep also implements full support for .gitignore, where as there are many bugs related to that functionality in The Silver Searcher.
  • ripgrep can search specific types of files. For example, rg -tpy foo limits your search to Python files and rg -Tjs foo excludes Javascript files from your search. ripgrep can be taught about new file types with custom matching rules.
  • ripgrep supports many features found in grep, such as showing the context of search results, searching multiple patterns, highlighting matches with color and full Unicode support. Unlike GNU grep, ripgrep stays fast while supporting Unicode (which is always on).

In other words, use ripgrep if you like speed, filtering by default, fewer bugs and Unicode support.

Why shouldn't I use ripgrep?

I'd like to try to convince you why you shouldn't use ripgrep. This should give you a glimpse at some important downsides or missing features of ripgrep.

  • ripgrep uses a regex engine based on finite automata, so if you want fancy regex features such as backreferences or look around, ripgrep won't give them to you. ripgrep does support lots of things though, including, but not limited to: lazy quantification (e.g., a+?), repetitions (e.g., a{2,5}), begin/end assertions (e.g., ^\w+$), word boundaries (e.g., \bfoo\b), and support for Unicode categories (e.g., \p{Sc} to match currency symbols or \p{Lu} to match any uppercase letter). (Fancier regexes will never be supported.)
  • If you need to search files with text encodings other than UTF-8 (like UTF-16), then ripgrep won't work. ripgrep will still work on ASCII compatible encodings like latin1 or otherwise partially valid UTF-8. ripgrep can search for arbitrary bytes though, which might work in a pinch. (Likely to be supported in the future.)
  • ripgrep doesn't yet support searching compressed files. (Likely to be supported in the future.)
  • ripgrep doesn't have multiline search. (Unlikely to ever be supported.)

In other words, if you like fancy regexes, non-UTF-8 character encodings, searching compressed files or multiline search, then ripgrep may not quite meet your needs (yet).

Is it really faster than everything else?

Yes. A large number of benchmarks with detailed analysis for each is available on my blog.

Summarizing, ripgrep is fast because:

  • It is built on top of Rust's regex engine. Rust's regex engine uses finite automata, SIMD and aggressive literal optimizations to make searching very fast.
  • Rust's regex library maintains performance with full Unicode support by building UTF-8 decoding directly into its deterministic finite automaton engine.
  • It supports searching with either memory maps or by searching incrementally with an intermediate buffer. The former is better for single files and the latter is better for large directories. ripgrep chooses the best searching strategy for you automatically.
  • Applies your ignore patterns in .gitignore files using a RegexSet. That means a single file path can be matched against multiple glob patterns simultaneously.
  • It uses a lock-free parallel recursive directory iterator, courtesy of crossbeam and ignore.

Installation

The binary name for ripgrep is rg.

Binaries for ripgrep are available for Windows, Mac and Linux. Linux binaries are static executables. Windows binaries are available either as built with MinGW (GNU) or with Microsoft Visual C++ (MSVC). When possible, prefer MSVC over GNU, but you'll need to have the Microsoft VC++ 2015 redistributable installed.

If you're a Mac OS X Homebrew user, then you can install ripgrep either from homebrew-core, (compiled with rust stable, no SIMD):

$ brew install ripgrep

or you can install a binary compiled with rust nightly (including SIMD and all optimizations) by utilizing a custom tap:

$ brew tap burntsushi/ripgrep https://github.com/BurntSushi/ripgrep.git
$ brew install burntsushi/ripgrep/ripgrep-bin

If you're a Windows Chocolatey user, then you can install ripgrep from the official repo:

$ choco install ripgrep

If you're an Arch Linux user, then you can install ripgrep from the official repos:

$ pacman -S ripgrep

If you're a Gentoo user, you can install ripgrep from the official repo:

$ emerge ripgrep

If you're a Fedora 24+ user, you can install ripgrep from copr:

$ dnf copr enable carlgeorge/ripgrep
$ dnf install ripgrep

If you're a RHEL/CentOS 7 user, you can install ripgrep from copr:

$ yum-config-manager --add-repo=https://copr.fedorainfracloud.org/coprs/carlgeorge/ripgrep/repo/epel-7/carlgeorge-ripgrep-epel-7.repo
$ yum install ripgrep

If you're a Nix user, you can install ripgrep from nixpkgs:

$ nix-env --install ripgrep
$ # (Or using the attribute name, which is also `ripgrep`.)

If you're a Rust programmer, ripgrep can be installed with cargo. Note that this requires you to have Rust 1.12 or newer installed.

$ cargo install ripgrep

ripgrep isn't currently in any other package repositories. I'd like to change that.

Whirlwind tour

The command line usage of ripgrep doesn't differ much from other tools that perform a similar function, so you probably already know how to use ripgrep. The full details can be found in rg --help, but let's go on a whirlwind tour.

ripgrep detects when its printing to a terminal, and will automatically colorize your output and show line numbers, just like The Silver Searcher. Coloring works on Windows too! Colors can be controlled more granularly with the --color flag.

One last thing before we get started: ripgrep assumes UTF-8 everywhere. It can still search files that are invalid UTF-8 (like, say, latin-1), but it will simply not work on UTF-16 encoded files or other more exotic encodings. Support for other encodings may happen.

To recursively search the current directory, while respecting all .gitignore files, ignore hidden files and directories and skip binary files:

$ rg foobar

The above command also respects all .ignore files, including in parent directories. .ignore files can be used when .gitignore files are insufficient. In all cases, .ignore patterns take precedence over .gitignore.

To ignore all ignore files, use -u. To additionally search hidden files and directories, use -uu. To additionally search binary files, use -uuu. (In other words, "search everything, dammit!") In particular, rg -uuu is similar to grep -a -r.

$ rg -uu foobar  # similar to `grep -r`
$ rg -uuu foobar  # similar to `grep -a -r`

(Tip: If your ignore files aren't being adhered to like you expect, run your search with the --debug flag.)

Make the search case insensitive with -i, invert the search with -v or show the 2 lines before and after every search result with -C2.

Force all matches to be surrounded by word boundaries with -w.

Search and replace (find first and last names and swap them):

$ rg '([A-Z][a-z]+)\s+([A-Z][a-z]+)' --replace '$2, $1'

Named groups are supported:

$ rg '(?P<first>[A-Z][a-z]+)\s+(?P<last>[A-Z][a-z]+)' --replace '$last, $first'

Up the ante with full Unicode support, by matching any uppercase Unicode letter followed by any sequence of lowercase Unicode letters (good luck doing this with other search tools!):

$ rg '(\p{Lu}\p{Ll}+)\s+(\p{Lu}\p{Ll}+)' --replace '$2, $1'

Search only files matching a particular glob:

$ rg foo -g 'README.*'

Or exclude files matching a particular glob:

$ rg foo -g '!*.min.js'

Search and return paths matching a particular glob (i.e., -g flag in ag/ack):

$ rg -g 'doc*' --files

Search only HTML and CSS files:

$ rg -thtml -tcss foobar

Search everything except for Javascript files:

$ rg -Tjs foobar

To see a list of types supported, run rg --type-list. To add a new type, use --type-add, which must be accompanied by a pattern for searching (rg won't persist your type settings):

$ rg --type-add 'foo:*.{foo,foobar}' -tfoo bar

The type foo will now match any file ending with the .foo or .foobar extensions.

Regex syntax

The syntax supported is documented as part of Rust's regex library.

Shell completions

Shell completion files are included in the release tarball for Bash, Fish, Zsh and PowerShell.

For bash, move rg.bash-completion to $XDG_CONFIG_HOME/bash_completion or /etc/bash_completion.d/.

For fish, move rg.fish to $HOME/.config/fish/completions.

Building

ripgrep is written in Rust, so you'll need to grab a Rust installation in order to compile it. ripgrep compiles with Rust 1.12 (stable) or newer. Building is easy:

$ git clone https://github.com/BurntSushi/ripgrep
$ cd ripgrep
$ cargo build --release
$ ./target/release/rg --version
0.1.3

If you have a Rust nightly compiler, then you can enable optional SIMD acceleration like so:

RUSTFLAGS="-C target-cpu=native" cargo build --release --features 'simd-accel avx-accel'

If your machine doesn't support AVX instructions, then simply remove avx-accel from the features list. Similarly for SIMD.

Running tests

ripgrep is relatively well tested, including both unit tests and integration tests. To run the full test suite, use:

$ cargo test

from the repository root.