1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2025-05-13 21:26:27 +02:00

28 Commits

Author SHA1 Message Date
Andrew Gallant
a7d26c8f14 binary: rejigger ripgrep's handling of binary files
This commit attempts to surface binary filtering in a slightly more
user friendly way. Namely, before, ripgrep would silently stop
searching a file if it detected a NUL byte, even if it had previously
printed a match. This can lead to the user quite reasonably assuming
that there are no more matches, since a partial search is fairly
unintuitive. (ripgrep has this behavior by default because it really
wants to NOT search binary files at all, just like it doesn't search
gitignored or hidden files.)

With this commit, if a match has already been printed and ripgrep detects
a NUL byte, then it will print a warning message indicating that the search
stopped prematurely.

Moreover, this commit adds a new flag, --binary, which causes ripgrep to
stop filtering binary files, but in a way that still avoids dumping
binary data into terminals. That is, the --binary flag makes ripgrep
behave more like grep's default behavior.

For files explicitly specified in a search, e.g., `rg foo some-file`,
then no binary filtering is applied (just like no gitignore and no
hidden file filtering is applied). Instead, ripgrep behaves as if you
gave the --binary flag for all explicitly given files.

This was a fairly invasive change, and potentially increases the UX
complexity of ripgrep around binary files. (Before, there were two
binary modes, where as now there are three.) However, ripgrep is now a
bit louder with warning messages when binary file detection might
otherwise be hiding potential matches, so hopefully this is a net
improvement.

Finally, the `-uuu` convenience now maps to `--no-ignore --hidden
--binary`, since this is closer to the actualy intent of the
`--unrestricted` flag, i.e., to reduce ripgrep's smart filtering. As a
consequence, `rg -uuu foo` should now search roughly the same number of
bytes as `grep -r foo`, and `rg -uuua foo` should search roughly the
same number of bytes as `grep -ra foo`. (The "roughly" weasel word is
used because grep's and ripgrep's binary file detection might differ
somewhat---perhaps based on buffer sizes---which can impact exactly what
is and isn't searched.)

See the numerous tests in tests/binary.rs for intended behavior.

Fixes #306, Fixes #855
2019-04-14 19:29:27 -04:00
Andrew Gallant
7a6a40bae1 edition: move core ripgrep to Rust 2018 2019-01-19 10:44:30 -05:00
Andrew Gallant
241bc8f8fc ripgrep: add --pre-glob flag
The --pre-glob flag is like the --glob flag, except it applies to filtering
files through the preprocessor instead of for search. This makes it
possible to apply the preprocessor to only a small subset of files, which
can greatly reduce the process overhead of using a preprocessor when
searching large directories.
2018-09-04 23:18:55 -04:00
Andrew Gallant
4846d63539 grep-cli: introduce new grep-cli crate
This commit moves a lot of "utility" code from ripgrep core into
grep-cli. Any one of these things might not be worth creating a new
crate, but combining everything together results in a fair number of a
convenience routines that make up a decent sized crate.

There is potentially more we could move into the crate, but much of what
remains in ripgrep core is almost entirely dealing with the number of
flags we support.

In the course of doing moving things to the grep-cli crate, we clean up
a lot of gunk and improve failure modes in a number of cases. In
particular, we've fixed a bug where other processes could deadlock if
they write too much to stderr.

Fixes #990
2018-09-04 23:18:55 -04:00
Andrew Gallant
bb110c1ebe ripgrep: migrate to libripgrep
This commit does the work to delete the old `grep` crate and effectively
rewrite most of ripgrep core to use the new libripgrep crates. The new
`grep` crate is now a facade that collects the various crates that make
up libripgrep.

The most complex part of ripgrep core is now arguably the translation
between command line parameters and the library options, which is
ultimately where we want to be.
2018-08-20 07:10:19 -04:00
Andrew Gallant
e3da726836 Rename search module to search_stream.
The name better reflects the difference between it and the search_buffer
module.
2016-09-10 00:08:42 -04:00
Andrew Gallant
f83cd63b11 Add integration tests. 2016-09-09 22:58:30 -04:00
Andrew Gallant
0766617e07 Refactor how coloring is done.
All in the name of appeasing Windows.
2016-09-08 21:46:14 -04:00
Andrew Gallant
0042dce949 Hack in Windows console coloring.
The code has suffered and needs refactoring/commenting. BUT... IT WORKS!
2016-09-07 21:54:28 -04:00
Andrew Gallant
ca058d7584 Add support for memory maps.
I though plain `read` had usurped them, but when searching a very small
number of files, mmaps can be around 20% faster on Linux. It'd be really
unfortunate to leave that on the table.

Mmap searching doesn't support contexts yet, but we probably don't really
care. And duplicating that logic doesn't sound fun. Without contexts, mmap
searching is delightfully simple.
2016-09-06 21:47:33 -04:00
Andrew Gallant
2bda77c414 Fix deps so that others can build it. 2016-09-05 18:22:12 -04:00
Andrew Gallant
7a149c20fe More progress. With coloring! 2016-09-05 17:36:41 -04:00
Andrew Gallant
d8d7560fd0 TODOs and some cleanup/refactoring. 2016-09-05 10:15:13 -04:00
Andrew Gallant
812cdb13c6 Lots of progress:
- Refactored interaction between CLI args and rest of xrep.
  - Filling in a lot more options, including file type filtering.
  - Fixing some bugs in globbing/ignoring.
  - More documentation.
2016-09-05 00:52:23 -04:00
Andrew Gallant
0bf278e72f making search work (finally) 2016-09-03 21:48:23 -04:00
Andrew Gallant
c2b5577cba progress on after contexts 2016-09-03 01:11:14 -04:00
Andrew Gallant
7f0b1ccbd3 Before contexts seem to work.
Code is in a little better shape.
2016-09-02 23:25:07 -04:00
Andrew Gallant
5450aed9a8 Make "before" context work.
No line numbers. And match inverting is broken.

This is awful.
2016-09-01 21:56:23 -04:00
Andrew Gallant
5aa3b9bc58 struggling with printing contexts, what a mess 2016-08-31 20:02:59 -04:00
Andrew Gallant
03d9df4303 tests and refactoring search 2016-08-31 15:52:35 -04:00
Andrew Gallant
d011cea053 The search code is a mess, but...
... we now support inverted matches and line numbers!
2016-08-29 22:44:15 -04:00
Andrew Gallant
c809679cf2 Lots of improvements. Most notably, removal of memory maps for searching.
Memory maps appear to degrade quite a bit in the presence of multithreading.

Also, switch to lock free data structures for synchronization. Give each
worker an input and output buffer which require no synchronization.
2016-08-28 20:18:34 -04:00
Andrew Gallant
1c8379f55a Implementing core functionality.
Initially experimenting with crossbeam to manage synchronization.
2016-08-28 01:37:12 -04:00
Andrew Gallant
0163b39faa refactor progress 2016-06-20 16:55:13 -04:00
Andrew Gallant
8d9d602945 update 2016-04-03 21:22:09 -04:00
Andrew Gallant
07bff7409b tweaks 2016-03-30 22:24:59 -04:00
Andrew Gallant
f1a91307cd short matches 2016-03-30 20:44:26 -04:00
Andrew Gallant
79a51029c1 progress 2016-03-29 21:21:34 -04:00