1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00
Commit Graph

112 Commits

Author SHA1 Message Date
Andrew Gallant
7dd1194a97 deps: update to latest regex crate
The regex update fixes the Rust nightly build failure by in turn updating
its simd dependency to 2.x.

The regex update also includes a literal optimization that uses Tuned
Boyer Moore.

Fixes #617
2017-12-30 16:50:18 -05:00
Igor Gnatenko
373e0595e6 bump bytecount to 0.2
Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
2017-11-22 09:52:47 -05:00
Dan Burkert
263e8f92b9 Update to memmap 0.6
`memmap` 0.6.0 introduces major API changes in anticipation of a 1.0
release. See https://github.com/danburkert/memmap-rs/releases/tag/0.6.0
for more information. CC danburkert/memmap-rs#33.
2017-11-22 06:57:15 -05:00
Andrew Gallant
c4e1945384
cargo: bump to 0.7.1 2017-10-22 10:33:09 -04:00
Andrew Gallant
8c8c83a1f8
deps: bump ignore to 0.3.1 2017-10-22 10:31:42 -04:00
Andrew Gallant
efa4de8126
cargo: bump to 0.7.0 2017-10-21 22:40:10 -04:00
Andrew Gallant
08060a2105
deps: update everything 2017-10-21 22:40:09 -04:00
Andrew Gallant
cd575d99f8
ignore: upgrade to walkdir 2
The uninteresting bits of this commit involve mechanical changes for
updates to walkdir 2. The more interesting bits of this commit are the
breaking changes, although none of them should require any significant
change on users of this library. The breaking changes are as follows:

* `DirEntry::path_is_symbolic_link` has been renamed to
  `DirEntry::path_is_symlink`. This matches the conventions in the
  standard library, and also the corresponding name change in walkdir.
* Removed the `From<walkdir::Error> for ignore::Error` impl. This was
  intended to only be used internally, but was the only thing that
  made `walkdir` a public dependency of `ignore`. Therefore, we remove
  it since it seems unnecessary.
* Renamed `WalkBuilder::sort_by` to `WalkBuilder::sort_by_file_name`,
  and changed the type of the comparator from

    Fn(&OsString, &OsString) -> cmp::Ordering + 'static

  to

    Fn(&OsStr, &OsStr) -> cmp::Ordering + Send + Sync + 'static

  The corresponding change in `walkdir` retains the `sort_by` name, but
  gives the comparator a pair of `&DirEntry` values instead of a pair
  of `&OsStr` values. Ideally, `ignore` would hand off its own pair of
  `&ignore::DirEntry` values, but this requires more design work. So for
  now, we retain previous functionality, but leave room to make a proper
  `sort_by` method.

[breaking-change]
2017-10-21 22:40:09 -04:00
Andrew Gallant
1267f01c24
deps: upgrade to memchr 2 2017-10-21 22:40:09 -04:00
Henri Sivonen
af77dd55a2 Update encoding_rs to 0.7.0 2017-08-28 08:14:33 -04:00
Jack O'Connor
3065a8c9c8 restore the default SIGPIPE behavior as a temporary workaround
See https://github.com/BurntSushi/ripgrep/issues/200.
2017-08-27 15:01:05 -04:00
Andrew Gallant
208c11af56
deps: bump termcolor in lock file 2017-08-27 11:34:58 -04:00
Andrew Gallant
e10544f819
0.6.0 2017-08-23 19:54:50 -04:00
Andrew Gallant
36c16eb00c
bump deps 2017-08-23 19:52:13 -04:00
Henri Sivonen
fe7fe74b0a Pass the simd-accel feature to encoding_rs 2017-08-20 08:42:31 -04:00
Vurich
b3a9c34515 Remove unused libc dependency 2017-08-08 07:03:58 -04:00
Andrew Gallant
972ec1adc6
bump clap to 2.26
Fixes #482
2017-07-30 18:04:49 -04:00
Igor Gnatenko
a2d4c03c71 bump encoding_rs to 0.6 2017-07-30 18:00:50 -04:00
Andrew Gallant
92e5fad27d
ignore-0.2.2 2017-07-17 17:56:33 -04:00
Andrew Gallant
1c03298903
bump ignore version, take 2 2017-07-12 22:26:59 -04:00
Andrew Gallant
9e51b18ac7
bump wincolor dep 2017-06-19 13:46:43 -04:00
Andrew Gallant
44c03f58bc
bump deps, redux
This only bumps the regex dependency. The new clap version causes a bump
in unicode-segmentation, which in turn requires a Rust 1.15, which is
above ripgrep's currently supported minimum Rust version of 1.12.
2017-05-21 15:56:56 -04:00
Andrew Gallant
d1a6ab922e
Revert "bump deps"
This reverts commit b860fa3acd.
2017-05-21 15:52:58 -04:00
Andrew Gallant
b860fa3acd
bump deps 2017-05-21 12:33:13 -04:00
Andrew Gallant
5a666b042d
bump ripgrep, ignore, globset
The `ignore` and `globset` crates both got breaking changes in the
course of fixing #444, so increase 0.x to 0.(x+1).
2017-05-11 19:12:20 -04:00
Andrew Gallant
0b685c8429 deps: update clap to 2.24
Fixes #442
2017-05-08 19:24:11 -04:00
Andrew Gallant
ac1c95a6d9 0.5.1 2017-04-09 09:47:00 -04:00
Andrew Gallant
685b431d80 bump deps 2017-04-09 09:46:37 -04:00
Andrew Gallant
487713aa34 bump ignore 2017-04-09 09:45:00 -04:00
Kevin K
0c298f60a6 updates clap and removes home rolled -h/--help distinction
This commit updates clap to v2.23.0

The update contained a bug fix in clap that results in broken code in
ripgrep. ripgrep was relying on the bug, but this commit fixes that
issue. The bug centered around not being able to override the
auto-generated help message by supplying a flag with a long of `help`.

Normally, supplying a flag with a long of `help` means whenever the user
passes `--help`, the consuming code (e.g. ripgrep) is responsible for
displaying the help message. However, due to the bug in clap this wasn't
necessary for ripgrep to do unless the user passed `-h`. With the bug
fixed, it meant the user passing `--help` and clap expected ripgrep to
display the help, yet ripgrep expected clap to display the help. This
has been fixed in this commit of ripgrep.

All well now!

v2.23.0 also brings the abilty to use `Arg::help` or `Arg::long_help`
allowing one to distinguish between `-h` and `--help`. This commit
leaves all doc strings in the `lazy_static!` hashmap however only for
aesthetic reasons.

This means all home rolled handling of `-h`/`--help` has been removed
from ripgrep, yet functionality *and* appearances are 100% the same.
2017-04-05 11:38:58 -04:00
Roman Proskuryakov
1425d6735e Bamp clap to 2.22.2
Fixes #426 , #418
2017-03-31 15:56:10 -04:00
Andrew Gallant
33c95d2919 bump deps 2017-03-30 12:33:31 -04:00
Andrew Gallant
08c017330f bump termcolor dep 2017-03-15 07:15:39 -04:00
Andrew Gallant
5cb4bb9ea0 bump ripgrep version in Cargo.lock 2017-03-14 15:09:24 -04:00
Andrew Gallant
c648eadbaa Bump and update deps. 2017-03-12 21:33:13 -04:00
Andrew Gallant
8bbe58d623 Add support for additional text encodings.
This includes, but is not limited to, UTF-16, latin-1, GBK, EUC-JP and
Shift_JIS. (Courtesy of the `encoding_rs` crate.)

Specifically, this feature enables ripgrep to search files that are
encoded in an encoding other than UTF-8. The list of available encodings
is tied directly to what the `encoding_rs` crate supports, which is in
turn tied to the Encoding Standard. The full list of available encodings
can be found here: https://encoding.spec.whatwg.org/#concept-encoding-get

This pull request also introduces the notion that text encodings can be
automatically detected on a best effort basis. Currently, the only
support for this is checking for a UTF-16 bom. In all other cases, a
text encoding of `auto` (the default) implies a UTF-8 or ASCII
compatible source encoding. When a text encoding is otherwise specified,
it is unconditionally used for all files searched.

Since ripgrep's regex engine is fundamentally built on top of UTF-8,
this feature works by transcoding the files to be searched from their
source encoding to UTF-8. This transcoding only happens when:

1. `auto` is specified and a non-UTF-8 encoding is detected.
2. A specific encoding is given by end users (including UTF-8).

When transcoding occurs, errors are handled by automatically inserting
the Unicode replacement character. In this case, ripgrep's output is
guaranteed to be valid UTF-8 (excluding non-UTF-8 file paths, if they
are printed).

In all other cases, the source text is searched directly, which implies
an assumption that it is at least ASCII compatible, but where UTF-8 is
most useful. In this scenario, encoding errors are not detected. In this
case, ripgrep's output will match the input exactly, byte-for-byte.

This design may not be optimal in all cases, but it has some advantages:

1. In the happy path ("UTF-8 everywhere") remains happy. I have not been
   able to witness any performance regressions.
2. In the non-UTF-8 path, implementation complexity is kept relatively
   low. The cost here is transcoding itself. A potentially superior
   implementation might build decoding of any encoding into the regex
   engine itself. In particular, the fundamental problem with
   transcoding everything first is that literal optimizations are nearly
   negated.

Future work should entail improving the user experience. For example, we
might want to auto-detect more text encodings. A more elaborate UX
experience might permit end users to specify multiple text encodings,
although this seems hard to pull off in an ergonomic way.

Fixes #1
2017-03-12 19:54:48 -04:00
Andrew Gallant
7c37065911 update deps 2017-03-08 20:23:12 -05:00
Andrew Gallant
4e8c0fc4ad bump clap to 2.20.5
Fixes #383
2017-02-25 18:43:13 -05:00
Andrew Gallant
a114b86063 update termcolor dep 2017-02-18 15:09:25 -05:00
Andrew Gallant
d825648b86 Remove lazy_static from globset 2017-02-12 15:37:50 -05:00
Andrew Gallant
fecef10c1c update deps 2017-01-17 19:36:23 -05:00
Andrew Gallant
f5a2d022ec Replace internal atty module with atty crate.
This removes all use of explicit unsafe in ripgrep proper except for
one: accessing the contents of a memory map. (Which may never go away.)
2017-01-15 16:32:30 -05:00
Andrew Gallant
057ed6305a 0.4.0 2017-01-13 23:46:21 -05:00
Andrew Gallant
a7ca2d6563 update same-file dep 2017-01-13 20:01:06 -05:00
Andrew Gallant
c3de1f58ea another bytecount update, weird 2017-01-10 21:13:40 -05:00
Andrew Gallant
e940bc956d update bytecount
Fixes #313
2017-01-10 18:30:16 -05:00
Andrew Gallant
461e0c4e33 Don't search stdout redirected file.
When running ripgrep like this:

    rg foo > output

we must be careful not to search `output` since ripgrep is actively writing
to it. Searching it can cause massive blowups where the file grows without
bound.

While this is conceptually easy to fix (check the inode of the redirection
and the inode of the file you're about to search), there are a few problems
with it.

First, inodes are a Unix thing, so we need a Windows specific solution to
this as well. To resolve this concern, I created a new crate, `same-file`,
which provides a cross platform abstraction.

Second, stat'ing every file is costly. This is not avoidable on Windows,
but on Unix, we can get the inode number directly from directory traversal.
However, this information wasn't exposed, but now it is (through both the
ignore and walkdir crates).

Fixes #286
2017-01-09 16:12:08 -05:00
Andrew Gallant
aed315e80a bump deps 2017-01-03 07:27:51 -05:00
Andrew Gallant
163e00677a Update to regex 0.2. 2017-01-01 01:03:21 -05:00
Andrew Gallant
d58236fbdc bump various versions 2016-12-30 15:44:08 -05:00