1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00
Commit Graph

326 Commits

Author SHA1 Message Date
lesnyrumcajs
5962abc465 searcher: add option to disable BOM sniffing
This commit adds a new encoding feature where the -E/--encoding flag
will now accept a value of 'none'. When given this value, all encoding
related machinery is disabled and ripgrep will search the raw bytes of
the file, including the BOM if it's present.

Closes #1207, Closes #1208
2019-04-06 10:35:08 -04:00
Andrew Gallant
77439f99a4 deps: add bstr to Cargo.lock 2019-04-05 23:24:08 -04:00
Andrew Gallant
cd9815cb37
deps: update to aho-corasick 0.7
We do the simplest possible change to migrate to the new version.

Fixes #1228
2019-04-03 13:51:26 -04:00
Andrew Gallant
3f22c3a658
deps: update everything
This updates all dependencies to their latest versions.

We tolerate a duplicative aho-corasick for now, which we will fix in the
next commit.
2019-04-03 13:07:26 -04:00
Andrew Gallant
0913972104
deps: bump encoding_rs_io
This brings in a new API for disabling BOM sniffing.

This is part of the work toward completing
https://github.com/BurntSushi/ripgrep/issues/1207
2019-03-03 16:36:34 -05:00
Andrew Gallant
f19b84fb23
regex: bump regex dep to fix match bug
See

* 661bf53d5b
* edf45e6f5f

for details on the bug fix, which was in the regex engine.

Fixes #1203
2019-02-27 17:42:14 -05:00
Andrew Gallant
1c7c4e6640
deps: update tempfile 2019-02-21 16:32:17 -05:00
Andrew Gallant
69c5e3938d
deps: bump smallvec
This gets rid of the unmaintained crates `unreachable` and `void`. Yay!
2019-02-21 16:31:48 -05:00
Andrew Gallant
d9cf05ad50
deps: update to aho-corasick 0.6.10
This brings in a fix for this bug:
https://github.com/BurntSushi/aho-corasick/issues/37

Fixes #1079
2019-02-16 11:39:33 -05:00
Andrew Gallant
af8b6caebb
deps: update various dependencies 2019-02-16 09:39:42 -05:00
Andrew Gallant
c84cfb6756
grep-regex-0.1.2 2019-02-16 09:30:06 -05:00
Andrew Gallant
8c95290ff6
deps: miscellaneous updates 2019-02-10 07:45:08 -05:00
Andrew Gallant
d6feeb7ff2
grep-searcher-0.1.3 2019-02-10 07:42:37 -05:00
Andrew Gallant
626ed00c19
searcher: revert big-endian patch
This undoes the patch to stop using bytecount on big-endian
architectures. In particular, we bump our bytecount dependency to the
latest release, which has a fix.

This reverts commit a4868b8835.

Fixes #1144 (again), Closes #1194
2019-02-10 07:40:32 -05:00
Andrew Gallant
fc3cf41247
grep-searcher-0.1.2 2019-02-09 16:13:07 -05:00
Andrew Gallant
de0bc78982
deps: bump encoding_rs to 0.8.16
This brings in an updated `encoding_rs` crate that uses `packed_simd`,
which compiles on the latest nightly. Compilation times do appear to be
impacted significantly though.

Fixes #1175 (again)
2019-02-07 17:05:14 -05:00
Andrew Gallant
f768796e4f
deps: update other deps 2019-01-29 13:08:56 -05:00
Andrew Gallant
da0c0c4705
deps: update to crossbeam-channel 0.3.8
This drops dependencies on parking_lot and rand from ripgrep.

(rand is still used for tests.)
2019-01-29 13:07:37 -05:00
Andrew Gallant
cc93db3b18
cargo: include auto-generated message
This is going to be annoying for a while if one switches between the
latest nightly compiler and older compilers. Sigh.
2019-01-29 13:04:40 -05:00
Andrew Gallant
f158a42a71
ignore: correctly detect hidden files on Windows
This commit fixes a bug where ripgrep only treated files beginning with
a `.` as hidden. On Windows, we continue this tradition, but
additionally check whether a file has the special Windows "hidden"
attribute set. If so, we treat it as a hidden file.

In order to make this work without an additional stat call, we had to
rearrange some of the plumbing from the directory traverser.

Fixes #1154
2019-01-27 12:11:52 -05:00
Andrew Gallant
e99b6bda0e
deps: bump regex-syntax to 0.6.5
This is necessary for the use of the new is_line_anchored_{start,end}
APIs.
2019-01-26 12:20:02 -05:00
Andrew Gallant
276e2c9b9a
searcher: always strip BOM
This fixes a bug where a BOM prefix was included. While this was somewhat
intentional in order to have a faithful "UTF8 passthru" option, in
practice, this causes problems such as breaking patterns like `^` in a
really non-obvious way.

The actual fix was to add a new API to encoding_rs_io, which this commit
brings in.

Fixes #1163
2019-01-25 17:18:57 -05:00
Andrew Gallant
47833b9ce7
deps: update removal of grep devdeps 2019-01-23 20:14:37 -05:00
Andrew Gallant
1e9ee2cc85 deps: update memmap 2019-01-19 10:44:30 -05:00
Andrew Gallant
968491f8e9 deps: update to bytecount 0.5
bytecount now uses runtime dispatch for enabling SIMD, which means we can
no longer need the avx-accel features. We remove it from ripgrep since the
next release will be a minor version bump, but leave them as no-ops for
the crates that previously used it.
2019-01-19 10:44:30 -05:00
Andrew Gallant
63b0f31a22 deps: update various dependencies
We also increase the MSRV to 1.32, the current stable release, which sets
the stage for migrating to Rust 2018.
2019-01-19 10:44:30 -05:00
Andrew Gallant
17ef4c40f3
ignore-0.4.6 2018-12-30 08:46:09 -05:00
Andrew Gallant
b3c5773266
deps: bump ignore 2018-12-30 08:43:18 -05:00
Andrew Gallant
b45b2f58ea
deps: update most other dependencies
This commit is the result of doing:

  $ cargo update
  $ cargo update -p encoding_rs --precise 0.8.10

where the latter line prevents encoding_rs from updating to 0.8.11 (or
newer). In particular, the 0.8.11 release increased the minimum Rust
version to 1.29, where as ripgrep 0.10.x is still on 1.28. We stay on an
older version for now until ripgrep is ready to move to 0.11.x.
2018-12-15 08:42:14 -05:00
Andrew Gallant
662a9bc73d
deps: update to crossbeam-channel 0.3
This also requires corresponding updates to both rand and rand_core. Doing
an update of rand without doing an update of rand_core results in
compilation errors because two distinct versions of rand_core are included
in the build, and the traits they expose are distinct and incompatible.

We also switch over to using tempfile instead of tempdir, which drops the
last remaining thing keeping rand 0.4 in the build.

Fixes #1141, Fixes #1142
2018-12-15 08:40:04 -05:00
Andrew Gallant
401add0a99
deps: update regex and regex-syntax
This brings in some new Unicode properties, such as \p{Emoji}.

It is now also technically possible construct a regex that recognizes
grapheme clusters.
2018-12-09 16:33:37 -05:00
Andrew Gallant
fb62266620
deps: update encoding_rs
This commit bumps the version of encoding_rs to use the latest release.
This appears to fix a panic in UTF-16 decoding.

Fixes #1089
2018-10-22 06:50:35 -04:00
Andrew Gallant
ba533f390e grep-searcher: update to encoding_rs_io 0.1.3
This update includes a work-around for a presumed bug in encoding_rs
that causes a panic:
https://github.com/hsivonen/encoding_rs/issues/34

Specifically, to reproduce this in ripgrep, one can run the following:

    $ curl -LO https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.1.tar.gz
    $ tar xf ruby-2.5.1.tar.gz
    $ rg ZZZZZ ruby-2.5.1/test/rexml/data/t63-2.svg
    thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 1'

Fixes #1052
2018-09-25 16:56:04 -04:00
Andrew Gallant
eb18da0450
pcre2: use jit_if_available
This will allow PCRE2 to fall back to non-JIT matching when running on
platforms without JIT support.

ref https://github.com/BurntSushi/rust-pcre2/issues/3
2018-09-08 17:12:14 -04:00
Andrew Gallant
d14f0b37d6
deps: update versions for all crates
I don't think every change here is needed, but this ensures we're using
the latest version of every direct dependency.
2018-09-07 14:00:22 -04:00
Andrew Gallant
3ddc3c040f
deps: minor updates 2018-09-07 13:03:01 -04:00
Andrew Gallant
0e2f8f7b47
grep: add clap and regex dev dependencies to grep
These are (or will be) used in grep's examples.
2018-09-07 12:06:05 -04:00
Andrew Gallant
83dff33326
deps: update various deps 2018-09-04 23:29:22 -04:00
Andrew Gallant
003c3695f4
deps: update grep version 2018-09-04 23:29:05 -04:00
Andrew Gallant
4846d63539 grep-cli: introduce new grep-cli crate
This commit moves a lot of "utility" code from ripgrep core into
grep-cli. Any one of these things might not be worth creating a new
crate, but combining everything together results in a fair number of a
convenience routines that make up a decent sized crate.

There is potentially more we could move into the crate, but much of what
remains in ripgrep core is almost entirely dealing with the number of
flags we support.

In the course of doing moving things to the grep-cli crate, we clean up
a lot of gunk and improve failure modes in a number of cases. In
particular, we've fixed a bug where other processes could deadlock if
they write too much to stderr.

Fixes #990
2018-09-04 23:18:55 -04:00
Andrew Gallant
04518e32e7
deps: update other crates 2018-08-30 23:03:07 -04:00
Andrew Gallant
f2eaf5b977
deps: update termcolor for perf tweaks 2018-08-30 22:57:01 -04:00
Andrew Gallant
f9ce7a84a8 ignore: add 'same_file_system' option
This commit adds a 'same_file_system' option to the walk builder. For
single threaded walking, it defers to the walkdir crate, which has the
same option. The bulk of this commit implements this flag for the parallel
walker. We add one very feeble test for this.

The parallel walker is now officially a complete mess.

Closes #321
2018-08-26 18:42:25 -04:00
Andrew Gallant
1b6089674e deps: more updates 2018-08-26 18:42:25 -04:00
Andrew Gallant
05a0389555
ripgrep: use winapi-util for stdin_is_readable 2018-08-25 00:30:15 -04:00
Andrew Gallant
16353bad6e
deps: update various deps
This includes a new crate, winapi-util, that is now used in wincolor,
walkdir and same-file.
2018-08-25 00:19:40 -04:00
Andrew Gallant
f1e025873f
deps: update dependencies
This includes an update to walkdir 2.2.2, which includes a
`same_file_system` option.
2018-08-22 20:50:24 -04:00
Andrew Gallant
033ad2b8e4
deps: update clap
Update clap to the latest version.

Also, drop the ansi_term dependency by disabling color output in clap's
error messages.
2018-08-21 23:10:34 -04:00
Andrew Gallant
098a8ee843 deps: various patch upgrades 2018-08-21 23:05:52 -04:00
Andrew Gallant
2f3dbf5fee ignore: fix false positive in path_is_symlink
This commit fixes a bug where the first path always reported itself as
as symlink via `path_is_symlink`.

Part of this fix includes updating walkdir to 2.2.1, which also includes
a corresponding bug fix.

Fixes #984
2018-08-21 23:05:52 -04:00
Andrew Gallant
0eef05142a ripgrep: move minimum version to Rust stable
This also updates some code to make use of our more liberal versioning
requirement, including the use of crossbeam-channel instead of the MsQueue
from the older an unmaintained crossbeam 0.3. This does regrettably add
a sizable number of dependencies, however, compile times seem mostly
unaffected.

Closes #1019
2018-08-21 23:05:52 -04:00
Andrew Gallant
9df60e164e
deps: update other dependencies to latest 2018-08-20 17:34:45 -04:00
Andrew Gallant
afa06c518a
deps: update libripgrep crate versions
This prepares them for an initial 0.1.0 release.
2018-08-20 17:34:45 -04:00
Andrew Gallant
eb184d7711 tests: re-tool integration tests
This basically rewrites every integration test. We reduce the amount of
magic involved here in terms of which arguments are being passed to
ripgrep processes. To make up for the boiler plate saved by the magic,
we make the Dir (formerly WorkDir) type a bit nicer to use, along with a
new TestCommand that wraps a std::process::Command. In exchange, we get
tests that are easier to read and write.

We also run every test with the `--pcre2` flag to make sure that works,
when PCRE2 is available.
2018-08-20 07:10:19 -04:00
Andrew Gallant
bb110c1ebe ripgrep: migrate to libripgrep
This commit does the work to delete the old `grep` crate and effectively
rewrite most of ripgrep core to use the new libripgrep crates. The new
`grep` crate is now a facade that collects the various crates that make
up libripgrep.

The most complex part of ripgrep core is now arguably the translation
between command line parameters and the library options, which is
ultimately where we want to be.
2018-08-20 07:10:19 -04:00
Andrew Gallant
d9ca529356 libripgrep: initial commit introducing libripgrep
libripgrep is not any one library, but rather, a collection of libraries
that roughly separate the following key distinct phases in a grep
implementation:

  1. Pattern matching (e.g., by a regex engine).
  2. Searching a file using a pattern matcher.
  3. Printing results.

Ultimately, both (1) and (3) are defined by de-coupled interfaces, of
which there may be multiple implementations. Namely, (1) is satisfied by
the `Matcher` trait in the `grep-matcher` crate and (3) is satisfied by
the `Sink` trait in the `grep2` crate. The searcher (2) ties everything
together and finds results using a matcher and reports those results
using a `Sink` implementation.

Closes #162
2018-08-20 07:10:19 -04:00
llogiq
ad9befbc1d deps: update bytecount to 0.3.2
PR #1003
2018-08-06 06:44:16 -04:00
Andrew Gallant
6799dcfc0e
release: 0.9.0 2018-08-03 16:13:31 -04:00
Andrew Gallant
0fdab0ec5e
grep-0.1.9 2018-08-03 16:12:08 -04:00
Andrew Gallant
74ec5b8932
deps: update termcolor and encoding_rs_io 2018-08-03 16:08:57 -04:00
Andrew Gallant
d94d99f657
ignore-0.4.3 2018-07-28 11:05:27 -04:00
Andrew Gallant
84585908ac
globset-0.4.1 2018-07-28 10:59:54 -04:00
Andrew Gallant
4dd2f8e40e
deps: update atty and winapi
This updates atty and winapi to their latest versions, including the bug
fix in atty that allows it to work with winapi 0.3.5.
2018-07-22 13:07:48 -04:00
Andrew Gallant
7a44cad599
deps: pin winapi to 0.3.4
winapi 0.3.5 changed how it represents some of its structs, which caused
a bug to surface in atty that prevents tty detection on Windows. atty
has an open PR to fix this: https://github.com/softprops/atty/pull/28

Until a new release of atty, we pin winapi to a version that works.
2018-07-22 09:31:22 -04:00
Andrew Gallant
209a125ea2
ripgrep: replace decoder with encoding_rs_io
This commit mostly moves the transcoder implementation to its own
crate: https://github.com/BurntSushi/encoding_rs_io

The new crate adds clear documentation and cleans up the implementation
to fully implement the contract of io::Read.
2018-07-21 20:36:32 -04:00
Andrew Gallant
7b6af5a177
deps: update regex to 1.0.2
And also update to regex-syntax 0.6.2.
2018-07-18 09:29:04 -04:00
Andrew Gallant
1393ce4b6b
deps: update all transitive dependencies
This updates all remaining transitive dependencies. Most changes appear
minor and there appear to be no minimum Rust version conflicts. Yay!
2018-07-17 20:34:03 -04:00
Andrew Gallant
8e358ee056
deps: bump various dependencies
Nothing major here. All patch releases. This should bring us completely
up to date with all direct dependencies.
2018-07-17 20:33:13 -04:00
Andrew Gallant
5b5f4e74d9
deps: bump encoding_rs to 0.8
This brings in performance improvements.
2018-07-17 20:29:41 -04:00
Andrew Gallant
d17ca45063
deps: update termcolor to 1.0.0 2018-07-17 18:37:02 -04:00
Andrew Gallant
5e85f2577b
deps: update to regex 1.0.1
This causes SIMD to kick in automatically when compiling with stable
Rust 1.27+.

We also update the README to describe the current state of things.

Thanks to @hartley for pointing this out:
https://twitter.com/hartley/status/1009950392862453760
2018-06-21 20:14:23 -04:00
Bastien Orivel
49f36c7dcd deps: update regex to 1.0
We retain the `simd-accel` feature on globset for backwards
compatibility, but will remove it in the next semver release.
2018-05-07 13:07:30 -04:00
Andrew Gallant
ed059559cd
deps: update to atty 0.2.9
https://github.com/softprops/atty/pull/25 was merged, so we can upgrade.
2018-04-23 19:32:39 -04:00
Andrew Gallant
6b15ce2342
deps: update remove_dir_all 2018-04-21 12:13:16 -04:00
Andrew Gallant
4c0b0c6c9d
ignore: release 0.4.2 2018-04-21 12:10:16 -04:00
Andrew Gallant
6c8b1e93d5
globset: release 0.4.0 2018-04-21 12:09:15 -04:00
Andrew Gallant
58bd0c67da deps: pin to atty 0.2.6
atty 0.2.7 (and 0.2.8) contain a regression in cygwin terminals that
prevents basic use of ripgrep, and is also the cause of the Windows CI
test failures. For now, we pin to 0.2.6, but a patch has been submitted
upstream: https://github.com/softprops/atty/pull/25
2018-04-21 12:01:11 -04:00
Andrew Gallant
0345e089aa
deps: update regex-syntax 2018-04-15 08:45:05 -04:00
Andrew Gallant
34abed597f
deps: update all dependencies
In particular, we can now drop rand 0.3.
2018-04-01 10:59:44 -04:00
Andrew Gallant
835600794f
termcolor: release 0.3.6 2018-03-26 17:28:21 -04:00
Dezhi “Andy” Fang
d7c9323a68 deps: update regex
This fixes build failures on latest nightly with SIMD features.
2018-03-17 19:33:34 -04:00
Andrew Gallant
b7d29d126f deps: update clap, atty, libc
Nothing to see here.

Note that we continue to refrain to update tempdir, which means we are
still bringing in rand 0.4 and rand 0.3. Updating tempdir brings in an
old version of remove_dir_all, which in turn brings in winapi 0.2. No
thanks.
2018-03-13 22:55:39 -04:00
Andrew Gallant
cd08707c7c grep: upgrade to regex-syntax 0.5
This update brings with it many bug fixes:

  * Better error messages are printed overall. We also include
    explicit call out for unsupported features like backreferences
    and look-around.
  * Regexes like `\s*{` no longer emit incomprehensible errors.
  * Unicode escape sequences, such as `\u{..}` are now supported.

For the most part, this upgrade was done in a straight-forward way. We
resist the urge to refactor the `grep` crate, in anticipation of it
being rewritten anyway.

Note that we removed the `--fixed-strings` suggestion whenever a regex
syntax error occurs. In practice, I've found that it results in a lot of
false positives, and I believe that its use is not as paramount now that
regex parse errors are much more readable.

Closes #268, Closes #395, Closes #702, Closes #853
2018-03-13 22:55:39 -04:00
Andrew Gallant
1f70e9187c deps: update regex crate
This update brings with it a new feature of the regex crate which will
now use SIMD optimizations automatically at runtime with no necessary
compile time flags. All that's needed is to enable the `unstable` feature.

Other crates, such as bytecount and encoding_rs, are still using the
old-style SIMD support, so we leave the simd-accel and avx-accel features.
However, the binaries we distribute on Github no longer have those
features enabled, which makes them truly portable.

Fixes #135
2018-03-12 23:21:42 -04:00
Andrew Gallant
9c216ad9a4
release: 0.8.1 2018-02-20 20:19:03 -05:00
Andrew Gallant
a6d09b2d42
deps: update to clap 2.30.0 2018-02-20 20:16:57 -05:00
Andrew Gallant
ab1b877c20
termcolor: release 0.3.5 2018-02-20 20:15:08 -05:00
Andrew Gallant
2b5c488814
ignore: release 0.4.1 2018-02-20 20:13:56 -05:00
Andrew Gallant
d65966efbc ignore: fix performance regression on Windows
This commit fixes a performance regression in Windows that resulted from
fallout from fixing #705. In particular, we introduced an additional
stat call for every single directory entry, which can be quite
disastrous for performance.

There is a corresponding companion PR that fixes the same bug in
walkdir: https://github.com/BurntSushi/walkdir/pull/96

Fixes #820
2018-02-20 19:50:52 -05:00
Andrew Gallant
23d1b91ead
release: 0.8.0 2018-02-11 20:22:22 -05:00
Andrew Gallant
56341973ee
ignore: release 0.4.0 2018-02-11 13:42:59 -05:00
Andrew Gallant
a431160d4c
globset: release 0.3.0 2018-02-11 13:41:36 -05:00
Andrew Gallant
5d15f49f0c
termcolor: release 0.3.4 2018-02-11 13:39:12 -05:00
Andrew Gallant
7718ee362e
wincolor: release 0.1.6 2018-02-11 13:38:00 -05:00
Andrew Gallant
739f8f596b
grep: release 0.1.8 2018-02-11 13:35:54 -05:00
Andrew Gallant
e818d7529b
deps: update several dependencies
We specifically avoid updating tempdir since it seems to have grown a
dependency on `remove_dir_all`, which in turn still uses winapi 0.2.
2018-02-11 13:31:41 -05:00
Andrew Gallant
8e93fa0e7f
deps: update regex to 0.2.6
This regex update disabled the Tuned Boyer-Moore literal searcher which
has a bug in it that isn't straight-forward to fix. We bring that update
into ripgrep with this commit.

Fixes #780, Fixes #781
2018-02-08 18:25:55 -05:00
Andrew Gallant
8cb5833ef9 argv: update clap to 2.29.4
We use the new AppSettings::AllArgsOverrideSelf to permit all flags to
be specified multiple times. This removes the need for our previous
work-around where we would enable `multiple` for every flag and then
just extract the last value when consuming clap's matches.

We also add a couple regression tests that ensure repeated switches and
flags work as expected.
2018-02-06 12:07:59 -05:00
Andrew Gallant
c8e755f11f deps: remove vec-map feature from clap
This removes the vec-map feature from clap. clap's README claims that
vec-map provides a small performance benefit, but I could observe any in
ripgrep workloads.

The benefit here is that it drops a dependency.

Amazingly, this drops whole release build times for ripgrep from 68s to
33s, and debug build time also decreases from 22s to 15.5s. This was
entirely unintentional but a welcome surprise.
2018-02-04 10:40:20 -05:00
Andrew Gallant
3535047094 logger: drop env_logger
This commit updates the `log` crate to 0.4 and drops the dependency on
env_logger. In particular, the latest version of env_logger brings in
additional non-optional dependencies such as chrono that I don't think is
worth including into ripgrep.

It turns out ripgrep doesn't need any fancy logging. We just need a concept
of log levels and the ability to print to stderr. Therefore, we just roll
our own super simple logger.

This update is motivated by the persistent configuration task. In
particular, we need the ability to toggle the global log level more than
once, and this doesn't appear to be possible with older versions of the
log crate.
2018-02-04 10:40:20 -05:00
Andrew Gallant
fe00255494
deps: bump wincolor 2018-02-03 20:38:03 -05:00
Andrew Gallant
c7fc916e6b
deps: bump walkdir (again)
walkdir 2.1.2 introduced a subtle bug on Windows when dealing with
symlinks. We update to the latest to get the fix.
2018-02-01 22:51:34 -05:00
Andrew Gallant
e36b65a11a
windows: fix OneDrive traversals
This commit fixes a bug on Windows where directory traversals were
completely broken when attempting to scan OneDrive directories that use
the "file on demand" strategy.

The specific problem was that Rust's standard library treats OneDrive
directories as reparse points instead of directories, which causes
methods like `FileType::is_file` and `FileType::is_dir` to always return
false, even when retrieved via methods like `metadata` that purport to
follow symbolic links.

We fix this by peppering our code with checks on the underlying file
attributes exposed by Windows. We consider an entry a directory if and
only if the directory bit is set on the attributes. We are careful to
make sure that the code remains the same on non-Windows platforms.

Note that we also bump the dependency on `walkdir`, which contains a
similar fix for its traversals.

This bug is recorded upstream:
https://github.com/rust-lang/rust/issues/46484

Upstream also has a pending PR:
https://github.com/rust-lang/rust/pull/47956

Fixes #705
2018-02-01 21:11:02 -05:00
Andrew Gallant
11ad7ab204
ignore/deps: update walkdir
This commit updates to the latest walkdir release, which fixes a bug on
Windows where ripgrep would panic if it was told to traverse a directory
while following symlinks *and* if opening one of those symlinks failed.

Fixes #633
2018-02-01 17:09:06 -05:00
llogiq
e05023b406 deps: update bytecount
This improves performance with current nightly rustc.
2018-01-30 16:34:30 -05:00
Balaji Sivaraman
f007f940c5 search: add support for searching compressed files
This commit adds opt-in support for searching compressed files during
recursive search. This behavior is only enabled when the
`-z/--search-zip` flag is passed to ripgrep. When enabled, a limited set
of common compression formats are recognized via file extension, and a
new process is spawned to perform the decompression. ripgrep then
searches the stdout of that spawned process.

Closes #539
2018-01-30 09:13:53 -05:00
Andrew Gallant
ef9e17d28a
deps: bump memmap to 0.6.2
This removes the last dependency that required winapi 0.2. ripgrep now
only depends on winapi 0.3.
2018-01-29 17:09:01 -05:00
Igor Gnatenko
50616935a9 deps: update bytecount to 0.3
Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
2018-01-09 07:09:29 -05:00
Stjepan Glavina
fbc1e7fa18 Update crossbeam to 0.3.2 2018-01-06 09:05:51 -05:00
Igor Gnatenko
5e73075ef5 deps: bump lazy_static to 1 2017-12-30 16:50:18 -05:00
Andrew Gallant
1b42c02489 deps: update same-file dep
The same-file update includes a migration to winapi 0.3.
2017-12-30 16:50:18 -05:00
Steffen Butzer
0d03145293 wincolor: migrate to winapi 0.3 2017-12-30 16:50:18 -05:00
Andrew Gallant
f8162d2707 deps: update all deps 2017-12-30 16:50:18 -05:00
Andrew Gallant
e044cfb33f deps: update to latest clap release
This also bumps the minimum Rust version required to 1.20.
2017-12-30 16:50:18 -05:00
Andrew Gallant
7dd1194a97 deps: update to latest regex crate
The regex update fixes the Rust nightly build failure by in turn updating
its simd dependency to 2.x.

The regex update also includes a literal optimization that uses Tuned
Boyer Moore.

Fixes #617
2017-12-30 16:50:18 -05:00
Igor Gnatenko
373e0595e6 bump bytecount to 0.2
Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
2017-11-22 09:52:47 -05:00
Dan Burkert
263e8f92b9 Update to memmap 0.6
`memmap` 0.6.0 introduces major API changes in anticipation of a 1.0
release. See https://github.com/danburkert/memmap-rs/releases/tag/0.6.0
for more information. CC danburkert/memmap-rs#33.
2017-11-22 06:57:15 -05:00
Andrew Gallant
c4e1945384
cargo: bump to 0.7.1 2017-10-22 10:33:09 -04:00
Andrew Gallant
8c8c83a1f8
deps: bump ignore to 0.3.1 2017-10-22 10:31:42 -04:00
Andrew Gallant
efa4de8126
cargo: bump to 0.7.0 2017-10-21 22:40:10 -04:00
Andrew Gallant
08060a2105
deps: update everything 2017-10-21 22:40:09 -04:00
Andrew Gallant
cd575d99f8
ignore: upgrade to walkdir 2
The uninteresting bits of this commit involve mechanical changes for
updates to walkdir 2. The more interesting bits of this commit are the
breaking changes, although none of them should require any significant
change on users of this library. The breaking changes are as follows:

* `DirEntry::path_is_symbolic_link` has been renamed to
  `DirEntry::path_is_symlink`. This matches the conventions in the
  standard library, and also the corresponding name change in walkdir.
* Removed the `From<walkdir::Error> for ignore::Error` impl. This was
  intended to only be used internally, but was the only thing that
  made `walkdir` a public dependency of `ignore`. Therefore, we remove
  it since it seems unnecessary.
* Renamed `WalkBuilder::sort_by` to `WalkBuilder::sort_by_file_name`,
  and changed the type of the comparator from

    Fn(&OsString, &OsString) -> cmp::Ordering + 'static

  to

    Fn(&OsStr, &OsStr) -> cmp::Ordering + Send + Sync + 'static

  The corresponding change in `walkdir` retains the `sort_by` name, but
  gives the comparator a pair of `&DirEntry` values instead of a pair
  of `&OsStr` values. Ideally, `ignore` would hand off its own pair of
  `&ignore::DirEntry` values, but this requires more design work. So for
  now, we retain previous functionality, but leave room to make a proper
  `sort_by` method.

[breaking-change]
2017-10-21 22:40:09 -04:00
Andrew Gallant
1267f01c24
deps: upgrade to memchr 2 2017-10-21 22:40:09 -04:00
Henri Sivonen
af77dd55a2 Update encoding_rs to 0.7.0 2017-08-28 08:14:33 -04:00
Jack O'Connor
3065a8c9c8 restore the default SIGPIPE behavior as a temporary workaround
See https://github.com/BurntSushi/ripgrep/issues/200.
2017-08-27 15:01:05 -04:00
Andrew Gallant
208c11af56
deps: bump termcolor in lock file 2017-08-27 11:34:58 -04:00
Andrew Gallant
e10544f819
0.6.0 2017-08-23 19:54:50 -04:00
Andrew Gallant
36c16eb00c
bump deps 2017-08-23 19:52:13 -04:00
Henri Sivonen
fe7fe74b0a Pass the simd-accel feature to encoding_rs 2017-08-20 08:42:31 -04:00
Vurich
b3a9c34515 Remove unused libc dependency 2017-08-08 07:03:58 -04:00
Andrew Gallant
972ec1adc6
bump clap to 2.26
Fixes #482
2017-07-30 18:04:49 -04:00
Igor Gnatenko
a2d4c03c71 bump encoding_rs to 0.6 2017-07-30 18:00:50 -04:00
Andrew Gallant
92e5fad27d
ignore-0.2.2 2017-07-17 17:56:33 -04:00
Andrew Gallant
1c03298903
bump ignore version, take 2 2017-07-12 22:26:59 -04:00
Andrew Gallant
9e51b18ac7
bump wincolor dep 2017-06-19 13:46:43 -04:00
Andrew Gallant
44c03f58bc
bump deps, redux
This only bumps the regex dependency. The new clap version causes a bump
in unicode-segmentation, which in turn requires a Rust 1.15, which is
above ripgrep's currently supported minimum Rust version of 1.12.
2017-05-21 15:56:56 -04:00
Andrew Gallant
d1a6ab922e
Revert "bump deps"
This reverts commit b860fa3acd.
2017-05-21 15:52:58 -04:00
Andrew Gallant
b860fa3acd
bump deps 2017-05-21 12:33:13 -04:00
Andrew Gallant
5a666b042d
bump ripgrep, ignore, globset
The `ignore` and `globset` crates both got breaking changes in the
course of fixing #444, so increase 0.x to 0.(x+1).
2017-05-11 19:12:20 -04:00
Andrew Gallant
0b685c8429 deps: update clap to 2.24
Fixes #442
2017-05-08 19:24:11 -04:00
Andrew Gallant
ac1c95a6d9 0.5.1 2017-04-09 09:47:00 -04:00
Andrew Gallant
685b431d80 bump deps 2017-04-09 09:46:37 -04:00
Andrew Gallant
487713aa34 bump ignore 2017-04-09 09:45:00 -04:00
Kevin K
0c298f60a6 updates clap and removes home rolled -h/--help distinction
This commit updates clap to v2.23.0

The update contained a bug fix in clap that results in broken code in
ripgrep. ripgrep was relying on the bug, but this commit fixes that
issue. The bug centered around not being able to override the
auto-generated help message by supplying a flag with a long of `help`.

Normally, supplying a flag with a long of `help` means whenever the user
passes `--help`, the consuming code (e.g. ripgrep) is responsible for
displaying the help message. However, due to the bug in clap this wasn't
necessary for ripgrep to do unless the user passed `-h`. With the bug
fixed, it meant the user passing `--help` and clap expected ripgrep to
display the help, yet ripgrep expected clap to display the help. This
has been fixed in this commit of ripgrep.

All well now!

v2.23.0 also brings the abilty to use `Arg::help` or `Arg::long_help`
allowing one to distinguish between `-h` and `--help`. This commit
leaves all doc strings in the `lazy_static!` hashmap however only for
aesthetic reasons.

This means all home rolled handling of `-h`/`--help` has been removed
from ripgrep, yet functionality *and* appearances are 100% the same.
2017-04-05 11:38:58 -04:00
Roman Proskuryakov
1425d6735e Bamp clap to 2.22.2
Fixes #426 , #418
2017-03-31 15:56:10 -04:00
Andrew Gallant
33c95d2919 bump deps 2017-03-30 12:33:31 -04:00
Andrew Gallant
08c017330f bump termcolor dep 2017-03-15 07:15:39 -04:00
Andrew Gallant
5cb4bb9ea0 bump ripgrep version in Cargo.lock 2017-03-14 15:09:24 -04:00
Andrew Gallant
c648eadbaa Bump and update deps. 2017-03-12 21:33:13 -04:00
Andrew Gallant
8bbe58d623 Add support for additional text encodings.
This includes, but is not limited to, UTF-16, latin-1, GBK, EUC-JP and
Shift_JIS. (Courtesy of the `encoding_rs` crate.)

Specifically, this feature enables ripgrep to search files that are
encoded in an encoding other than UTF-8. The list of available encodings
is tied directly to what the `encoding_rs` crate supports, which is in
turn tied to the Encoding Standard. The full list of available encodings
can be found here: https://encoding.spec.whatwg.org/#concept-encoding-get

This pull request also introduces the notion that text encodings can be
automatically detected on a best effort basis. Currently, the only
support for this is checking for a UTF-16 bom. In all other cases, a
text encoding of `auto` (the default) implies a UTF-8 or ASCII
compatible source encoding. When a text encoding is otherwise specified,
it is unconditionally used for all files searched.

Since ripgrep's regex engine is fundamentally built on top of UTF-8,
this feature works by transcoding the files to be searched from their
source encoding to UTF-8. This transcoding only happens when:

1. `auto` is specified and a non-UTF-8 encoding is detected.
2. A specific encoding is given by end users (including UTF-8).

When transcoding occurs, errors are handled by automatically inserting
the Unicode replacement character. In this case, ripgrep's output is
guaranteed to be valid UTF-8 (excluding non-UTF-8 file paths, if they
are printed).

In all other cases, the source text is searched directly, which implies
an assumption that it is at least ASCII compatible, but where UTF-8 is
most useful. In this scenario, encoding errors are not detected. In this
case, ripgrep's output will match the input exactly, byte-for-byte.

This design may not be optimal in all cases, but it has some advantages:

1. In the happy path ("UTF-8 everywhere") remains happy. I have not been
   able to witness any performance regressions.
2. In the non-UTF-8 path, implementation complexity is kept relatively
   low. The cost here is transcoding itself. A potentially superior
   implementation might build decoding of any encoding into the regex
   engine itself. In particular, the fundamental problem with
   transcoding everything first is that literal optimizations are nearly
   negated.

Future work should entail improving the user experience. For example, we
might want to auto-detect more text encodings. A more elaborate UX
experience might permit end users to specify multiple text encodings,
although this seems hard to pull off in an ergonomic way.

Fixes #1
2017-03-12 19:54:48 -04:00