1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00
Commit Graph

326 Commits

Author SHA1 Message Date
Ed Page
9f7c2ebc09 ignore: allow parallel walker to borrow data
This makes it so the caller can more easily refactor from
single-threaded to multi-threaded walking. If they want to support both,
this makes it easier to do so with a single initialization code-path. In
particular, it side-steps the need to put everything into an `Arc`.

This is not a breaking change because it strictly increases the number
of allowed inputs to `WalkParallel::run`.

Closes #1410, Closes #1432
2020-02-17 17:16:28 -05:00
Andrew Gallant
cd8ec38a68 grep-regex: add fast path for -w/--word-regexp
Previously, ripgrep would always defer to the regex engine's capturing
matches in order to implement word matching. Namely, ripgrep would
determine the correct match offsets via a capturing group, since the
word regex is itself generated from the user supplied regex.

Unfortunately, the regex engine's capturing mode is still fairly slow,
so this commit adds a fast path to avoid capturing mode in the vast
majority of cases. See comments in the code for details.
2020-02-17 17:16:28 -05:00
Ximin Luo
f8418c6a52 explicitly declare lazy_static dependency
`benches/bench.rs` uses lazy_static but Cargo.toml does not declare a
dependency on it. This causes rustc to use its own internal private
copy instead. Sometimes this causes unintuitive errors like this Debian
bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=942243

The underlying issue is https://github.com/rust-lang/rust#27812 but it
can be avoided by explicitly declaring the dependency, which you are
supposed to do anyways.

Closes #1435
2020-02-17 17:16:28 -05:00
Andrew Gallant
74d1fe59e9
deps: update everything 2020-01-30 18:33:40 -05:00
Andrew Gallant
9fd1e202e0
deps: update regex, regex-syntax and aho-corasick
Notably, this brings in a bug fix reported by @okdana:
https://github.com/rust-lang/regex/issues/640
2020-01-30 18:32:56 -05:00
Andrew Gallant
8bdf84e3a8
deps: update everything 2020-01-16 19:47:23 -05:00
Andrew Gallant
5a6e17fcc1
deps: various updates
Most of these updates (sans thread_local) are from crates I maintain
that have seen updates recently.

Notably, this includes a bump to `termcolor 1.1.0` which includes
support for respecting `NO_COLOR`. This commit therefore means that
ripgrep now supports `NO_COLOR`.

As an added bonus, we drop a dependency on Windows. (Although the total
amount of code compiled remains the same.)

Closes #1186
2020-01-11 10:09:10 -05:00
Andrew Gallant
00bfcd14a6
ignore-0.4.11 2020-01-10 15:08:27 -05:00
Andrew Gallant
837fb5e21f deps: update to crossbeam-channel 0.4
Closes #1427
2020-01-10 15:07:47 -05:00
Andrew Gallant
2e1815606e deps: update to bytecount 0.6
Looks like there aren't any major changes other than dependency updates.
2020-01-10 15:07:47 -05:00
Andrew Gallant
cb2f6ddc61 deps: update to thread_local 1.0
We also update the pcre2 and regex dependencies, which removes any other
lingering uses of thread_local 0.3.
2020-01-10 15:07:47 -05:00
Andrew Gallant
bd7a42602f deps: bump to base64 0.11 2020-01-10 15:07:47 -05:00
Andrew Gallant
528ce56e1b deps: run cargo update
The only new dependency is an unused target specific dependency hermit
via the atty crate.
2020-01-10 15:07:47 -05:00
Andrew Gallant
e14f9195e5
deps: update everything 2019-08-28 20:18:47 -04:00
Andrew Gallant
ef0e7af56a
deps: update bstr to 0.2.7
The new bstr release contains a small performance bug fix where some
trivial methods weren't being inlined.
2019-08-11 10:41:05 -04:00
Andrew Gallant
5c4584aa7c
grep-regex-0.1.5 2019-08-06 09:51:13 -04:00
Andrew Gallant
0972c6e7c7
grep-searcher-0.1.6 2019-08-06 09:50:52 -04:00
Andrew Gallant
0a372bf2e4
deps: update ignore 2019-08-06 09:50:35 -04:00
Andrew Gallant
31807f805a
deps: drop tempfile
We were only using it to create temporary directories for `ignore`
tests, but it pulls in a bunch of dependencies and we don't really need
randomness. So just use our own simple wrapper instead.
2019-08-06 09:46:05 -04:00
Andrew Gallant
4de227fd9a
deps: update everything
Mostly this just updates regex and its assorted dependencies. This does
drop utf8-ranges and ucd-util, in accordance with changes to
regex-syntax and regex.
2019-08-05 13:50:55 -04:00
Andrew Gallant
e402d6c260
ripgrep: release 11.0.2 2019-08-01 18:02:15 -04:00
Andrew Gallant
709ca91f50
ignore: release 0.4.9 2019-08-01 17:48:37 -04:00
Andrew Gallant
9c220f9a9b
grep-regex: release 0.1.4 2019-08-01 17:47:45 -04:00
Andrew Gallant
9085bed139
grep-matcher: release 0.1.3 2019-08-01 17:46:59 -04:00
Andrew Gallant
b5e5979ff1
deps: update everything
This drops `spin` and `autocfg`, yay.
2019-08-01 17:42:38 -04:00
Andrew Gallant
08ae4da2b7
deps: update them
There are some nice removals. It looks like rand has slimmed down, and
smallvec is gone now as well.
2019-07-25 07:52:33 -04:00
Andrew Gallant
7ac95c1f50
deps: bump ignore 2019-07-24 12:56:47 -04:00
Andrew Gallant
785c1f1766
release: globset, grep-cli, grep-printer, grep-searcher 2019-06-26 16:53:30 -04:00
Andrew Gallant
8b734cb490
deps: update everything 2019-06-26 16:51:06 -04:00
Andrew Gallant
b93762ea7a
bstr: update everything to bstr 0.2 2019-06-26 16:47:33 -04:00
Andrew Gallant
50bcb7409e
deps: update everything 2019-06-16 18:38:45 -04:00
Andrew Gallant
9dcfd9a205 deps: bump pcre2-sys to 0.2.1
This brings in a bug fix that no longer tries to run `git` to update the
submodule if the `git` command doesn't exist.

This is useful is more restricted build contexts where `git` isn't
installed. Such as in the docker image used for running `cross`.
2019-04-25 11:12:14 -04:00
Andrew Gallant
03bf37ff4a
alloc: use jemalloc when building with musl
It turns out that musl's allocator is slow enough to cause a fairly
noticeable performance regression when ripgrep is built as a static
binary with musl. We fix this by using jemalloc when building with musl.

We continue to use the default system allocator in all other scenarios.
Namely, glibc's allocator doesn't noticeably regress performance compared
to jemalloc. But we could add more targets to this logic if other
system allocators (macOS, Windows) prove to be slow.

This wasn't necessary before because rustc recently stopped using jemalloc
by default.

Fixes #1268
2019-04-24 17:21:38 -04:00
Andrew Gallant
973de50c9e
ripgrep: release 11.0.1, take 2 2019-04-16 13:11:28 -04:00
Andrew Gallant
fdde2bcd38
deps: update regex to 1.1.6
This brings in a fix for a regression introduced in ripgrep 11.

Fixes #1247
2019-04-16 08:34:30 -04:00
Andrew Gallant
d7f57d9aab
ripgrep: release 11.0.0 2019-04-15 18:09:40 -04:00
Andrew Gallant
1a2a24ea74
grep: release 0.2.4 2019-04-15 18:03:46 -04:00
Andrew Gallant
d66610b295
grep-cli: release 0.1.2 2019-04-15 18:02:44 -04:00
Andrew Gallant
019ae1989b
grep-printer: release 0.1.2 2019-04-15 18:00:49 -04:00
Andrew Gallant
36d3f235dc
grep-searcher: release 0.1.4 2019-04-15 17:59:22 -04:00
Andrew Gallant
79018eb693
grep-pcre2: release 0.1.3 2019-04-15 17:57:03 -04:00
Andrew Gallant
44cd344438
grep-regex: release 0.1.3 2019-04-15 17:56:04 -04:00
Andrew Gallant
e493e54b9b
grep-matcher: release 0.1.2 2019-04-15 17:53:29 -04:00
Andrew Gallant
8e8215aa65
ignore: release 0.4.7 2019-04-15 17:50:37 -04:00
Andrew Gallant
e79085e9e4
release: globset 0.4.3 2019-04-15 14:07:03 -04:00
Andrew Gallant
9952ba2068 deps: update glob dev-dependency 2019-04-14 19:29:27 -04:00
Andrew Gallant
b751758d60 deps: update everything 2019-04-14 19:29:27 -04:00
Andrew Gallant
f3646242cc deps: use pcre2 0.2.0
This comes with PCRE 10.32 and a few new options we'll use in subsequent
commits.
2019-04-14 19:29:27 -04:00
Andrew Gallant
09108b7fda regex: make multi-literal searcher faster
This makes the case of searching for a dictionary of a very large number
of literals much much faster. (~10x or so.) In particular, we achieve this
by short-circuiting the construction of a full regex when we know we have
a simple alternation of literals. Building the regex for a large dictionary
(>100,000 literals) turns out to be quite slow, even if it internally will
dispatch to Aho-Corasick.

Even that isn't quite enough. It turns out that even *parsing* such a regex
is quite slow. So when the -F/--fixed-strings flag is set, we short
circuit regex parsing completely and jump straight to Aho-Corasick.

We aren't quite as fast as GNU grep here, but it's much closer (less than
2x slower).

In general, this is somewhat of a hack. In particular, it seems plausible
that this optimization could be implemented entirely in the regex engine.
Unfortunately, the regex engine's internals are just not amenable to this
at all, so it would require a larger refactoring effort. For now, it's
good enough to add this fairly simple hack at a higher level.

Unfortunately, if you don't pass -F/--fixed-strings, then ripgrep will
be slower, because of the aforementioned missing optimization. Moreover,
passing flags like `-i` or `-S` will cause ripgrep to abandon this
optimization and fall back to something potentially much slower. Again,
this fix really needs to happen inside the regex engine, although we
might be able to special case -i when the input literals are pure ASCII
via Aho-Corasick's `ascii_case_insensitive`.

Fixes #497, Fixes #838
2019-04-07 19:11:03 -04:00
Andrew Gallant
743d64f2e4 deps: update to clap 2.33 2019-04-06 10:35:08 -04:00
lesnyrumcajs
5962abc465 searcher: add option to disable BOM sniffing
This commit adds a new encoding feature where the -E/--encoding flag
will now accept a value of 'none'. When given this value, all encoding
related machinery is disabled and ripgrep will search the raw bytes of
the file, including the BOM if it's present.

Closes #1207, Closes #1208
2019-04-06 10:35:08 -04:00
Andrew Gallant
77439f99a4 deps: add bstr to Cargo.lock 2019-04-05 23:24:08 -04:00
Andrew Gallant
cd9815cb37
deps: update to aho-corasick 0.7
We do the simplest possible change to migrate to the new version.

Fixes #1228
2019-04-03 13:51:26 -04:00
Andrew Gallant
3f22c3a658
deps: update everything
This updates all dependencies to their latest versions.

We tolerate a duplicative aho-corasick for now, which we will fix in the
next commit.
2019-04-03 13:07:26 -04:00
Andrew Gallant
0913972104
deps: bump encoding_rs_io
This brings in a new API for disabling BOM sniffing.

This is part of the work toward completing
https://github.com/BurntSushi/ripgrep/issues/1207
2019-03-03 16:36:34 -05:00
Andrew Gallant
f19b84fb23
regex: bump regex dep to fix match bug
See

* 661bf53d5b
* edf45e6f5f

for details on the bug fix, which was in the regex engine.

Fixes #1203
2019-02-27 17:42:14 -05:00
Andrew Gallant
1c7c4e6640
deps: update tempfile 2019-02-21 16:32:17 -05:00
Andrew Gallant
69c5e3938d
deps: bump smallvec
This gets rid of the unmaintained crates `unreachable` and `void`. Yay!
2019-02-21 16:31:48 -05:00
Andrew Gallant
d9cf05ad50
deps: update to aho-corasick 0.6.10
This brings in a fix for this bug:
https://github.com/BurntSushi/aho-corasick/issues/37

Fixes #1079
2019-02-16 11:39:33 -05:00
Andrew Gallant
af8b6caebb
deps: update various dependencies 2019-02-16 09:39:42 -05:00
Andrew Gallant
c84cfb6756
grep-regex-0.1.2 2019-02-16 09:30:06 -05:00
Andrew Gallant
8c95290ff6
deps: miscellaneous updates 2019-02-10 07:45:08 -05:00
Andrew Gallant
d6feeb7ff2
grep-searcher-0.1.3 2019-02-10 07:42:37 -05:00
Andrew Gallant
626ed00c19
searcher: revert big-endian patch
This undoes the patch to stop using bytecount on big-endian
architectures. In particular, we bump our bytecount dependency to the
latest release, which has a fix.

This reverts commit a4868b8835.

Fixes #1144 (again), Closes #1194
2019-02-10 07:40:32 -05:00
Andrew Gallant
fc3cf41247
grep-searcher-0.1.2 2019-02-09 16:13:07 -05:00
Andrew Gallant
de0bc78982
deps: bump encoding_rs to 0.8.16
This brings in an updated `encoding_rs` crate that uses `packed_simd`,
which compiles on the latest nightly. Compilation times do appear to be
impacted significantly though.

Fixes #1175 (again)
2019-02-07 17:05:14 -05:00
Andrew Gallant
f768796e4f
deps: update other deps 2019-01-29 13:08:56 -05:00
Andrew Gallant
da0c0c4705
deps: update to crossbeam-channel 0.3.8
This drops dependencies on parking_lot and rand from ripgrep.

(rand is still used for tests.)
2019-01-29 13:07:37 -05:00
Andrew Gallant
cc93db3b18
cargo: include auto-generated message
This is going to be annoying for a while if one switches between the
latest nightly compiler and older compilers. Sigh.
2019-01-29 13:04:40 -05:00
Andrew Gallant
f158a42a71
ignore: correctly detect hidden files on Windows
This commit fixes a bug where ripgrep only treated files beginning with
a `.` as hidden. On Windows, we continue this tradition, but
additionally check whether a file has the special Windows "hidden"
attribute set. If so, we treat it as a hidden file.

In order to make this work without an additional stat call, we had to
rearrange some of the plumbing from the directory traverser.

Fixes #1154
2019-01-27 12:11:52 -05:00
Andrew Gallant
e99b6bda0e
deps: bump regex-syntax to 0.6.5
This is necessary for the use of the new is_line_anchored_{start,end}
APIs.
2019-01-26 12:20:02 -05:00
Andrew Gallant
276e2c9b9a
searcher: always strip BOM
This fixes a bug where a BOM prefix was included. While this was somewhat
intentional in order to have a faithful "UTF8 passthru" option, in
practice, this causes problems such as breaking patterns like `^` in a
really non-obvious way.

The actual fix was to add a new API to encoding_rs_io, which this commit
brings in.

Fixes #1163
2019-01-25 17:18:57 -05:00
Andrew Gallant
47833b9ce7
deps: update removal of grep devdeps 2019-01-23 20:14:37 -05:00
Andrew Gallant
1e9ee2cc85 deps: update memmap 2019-01-19 10:44:30 -05:00
Andrew Gallant
968491f8e9 deps: update to bytecount 0.5
bytecount now uses runtime dispatch for enabling SIMD, which means we can
no longer need the avx-accel features. We remove it from ripgrep since the
next release will be a minor version bump, but leave them as no-ops for
the crates that previously used it.
2019-01-19 10:44:30 -05:00
Andrew Gallant
63b0f31a22 deps: update various dependencies
We also increase the MSRV to 1.32, the current stable release, which sets
the stage for migrating to Rust 2018.
2019-01-19 10:44:30 -05:00
Andrew Gallant
17ef4c40f3
ignore-0.4.6 2018-12-30 08:46:09 -05:00
Andrew Gallant
b3c5773266
deps: bump ignore 2018-12-30 08:43:18 -05:00
Andrew Gallant
b45b2f58ea
deps: update most other dependencies
This commit is the result of doing:

  $ cargo update
  $ cargo update -p encoding_rs --precise 0.8.10

where the latter line prevents encoding_rs from updating to 0.8.11 (or
newer). In particular, the 0.8.11 release increased the minimum Rust
version to 1.29, where as ripgrep 0.10.x is still on 1.28. We stay on an
older version for now until ripgrep is ready to move to 0.11.x.
2018-12-15 08:42:14 -05:00
Andrew Gallant
662a9bc73d
deps: update to crossbeam-channel 0.3
This also requires corresponding updates to both rand and rand_core. Doing
an update of rand without doing an update of rand_core results in
compilation errors because two distinct versions of rand_core are included
in the build, and the traits they expose are distinct and incompatible.

We also switch over to using tempfile instead of tempdir, which drops the
last remaining thing keeping rand 0.4 in the build.

Fixes #1141, Fixes #1142
2018-12-15 08:40:04 -05:00
Andrew Gallant
401add0a99
deps: update regex and regex-syntax
This brings in some new Unicode properties, such as \p{Emoji}.

It is now also technically possible construct a regex that recognizes
grapheme clusters.
2018-12-09 16:33:37 -05:00
Andrew Gallant
fb62266620
deps: update encoding_rs
This commit bumps the version of encoding_rs to use the latest release.
This appears to fix a panic in UTF-16 decoding.

Fixes #1089
2018-10-22 06:50:35 -04:00
Andrew Gallant
ba533f390e grep-searcher: update to encoding_rs_io 0.1.3
This update includes a work-around for a presumed bug in encoding_rs
that causes a panic:
https://github.com/hsivonen/encoding_rs/issues/34

Specifically, to reproduce this in ripgrep, one can run the following:

    $ curl -LO https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.1.tar.gz
    $ tar xf ruby-2.5.1.tar.gz
    $ rg ZZZZZ ruby-2.5.1/test/rexml/data/t63-2.svg
    thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 1'

Fixes #1052
2018-09-25 16:56:04 -04:00
Andrew Gallant
eb18da0450
pcre2: use jit_if_available
This will allow PCRE2 to fall back to non-JIT matching when running on
platforms without JIT support.

ref https://github.com/BurntSushi/rust-pcre2/issues/3
2018-09-08 17:12:14 -04:00
Andrew Gallant
d14f0b37d6
deps: update versions for all crates
I don't think every change here is needed, but this ensures we're using
the latest version of every direct dependency.
2018-09-07 14:00:22 -04:00
Andrew Gallant
3ddc3c040f
deps: minor updates 2018-09-07 13:03:01 -04:00
Andrew Gallant
0e2f8f7b47
grep: add clap and regex dev dependencies to grep
These are (or will be) used in grep's examples.
2018-09-07 12:06:05 -04:00
Andrew Gallant
83dff33326
deps: update various deps 2018-09-04 23:29:22 -04:00
Andrew Gallant
003c3695f4
deps: update grep version 2018-09-04 23:29:05 -04:00
Andrew Gallant
4846d63539 grep-cli: introduce new grep-cli crate
This commit moves a lot of "utility" code from ripgrep core into
grep-cli. Any one of these things might not be worth creating a new
crate, but combining everything together results in a fair number of a
convenience routines that make up a decent sized crate.

There is potentially more we could move into the crate, but much of what
remains in ripgrep core is almost entirely dealing with the number of
flags we support.

In the course of doing moving things to the grep-cli crate, we clean up
a lot of gunk and improve failure modes in a number of cases. In
particular, we've fixed a bug where other processes could deadlock if
they write too much to stderr.

Fixes #990
2018-09-04 23:18:55 -04:00
Andrew Gallant
04518e32e7
deps: update other crates 2018-08-30 23:03:07 -04:00
Andrew Gallant
f2eaf5b977
deps: update termcolor for perf tweaks 2018-08-30 22:57:01 -04:00
Andrew Gallant
f9ce7a84a8 ignore: add 'same_file_system' option
This commit adds a 'same_file_system' option to the walk builder. For
single threaded walking, it defers to the walkdir crate, which has the
same option. The bulk of this commit implements this flag for the parallel
walker. We add one very feeble test for this.

The parallel walker is now officially a complete mess.

Closes #321
2018-08-26 18:42:25 -04:00
Andrew Gallant
1b6089674e deps: more updates 2018-08-26 18:42:25 -04:00
Andrew Gallant
05a0389555
ripgrep: use winapi-util for stdin_is_readable 2018-08-25 00:30:15 -04:00
Andrew Gallant
16353bad6e
deps: update various deps
This includes a new crate, winapi-util, that is now used in wincolor,
walkdir and same-file.
2018-08-25 00:19:40 -04:00
Andrew Gallant
f1e025873f
deps: update dependencies
This includes an update to walkdir 2.2.2, which includes a
`same_file_system` option.
2018-08-22 20:50:24 -04:00
Andrew Gallant
033ad2b8e4
deps: update clap
Update clap to the latest version.

Also, drop the ansi_term dependency by disabling color output in clap's
error messages.
2018-08-21 23:10:34 -04:00
Andrew Gallant
098a8ee843 deps: various patch upgrades 2018-08-21 23:05:52 -04:00
Andrew Gallant
2f3dbf5fee ignore: fix false positive in path_is_symlink
This commit fixes a bug where the first path always reported itself as
as symlink via `path_is_symlink`.

Part of this fix includes updating walkdir to 2.2.1, which also includes
a corresponding bug fix.

Fixes #984
2018-08-21 23:05:52 -04:00