Previously, the 'fut' type only matches files called '.fut', while in
reality we want to match all files with the '.fut' extension. This
commit fixes that issue.
PR #2027
It sounds like Projectfile is no longer being used,
but we should keep it around in case folks are
still using it. It's unlikely that its presence will
do much if any harm.
PR #1904
This seems like an obvious optimization but becomes critical when
filesystem operations even as simple as stat can result in significant
overheads; an example of this was a bespoke filesystem layer in Windows
that hosted files remotely and would download them on-demand when
particular filesystem operations occurred. Users of this system who
ensured correct file-type fileters were being used could still get
unnecessary file access resulting in large downloads.
Fixes#1657, Closes#1660
This also replaces '--all' in Cargo commands with '--workspace'. The
former has apparently been deprecated.
We also fix a couple warnings that this new step detected.
Closes#1848
`ignore::Error` wraps `std::io::Error` with additional information
(as well as expose non-IO errors). For people wanting to inspect what
the error is, they have to recursively match the Enum. This provides
`io_error` and `into_io_error` helpers to do this for the user.
PR #1740
This updates encoding_rs, crossbeam-utils and crossbeam-channel. This
serves two purposes. The encoding_rs update fixes a compilation failure
on the latest nightly. The crossbeam updates are good sense and to
reduce duplicate dependencies such as cfg-if. (Although, we note that
the log crate still pulls in cfg-if 0.1, so ripgrep has a duplicate
dependency there for now. But it's very small.)
Fixes#1721, Closes#1705
Bazel supports `BUILD.bazel` as well as `WORKSPACE.bazel`. In
addition, it is common to ship BUILD/WORKSPACE templates for
external repositories suffixed with .bazel for easier tool
recognition.
Co-authored-by: Brandon Adams <brandon.adams@imc.com>
PR #1716
Adds `WalkBuilder::filter_entry` that takes a predicate to be applied to
all entries. If the predicate returns `false` on a given entry, that
entry and all children will be skipped.
Fixes#1555, Closes#1557
While Linux distributions (at least Arch Linux, RHEL, Debian) do not support
compressing files with compress(1), macOS & AIX do (the utility is part of
POSIX). Additionally, gzip is able to uncompress such compressed files and
provides an `uncompress` binary.
Closes#1547
This replaces the use of channels in the parallel directory traversal
with a simple stack. The primary motivation for this change is to reduce
peak memory usage. In particular, when using a channel (which is a
queue), we wind up visiting files in a breadth first fashion. Using a
stack switches us to a depth first traversal. While there are no real
intrinsic differences, depth first traversal generally tends to use less
memory because directory trees are more commonly wide than they are
deep.
In particular, the queue/stack size itself is not the only concern. In
one recent case documented in #1550, a user wanted to search all Rust
crates. The directory structure was shallow but extremely wide, with a
single directory containing all crates. This in turn results is in
descending into each of those directories and building a gitignore
matcher for each (since most crates have `.gitignore` files) before ever
searching a single file. This means that ripgrep has all such matchers
in memory simultaneously, which winds up using quite a bit of memory.
In a depth first traversal, peak memory usage is much lower because
gitignore matches are built and discarded more quickly. In the case of
searching all crates, the peak memory usage decrease is dramatic. On my
system, it shrinks by an order magnitude, from almost 1GB to 50MB. The
decline in peak memory usage is consistent across other use cases as
well, but is typically more modest. For example, searching the Linux
repo has a 50% decrease in peak memory usage and searching the Chromium
repo has a 25% decrease in peak memory usage.
Search times generally remain unchanged, although some ad hoc benchmarks
that I typically run have gotten a bit slower. As far as I can tell,
this appears to be result of scheduling changes. Namely, the depth first
traversal seems to result in searching some very large files towards the
end of the search, which reduces the effectiveness of parallelism and
makes the overall search take longer. This seems to suggest that a stack
isn't optimal. It would instead perhaps be better to prioritize
searching larger files first, but it's not quite clear how to do this
without introducing more overhead (getting the file size for each file
requires a stat call).
Fixes#1550
We should not assume that the commondir file actually exists. If it
doesn't, then just move on. This otherwise emits an error message when
searching normal submodules, which is not OK.
This regression was introduced in #1446.
Fixes#1520