1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2025-11-29 05:57:07 +02:00

ignore: fix global gitignore bug that arises with absolute paths

The `ignore` crate currently handles two different kinds of "global"
gitignore files: gitignores from `~/.gitconfig`'s `core.excludesFile`
and gitignores passed in via `WalkBuilder::add_ignore` (corresponding to
ripgrep's `--ignore-file` flag).

In contrast to any other kind of gitignore file, these gitignore files
should have their patterns interpreted relative to the current working
directory. (Arguably there are other choices we could make here, e.g.,
based on the paths given. But the `ignore` infrastructure can't handle
that, and it's not clearly correct to me.) Normally, a gitignore file
has its patterns interpreted relative to where the gitignore file is.
This relative interpretation matters for patterns like `/foo`, which are
anchored to _some_ directory.

Previously, we would generally get the global gitignores correct because
it's most common to use ripgrep without providing a path. Thus, it
searches the current working directory. In this case, no stripping of
the paths is needed in order for the gitignore patterns to be applied
directly.

But if one provides an absolute path (or something else) to ripgrep to
search, the paths aren't stripped correctly. Indeed, in the core, I had
just given up and not provided a "root" path to these global gitignores.
So it had no hope of getting this correct.

We fix this assigning the CWD to the `Gitignore` values created from
global gitignore files. This was a painful thing to do because we'd
ideally:

1. Call `std::env::current_dir()` at most once for each traversal.
2. Provide a way to avoid the library calling `std::env::current_dir()`
   at all. (Since this is global process state and folks might want to
   set it to different values for $reasons.)

The `ignore` crate's internals are a total mess. But I think I've
addressed the above 2 points in a semver compatible manner.

Fixes #3179
This commit is contained in:
Andrew Gallant
2025-10-15 18:08:30 -04:00
parent 9ec08522be
commit b610d1cb15
6 changed files with 213 additions and 11 deletions

View File

@@ -1655,6 +1655,56 @@ rgtest!(r3173_hidden_whitelist_only_dot, |dir: Dir, _: TestCommand| {
eqnice!(cmd().args(&["--files", "./"]).stdout(), "./.foo.txt\n");
});
// See: https://github.com/BurntSushi/ripgrep/issues/3179
rgtest!(r3179_global_gitignore_cwd, |dir: Dir, mut cmd: TestCommand| {
dir.create_dir("a/b/c");
dir.create("a/b/c/haystack", "");
dir.create(".test.gitignore", "/haystack");
// I'm not sure in which cases this can fail. If it
// does and it's unavoidable, feel free to submit a
// patch that skips this test when this canonicalization
// fails.
//
// The reason we canonicalize here is strange, and it is
// perhaps papering over a bug in ripgrep. But on macOS,
// `TMPDIR` is set to `/var/blah/blah`. However, `/var`
// is symlinked to `/private/var`. So the CWD detected by
// the process is `/private/var`. So it turns out that the
// CWD is not a proper prefix of `dir.path()` here. So we
// cheat around this by forcing our path to be canonicalized
// so it's `/private/var` everywhere.
//
// Arguably, ripgrep should still work here without
// canonicalization. But it's not actually quite clear
// to me how to do it. I *believe* the solution here is
// that gitignore matching should be relative to the directory
// path given to `WalkBuider::{add,new}`, and *not* to the
// CWD. But this is a very big change to how `ignore` works
// I think. At least conceptually. So that will need to be
// something we do when we rewrite `ignore`. Sigh.
//
// ... but, on Windows, path canonicalization seems to
// totally fuck things up, so skip it there. HEAVY sigh.
let dir_path = if cfg!(windows) {
dir.path().to_path_buf()
} else {
dir.path().canonicalize().unwrap()
};
let ignore_file_path = dir_path.join(".test.gitignore");
cmd.current_dir("a/b/c")
.arg("--files")
.arg("--ignore-file")
.arg(ignore_file_path.display().to_string())
// This is a key part of the reproduction. When just providing `.`
// to ignore's walker (as ripgrep does when a path to search isn't
// provided), then everything works as one expects. Because there's
// nothing to strip off of the paths being searched. But when one
// provides an absolute path, the stripping didn't work.
.arg(&dir_path)
.assert_err();
});
// See: https://github.com/BurntSushi/ripgrep/issues/3180
rgtest!(r3180_look_around_panic, |dir: Dir, mut cmd: TestCommand| {
dir.create("haystack", " b b b b b b b b\nc\n");