It looks like a new dependency on `getrandom` was added (which brings in
a few more dependencies itself) because of `jobserver`. Thankfully,
`jobserver` is only used when ripgrep's `pcre2` feature is enabled, so
this still keeps the default set of dependencies very small.
This removes `once_cell` (a dependency of `cc`) but adds `shlex` (also a
dependency of `cc`). AFAIK, ripgrep does not utilize anything in `cc`
that requires `shlex`, which is pretty unfortunate that we have to spend
time compiling it. (We use `cc` only when the `pcre2` feature is
enabled.)
Notably, this removes winapi in favor of windows-sys, as a result of
winapi-util switching over to windows-sys[1].
Annoyingly, when PCRE2 is enabled, this brings in a dependency on
`once_cell`[2]. I had worked to remove it from my dependencies and now
it's back. Gah. I suppose I could disable the `parallel` feature of
`cc`, but that doesn't seem like a good trade-off.
[1]: https://github.com/BurntSushi/winapi-util/pull/13
[2]: https://github.com/rust-lang/cc-rs/pull/1037
This feature causes nothing but problems and is frequently broken. The
only optimization it was enabling were SIMD optimizations for
transcoding. In particular, for UTF-16 transcoding. This is performed by
the [`encoding_rs`](https://github.com/hsivonen/encoding_rs) crate,
which specifically uses unstable portable SIMD APIs instead of the
stable non-portable SIMD APIs.
SIMD optimizations that apply to search have long been making use of
stable APIs, and are automatically enabled when your target supports
them. This is, IMO, the correct user experience and one that
`encoding_rs` refuses to support. I'm done dealing with it, so
transcoding will only use scalar code until the SIMD optimizations in
`encoding_rs` work on stable. (This doesn't mean that `encoding_rs` has
to change. This could also be fixed by stabilizing `std::simd`.)
Fixes#2748
Instead, we just roll our own. A slow version of this is pretty simple
to do, and that's what we write here. The `base64` crate supports a lot
more functionality and is quite fast, but we care about neither of those
things for this particular aspect of ripgrep. (base64 is only used for
non-UTF-8 data or file paths, which are both quite rare.)
As suggested by @epage[1].
Ad hoc timings on my i7-12900K:
before cargo build: 4.91s
before cargo build release: 8.05s
after cargo build: 4.69s
after cargo build release: 7.83s
... pretty underwhelming if you ask me. Ah well. And on my M2 mac mini:
before cargo build: 6.18s
before cargo build release: 14.50s
after cargo build: 5.52s
after cargo build release: 13.44s
Still kind of underwhelming, but definitely better. It shaves a full
second off of compile times in release mode. I went back to my
i7-12900K, but passed `-j1` to `cargo build` to force single threaded
mode:
before cargo build: 19.44s
before cargo build release: 50.64s
after cargo build: 16.76s
after cargo build release: 48.00s
Which seems pretty consistent with the modest improvements above.
Looking at `cargo build --timings`, the beefiest chunk of time is spent
in compiling `regex-automata`, by far. This is fine because it's core
functionality. I wish a fast general purpose regex engine with its
internals exposed as a separately versioned library didn't require so
much code... Blech.
[1]: https://old.reddit.com/r/rust/comments/17rd8ww/faster_compilation_with_the_parallel_frontend_in/k8igjlg/
The idea is that by bringing derives in via serde's optional feature, it
was inhibiting compilation speed[1]. We try to fix that by depending on
`serde_derive` as a distinct dependency.
It does seem to improve overall compilation time, but only by about 0.5
seconds. With that said, my machine has a lot of cores, so it's possible
this will help more on less powerful CPUs.
[1]: https://old.reddit.com/r/rust/comments/17rd8ww/faster_compilation_with_the_parallel_frontend_in/k8igjlg/