mirror of
https://github.com/BurntSushi/ripgrep.git
synced 2024-12-12 19:18:24 +02:00
c21302b409
Previously, we had logic to skip our own inner literal optimization if the regex itself was already (likely) accelerated. It turns out that the presence of a Unicode word boundary can defeat acceleration to a point. It's likely enough that even if the underlying regex is accelerated, it would be prudent to do our own inner literal optimization if the pattern has a Unicode word boundary. Normally a Unicode word boundary doesn't defeat literal optimizations, since even the slower engines can make use of *prefix* literal optimizations. But a regex can be accelerated via its own inner or suffix literal optimizations, and those require the use of a DFA (or lazy DFA). Since DFAs crap out on haystacks that contain a non-ASCII Unicode scalar value when the regex contains a Unicode word boundary, it follows that an "accelerated" can still wind up being quite slow. (An "accelerated" regex can also slow down because of restrictions on avoiding quadratic behavior, but I believe this happens less frequently and is not as severe as the slow down as a result of Unicode word boundaries. Namely, avoiding quadratic behavior just means giving up on the inner literal optimization for a single search. In which case, the regex engine can still fall back to a normal forward DFA. That will definitely be slower than an inner literal optimization done by ripgrep, but not quite as dramatic as it would be when DFAs can't be used at all.) |
||
---|---|---|
.. | ||
src | ||
Cargo.toml | ||
LICENSE-MIT | ||
README.md | ||
UNLICENSE |
grep-regex
The grep-regex
crate provides an implementation of the Matcher
trait from
the grep-matcher
crate. This implementation permits Rust's regex engine to
be used in the grep
crate for fast line oriented searching.
Dual-licensed under MIT or the UNLICENSE.
Documentation
NOTE: You probably don't want to use this crate directly. Instead, you
should prefer the facade defined in the
grep
crate.
Usage
Add this to your Cargo.toml
:
[dependencies]
grep-regex = "0.1"