mirror of
https://github.com/BurntSushi/ripgrep.git
synced 2025-03-17 20:28:03 +02:00
Now that Rust's regex crate finally supports a CRLF mode, we can remove this giant hack in ripgrep to enable it. (And assuredly did not work in all cases.) The way this works in the regex engine is actually subtly different than what ripgrep previously did. Namely, --crlf would previously treat either \r\n or \n as a line terminator. But now it treats \r\n, \n and \r as line terminators. In effect, it is implemented by treating \r and \n as line terminators, but ^ and $ will never match at a position between a \r and a \n. So basically this means that $ will end up matching in more cases than it might be intended too, but I don't expect this to be a big problem in practice. Note that passing --crlf to ripgrep and enabling CRLF mode in the regex via the `R` inline flag (e.g., `(?R:$)`) are subtly different. The `R` flag just controls the regex engine, but --crlf instructs all of ripgrep to use \r\n as a line terminator. There are likely some inconsistencies or corner cases that are wrong as a result of this cognitive dissonance, but we choose to leave well enough alone for now. Fixing this for real will probably require re-thinking how line terminators are handled in ripgrep. For example, one "problem" with how they're handled now is that ripgrep will re-insert its own line terminators when printing output instead of copying the input. This is maybe not so great and perhaps unexpected. (ripgrep probably can't get away with not inserting any line terminators. Users probably expect files that don't end with a line terminator whose last line matches to have a line terminator inserted.)