mirror of
https://github.com/BurntSushi/ripgrep.git
synced 2025-03-03 14:32:22 +02:00
It looks like it incorrectly treats a file that is purely valid UTF-8 as a binary file, which in turn effectively renders all of the Russian subtitle benchmarks moot for ugrep. So we pass '-a' to force ugrep to treat the file as text. This technically gives ugrep an edge because it now no longer needs to look to see if the haystack is binary or not. In practice this is usually implemented using highly optimized SIMD routines (e.g., 'memchr'), so it tends not to matter much. We might also consider passing '-a' to all grep commands. But... I think using '-a' is the less common case and we should try to benchmark the common case.