For example, when only a single file (or stdin) is being searched, then we
should be able to print directly to the terminal instead of intermediate
buffers. (The buffers are only necessary for parallelism.)
Closes#4.
I though plain `read` had usurped them, but when searching a very small
number of files, mmaps can be around 20% faster on Linux. It'd be really
unfortunate to leave that on the table.
Mmap searching doesn't support contexts yet, but we probably don't really
care. And duplicating that logic doesn't sound fun. Without contexts, mmap
searching is delightfully simple.
- Refactored interaction between CLI args and rest of xrep.
- Filling in a lot more options, including file type filtering.
- Fixing some bugs in globbing/ignoring.
- More documentation.
Memory maps appear to degrade quite a bit in the presence of multithreading.
Also, switch to lock free data structures for synchronization. Give each
worker an input and output buffer which require no synchronization.
I'm pretty disappointed by the performance of regex sets. They are
apparently spending a lot of their time in construction of the DFA,
which probably means that the DFA is just too big.
It turns out that it's actually faster to build an *additional* normal
regex with the alternation of every glob and use it as a first-pass
filter over every file path. If there's a match, only then do we try the
more expensive RegexSet.