From 8db24e135375a2510e3eca85c72005172788471e Mon Sep 17 00:00:00 2001 From: Andrew Gallant Date: Sun, 12 Mar 2017 20:21:22 -0400 Subject: [PATCH] Stop aggressive inlining. It's not clear what exactly is happening here, but the Read implementation for text decoding appears a bit sensitive. Small pertubations in the code appear to have a nearly 100% impact on the overall speed of ripgrep when searching UTF-16 files. I haven't had the time to examine the generated code in detail, but `perf stat` seems to think that the instruction cache is performing a lot worse when the code slows down. This might mean that excessive inlining causes a different code structure that leads to less-than-optimal icache usage, but it's at best a guess. Explicitly disabling the inline for the cold path seems to help the optimizer figure out the right thing. --- src/decoder.rs | 1 + 1 file changed, 1 insertion(+) diff --git a/src/decoder.rs b/src/decoder.rs index d43cbdbb..345389a5 100644 --- a/src/decoder.rs +++ b/src/decoder.rs @@ -251,6 +251,7 @@ impl> DecodeReader { Ok(nwrite) } + #[inline(never)] // impacts perf... fn detect(&mut self) -> io::Result<()> { let bom = try!(self.rdr.peek_bom()); self.decoder = bom.decoder();