1
0
mirror of https://github.com/facebook/zstd.git synced 2025-03-06 16:56:49 +02:00
Han Zhu e6dccbf482 Inline BIT_reloadDStream
Inlining `BIT_reloadDStream` provided >3% decompression speed improvement for
clang PGO-optimized zstd binary, measured using the Silesia corpus with
compression level 1. The win comes from improved register allocation which leads
to fewer spills and reloads. Take a look at this comparison of
profile-annotated hot assembly before and after this change:
https://www.diffchecker.com/UjDGIyLz/. The diff is a bit messy, but notice three
fewer moves after inlining.

In general LLVM's register allocator works better when it can see more code. For
example, when the register allocator sees a call instruction, it partitions the
registers into caller registers and callee registers, and it is not free to do
whatever it wants with all the registers for the current function. Inlining the
callee lets the register allocation access all registers and use them more
flexsibly.
2023-03-28 15:36:02 -07:00
..
2023-03-28 15:36:02 -07:00
2023-01-27 13:15:07 -08:00
2023-01-27 13:15:07 -08:00
2023-02-18 10:31:48 +01:00