1
0
mirror of https://github.com/facebook/zstd.git synced 2025-03-06 16:56:49 +02:00

10783 Commits

Author SHA1 Message Date
Yann Collet
7d3e5e3ba1 split all full 128 KB blocks
this helps make the streaming behavior more consistent,
since it does no longer depend on having more data presented on the input.

suggested by @terrelln
2024-10-23 14:18:48 -07:00
Yann Collet
b68ddce818 rewrite fingerprint storage to no longer need 64-bit members
so that it can be stored using standard alignment requirement (sizeof(void*)).

Distance function still requires 64-bit signed multiplication though,
so it won't change the issue regarding the bug in ubsan for clang 32-bit on github ci.
2024-10-23 11:50:57 -07:00
Yann Collet
57239c4d3b fixed minor strict pedantic C90 issue 2024-10-23 11:50:57 -07:00
Yann Collet
18b1e67223 fixed extraneous return
strict C90 compliance test
2024-10-23 11:50:57 -07:00
Yann Collet
d2eeed53dc updated compression results
due to integration of `sample5` strategy, leading to better compression ratios on a range of levels
2024-10-23 11:50:57 -07:00
Yann Collet
0be334d208 fixes static state allocation check
detected by @felixhandte
2024-10-23 11:50:57 -07:00
Yann Collet
06b7cfabf8 rewrote ZSTD_cwksp_initialAllocStart() to be easier to read
following a discussion with @felixhandte
2024-10-23 11:50:57 -07:00
Yann Collet
16450d0732 rewrite penalty update
suggested by @terrelln
2024-10-23 11:50:57 -07:00
Yann Collet
1ec5f9f1f6 changed loop exit condition so that there is no need to assert() within the loop. 2024-10-23 11:50:57 -07:00
Yann Collet
4662f6e646 renamed: FingerPrint => Fingerprint
suggested by @terrelln
2024-10-23 11:50:57 -07:00
Yann Collet
ea85dc7af6 conservatively estimate over-splitting in presence of incompressible loss
ensure data can never be expanded by more than 3 bytes per full block.
2024-10-23 11:50:57 -07:00
Yann Collet
5ae34e4c96 ensure lastBlock is correctly determined
reported by @terrelln
2024-10-23 11:50:57 -07:00
Yann Collet
7bad787d8b made ZSTD_isPower2() an inline function 2024-10-23 11:50:57 -07:00
Yann Collet
a167571db5 added a faster block splitter variant
that samples 1 in 5 positions.

This variant is fast enough for lazy2 and btlazy2,
but it's less good in combination with post-splitter at higher levels (>= btopt).
2024-10-23 11:50:57 -07:00
Yann Collet
1c62e714ab minor split optimization
let's fill the initial stats directly into target fingerprint
2024-10-23 11:50:57 -07:00
Yann Collet
dac26eaeac updated regression test results 2024-10-23 11:50:57 -07:00
Yann Collet
4ce91cbf2b fixed workspace alignment on non 64-bit systems 2024-10-23 11:50:57 -07:00
Yann Collet
cae8d13294 splitter workspace is now provided by ZSTD_CCtx* 2024-10-23 11:50:56 -07:00
Yann Collet
4685eafa81 fix alignment test
for non 64-bit systems
2024-10-23 11:50:56 -07:00
Yann Collet
433f4598ad fixed minor conversion warnings on Visual 2024-10-23 11:50:56 -07:00
Yann Collet
73a6653653 ZSTD_splitBlock_4k() uses externally provided workspace
ideally, this workspace would be provided from the ZSTD_CCtx* state
2024-10-23 11:50:56 -07:00
Yann Collet
7f015c2fd7 replaced uasan32 test by asan32 test 2024-10-23 11:50:56 -07:00
Yann Collet
31d48e9ffa fixing minor formatting issue in 32-bit mode with logs enabled 2024-10-23 11:50:56 -07:00
Yann Collet
76ad1d6903 fixed VS2010 solution 2024-10-23 11:50:56 -07:00
Yann Collet
cdddcaaec9 new Makefile target mesonbuild
for easier local testing
2024-10-23 11:50:56 -07:00
Yann Collet
6939235f01 fixed meson build 2024-10-23 11:50:56 -07:00
Yann Collet
80a912dec1 fixed zstreamtest 2024-10-23 11:50:56 -07:00
Yann Collet
6dc52122e6 fixed c90 comment style 2024-10-23 11:50:56 -07:00
Yann Collet
20c3d176cd fix assert 2024-10-23 11:50:56 -07:00
Yann Collet
0d4b520657 only split full blocks
short term simplification
2024-10-23 11:50:56 -07:00
Yann Collet
dd38c677eb fixed single-library build 2024-10-23 11:50:56 -07:00
Yann Collet
8b3887f579 fixed kernel build 2024-10-23 11:50:56 -07:00
Yann Collet
f83ed087f6 fixed RLE detection test 2024-10-23 11:50:56 -07:00
Yann Collet
83a3402a92 fix overlap write scenario in presence of incompressible data 2024-10-23 11:50:56 -07:00
Yann Collet
fa147cbb4d more ZSTD_memset() to apply 2024-10-23 11:50:56 -07:00
Yann Collet
6021b6663a minor C++-ism
though I really wonder if this is a property worth maintaining.
2024-10-23 11:50:56 -07:00
Yann Collet
e2d7d08888 use ZSTD_memset()
for better portability on Linux kernel
2024-10-23 11:50:56 -07:00
Yann Collet
586ca96fec do not use new as variable name 2024-10-23 11:50:56 -07:00
Yann Collet
9e52789962 fixed strict C90 semantic 2024-10-23 11:50:56 -07:00
Yann Collet
a5bce4ae84 XP: add a pre-splitter
instead of ingesting only full blocks, make an analysis of data, and infer where to split.
2024-10-23 11:50:56 -07:00
Yann Collet
dfaf5fafb9
Merge pull request #4174 from facebook/bench_loadOnce
Modify benchmark to load sources only once
2024-10-23 11:14:05 -07:00
Yann Collet
f34bc9cee6 improve man page on benchmark mode
update the man page in troff format,
and the README with latest `--help` content and complementary details about benchmark mode.

also: display level 0 when doing decompression benchmark
2024-10-23 00:16:13 -07:00
Yann Collet
0079d515b1 Modify benchmark to only load sources once
After a regrettable update,
the benchmark module ended up reloading sources for every compression level.

While the delay itself is likely torelable,
the main issue is that the `--quiet` mode now also displays a loading summary between each compression line.
This wasn't the original intention, which is to produce a compact view of all compressions.

This is fixed in this version,
where sources are loaded only once, for all compression levels,
and loading summary is only displayed once.
2024-10-22 02:18:48 -07:00
Yann Collet
b880f20d52
Merge pull request #4171 from facebook/lvl3_ratio+
Improve compression ratio of levels 3 & 4
2024-10-17 11:39:41 -07:00
Yann Collet
41d870fbbf updated regression tests results 2024-10-17 11:06:26 -07:00
Yann Collet
ff8e98bebe enable regression tests at pull request time
was transferred from circleci,
but was only triggered on push into dev,
i.e. after pull request is merged.
2024-10-17 09:45:16 -07:00
Yann Collet
47d4f5662d rewrite code in the manner suggested by @terrelln 2024-10-17 09:37:23 -07:00
Yann Collet
61d08b0e42 fix test
a margin of 4 is insufficient to guarantee compression success.
2024-10-17 09:37:23 -07:00
Yann Collet
6326775166 slightly improved compression ratio at levels 3 & 4
The compression ratio benefits are small but consistent, i.e. always positive.
On `silesia.tar` corpus, this modification saves ~75 KB at level 3.
The measured speed cost is negligible, i.e. below noise level, between 0 and -1%.
2024-10-17 09:37:23 -07:00
Yann Collet
18a42190c2
Merge pull request #4170 from facebook/dict_cSpeed
Improve dictionary compression speed
2024-10-16 17:36:49 -07:00