Yann Collet
ff8e98bebe
enable regression tests at pull request time
...
was transferred from circleci,
but was only triggered on push into dev,
i.e. after pull request is merged.
2024-10-17 09:45:16 -07:00
Yann Collet
47d4f5662d
rewrite code in the manner suggested by @terrelln
2024-10-17 09:37:23 -07:00
Yann Collet
61d08b0e42
fix test
...
a margin of 4 is insufficient to guarantee compression success.
2024-10-17 09:37:23 -07:00
Yann Collet
6326775166
slightly improved compression ratio at levels 3 & 4
...
The compression ratio benefits are small but consistent, i.e. always positive.
On `silesia.tar` corpus, this modification saves ~75 KB at level 3.
The measured speed cost is negligible, i.e. below noise level, between 0 and -1%.
2024-10-17 09:37:23 -07:00
Yann Collet
18a42190c2
Merge pull request #4170 from facebook/dict_cSpeed
...
Improve dictionary compression speed
2024-10-16 17:36:49 -07:00
Yann Collet
730d2dce41
fix test
2024-10-15 18:44:40 -07:00
Yann Collet
c2abfc5ba4
minor improvement to level 3 dictionary compression ratio
2024-10-15 17:58:33 -07:00
Yann Collet
e63896eb58
small dictionary compression speed improvement
...
not as good as small-blocks improvement,
but generally positive.
2024-10-15 17:48:35 -07:00
Yann Collet
def3ee9548
Merge pull request #4167 from facebook/ci_m32test_faster
...
attempt to make 32-bit tests faster
2024-10-12 01:57:55 -07:00
Yann Collet
e6740355e3
attempt parallel test running with -j
2024-10-11 18:01:28 -07:00
Yann Collet
6f2e29a234
measure if -O2 makes the test complete faster
2024-10-11 17:30:55 -07:00
Yann Collet
1024aa9252
attempt to make 32-bit tests faster
...
this is the longest CI test, reaching ~40mn on last PR
2024-10-11 16:24:25 -07:00
Yann Collet
8c38bda935
Merge pull request #4165 from facebook/cspeed_cmov
...
Improve compression speed on small blocks
2024-10-11 16:20:19 -07:00
Yann Collet
8e5823b65c
rename variable name
...
findMatch -> matchFound
since it's a test, as opposed to an active search operation.
suggested by @terrelln
2024-10-11 15:38:12 -07:00
Yann Collet
83de00316c
fixed parameter ordering in dfast
...
noticed by @terrelln
2024-10-11 15:36:15 -07:00
Yann Collet
7ba43091b8
Merge pull request #4164 from facebook/spec_043
...
spec update: huffman prefix code paragraph
2024-10-10 16:56:02 -07:00
Yann Collet
fa1fcb08ab
minor: better variable naming
2024-10-10 16:07:20 -07:00
Yann Collet
3e7c66acd1
added ascending order example
2024-10-09 01:06:24 -07:00
Yann Collet
d45aee43f4
make __asm__ a __GNUC__ specific
2024-10-08 16:38:35 -07:00
Yann Collet
741b860fc1
store dummy bytes within ZSTD_match4Found_cmov()
...
feels more logical, better contained
2024-10-08 16:34:40 -07:00
Yann Collet
197c258a79
introduce memory barrier to force test order
...
suggested by @terrelln
2024-10-08 15:54:48 -07:00
Yann Collet
186b132495
made search strategy switchable
...
between cmov and branch
and use a simple heuristic based on wlog to select between them.
note: performance is not good on clang (yet)
2024-10-08 13:52:56 -07:00
Yann Collet
2cc600bab2
refactor search into an inline function
...
for easier swapping with a parameter
2024-10-08 11:10:48 -07:00
Yann Collet
3b343dcfb1
refactor huffman prefix code paragraph
2024-10-07 17:15:07 -07:00
Yann Collet
1e7fa242f4
minor refactor zstd_fast
...
make hot variables more local
2024-10-07 11:22:40 -07:00
Yann Collet
da23998e9a
Merge pull request #4160 from facebook/fix_nightly
...
fix dependency for nightly github actions tests
2024-10-03 21:02:39 -07:00
Yann Collet
b84653fc83
fix dependency for nightly github actions tests
2024-10-03 15:10:16 -07:00
Yann Collet
b7e1eef048
Merge pull request #4159 from facebook/spec_refactor_fse
...
specification update
2024-10-03 14:54:16 -07:00
Yann Collet
a8b86d024a
refactor documentation of the FSE decoding table build process
2024-10-02 23:09:06 -07:00
Yann Collet
75b0f5f4f5
Merge pull request #4153 from artem/fix-meson-includes
...
meson: Do not export private headers in libzstd_dep to avoid name clash
2024-10-02 16:51:44 -07:00
Yann Collet
dda3cdfdec
Merge pull request #4156 from facebook/rm_circleci
...
removing nightly tests built on circleci
2024-10-02 16:51:15 -07:00
Yann Collet
751bf1ffd8
Merge pull request #4157 from facebook/fix_result_c
...
fix incorrect pointer manipulation
2024-10-02 16:50:45 -07:00
Yann Collet
dcc8fd0472
Merge pull request #4158 from facebook/benchzstd_fclose
...
fix missing fclose()
2024-10-02 16:49:43 -07:00
Yann Collet
8edd147686
fix missing fclose()
...
fix #4151
2024-10-01 09:52:45 -07:00
Yann Collet
de6cc98e07
fix incorrect pointer manipulation
...
fix #4155
2024-10-01 09:25:26 -07:00
Yann Collet
3d5d3f5630
removing nightly tests built on circleci
2024-09-30 21:38:29 -07:00
Yann Collet
27bf1362fe
Merge pull request #4154 from dearblue/freebsd-14.1
...
Update FreeBSD VM image to 14.1
2024-09-30 11:54:32 -07:00
Artem Labazov
ccc02a9a77
meson: Fix contrib and tests build
2024-09-30 18:05:57 +03:00
Artem Labazov
d2d49a1161
meson: Do not export private headers in libzstd_dep to avoid name clash
...
This way libzstd_dep does not override, for instance, <xxhash.h>
2024-09-30 17:03:42 +03:00
dearblue
a3b5c4521c
Update FreeBSD VM image to 14.1
...
FreeBSD 14.0 will reach the end of life on 2024-09-30.
The updated 14.1 is scheduled to end-of-life on 2025-03-31.
ref. https://www.freebsd.org/releases/14.2R/schedule/
2024-09-30 22:45:17 +09:00
Yann Collet
984d11a4d1
Merge pull request #4146 from facebook/dictBench_Doc
...
update documentation: specify that Dictionary can be used for benchmark
2024-09-27 13:44:42 -07:00
Yann Collet
d2212c680a
Merge pull request #4013 from elasota/spec-clarify-offset-code-overflow
...
Specify that decoders may reject non-zero probabilities for larger offset codes than implementation supports
2024-09-27 13:42:32 -07:00
Yann Collet
039f404faa
update documentation to specify that Dictionary can be used for benchmark
...
fix #4139
2024-09-25 16:56:01 -07:00
inventor500
9215de52c7
Included suggestion from @neheb
2024-09-25 09:51:05 -07:00
inventor500
a8b544d460
Fixed warning when compiling pzstd with CPPFLAGS=-Wunused-result and CXXFLAGS=-std=c++17
2024-09-25 09:51:05 -07:00
Yann Collet
bc96d4b077
Merge pull request #4119 from xionghul/dev
...
Fix zstd-pgo run error
2024-09-24 17:55:43 -07:00
Yann Collet
d27a4cd4ac
Merge pull request #4143 from facebook/fix_dictsizemin_dic
...
fix doc nit: ZDICT_DICTSIZE_MIN
2024-09-24 17:55:25 -07:00
Ilya Tokar
e8fce38954
Optimize compression by avoiding unpredictable branches
...
Avoid unpredictable branch. Use conditional move to generate the address
that is guaranteed to be safe and compare unconditionally.
Instead of
if (idx < limit && x[idx] == val ) // mispredicted idx < limit branch
Do
addr = cmov(safe,x+idx)
if (*addr == val && idx < limit) // almost always false so well predicted
Using microbenchmarks from https://github.com/google/fleetbench ,
I get about ~10% speed-up:
name old cpu/op new cpu/op delta
BM_ZSTD_COMPRESS_Fleet/compression_level:-7/window_log:15 1.46ns ± 3% 1.31ns ± 7% -9.88% (p=0.000 n=35+38)
BM_ZSTD_COMPRESS_Fleet/compression_level:-7/window_log:16 1.41ns ± 3% 1.28ns ± 3% -9.56% (p=0.000 n=36+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:-5/window_log:15 1.61ns ± 1% 1.43ns ± 3% -10.70% (p=0.000 n=30+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:-5/window_log:16 1.54ns ± 2% 1.39ns ± 3% -9.21% (p=0.000 n=37+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:-3/window_log:15 1.82ns ± 2% 1.61ns ± 3% -11.31% (p=0.000 n=37+40)
BM_ZSTD_COMPRESS_Fleet/compression_level:-3/window_log:16 1.73ns ± 3% 1.56ns ± 3% -9.50% (p=0.000 n=38+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:-1/window_log:15 2.12ns ± 2% 1.79ns ± 3% -15.55% (p=0.000 n=34+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:-1/window_log:16 1.99ns ± 3% 1.72ns ± 3% -13.70% (p=0.000 n=38+38)
BM_ZSTD_COMPRESS_Fleet/compression_level:0/window_log:15 3.22ns ± 3% 2.94ns ± 3% -8.67% (p=0.000 n=38+40)
BM_ZSTD_COMPRESS_Fleet/compression_level:0/window_log:16 3.19ns ± 4% 2.86ns ± 4% -10.55% (p=0.000 n=40+38)
BM_ZSTD_COMPRESS_Fleet/compression_level:1/window_log:15 2.60ns ± 3% 2.22ns ± 3% -14.53% (p=0.000 n=40+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:1/window_log:16 2.46ns ± 3% 2.13ns ± 2% -13.67% (p=0.000 n=39+36)
BM_ZSTD_COMPRESS_Fleet/compression_level:2/window_log:15 2.69ns ± 3% 2.46ns ± 3% -8.63% (p=0.000 n=37+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:2/window_log:16 2.63ns ± 3% 2.36ns ± 3% -10.47% (p=0.000 n=40+40)
BM_ZSTD_COMPRESS_Fleet/compression_level:3/window_log:15 3.20ns ± 2% 2.95ns ± 3% -7.94% (p=0.000 n=35+40)
BM_ZSTD_COMPRESS_Fleet/compression_level:3/window_log:16 3.20ns ± 4% 2.87ns ± 4% -10.33% (p=0.000 n=40+40)
I've also measured the impact on internal workloads and saw similar
~10% improvement in performance, measured by cpu usage/byte of data.
2024-09-20 16:07:01 -04:00
Yann Collet
7a48dc230c
fix doc nit: ZDICT_DICTSIZE_MIN
...
fix #4142
2024-09-19 09:50:30 -07:00
Yann Collet
20707e3718
Merge pull request #4129 from facebook/mitigate_32bit
...
Limit range of operations on Indexes in 32-bit mode
2024-08-22 11:00:50 -07:00