Yann Collet
|
18a42190c2
|
Merge pull request #4170 from facebook/dict_cSpeed
Improve dictionary compression speed
|
2024-10-16 17:36:49 -07:00 |
|
Yann Collet
|
730d2dce41
|
fix test
|
2024-10-15 18:44:40 -07:00 |
|
Yann Collet
|
c2abfc5ba4
|
minor improvement to level 3 dictionary compression ratio
|
2024-10-15 17:58:33 -07:00 |
|
Yann Collet
|
e63896eb58
|
small dictionary compression speed improvement
not as good as small-blocks improvement,
but generally positive.
|
2024-10-15 17:48:35 -07:00 |
|
Yann Collet
|
def3ee9548
|
Merge pull request #4167 from facebook/ci_m32test_faster
attempt to make 32-bit tests faster
|
2024-10-12 01:57:55 -07:00 |
|
Yann Collet
|
e6740355e3
|
attempt parallel test running with -j
|
2024-10-11 18:01:28 -07:00 |
|
Yann Collet
|
6f2e29a234
|
measure if -O2 makes the test complete faster
|
2024-10-11 17:30:55 -07:00 |
|
Yann Collet
|
1024aa9252
|
attempt to make 32-bit tests faster
this is the longest CI test, reaching ~40mn on last PR
|
2024-10-11 16:24:25 -07:00 |
|
Yann Collet
|
8c38bda935
|
Merge pull request #4165 from facebook/cspeed_cmov
Improve compression speed on small blocks
|
2024-10-11 16:20:19 -07:00 |
|
Yann Collet
|
8e5823b65c
|
rename variable name
findMatch -> matchFound
since it's a test, as opposed to an active search operation.
suggested by @terrelln
|
2024-10-11 15:38:12 -07:00 |
|
Yann Collet
|
83de00316c
|
fixed parameter ordering in dfast
noticed by @terrelln
|
2024-10-11 15:36:15 -07:00 |
|
Yann Collet
|
7ba43091b8
|
Merge pull request #4164 from facebook/spec_043
spec update: huffman prefix code paragraph
|
2024-10-10 16:56:02 -07:00 |
|
Yann Collet
|
fa1fcb08ab
|
minor: better variable naming
|
2024-10-10 16:07:20 -07:00 |
|
Yann Collet
|
3e7c66acd1
|
added ascending order example
|
2024-10-09 01:06:24 -07:00 |
|
Yann Collet
|
d45aee43f4
|
make __asm__ a __GNUC__ specific
|
2024-10-08 16:38:35 -07:00 |
|
Yann Collet
|
741b860fc1
|
store dummy bytes within ZSTD_match4Found_cmov()
feels more logical, better contained
|
2024-10-08 16:34:40 -07:00 |
|
Yann Collet
|
197c258a79
|
introduce memory barrier to force test order
suggested by @terrelln
|
2024-10-08 15:54:48 -07:00 |
|
Yann Collet
|
186b132495
|
made search strategy switchable
between cmov and branch
and use a simple heuristic based on wlog to select between them.
note: performance is not good on clang (yet)
|
2024-10-08 13:52:56 -07:00 |
|
Yann Collet
|
2cc600bab2
|
refactor search into an inline function
for easier swapping with a parameter
|
2024-10-08 11:10:48 -07:00 |
|
Yann Collet
|
3b343dcfb1
|
refactor huffman prefix code paragraph
|
2024-10-07 17:15:07 -07:00 |
|
Yann Collet
|
1e7fa242f4
|
minor refactor zstd_fast
make hot variables more local
|
2024-10-07 11:22:40 -07:00 |
|
Yann Collet
|
da23998e9a
|
Merge pull request #4160 from facebook/fix_nightly
fix dependency for nightly github actions tests
|
2024-10-03 21:02:39 -07:00 |
|
Yann Collet
|
b84653fc83
|
fix dependency for nightly github actions tests
|
2024-10-03 15:10:16 -07:00 |
|
Yann Collet
|
b7e1eef048
|
Merge pull request #4159 from facebook/spec_refactor_fse
specification update
|
2024-10-03 14:54:16 -07:00 |
|
Yann Collet
|
a8b86d024a
|
refactor documentation of the FSE decoding table build process
|
2024-10-02 23:09:06 -07:00 |
|
Yann Collet
|
75b0f5f4f5
|
Merge pull request #4153 from artem/fix-meson-includes
meson: Do not export private headers in libzstd_dep to avoid name clash
|
2024-10-02 16:51:44 -07:00 |
|
Yann Collet
|
dda3cdfdec
|
Merge pull request #4156 from facebook/rm_circleci
removing nightly tests built on circleci
|
2024-10-02 16:51:15 -07:00 |
|
Yann Collet
|
751bf1ffd8
|
Merge pull request #4157 from facebook/fix_result_c
fix incorrect pointer manipulation
|
2024-10-02 16:50:45 -07:00 |
|
Yann Collet
|
dcc8fd0472
|
Merge pull request #4158 from facebook/benchzstd_fclose
fix missing fclose()
|
2024-10-02 16:49:43 -07:00 |
|
Yann Collet
|
8edd147686
|
fix missing fclose()
fix #4151
|
2024-10-01 09:52:45 -07:00 |
|
Yann Collet
|
de6cc98e07
|
fix incorrect pointer manipulation
fix #4155
|
2024-10-01 09:25:26 -07:00 |
|
Yann Collet
|
3d5d3f5630
|
removing nightly tests built on circleci
|
2024-09-30 21:38:29 -07:00 |
|
Yann Collet
|
27bf1362fe
|
Merge pull request #4154 from dearblue/freebsd-14.1
Update FreeBSD VM image to 14.1
|
2024-09-30 11:54:32 -07:00 |
|
Artem Labazov
|
ccc02a9a77
|
meson: Fix contrib and tests build
|
2024-09-30 18:05:57 +03:00 |
|
Artem Labazov
|
d2d49a1161
|
meson: Do not export private headers in libzstd_dep to avoid name clash
This way libzstd_dep does not override, for instance, <xxhash.h>
|
2024-09-30 17:03:42 +03:00 |
|
dearblue
|
a3b5c4521c
|
Update FreeBSD VM image to 14.1
FreeBSD 14.0 will reach the end of life on 2024-09-30.
The updated 14.1 is scheduled to end-of-life on 2025-03-31.
ref. https://www.freebsd.org/releases/14.2R/schedule/
|
2024-09-30 22:45:17 +09:00 |
|
Yann Collet
|
984d11a4d1
|
Merge pull request #4146 from facebook/dictBench_Doc
update documentation: specify that Dictionary can be used for benchmark
|
2024-09-27 13:44:42 -07:00 |
|
Yann Collet
|
d2212c680a
|
Merge pull request #4013 from elasota/spec-clarify-offset-code-overflow
Specify that decoders may reject non-zero probabilities for larger offset codes than implementation supports
|
2024-09-27 13:42:32 -07:00 |
|
Yann Collet
|
039f404faa
|
update documentation to specify that Dictionary can be used for benchmark
fix #4139
|
2024-09-25 16:56:01 -07:00 |
|
inventor500
|
9215de52c7
|
Included suggestion from @neheb
|
2024-09-25 09:51:05 -07:00 |
|
inventor500
|
a8b544d460
|
Fixed warning when compiling pzstd with CPPFLAGS=-Wunused-result and CXXFLAGS=-std=c++17
|
2024-09-25 09:51:05 -07:00 |
|
Yann Collet
|
bc96d4b077
|
Merge pull request #4119 from xionghul/dev
Fix zstd-pgo run error
|
2024-09-24 17:55:43 -07:00 |
|
Yann Collet
|
d27a4cd4ac
|
Merge pull request #4143 from facebook/fix_dictsizemin_dic
fix doc nit: ZDICT_DICTSIZE_MIN
|
2024-09-24 17:55:25 -07:00 |
|
Ilya Tokar
|
e8fce38954
|
Optimize compression by avoiding unpredictable branches
Avoid unpredictable branch. Use conditional move to generate the address
that is guaranteed to be safe and compare unconditionally.
Instead of
if (idx < limit && x[idx] == val ) // mispredicted idx < limit branch
Do
addr = cmov(safe,x+idx)
if (*addr == val && idx < limit) // almost always false so well predicted
Using microbenchmarks from https://github.com/google/fleetbench,
I get about ~10% speed-up:
name old cpu/op new cpu/op delta
BM_ZSTD_COMPRESS_Fleet/compression_level:-7/window_log:15 1.46ns ± 3% 1.31ns ± 7% -9.88% (p=0.000 n=35+38)
BM_ZSTD_COMPRESS_Fleet/compression_level:-7/window_log:16 1.41ns ± 3% 1.28ns ± 3% -9.56% (p=0.000 n=36+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:-5/window_log:15 1.61ns ± 1% 1.43ns ± 3% -10.70% (p=0.000 n=30+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:-5/window_log:16 1.54ns ± 2% 1.39ns ± 3% -9.21% (p=0.000 n=37+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:-3/window_log:15 1.82ns ± 2% 1.61ns ± 3% -11.31% (p=0.000 n=37+40)
BM_ZSTD_COMPRESS_Fleet/compression_level:-3/window_log:16 1.73ns ± 3% 1.56ns ± 3% -9.50% (p=0.000 n=38+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:-1/window_log:15 2.12ns ± 2% 1.79ns ± 3% -15.55% (p=0.000 n=34+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:-1/window_log:16 1.99ns ± 3% 1.72ns ± 3% -13.70% (p=0.000 n=38+38)
BM_ZSTD_COMPRESS_Fleet/compression_level:0/window_log:15 3.22ns ± 3% 2.94ns ± 3% -8.67% (p=0.000 n=38+40)
BM_ZSTD_COMPRESS_Fleet/compression_level:0/window_log:16 3.19ns ± 4% 2.86ns ± 4% -10.55% (p=0.000 n=40+38)
BM_ZSTD_COMPRESS_Fleet/compression_level:1/window_log:15 2.60ns ± 3% 2.22ns ± 3% -14.53% (p=0.000 n=40+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:1/window_log:16 2.46ns ± 3% 2.13ns ± 2% -13.67% (p=0.000 n=39+36)
BM_ZSTD_COMPRESS_Fleet/compression_level:2/window_log:15 2.69ns ± 3% 2.46ns ± 3% -8.63% (p=0.000 n=37+39)
BM_ZSTD_COMPRESS_Fleet/compression_level:2/window_log:16 2.63ns ± 3% 2.36ns ± 3% -10.47% (p=0.000 n=40+40)
BM_ZSTD_COMPRESS_Fleet/compression_level:3/window_log:15 3.20ns ± 2% 2.95ns ± 3% -7.94% (p=0.000 n=35+40)
BM_ZSTD_COMPRESS_Fleet/compression_level:3/window_log:16 3.20ns ± 4% 2.87ns ± 4% -10.33% (p=0.000 n=40+40)
I've also measured the impact on internal workloads and saw similar
~10% improvement in performance, measured by cpu usage/byte of data.
|
2024-09-20 16:07:01 -04:00 |
|
Yann Collet
|
7a48dc230c
|
fix doc nit: ZDICT_DICTSIZE_MIN
fix #4142
|
2024-09-19 09:50:30 -07:00 |
|
Yann Collet
|
20707e3718
|
Merge pull request #4129 from facebook/mitigate_32bit
Limit range of operations on Indexes in 32-bit mode
|
2024-08-22 11:00:50 -07:00 |
|
Yann Collet
|
09cb37cbb1
|
Limit range of operations on Indexes in 32-bit mode
and use unsigned type.
This reduce risks that an operation produces a negative number when crossing the 2 GB limit.
|
2024-08-21 11:03:43 -07:00 |
|
Yann Collet
|
ad038d8768
|
Merge pull request #4128 from facebook/dependabot/github_actions/github/codeql-action-3.26.2
Bump github/codeql-action from 3.25.1 to 3.26.2
|
2024-08-19 13:54:06 -07:00 |
|
dependabot[bot]
|
ec0c41414d
|
Bump github/codeql-action from 3.25.1 to 3.26.2
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.1 to 3.26.2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](c7f9125735...429e197704 )
---
updated-dependencies:
- dependency-name: github/codeql-action
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
|
2024-08-19 05:37:33 +00:00 |
|
Yann Collet
|
a761013b03
|
Merge pull request #4122 from facebook/dependabot/github_actions/actions/setup-java-4
Bump actions/setup-java from 3 to 4
|
2024-08-11 23:07:43 -07:00 |
|