krak/zstd - zstd - Gitea: Git with a cup of tea

krak/zstd

mirror of https://github.com/facebook/zstd.git synced 2025-03-06 16:56:49 +02:00

Author	SHA1	Message	Date
Yann Collet	18b1e67223	fixed extraneous return strict C90 compliance test	2024-10-23 11:50:57 -07:00
Yann Collet	0be334d208	fixes static state allocation check detected by @felixhandte	2024-10-23 11:50:57 -07:00
Yann Collet	06b7cfabf8	rewrote ZSTD_cwksp_initialAllocStart() to be easier to read following a discussion with @felixhandte	2024-10-23 11:50:57 -07:00
Yann Collet	16450d0732	rewrite penalty update suggested by @terrelln	2024-10-23 11:50:57 -07:00
Yann Collet	1ec5f9f1f6	changed loop exit condition so that there is no need to assert() within the loop.	2024-10-23 11:50:57 -07:00
Yann Collet	4662f6e646	renamed: FingerPrint => Fingerprint suggested by @terrelln	2024-10-23 11:50:57 -07:00
Yann Collet	ea85dc7af6	conservatively estimate over-splitting in presence of incompressible loss ensure data can never be expanded by more than 3 bytes per full block.	2024-10-23 11:50:57 -07:00
Yann Collet	5ae34e4c96	ensure `lastBlock` is correctly determined reported by @terrelln	2024-10-23 11:50:57 -07:00
Yann Collet	7bad787d8b	made ZSTD_isPower2() an inline function	2024-10-23 11:50:57 -07:00
Yann Collet	a167571db5	added a faster block splitter variant that samples 1 in 5 positions. This variant is fast enough for lazy2 and btlazy2, but it's less good in combination with post-splitter at higher levels (>= btopt).	2024-10-23 11:50:57 -07:00
Yann Collet	1c62e714ab	minor split optimization let's fill the initial stats directly into target fingerprint	2024-10-23 11:50:57 -07:00
Yann Collet	4ce91cbf2b	fixed workspace alignment on non 64-bit systems	2024-10-23 11:50:57 -07:00
Yann Collet	cae8d13294	splitter workspace is now provided by ZSTD_CCtx*	2024-10-23 11:50:56 -07:00
Yann Collet	4685eafa81	fix alignment test for non 64-bit systems	2024-10-23 11:50:56 -07:00
Yann Collet	73a6653653	ZSTD_splitBlock_4k() uses externally provided workspace ideally, this workspace would be provided from the ZSTD_CCtx* state	2024-10-23 11:50:56 -07:00
Yann Collet	31d48e9ffa	fixing minor formatting issue in 32-bit mode with logs enabled	2024-10-23 11:50:56 -07:00
Yann Collet	6dc52122e6	fixed c90 comment style	2024-10-23 11:50:56 -07:00
Yann Collet	20c3d176cd	fix assert	2024-10-23 11:50:56 -07:00
Yann Collet	0d4b520657	only split full blocks short term simplification	2024-10-23 11:50:56 -07:00
Yann Collet	8b3887f579	fixed kernel build	2024-10-23 11:50:56 -07:00
Yann Collet	f83ed087f6	fixed RLE detection test	2024-10-23 11:50:56 -07:00
Yann Collet	83a3402a92	fix overlap write scenario in presence of incompressible data	2024-10-23 11:50:56 -07:00
Yann Collet	fa147cbb4d	more ZSTD_memset() to apply	2024-10-23 11:50:56 -07:00
Yann Collet	6021b6663a	minor C++-ism though I really wonder if this is a property worth maintaining.	2024-10-23 11:50:56 -07:00
Yann Collet	e2d7d08888	use ZSTD_memset() for better portability on Linux kernel	2024-10-23 11:50:56 -07:00
Yann Collet	586ca96fec	do not use `new` as variable name	2024-10-23 11:50:56 -07:00
Yann Collet	9e52789962	fixed strict C90 semantic	2024-10-23 11:50:56 -07:00
Yann Collet	a5bce4ae84	XP: add a pre-splitter instead of ingesting only full blocks, make an analysis of data, and infer where to split.	2024-10-23 11:50:56 -07:00
Yann Collet	47d4f5662d	rewrite code in the manner suggested by @terrelln	2024-10-17 09:37:23 -07:00
Yann Collet	6326775166	slightly improved compression ratio at levels 3 & 4 The compression ratio benefits are small but consistent, i.e. always positive. On `silesia.tar` corpus, this modification saves ~75 KB at level 3. The measured speed cost is negligible, i.e. below noise level, between 0 and -1%.	2024-10-17 09:37:23 -07:00
Yann Collet	c2abfc5ba4	minor improvement to level 3 dictionary compression ratio	2024-10-15 17:58:33 -07:00
Yann Collet	e63896eb58	small dictionary compression speed improvement not as good as small-blocks improvement, but generally positive.	2024-10-15 17:48:35 -07:00
Yann Collet	8c38bda935	Merge pull request #4165 from facebook/cspeed_cmov Improve compression speed on small blocks	2024-10-11 16:20:19 -07:00
Yann Collet	8e5823b65c	rename variable name findMatch -> matchFound since it's a test, as opposed to an active search operation. suggested by @terrelln	2024-10-11 15:38:12 -07:00
Yann Collet	83de00316c	fixed parameter ordering in `dfast` noticed by @terrelln	2024-10-11 15:36:15 -07:00
Yann Collet	fa1fcb08ab	minor: better variable naming	2024-10-10 16:07:20 -07:00
Yann Collet	d45aee43f4	make __asm__ a __GNUC__ specific	2024-10-08 16:38:35 -07:00
Yann Collet	741b860fc1	store dummy bytes within ZSTD_match4Found_cmov() feels more logical, better contained	2024-10-08 16:34:40 -07:00
Yann Collet	197c258a79	introduce memory barrier to force test order suggested by @terrelln	2024-10-08 15:54:48 -07:00
Yann Collet	186b132495	made search strategy switchable between cmov and branch and use a simple heuristic based on wlog to select between them. note: performance is not good on clang (yet)	2024-10-08 13:52:56 -07:00
Yann Collet	2cc600bab2	refactor search into an inline function for easier swapping with a parameter	2024-10-08 11:10:48 -07:00
Yann Collet	1e7fa242f4	minor refactor zstd_fast make hot variables more local	2024-10-07 11:22:40 -07:00
Ilya Tokar	e8fce38954	Optimize compression by avoiding unpredictable branches Avoid unpredictable branch. Use conditional move to generate the address that is guaranteed to be safe and compare unconditionally. Instead of if (idx < limit && x[idx] == val ) // mispredicted idx < limit branch Do addr = cmov(safe,x+idx) if (*addr == val && idx < limit) // almost always false so well predicted Using microbenchmarks from https://github.com/google/fleetbench, I get about ~10% speed-up: name old cpu/op new cpu/op delta BM_ZSTD_COMPRESS_Fleet/compression_level:-7/window_log:15 1.46ns ± 3% 1.31ns ± 7% -9.88% (p=0.000 n=35+38) BM_ZSTD_COMPRESS_Fleet/compression_level:-7/window_log:16 1.41ns ± 3% 1.28ns ± 3% -9.56% (p=0.000 n=36+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-5/window_log:15 1.61ns ± 1% 1.43ns ± 3% -10.70% (p=0.000 n=30+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-5/window_log:16 1.54ns ± 2% 1.39ns ± 3% -9.21% (p=0.000 n=37+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-3/window_log:15 1.82ns ± 2% 1.61ns ± 3% -11.31% (p=0.000 n=37+40) BM_ZSTD_COMPRESS_Fleet/compression_level:-3/window_log:16 1.73ns ± 3% 1.56ns ± 3% -9.50% (p=0.000 n=38+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-1/window_log:15 2.12ns ± 2% 1.79ns ± 3% -15.55% (p=0.000 n=34+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-1/window_log:16 1.99ns ± 3% 1.72ns ± 3% -13.70% (p=0.000 n=38+38) BM_ZSTD_COMPRESS_Fleet/compression_level:0/window_log:15 3.22ns ± 3% 2.94ns ± 3% -8.67% (p=0.000 n=38+40) BM_ZSTD_COMPRESS_Fleet/compression_level:0/window_log:16 3.19ns ± 4% 2.86ns ± 4% -10.55% (p=0.000 n=40+38) BM_ZSTD_COMPRESS_Fleet/compression_level:1/window_log:15 2.60ns ± 3% 2.22ns ± 3% -14.53% (p=0.000 n=40+39) BM_ZSTD_COMPRESS_Fleet/compression_level:1/window_log:16 2.46ns ± 3% 2.13ns ± 2% -13.67% (p=0.000 n=39+36) BM_ZSTD_COMPRESS_Fleet/compression_level:2/window_log:15 2.69ns ± 3% 2.46ns ± 3% -8.63% (p=0.000 n=37+39) BM_ZSTD_COMPRESS_Fleet/compression_level:2/window_log:16 2.63ns ± 3% 2.36ns ± 3% -10.47% (p=0.000 n=40+40) BM_ZSTD_COMPRESS_Fleet/compression_level:3/window_log:15 3.20ns ± 2% 2.95ns ± 3% -7.94% (p=0.000 n=35+40) BM_ZSTD_COMPRESS_Fleet/compression_level:3/window_log:16 3.20ns ± 4% 2.87ns ± 4% -10.33% (p=0.000 n=40+40) I've also measured the impact on internal workloads and saw similar ~10% improvement in performance, measured by cpu usage/byte of data.	2024-09-20 16:07:01 -04:00
Yann Collet	7a48dc230c	fix doc nit: ZDICT_DICTSIZE_MIN fix #4142	2024-09-19 09:50:30 -07:00
Yann Collet	09cb37cbb1	Limit range of operations on Indexes in 32-bit mode and use unsigned type. This reduce risks that an operation produces a negative number when crossing the 2 GB limit.	2024-08-21 11:03:43 -07:00
Yann Collet	1eb32ff594	Merge pull request #4115 from Adenilson/leak01 [zstd][leak] Avoid memory leak on early return of ZSTD_generateSequence	2024-08-09 14:09:17 -07:00
Yann Collet	ee1fc7ee5c	Merge pull request #4114 from Adenilson/trace01 [riscv] Enable support for weak symbols	2024-08-09 14:08:57 -07:00
Adenilson Cavalcanti	a40bad8ec0	[zstd][leak] Avoid memory leak on early return of ZSTD_generateSequence Sanity checks on a few of the context parameters (i.e. workers and block size) may prompt an early return on ZSTD_generateSequences. Allocating the destination buffer past those return points avoids a potential memory leak. This patch should fix issue #4112.	2024-08-06 18:01:20 -07:00
Adenilson Cavalcanti	6dbd49bcd0	[riscv] Enable support for weak symbols Both gcc and clang support weak symbols on RISC-V, therefore let's enable it. This should fix issue #4069.	2024-08-06 16:55:32 -07:00
Yann Collet	cb784edf5d	added android-ndk-build	2024-07-30 11:34:49 -07:00

1 2 3 4 5 ...

4633 Commits