krak/zstd - zstd - Gitea: Git with a cup of tea

krak/zstd

mirror of https://github.com/facebook/zstd.git synced 2025-07-16 20:24:32 +02:00

Author	SHA1	Message	Date
Yann Collet	7306832e8a	Merge pull request #3570 from facebook/rsync_doc [easy] minor doc update for --rsyncable	2023-03-27 09:07:13 -07:00
W. Felix Handte	1b8bddc41e	[contrib/pzstd] Detect and Select Maximum Available C++ Standard Rather than remove the flag entirely, as proposed in #3499, this commit uses the newest C++ standard the compiler supports. This retains the selection of using only standardized features (excluding GNU extensions) and keeps the recency requirements of the codebase explicit. Tested with various versions of `g++` and `clang++`.	2023-03-27 11:24:47 -04:00
Yann Collet	167157dd74	Merge pull request #3572 from facebook/dependabot/github_actions/actions/checkout-3.5.0 Bump actions/checkout from 3.3.0 to 3.5.0	2023-03-27 07:07:55 -07:00
dependabot[bot]	191d22994f	Bump github/codeql-action from 2.2.6 to 2.2.8 Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.2.6 to 2.2.8. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](`16964e90ba...67a35a0858`) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-03-27 06:06:11 +00:00
dependabot[bot]	4cf9c7e098	Bump actions/checkout from 3.3.0 to 3.5.0 Bumps [actions/checkout](https://github.com/actions/checkout) from 3.3.0 to 3.5.0. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](`ac59398561...8f4b7f8486`) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2023-03-27 06:06:05 +00:00
Yann Collet	35c0c2075e	minor doc update on --rsyncable as requested by @devZer0. fix #3567	2023-03-23 15:42:27 -06:00
Rick Mark	ca799f84ca	Merge branch 'readme_cmake_fat' of github.com:facebook/zstd into readme_cmake_fat	2023-03-23 09:44:06 -07:00
Rick Mark	408bd1e9fe	Add instructions for building Universal2 on macOS via CMake	2023-03-23 09:41:31 -07:00
Tobias Hieta	979b047114	Disable linker flag detection on MSVC/ClangCL. This fixes compilation with clang-cl on Windows. There is a bug in cmake so that check_linker_flag() doesn't give the correct result when using link.exe/lld-link.exe. Details in CMake's gitlab: https://gitlab.kitware.com/cmake/cmake/-/issues/22023 Fixes #3522	2023-03-22 22:13:57 +01:00
Rick Mark	82cf6037ac	Add instructions for building Universal2 on macOS via CMake	2023-03-22 11:28:03 -07:00
daniellerozenblit	3e0550ee52	fix window update (#3556 )	2023-03-21 13:28:26 -04:00
Nick Terrell	a3c3a38b9b	[lazy] Skip over incompressible data Every 256 bytes the lazy match finders process without finding a match, they will increase their step size by 1. So for bytes [0, 256) they search every position, for bytes [256, 512) they search every other position, and so on. However, they currently still insert every position into their hash tables. This is different from fast & dfast, which only insert the positions they search. This PR changes that, so now after we've searched 2KB without finding any matches, at which point we'll only be searching one in 9 positions, we'll stop inserting every position, and only insert the positions we search. The exact cutoff of 2KB isn't terribly important, I've just selected a cutoff that is reasonably large, to minimize the impact on "normal" data. This PR only adds skipping to greedy, lazy, and lazy2, but does not touch btlazy2. \| Dataset \| Level \| Compiler \| CSize ∆ \| Speed ∆ \| \|---------\|-------\|--------------\|---------\|---------\| \| Random \| 5 \| clang-14.0.6 \| 0.0% \| +704% \| \| Random \| 5 \| gcc-12.2.0 \| 0.0% \| +670% \| \| Random \| 7 \| clang-14.0.6 \| 0.0% \| +679% \| \| Random \| 7 \| gcc-12.2.0 \| 0.0% \| +657% \| \| Random \| 12 \| clang-14.0.6 \| 0.0% \| +1355% \| \| Random \| 12 \| gcc-12.2.0 \| 0.0% \| +1331% \| \| Silesia \| 5 \| clang-14.0.6 \| +0.002% \| +0.35% \| \| Silesia \| 5 \| gcc-12.2.0 \| +0.002% \| +2.45% \| \| Silesia \| 7 \| clang-14.0.6 \| +0.001% \| -1.40% \| \| Silesia \| 7 \| gcc-12.2.0 \| +0.007% \| +0.13% \| \| Silesia \| 12 \| clang-14.0.6 \| +0.011% \| +22.70% \| \| Silesia \| 12 \| gcc-12.2.0 \| +0.011% \| -6.68% \| \| Enwik8 \| 5 \| clang-14.0.6 \| 0.0% \| -1.02% \| \| Enwik8 \| 5 \| gcc-12.2.0 \| 0.0% \| +0.34% \| \| Enwik8 \| 7 \| clang-14.0.6 \| 0.0% \| -1.22% \| \| Enwik8 \| 7 \| gcc-12.2.0 \| 0.0% \| -0.72% \| \| Enwik8 \| 12 \| clang-14.0.6 \| 0.0% \| +26.19% \| \| Enwik8 \| 12 \| gcc-12.2.0 \| 0.0% \| -5.70% \| The speed difference for clang at level 12 is real, but is probably caused by some sort of alignment or codegen issues. clang is significantly slower than gcc before this PR, but gets up to parity with it. I also measured the ratio difference for the HC match finder, and it looks basically the same as the row-based match finder. The speedup on random data looks similar. And performance is about neutral, without the big difference at level 12 for either clang or gcc.	2023-03-20 11:18:29 -07:00
Peter Pentchev	3b001a38fe	Simplify line splitting in the CLI tests	2023-03-20 11:17:43 -07:00
Peter Pentchev	29b8a3d8f2	Fix a Python bytes/int mismatch in CLI tests In Python 3.x, a single element of a bytes array is returned as an integer number. Thus, NEWLINE is an int variable, and attempting to add it to the line array will fail with a type mismatch error that may be demonstrated as follows: [roam@straylight ~]$ python3 -c 'b"hello" + b"\n"[0]' Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: can't concat int to bytes [roam@straylight ~]$	2023-03-20 11:17:43 -07:00
Yann Collet	e2208242ac	Merge pull request #3553 from facebook/ldm_dict added documentation for LDM + dictionary compatibility	2023-03-16 11:20:32 -07:00
Nick Terrell	fbd97f305a	Deprecated bufferless and block level APIs * Mark all bufferless and block level functions as deprecated * Update documentation to suggest not using these functions * Add `_deprecated()` wrappers for functions that we use internally and call those instead	2023-03-16 10:04:15 -07:00
daniellerozenblit	53bad103ce	patch-from speed optimization (#3545 ) * patch-from speed optimization: only load portion of dictionary into normal matchfinders * test regression for x8 multiplier * fix off-by-one error for bit shift bound * restrict patchfrom speed optimization to strategy < ZSTD_btultra * update results.csv * update regression test	2023-03-14 20:36:56 -04:00
Yann Collet	f4563d87b9	added documentation for LDM + dictionary compatibility	2023-03-14 17:17:21 -07:00
Yann Collet	488e45f38b	Merge pull request #3547 from facebook/seekable_doc added documentation for the seekable format	2023-03-13 20:25:58 -07:00
Yonatan Komornik	91f4c23e63	Add salt into row hash (#3528 part 2) (#3533 ) Part 2 of #3528 Adds hash salt that helps to avoid regressions where consecutive compressions use the same tag space with similar data (running zstd -b5e7 enwik8 -B128K reproduces this regression).	2023-03-13 15:34:13 -07:00
Yonatan Komornik	9420bce8a4	Add init once memory (#3528 ) (#3529 ) - Adds memory type that is guaranteed to have been initialized at least once in the workspace's lifetime. - Changes tag space in row hash to be based on init once memory.	2023-03-13 13:20:49 -07:00
dependabot[bot]	e2965edd10	Bump github/codeql-action from 2.2.5 to 2.2.6 (#3549 ) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.2.5 to 2.2.6. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](`32dc499307...16964e90ba`) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-03-13 10:07:20 -07:00
Yonatan Komornik	a91e91d614	[Bugfix] row hash tries to match position 0 (#3548 ) #3543 decreases the size of the tagTable by a factor of 2, which requires using the first tag position in each row for head position instead of a tag. Although position 0 stopped being a valid match, it still persisted in mask calculation resulting in the matches loops possibly terminating before it should have. The fix skips position 0 to solve this problem.	2023-03-13 10:00:03 -07:00
Yann Collet	dd8cb5a0f1	added documentation for the seekable format and notably provide additional context for the Maximum Frame Size parameter. requested by @P-E-Meunier at `1df9f36c6c (commitcomment-103856979)`.	2023-03-10 15:54:31 -08:00
Yonatan Komornik	33e39094e7	Reduce RowHash's tag space size by x2 (#3543 ) Allocate half the memory for tag space, which means that we get one less slot for an actual tag (needs to be used for next position index). The results is a slight loss in compression ratio (up to 0.2%) and some regressions/improvements to speed depending on level and sample. In turn, we get to save 16% of the hash table's space (5 bytes per entry instead of 6 bytes per entry).	2023-03-10 14:15:04 -08:00
Yann Collet	134d332b10	Merge pull request #3544 from facebook/seek_faster Improved seekable format ingestion speed for small frame size	2023-03-10 12:33:33 -08:00
Yann Collet	1df9f36c6c	Improved seekable format ingestion speed for small frame size As reported by @P-E-Meunier in https://github.com/facebook/zstd/issues/2662#issuecomment-1443836186, seekable format ingestion speed can be particularly slow when selected `FRAME_SIZE` is very small, especially in combination with the recent row_hash compression mode. The specific scenario mentioned was `pijul`, using frame sizes of 256 bytes and level 10. This is improved in this PR, by providing approximate parameter adaptation to the compression process. Tested locally on a M1 laptop, ingestion of `enwik8` using `pijul` parameters went from 35sec. (before this PR) to 2.5sec (with this PR). For the specific corner case of a file full of zeroes, this is even more pronounced, going from 45sec. to 0.5sec. These benefits are unrelated to (and come on top of) other improvement efforts currently being made by @yoniko for the row_hash compression method specifically. The `seekable_compress` test program has been updated to allows setting compression level, in order to produce these performance results.	2023-03-09 18:00:30 -08:00
Felix Handte	d55a6483d7	Merge pull request #3542 from felixhandte/pin-moar-action-deps Pin Moar Action Dependencies	2023-03-09 16:22:11 -08:00
W. Felix Handte	cd9486031d	Also Pin Dockerfile Dependency Hashes	2023-03-09 17:01:22 -05:00
Felix Handte	283c228abe	Merge pull request #3541 from felixhandte/fix-setvbuf-segfault Avoid Segfault Caused by Calling `setvbuf()` on Null File Pointer	2023-03-09 13:54:11 -08:00
Yann Collet	e769da1645	Merge pull request #3526 from facebook/bench_zstd_api Simplify benchmark unit invocation API from CLI	2023-03-09 13:11:11 -08:00
Yann Collet	6bedef8095	Merge pull request #3538 from facebook/doc_huffman added clarifications for sizes of compressed huffman blocks and streams.	2023-03-09 13:09:42 -08:00
daniellerozenblit	e0fc9fd90b	Merge pull request #3486 from daniellerozenblit/patch-from-low-memory-mode Mmap large dictionaries in patch-from mode	2023-03-09 15:30:09 -05:00
Nick Terrell	c40c7378c6	Clarify dstCapacity requirements Clarify `dstCapacity` requirements for single-pass functions. Fixes #3524.	2023-03-09 10:18:30 -08:00
W. Felix Handte	1ec556238e	Pin Moar Action Dependencies An offering to the Scorecard gods, may they have mercy on our souls.	2023-03-09 12:54:07 -05:00
W. Felix Handte	957a0ae52d	Add CLI Test	2023-03-09 12:48:11 -05:00
W. Felix Handte	c4c3e11958	Avoid Calling `setvbuf()` on Null File Pointer	2023-03-09 12:47:40 -05:00
W. Felix Handte	50e8f55e7d	Fix Python 3.6 Incompatibility in CLI Tests	2023-03-09 12:46:37 -05:00
Dmitriy Voropaev	b7080f4c67	Increase tests timeout Current timeout is too small for some slower machines, e.g. most modern riscv64 boards, where tests fail with the following diagnostics: Traceback (most recent call last): File "/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/./cli-tests/run.py", line 734, in <module> success = run_tests(tests, opts) File "/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/./cli-tests/run.py", line 601, in run_tests tests[test_case.name] = test_case.run() File "/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/./cli-tests/run.py", line 285, in run return self.analyze() File "/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/./cli-tests/run.py", line 275, in analyze self._join_test() File "/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/./cli-tests/run.py", line 330, in _join_test (stdout, stderr) = self._test_process.communicate(timeout=self._opts.timeout) File "/usr/lib64/python3.10/subprocess.py", line 1154, in communicate stdout, stderr = self._communicate(input, endtime, timeout) File "/usr/lib64/python3.10/subprocess.py", line 2006, in _communicate self._check_timeout(endtime, orig_timeout, stdout, stderr) File "/usr/lib64/python3.10/subprocess.py", line 1198, in _check_timeout raise TimeoutExpired( subprocess.TimeoutExpired: Command '['/usr/src/RPM/BUILD/zstd-1.5.4-alt2/tests/cli-tests/compression/window-resize.sh']' timed out after 60 seconds	2023-03-09 16:31:05 +04:00
Danielle Rozenblit	70850eb72b	assert to ensure that dict buffer type is valid	2023-03-08 16:54:57 -08:00
Yann Collet	64e8511b26	added clarifications for sizes of compressed huffman blocks and streams.	2023-03-08 15:31:36 -08:00
Nick Terrell	07a2a33135	Add ZSTD_set{C,F,}Params() helper functions * Add ZSTD_setFParams() and ZSTD_setParams() * Modify ZSTD_setCParams() to use ZSTD_setParameter() to avoid a second path setting parameters * Add unit tests * Update documentation to suggest using them to replace deprecated functions Fixes #3396.	2023-03-08 09:57:35 -08:00
Danielle Rozenblit	96e55c14f2	ability to disable mmap + struct to manage FIO dictionary	2023-03-08 08:06:10 -08:00
Nick Terrell	6313a58e45	[linux-kernel] Fix assert definition Backport upstream fix of the assert definition. This code is currently unused, and can be enabled for testing, which is why it wasn't caught. https://lore.kernel.org/lkml/20230129131436.1343228-1-j.neuschaefer@gmx.net/	2023-03-07 16:53:36 -08:00
Yonatan Komornik	988ce61a0c	Adds initialization of clevel to static cdict (#3525 ) (#3527 ) - Initializes clevel in `ZSTD_CCtxParams_init` - Adds CI workflow for msan fuzzers runs without optimization (`-O0`) - Fixes Makefile to correctly pass on user defined `MOREFLAGS` and `FUZZER_FLAGS` in cases they have been overwritten	2023-03-06 18:05:12 -08:00
Yann Collet	1e38e07b3d	simplified BMK_benchFilesAdvanced()	2023-03-06 12:34:13 -08:00
Yann Collet	9efc14804e	minor: fixed zlib wrapper internal benchmark another possibility could be to link it to programs/benchfn . Not worth the effort.	2023-03-06 12:20:06 -08:00
Yann Collet	db79219f70	simplify BMK_syntheticTest()	2023-03-06 12:15:22 -08:00
Yann Collet	db7d7b6974	Merge pull request #3516 from dloidolt/fullbench_2_files fullbench with two files	2023-03-06 11:56:30 -08:00
dependabot[bot]	1be95291a8	Bump github/codeql-action from 2.2.4 to 2.2.5 (#3518 ) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.2.4 to 2.2.5. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](`17573ee1cc...32dc499307`) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-03-02 10:44:06 -08:00

1 2 3 4 5 ...

10235 Commits