krak/zstd - zstd - Gitea: Git with a cup of tea

krak/zstd

mirror of https://github.com/facebook/zstd.git synced 2025-07-03 22:30:29 +02:00

Author	SHA1	Message	Date
W. Felix Handte	d09f195ceb	Remove blockCompressor NULL Checks	2023-05-04 12:18:58 -04:00
W. Felix Handte	b7add1dd67	Abort if Unsupported Parameters Used	2023-05-04 12:18:58 -04:00
W. Felix Handte	50cdf84f58	Macro-Exclude Block Compressors from Declaration/Definition	2023-05-04 12:18:58 -04:00
W. Felix Handte	cbf3e26316	Allow `ZSTD_selectBlockCompressor()` to Return NULL Return an error rather than segfaulting.	2023-05-04 12:18:58 -04:00
W. Felix Handte	5d693cc38c	Coalesce Almost All Copyright Notices to Standard Phrasing ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache $ -prune -o -type f); do sed -i '/Copyright .* $Yann Collet$\\|$Meta Platforms$/ s/Copyright ./Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0.c lib/legacy/zstd_v0*.h nano ./programs/windres/zstd.rc nano ./build/VS2010/zstd/zstd.rc nano ./build/VS2010/libzstd-dll/libzstd-dll.rc ```	2022-12-20 12:52:34 -05:00
W. Felix Handte	8927f985ff	Update Copyright Headers 'Facebook' -> 'Meta Platforms' ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora $ -prune -o -type f); do sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f; done ```	2022-12-20 12:37:57 -05:00
Elliot Gorokhovsky	f6ef14329f	"Short cache" optimization for level 1-4 DMS (+5-30% compression speed) (#3152 ) * first attempt at fast DMS short cache * significant wins for some scenarios * fix all clang regressions * nits * fix 1.5% gcc11 regression on hot 110Kdict scenario * fix CI * nit * Add tags to doublefast hash table * use tags in doublefast DMS * Fix CI * Clean up some hardcoded logic / constants * Switch forCCtx to an enum * nit * add short cache to ip+1 long search * Move tag size into hashLog * Minor nits * Truncate dictionaries greater than 16MB in short cache mode * Helper function for tag comparison * Cap short cache hashLog at 24 to prevent overflow * size_t dictTagsMatch -> int dictTagsMatch * nit * Clean up and comment dictionary truncation * Move ZSTD_tableFillPurpose_e next to ZSTD_dictTableLoadMethod_e * Comment and expand helper functions * Asserts and documentation * nit	2022-06-21 17:27:19 -04:00
Dominique Pelle	b772f53952	Typo and grammar fixes	2022-03-12 08:58:04 +01:00
Yann Collet	7a18d709ae	updated all names to offBase convention	2021-12-29 17:30:43 -08:00
Yann Collet	435f5a2e6d	fixed regression test assert optLdm->offset might be == 0 in invalid case. Only use STORE_OFFSET() after validating it's a correct case.	2021-12-28 09:55:31 -08:00
Yann Collet	1aed962216	introduce macros STORE_OFFSET() and STORE_REPCODE() this meant to abstract the sumtype representation required to transfert `offcode` to `ZSTD_storeSeq()`. Unfortunately, the sumtype numeric representation is currently a leaky abstraction that has permeated many other parts of the code, especially within `zstd_lazy.c` and also within `zstd_opt.c` and `zstd_compress.c`. While this PR makes a good job a transfering a large nb of call sites to using the new macros, there are still a few sites where this transformation is more complex, or where the numeric representation itself it used "as is". One of the problematics area is the decision to use the numeric format of the sumtype within the match finders of `zstd_lazy`. This commit doesn't change the behavior, it only introduces and employes the macros, but eventually the resulting code remains identical. At target, if the numeric representation of the sumtype can be completely abstracted and no other part of the code depends on it, it will be possible to move it towards something slightly more efficient.	2021-12-23 22:03:30 -08:00
Yann Collet	b77fcac61f	change ZSTD_storeSeq() interface to accept matchLength instead of mlBase. This removes the need to do `- MINMATCH` at every call site. The new interface contract is checked with an `assert()`.	2021-12-23 12:03:33 -08:00
Dimitris Apostolou	ebbd675998	Fix typos	2021-11-13 10:04:04 +02:00
senhuang42	06f42c3bfd	Use new paramSwitch enum for LDM	2021-09-21 14:22:09 -04:00
senhuang42	b5c35d7ea3	Use new paramSwitch enum for LCM, row matchfinder, and block splitter	2021-09-21 14:22:02 -04:00
Nick Terrell	8389a5122b	Merge pull request #2602 from terrelln/ldm-opt [LDM] Speed optimization on repetitive data	2021-05-04 23:13:09 -07:00
Nick Terrell	32823bc150	[LDM] Speed optimization on repetitive data LDM does especially poorly on repetitive data when that data's hash happens to have `(hash & stopMask) == 0`. Either because the `stopMask == 0` or random chance. Optimize this case by skipping over repetitive patterns. The detection is very simplistic, but should catch most of the offending cases. ``` head -c 1G /dev/zero \| perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long 21.187881087 seconds time elapsed head -c 1G /dev/zero \| perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long 1.149707921 seconds time elapsed ```	2021-05-04 10:57:42 -07:00
Nick Terrell	34aff7ea06	Bug fix & run overflow correction much more frequently in tests * Fix overflow correction when `windowLog < cycleLog`. Previously, we got the correction wrong in this case, and our chain tables and binary trees would be corrupted. Now, we work as long as `maxDist` is a power of two, by adding `MAX(maxDist, cycleSize)` to our indices. * When `ZSTD_WINDOW_OVERFLOW_CORRECT_FREQUENTLY` is defined to non-zero run overflow correction as frequently as allowed without impacting compression ratio. * Enable `ZSTD_WINDOW_OVERFLOW_CORRECT_FREQUENTLY` in `fuzzer` and `zstreamtest` as well as all the OSS-Fuzz fuzzers. This has a 5-10% speed penalty at most, which seems reasonable.	2021-05-03 15:21:47 -07:00
Nick Terrell	4694423c4f	Add and integrate lazy row hash strategy	2021-04-07 09:53:34 -07:00
Nick Terrell	a494308ae9	[copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files * Switch to yearless copyright per FB policy * Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources * Add zstd copyright/license header to the `contrib/linux-kernel` sources * Update the `tests/test-license.py` to check for yearless copyright * Improvements to `tests/test-license.py` * Check `contrib/linux-kernel` in `tests/test-license.py`	2021-03-30 10:30:43 -07:00
Quentin Carbonneaux	552efcac2d	relocate large arrays from the stack to ldmState_t	2021-02-10 16:16:54 +01:00
Quentin Carbonneaux	e2ad174d73	fix some compiler warnings	2021-02-08 20:19:16 +01:00
Quentin Carbonneaux	874a590e5c	deal safely with short inputs in ZSTD_ldm_generateSequences The fuzzer CI found this bug.	2021-02-04 11:15:24 +01:00
Quentin Carbonneaux	9f327c02fd	new core ldm algorithm	2021-02-03 22:24:07 +01:00
Quentin Carbonneaux	aee3dc877f	fix a variable name to reflect its nature	2021-01-22 02:24:19 -08:00
Quentin Carbonneaux	d6e3de77dc	fix warning and remove one more occurrence of makeEntryAndInsertByTag	2021-01-20 01:39:16 -08:00
Quentin Carbonneaux	e0d5eca8fa	fix forgotten numTagBits in getTagMask	2021-01-20 00:54:20 -08:00
Quentin Carbonneaux	1e65711ca5	a couple performance improvement changes for ldm	2021-01-20 00:54:20 -08:00
Thomas Waldmann	92a2b5ccc9	fixup: lits means literals	2021-01-07 23:30:42 +01:00
Thomas Waldmann	f9802d80a0	fix typos (work done by Andrea Gelmini)	2021-01-07 18:47:23 +01:00
Nick Terrell	66e811d782	[license] Update year to 2021	2021-01-04 17:53:52 -05:00
Nick Terrell	0953645837	Merge pull request #2362 from senhuang42/fix_ldm_fuzz_issue Fix long distance matcher OSS-fuzz issue	2020-10-27 11:13:03 -07:00
senhuang42	4d01979b62	Expose and call ZSTD_ldm_skipRawSeqStoreBytes()	2020-10-16 20:30:00 -04:00
senhuang42	d0550bb18f	Clarify argument names, fix DEBUGLOG() statements	2020-10-14 15:45:43 -04:00
senhuang42	3f99c9b38d	Adjust match backwards count args	2020-10-14 15:23:03 -04:00
senhuang42	bf0d559449	Introduce, implement, and call ZSTD_ldm_countBackwardsMatch_2segments()	2020-10-14 12:58:06 -04:00
senhuang42	a6165c1b28	Change matchState_t::ldmSeqStore to pointer	2020-10-07 14:13:57 -04:00
senhuang42	abce708a56	Move posInSequence correction to correct location	2020-10-07 13:56:25 -04:00
senhuang42	0fac8e07e1	Refactor usage of ms->ldmSeqStore so that it is not modified during compressBlock(), and simplify skipRawSeqStoreBytes	2020-10-07 13:56:25 -04:00
senhuang42	a5500cf2af	Refactor separate ldm variables all into one struct	2020-10-07 13:56:25 -04:00
senhuang42	031b7ec15f	Disable LDM minMatch adjustment when using opt parser	2020-10-07 13:56:25 -04:00
senhuang42	b8bfc4e63d	Add cSize regression test to fuzzer.c	2020-10-07 13:56:25 -04:00
senhuang42	10647924f1	Make function descriptions more accurate	2020-10-07 13:56:25 -04:00
senhuang42	7dee62c287	Reset ldmSeqStore after initStats_ultra() pass for btultra2	2020-10-07 13:56:25 -04:00
senhuang42	ea92fb3a68	Cleanups, add comments and explanations	2020-10-07 13:56:25 -04:00
senhuang42	6ccd97fc96	Fixed end of match boundary update issues	2020-10-07 13:56:25 -04:00
senhuang42	28394b64f2	Add proper bounds check on adding ldms	2020-10-07 13:56:25 -04:00
senhuang42	f57c7e6bbf	Add base adjustment correction	2020-10-07 13:56:25 -04:00
senhuang42	84009a076a	Add re-copying of ldmSeqStore after processing	2020-10-07 13:56:25 -04:00
senhuang42	35d9f488f5	Modify codepath to use opt parser exclusively if the compression level is high enough	2020-10-07 13:56:24 -04:00

1 2

94 Commits