krak/zstd - zstd - Gitea: Git with a cup of tea

krak/zstd

mirror of https://github.com/facebook/zstd.git synced 2025-03-06 16:56:49 +02:00

Author	SHA1	Message	Date
Nick Terrell	395a2c5462	[bug-fix] Fix rare corruption bug affecting the block splitter The block splitter confuses sequences with literal length == 65536 that use a repeat offset code. It interprets this as literal length == 0 when deciding the meaning of the repeat offset, and corrupts the repeat offset history. This is benign, merely causing suboptimal compression performance, if the confused history is flushed before the end of the block, e.g. if there are 3 consecutive non-repeat code sequences after the mistake. It also is only triggered if the block splitter decided to split the block. All that to say: This is a rare bug, and requires quite a few conditions to trigger. However, the good news is that if you have a way to validate that the decompressed data is correct, e.g. you've enabled zstd's checksum or have a checksum elsewhere, the original data is very likely recoverable. So if you were affected by this bug please reach out. The fix is to remind the block splitter that the literal length is actually 64K. The test case is a bit tricky to set up, but I've managed to reproduce the issue. Thanks to @danlark1 for alerting us to the issue and providing us a reproducer!	2023-02-23 10:54:31 -08:00
Nick Terrell	a70ca2bd7d	Fix off-by-one error in superblock mode (#3221 ) Fixes #3212. Long literal and match lengths had an off-by-one error in ZSTD_getSequenceLength. Fix the off-by-one error, and add a golden compression test that catches the bug. Also run all the golden tests in the cli-tests framework.	2022-08-03 11:28:39 -07:00
Nick Terrell	08981d2638	[lib] Allow compression dictionaries with missing symbols Allow compression to use dictionaries with missing symbols in their entropy tables. We set the FSE repeat mode to check when there are missing symbols, and set the FSE repeat mode to valid when all symbols are present. Note that when not all symbols are present, the heuristics which favor dictionary tables for lower compression levels won't activate. Tested by manually creating a dictionary with missing symbols of every type, and validing that the compressor rejects it before this change, and accepts it after this change. Also, I ran the `dictionary_loader` fuzzer for >1 hour of CPU time without running into cases where compression succeeds, but decompression fails. Fixes #2174.	2020-06-12 17:57:19 -07:00
Bimba Shrestha	e6be4cf4eb	Changing test file directory names to be more descriptive	2019-09-09 12:08:33 -07:00

4 Commits