1
0
mirror of https://github.com/facebook/zstd.git synced 2025-07-03 22:30:29 +02:00
Commit Graph

4440 Commits

Author SHA1 Message Date
23a356cdde Merge pull request #3424 from felixhandte/disable-asan-msan-poison-mingw
Disable Custom ASAN/MSAN Poisoning on MinGW Builds
2023-01-17 12:41:41 -05:00
5d8cfa6b96 Deprecate advanced streaming functions (#3408)
* deprecate advanced streaming functions

* remove internal usage of the deprecated functions

* nit

* suppress warnings in tests/zstreamtest.c

* purge ZSTD_initDStream_usingDict

* nits

* c90 compat

* zstreamtest.c already disables deprecation warnings!

* fix initDStream() return value

* fix typo

* wasn't able to import private symbol properly, this commit works around that

* new strategy for zbuff

* undo zbuff deprecation warning changes

* move ZSTD_DISABLE_DEPRECATE_WARNINGS from .h to .c
2023-01-13 14:51:47 -05:00
d78fbedd96 Don't Even Declare Poisoning Functions if Poisoning is Disabled
This guarantees that we won't accidentally forget to check the macro somewhere
where we use these functions.
2023-01-13 11:56:48 -05:00
f10922a8fa Disable Custom ASAN/MSAN Poisoning on MinGW Builds
Addresses #3240.
2023-01-13 11:53:09 -05:00
14b8defb86 move ZSTD_BLOCKSIZE_MAX_MIN to static linking only section 2023-01-13 07:00:50 -08:00
d5509080bc Merge pull request #3419 from facebook/fix3416
fix root cause of #3416
2023-01-13 00:21:08 -08:00
5b266196a4 Add support for in-place decompression
* Add a function and macro ZSTD_decompressionMargin() that computes the
  decompression margin for in-place decompression. The function computes
  a tight margin that works in all cases, and the macro computes an upper
  bound that will only work if flush isn't used.
* When doing in-place decompression, make sure that our output buffer
  doesn't overlap with the input buffer. This ensures that we don't
  decide to use the portion of the output buffer that overlaps the input
  buffer for temporary memory, like for literals.
* Add a simple unit test.
* Add in-place decompression to the simple_round_trip and
  stream_round_trip fuzzers. This should help verify that our margin stays
  correct.
2023-01-12 16:28:08 -08:00
ac45e078a5 add explanation about new test
as requested by @terrelln
2023-01-12 15:49:01 -08:00
796699c0bc fix root cause of #3416
A minor change in 5434de0 changed a `<=` into a `<`,
and as an indirect consequence allowed compression attempt of literals when there are only 6 literals to compress
(previous limit was effectively 7 literals).

This is not in itself a problem, as the threshold is merely an heuristic,
but it emerged a bug that has always been there, and was just never triggered so far due to the previous limit.
This bug would make the literal compressor believes that all literals are the same symbol,
but for the exact case where nbLiterals==6, plus a pretty wild combination of other limit conditions,
this outcome could be false, resulting in data corruption.

Replaced the blind heuristic by an actual test for all limit cases,
so that even if the threshold is changed again in the future,
the detection of RLE mode will remain reliable.
2023-01-12 15:41:08 -08:00
06b096db47 additional tests and documentation updates + allow maxBlockSize to be set to 0 (goes to default) 2023-01-12 13:41:50 -08:00
53eb5a758c add simple test for maxBlockSize expected functionality 2023-01-12 08:55:39 -08:00
1fffcfe01d update minimum threshold for max block size 2023-01-11 11:09:57 -08:00
fe08137d9a resolve max block value in cctx and use when calculating the max block size 2023-01-09 07:53:53 -08:00
71dbe8f9d4 minor: fix conversion warnings 2023-01-04 20:00:04 -08:00
d913417f72 Merge branch 'dev' into fuzz-max-block-size 2023-01-04 16:34:07 -05:00
908e812733 initial commit 2023-01-04 13:01:54 -08:00
ebba9ff425 update regression results 2023-01-03 14:04:23 -08:00
5434de01e2 improve compression ratio of small alphabets
fix #3328

In situations where the alphabet size is very small,
the evaluation of literal costs from the Optimal Parser is initially incorrect.
It takes some time to converge, during which compression is less efficient.
This is especially important for small files,
because there will not be enough data to converge,
so most of the parsing is selected based on incorrect metrics.

After this patch, the scenario ##3328 gets fixed,
delivering the expected 29 bytes compressed size (smallest known compressed size).
2023-01-03 12:22:37 -08:00
1c818e3a0a Merge pull request #3302 from daniellerozenblit/optimal-huff-depth-speed
Optimal huff depth speed improvements
2023-01-03 12:51:51 -05:00
df714ddb0f implement suggestions 2023-01-03 07:20:21 -08:00
d07e72bb13 fixed incorrect assert
commented Fweight instead
2022-12-28 17:23:40 -08:00
4a1a79a512 just add some comments to zstd_opt for improved clarity 2022-12-28 16:24:12 -08:00
9fbbd74871 Merge pull request #3400 from danlark1/dev
Move deprecated annotation before static to allow C++ compilation for clang
2022-12-28 15:50:26 -08:00
00c85b28e7 update ZSTD_CCts_setCParams() inline documentation
specify behavior when changing compression parameters during MT compression,
reported by @embg
2022-12-28 15:08:18 -08:00
481a2e1010 Merge pull request #3403 from facebook/setCParams
ZSTD_CCtx_setCParams
2022-12-28 14:07:13 -08:00
2a402626dd External matchfinder API (#3333)
* First building commit with sample matchfinder

* Set up ZSTD_externalMatchCtx struct

* move seqBuffer to ZSTD_Sequence*

* support non-contiguous dictionary

* clean up parens

* add clearExternalMatchfinder, handle allocation errors

* Add useExternalMatchfinder cParam

* validate useExternalMatchfinder cParam

* Disable LDM + external matchfinder

* Check for static CCtx

* Validate mState and mStateDestructor

* Improve LDM check to cover both branches

* Error API with optional fallback

* handle RLE properly for external matchfinder

* nit

* Move to a CDict-like model for resource ownership

* Add hidden useExternalMatchfinder bool to CCtx_params_s

* Eliminate malloc, move to cwksp allocation

* Handle CCtx reset properly

* Ensure seqStore has enough space for external sequences

* fix capitalization

* Add DEBUGLOG statements

* Add compressionLevel param to matchfinder API

* fix c99 issues and add a param combination error code

* nits

* Test external matchfinder API

* C90 compat for simpleExternalMatchFinder

* Fix some @nocommits and an ASAN bug

* nit

* nit

* nits

* forward declare copySequencesToSeqStore functions in zstd_compress_internal.h

* nit

* nit

* nits

* Update copyright headers

* Fix CMake zstreamtest build

* Fix copyright headers (again)

* typo

* Add externalMatchfinder demo program to make contrib

* Reduce memory consumption for small blockSize

* ZSTD_postProcessExternalMatchFinderResult nits

* test sum(matchlen) + sum(litlen) == srcSize in debug builds

* refExternalMatchFinder -> registerExternalMatchFinder

* C90 nit

* zstreamtest nits

* contrib nits

* contrib nits

* allow block splitter + external matchfinder, refactor

* add windowSize param

* add contrib/externalMatchfinder/README.md

* docs

* go back to old RLE heuristic because of the first block issue

* fix initializer element is not a constant expression

* ref contrib from zstd.h

* extremely pedantic compiler warning fix, meson fix, typo fix

* Additional docs on API limitations

* minor nits

* Refactor maxNbSeq calculation into a helper function

* Fix copyright
2022-12-28 16:45:14 -05:00
b17743e41b Signal parameter change during MT compression 2022-12-28 13:14:58 -08:00
89342d1e07 New xp library symbol : ZSTD_CCtx_setCParams()
Inspired by #3395,
offer a new capability to set all parameters defined in a ZSTD_compressionParameters structure
with a single symbol invocation
to improve user code brevity.
2022-12-27 23:49:22 -08:00
48f4aa7307 Move deprecated annotation before static to allow C++ compilation for clang
This fixes last 2 instances of https://github.com/facebook/zstd/issues/3250
2022-12-23 12:07:31 +00:00
089b2797e3 Merge pull request #3398 from facebook/fix3316
spec update : require minimum nb of literals for 4-streams mode
2022-12-22 16:57:05 -08:00
6a9c525903 spec update : require minimum nb of literals for 4-streams mode
Reported by @shulib :
the specification for 4-streams mode
doesn't work when the amount of literals to compress is 5 bytes.
Extending it, it also doesn't work for sizes 1 or 2.

This patch updates the specification and the implementation
to require a minimum of 6 literals to trigger or accept the 4-streams mode.

The impact is expected to be a no-op :
the 4-streams mode is never triggered for such small quantity of literals anyway,
since it would be wasteful (it costs ~7.3 bytes more than single-stream mode).
An informal lower limit is set at ~256 bytes,
so the technical minimum is very far from this limit.

This is just meant for completeness of the specification.
2022-12-22 16:14:34 -08:00
ea2895cef4 Support decompression of compressed blocks of size ZSTD_BLOCKSIZE_MAX exactly 2022-12-22 12:40:27 -08:00
40a7188130 Fix make clangbuild & add CI
Fix the errors for:
* `-Wdocumentation`
* `-Wconversion` except `-Wsign-conversion`
2022-12-21 17:31:04 -08:00
c26f348dc8 fix CI errors 2022-12-20 12:43:46 -08:00
482689b995 huf log speed optimization: unidirectional scan of logs + break when regressing 2022-12-20 12:27:38 -08:00
e4018c4e7f [docs] Clarify dictionary loading documentation
Reinforce that loading a new dictionary clears the current dictionary.
Except for the multiple-ddict mode.
2022-12-20 11:10:49 -08:00
5d693cc38c Coalesce Almost All Copyright Notices to Standard Phrasing
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do sed -i '/Copyright .* \(Yann Collet\)\|\(Meta Platforms\)/ s/Copyright .*/Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done

git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0*.c lib/legacy/zstd_v0*.h
nano ./programs/windres/zstd.rc
nano ./build/VS2010/zstd/zstd.rc
nano ./build/VS2010/libzstd-dll/libzstd-dll.rc
```
2022-12-20 12:52:34 -05:00
8927f985ff Update Copyright Headers 'Facebook' -> 'Meta Platforms'
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
  sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
2022-12-20 12:37:57 -05:00
a8add436ce Merge pull request #3364 from yoniko/fix-windows-mt-thread-resize-bug
Windows MT layer bug fixes
2022-12-19 15:54:01 -08:00
26f1bf7d70 CR fixes 2022-12-19 15:13:43 -08:00
832c1a6a1c minor reformatting
and minor reliability and maintenance changes
2022-12-18 11:26:57 -08:00
ec42c92aaa Fix race condition in the Windows thread / pthread translation layer
When spawning a Windows thread we have small worker wrapper function that translates
between the interfaces of Windows and POSIX threads.
This wrapper is given a pointer that might get stale before the worker starts running,
resulting in UB and crashes.
This commit adds synchronization so that we know the wrapper has finished reading the data
it needs before we allow the main thread to resume execution.
2022-12-17 13:38:02 -08:00
500f02eb66 Fixes two bugs in the Windows thread / pthread translation layer
1. If threads are resized the threads' `ZSTD_pthread_t` might move
while the worker still holds a pointer into it (see more details in #3120).
2. The join operation was waiting for a thread and then return its `thread.arg`
as a return value, but since the `ZSTD_pthread_t thread` was passed by value it
would have a stale `arg` that wouldn't match the thread's actual return value.

This fix changes the `ZSTD_pthread_join` API and removes support for returning
a value. This means that we are diverging from the `pthread_join` API and this
is no longer just an alias.
In the future, if needed, we could return a Windows thread's return value using
`GetExitCodeThread`, but as this path wouldn't be excised in any case, it's
preferable to not add it right now.
2022-12-17 13:38:02 -08:00
2f4238e47a make ZSTD_DECOMPRESSBOUND() compatible with input size 0
for environments with stringent compilation warnings.
2022-12-16 16:05:39 -08:00
ea24b88667 decompressBound() tests
fixed an overflow in an intermediate result on 32-bit platform.
Checked that the new test catch this bug in 32-bit mode.
2022-12-16 15:43:26 -08:00
51355e1f70 Merge pull request #3362 from facebook/compressBound
check potential overflow of compressBound()
2022-12-16 14:22:22 -08:00
2f7b8d47fb [zdict] Fix static linking only include guards
Fix `zdict.h` static linking only section so if you include it twice it
still exposes the static linking only symbols. E.g. this pattern:

```
```

This can easily happen when a header you include includes `zdict.h`.
2022-12-16 12:55:20 -08:00
0c42424a1e [build] Fix ZSTD_LIB_MINIFY build option
`ZSTD_LIB_MINIFY` broke in 8bf699aa59.

This commit fixes the macro and the static library shrinks from ~600K to 324K
with ZSTD_LIB_MINIFY set.

Fixes #3066.
2022-12-16 12:55:05 -08:00
358a237484 [api][visibility] Make the visibility macros more consistent
1. Follow the scheme introduced in PR #2501 for both `zdict.h` and `zstd_errors.h`.
2. If the `*_VISIBLE` macro isn't set, but the `*_VISIBILITY` macro is, use that.
   Also make this change for `zstd.h`, since we probably shouldn't have changed
   that macro name without backward compatibility in the first place.
3. Change all references to `*_VISIBILITY` to `*_VISIBLE`.

Fixes #3359.
2022-12-16 12:54:45 -08:00
97f63ce2b5 added unit tests for compressBound()
and rephrased the code documentation, as suggested by @terrelln
2022-12-16 12:35:14 -08:00