1
0
mirror of https://github.com/facebook/zstd.git synced 2025-03-06 16:56:49 +02:00

4522 Commits

Author SHA1 Message Date
daniellerozenblit
1c818e3a0a
Merge pull request #3302 from daniellerozenblit/optimal-huff-depth-speed
Optimal huff depth speed improvements
2023-01-03 12:51:51 -05:00
Danielle Rozenblit
df714ddb0f implement suggestions 2023-01-03 07:20:21 -08:00
Yann Collet
d07e72bb13 fixed incorrect assert
commented Fweight instead
2022-12-28 17:23:40 -08:00
Yann Collet
4a1a79a512 just add some comments to zstd_opt for improved clarity 2022-12-28 16:24:12 -08:00
Yann Collet
9fbbd74871
Merge pull request #3400 from danlark1/dev
Move deprecated annotation before static to allow C++ compilation for clang
2022-12-28 15:50:26 -08:00
Yann Collet
00c85b28e7 update ZSTD_CCts_setCParams() inline documentation
specify behavior when changing compression parameters during MT compression,
reported by @embg
2022-12-28 15:08:18 -08:00
Yann Collet
481a2e1010
Merge pull request #3403 from facebook/setCParams
ZSTD_CCtx_setCParams
2022-12-28 14:07:13 -08:00
Elliot Gorokhovsky
2a402626dd
External matchfinder API (#3333)
* First building commit with sample matchfinder

* Set up ZSTD_externalMatchCtx struct

* move seqBuffer to ZSTD_Sequence*

* support non-contiguous dictionary

* clean up parens

* add clearExternalMatchfinder, handle allocation errors

* Add useExternalMatchfinder cParam

* validate useExternalMatchfinder cParam

* Disable LDM + external matchfinder

* Check for static CCtx

* Validate mState and mStateDestructor

* Improve LDM check to cover both branches

* Error API with optional fallback

* handle RLE properly for external matchfinder

* nit

* Move to a CDict-like model for resource ownership

* Add hidden useExternalMatchfinder bool to CCtx_params_s

* Eliminate malloc, move to cwksp allocation

* Handle CCtx reset properly

* Ensure seqStore has enough space for external sequences

* fix capitalization

* Add DEBUGLOG statements

* Add compressionLevel param to matchfinder API

* fix c99 issues and add a param combination error code

* nits

* Test external matchfinder API

* C90 compat for simpleExternalMatchFinder

* Fix some @nocommits and an ASAN bug

* nit

* nit

* nits

* forward declare copySequencesToSeqStore functions in zstd_compress_internal.h

* nit

* nit

* nits

* Update copyright headers

* Fix CMake zstreamtest build

* Fix copyright headers (again)

* typo

* Add externalMatchfinder demo program to make contrib

* Reduce memory consumption for small blockSize

* ZSTD_postProcessExternalMatchFinderResult nits

* test sum(matchlen) + sum(litlen) == srcSize in debug builds

* refExternalMatchFinder -> registerExternalMatchFinder

* C90 nit

* zstreamtest nits

* contrib nits

* contrib nits

* allow block splitter + external matchfinder, refactor

* add windowSize param

* add contrib/externalMatchfinder/README.md

* docs

* go back to old RLE heuristic because of the first block issue

* fix initializer element is not a constant expression

* ref contrib from zstd.h

* extremely pedantic compiler warning fix, meson fix, typo fix

* Additional docs on API limitations

* minor nits

* Refactor maxNbSeq calculation into a helper function

* Fix copyright
2022-12-28 16:45:14 -05:00
Yann Collet
b17743e41b Signal parameter change during MT compression 2022-12-28 13:14:58 -08:00
Yann Collet
89342d1e07 New xp library symbol : ZSTD_CCtx_setCParams()
Inspired by #3395,
offer a new capability to set all parameters defined in a ZSTD_compressionParameters structure
with a single symbol invocation
to improve user code brevity.
2022-12-27 23:49:22 -08:00
Daniel Kutenin
48f4aa7307
Move deprecated annotation before static to allow C++ compilation for clang
This fixes last 2 instances of https://github.com/facebook/zstd/issues/3250
2022-12-23 12:07:31 +00:00
Yann Collet
089b2797e3
Merge pull request #3398 from facebook/fix3316
spec update : require minimum nb of literals for 4-streams mode
2022-12-22 16:57:05 -08:00
Yann Collet
6a9c525903 spec update : require minimum nb of literals for 4-streams mode
Reported by @shulib :
the specification for 4-streams mode
doesn't work when the amount of literals to compress is 5 bytes.
Extending it, it also doesn't work for sizes 1 or 2.

This patch updates the specification and the implementation
to require a minimum of 6 literals to trigger or accept the 4-streams mode.

The impact is expected to be a no-op :
the 4-streams mode is never triggered for such small quantity of literals anyway,
since it would be wasteful (it costs ~7.3 bytes more than single-stream mode).
An informal lower limit is set at ~256 bytes,
so the technical minimum is very far from this limit.

This is just meant for completeness of the specification.
2022-12-22 16:14:34 -08:00
Yann Collet
ea2895cef4 Support decompression of compressed blocks of size ZSTD_BLOCKSIZE_MAX exactly 2022-12-22 12:40:27 -08:00
Nick Terrell
40a7188130 Fix make clangbuild & add CI
Fix the errors for:
* `-Wdocumentation`
* `-Wconversion` except `-Wsign-conversion`
2022-12-21 17:31:04 -08:00
Danielle Rozenblit
c26f348dc8 fix CI errors 2022-12-20 12:43:46 -08:00
Danielle Rozenblit
482689b995 huf log speed optimization: unidirectional scan of logs + break when regressing 2022-12-20 12:27:38 -08:00
Nick Terrell
e4018c4e7f [docs] Clarify dictionary loading documentation
Reinforce that loading a new dictionary clears the current dictionary.
Except for the multiple-ddict mode.
2022-12-20 11:10:49 -08:00
W. Felix Handte
5d693cc38c Coalesce Almost All Copyright Notices to Standard Phrasing
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do sed -i '/Copyright .* \(Yann Collet\)\|\(Meta Platforms\)/ s/Copyright .*/Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done

git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0*.c lib/legacy/zstd_v0*.h
nano ./programs/windres/zstd.rc
nano ./build/VS2010/zstd/zstd.rc
nano ./build/VS2010/libzstd-dll/libzstd-dll.rc
```
2022-12-20 12:52:34 -05:00
W. Felix Handte
8927f985ff Update Copyright Headers 'Facebook' -> 'Meta Platforms'
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
  sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
2022-12-20 12:37:57 -05:00
Yonatan Komornik
a8add436ce
Merge pull request #3364 from yoniko/fix-windows-mt-thread-resize-bug
Windows MT layer bug fixes
2022-12-19 15:54:01 -08:00
Yonatan Komornik
26f1bf7d70 CR fixes 2022-12-19 15:13:43 -08:00
Yann Collet
832c1a6a1c minor reformatting
and minor reliability and maintenance changes
2022-12-18 11:26:57 -08:00
Yonatan Komornik
ec42c92aaa Fix race condition in the Windows thread / pthread translation layer
When spawning a Windows thread we have small worker wrapper function that translates
between the interfaces of Windows and POSIX threads.
This wrapper is given a pointer that might get stale before the worker starts running,
resulting in UB and crashes.
This commit adds synchronization so that we know the wrapper has finished reading the data
it needs before we allow the main thread to resume execution.
2022-12-17 13:38:02 -08:00
Yonatan Komornik
500f02eb66 Fixes two bugs in the Windows thread / pthread translation layer
1. If threads are resized the threads' `ZSTD_pthread_t` might move
while the worker still holds a pointer into it (see more details in #3120).
2. The join operation was waiting for a thread and then return its `thread.arg`
as a return value, but since the `ZSTD_pthread_t thread` was passed by value it
would have a stale `arg` that wouldn't match the thread's actual return value.

This fix changes the `ZSTD_pthread_join` API and removes support for returning
a value. This means that we are diverging from the `pthread_join` API and this
is no longer just an alias.
In the future, if needed, we could return a Windows thread's return value using
`GetExitCodeThread`, but as this path wouldn't be excised in any case, it's
preferable to not add it right now.
2022-12-17 13:38:02 -08:00
Yann Collet
2f4238e47a make ZSTD_DECOMPRESSBOUND() compatible with input size 0
for environments with stringent compilation warnings.
2022-12-16 16:05:39 -08:00
Yann Collet
ea24b88667 decompressBound() tests
fixed an overflow in an intermediate result on 32-bit platform.
Checked that the new test catch this bug in 32-bit mode.
2022-12-16 15:43:26 -08:00
Yann Collet
51355e1f70
Merge pull request #3362 from facebook/compressBound
check potential overflow of compressBound()
2022-12-16 14:22:22 -08:00
Nick Terrell
2f7b8d47fb [zdict] Fix static linking only include guards
Fix `zdict.h` static linking only section so if you include it twice it
still exposes the static linking only symbols. E.g. this pattern:

```
```

This can easily happen when a header you include includes `zdict.h`.
2022-12-16 12:55:20 -08:00
Nick Terrell
0c42424a1e [build] Fix ZSTD_LIB_MINIFY build option
`ZSTD_LIB_MINIFY` broke in 8bf699aa59372d7c2bb4216bcf8037cab7dae51e.

This commit fixes the macro and the static library shrinks from ~600K to 324K
with ZSTD_LIB_MINIFY set.

Fixes #3066.
2022-12-16 12:55:05 -08:00
Nick Terrell
358a237484 [api][visibility] Make the visibility macros more consistent
1. Follow the scheme introduced in PR #2501 for both `zdict.h` and `zstd_errors.h`.
2. If the `*_VISIBLE` macro isn't set, but the `*_VISIBILITY` macro is, use that.
   Also make this change for `zstd.h`, since we probably shouldn't have changed
   that macro name without backward compatibility in the first place.
3. Change all references to `*_VISIBILITY` to `*_VISIBLE`.

Fixes #3359.
2022-12-16 12:54:45 -08:00
Yann Collet
97f63ce2b5 added unit tests for compressBound()
and rephrased the code documentation, as suggested by @terrelln
2022-12-16 12:35:14 -08:00
Nick Terrell
ee6475cbbd Add missing parens around macro definition
Fixes #3301.
2022-12-15 17:18:23 -08:00
Yann Collet
45ed0df18a check potential overflow of compressBound()
fixed #3323, reported by @nigeltao

Completed documentation around this risk
(which is largely theoretical,
I can't see that happening in any "real world" scenario,
but an erroneous @srcSize value could indeed trigger it).
2022-12-15 15:23:15 -08:00
Nick Terrell
a91e7ec175 Fix corruption that rarely occurs in 32-bit mode with wlog=25
Fix an off-by-one error in the compressor that emits corrupt blocks if:

* Zstd is compiled in 32-bit mode
* The windowLog == 25 exactly
* An offset of 2^25-3, 2^25-2, 2^25-1, or 2^25 is emitted
* The bitstream had 7 bits leftover before writing the offset

This bug has been present since before v1.0, but wasn't able to easily
be triggered, since until somewhat recently zstd wasn't able to find
matches that were within 128KB of the window size.

Add a test case, and fix 2 bugs in `ZSTD_compressSequences()`:
* The `ZSTD_isRLE()` check was incorrect. It wouldn't produce
  corruption, but it could waste CPU and not emit RLE even if the block
  was RLE
* One windowSize was `1 << windowLog`, not `1u << windowLog`

Thanks to @tansy for finding the issue, and giving us a reproducer!

Fixes Issue #3350.
2022-12-15 14:41:50 -08:00
daniellerozenblit
e2fc93340f
Merge branch 'dev' into http-to-https 2022-12-15 10:46:13 -05:00
Nick Terrell
728e73ebb4 [legacy] Remove FORCE_MEMORY_ACCESS and only use memcpy
Delete unaligned memory access code from the legacy codebase by removing all the
non-memcpy functions. We don't care about speed at all for this codebase, only
simplicity.
2022-12-14 17:54:35 -08:00
Nick Terrell
f31b83ff34 [decompress] Fix nullptr addition & improve fuzzer
Fix an instance of `NULL + 0` in `ZSTD_decompressStream()`. Also, improve our
`stream_decompress` fuzzer to pass `NULL` in/out buffers to
`ZSTD_decompressStream()`, and fix 2 issues that were immediately surfaced.

Fixes #3351
2022-12-14 17:54:22 -08:00
Alex Xu (Hello71)
a78c91ae59 Use proper unaligned access attributes
Instead of using packed attribute hack, just use aligned attribute. It
improves code generation on armv6 and armv7, and slightly improves code
generation on aarch64. GCC generates identical code to regular aligned
access on ARMv6 for all versions between 4.5 and trunk, except GCC 5
which is buggy and generates the same (bad) code as packed access:
https://gcc.godbolt.org/z/hq37rz7sb
2022-12-14 16:00:37 -08:00
Danielle Rozenblit
4dffc35f2e Convert references to https from http 2022-12-14 06:58:35 -08:00
Elliot Gorokhovsky
c43da3d605 Fix C90 compat 2022-12-13 18:01:32 -08:00
Elliot Gorokhovsky
e1e82f74f1 Reserve two fields in ZSTD_frameHeader 2022-12-13 17:45:05 -08:00
Danielle Rozenblit
69ec75f0d5 fix window resizing edge case 2022-12-13 08:35:20 -08:00
Yann Collet
80cf73fd66
Merge pull request #3320 from cwoffenden/msvc-arm64-size_t
Fix for MSVC C4267 warning on ARM64 (which becomes error C2220 with /WX)
2022-12-04 18:55:30 -08:00
Yann Collet
ecd7601c36 minor: proper pledgedSrcSize trace 2022-11-22 06:00:45 -08:00
Elliot Gorokhovsky
c8d870fe52 Improve LDM cparam validation logic 2022-11-21 15:39:18 -05:00
Carl Woffenden
0547c3d3f8 Random edit to re-run the CI
I don't believe the (x64) Mac failure is related to error since it would take the SSE path.
2022-11-19 19:04:08 +01:00
Carl Woffenden
0168914490 Fix for MSVC C4267 error 2022-11-18 11:31:17 +01:00
Danielle Rozenblit
c2638212af Change threshold for benchmarking 2022-10-27 13:13:17 -07:00
Danielle Rozenblit
db74d043d6 Speed optimizations with macro 2022-10-27 10:20:44 -07:00