1
0
mirror of https://github.com/facebook/zstd.git synced 2026-06-11 03:34:24 +02:00

830 Commits

Author SHA1 Message Date
NiDU-NINJA 3f8f9b3f89 Fix potential NULL pointer dereference in ZSTD_customCalloc when custom allocator fails 2026-03-07 08:27:15 +00:00
Sergey G. Brester (sebres) 6e1e545916 avoid potential RC on ctx->threadLimit, code review;
closes gh-4547; replaces gh-4558
2026-03-02 14:39:17 -05:00
Alexander Moch 2107c8f189 bitstream: fix BIT_readBits and BIT_reloadDStream prototypes
Align the declarations of BIT_readBits() and BIT_reloadDStream() in
bitstream.h with their FORCE_INLINE_TEMPLATE definitions.

The previous MEM_STATIC declarations caused an attribute mismatch
between the header and the definitions, which can lead to incorrect
compiler assumptions under certain toolchains and optimization levels.

Signed-off-by: Alexander Moch <mail@alexmoch.com>
2026-02-27 16:37:40 -05:00
ZijianLi 87cc127705 - Modify the GCC version used for CI testing of the RISCV architecture
- Fix a bug in the ZSTD_row_getRVVMask function
- Improve some performance for ZSTD_copy16()
2025-09-26 22:34:57 +08:00
Ryan Lefkowitz c59812e558 🔧 Fix memory leak in pthread init functions on failure
When pthread_mutex_init() or pthread_cond_init() fails in the debug
implementation (DEBUGLEVEL >= 1), the previously allocated memory was
not freed, causing a memory leak.

This fix ensures that allocated memory is properly freed when pthread
initialization functions fail, preventing resource leaks in error
conditions.

The issue affects:
- ZSTD_pthread_mutex_init() at lib/common/threading.c:146
- ZSTD_pthread_cond_init() at lib/common/threading.c:167

This is particularly important for long-running applications or
scenarios with resource constraints where pthread initialization
might fail due to system limits.
2025-09-15 18:20:01 -04:00
ZijianLi d04e7944dd add compiler version check. 2025-07-07 23:07:39 +08:00
Arpad Panyik 1e9d2006ae AArch64: Use better block copy8
The vector copy is only necessary for 16-byte blocks on AArch64.

Decompression uplifts on a Neoverse V2 system, using Zstd-1.5.8
compiled with "-O3 -march=armv8.2-a+sve2":

                 Clang-19  Clang-20    GCC-14    GCC-15
 1#silesia.tar:   +0.316%   +0.865%   +0.025%   +0.096%
 2#silesia.tar:   +0.689%   +1.374%   +0.027%   +0.065%
 3#silesia.tar:   +0.811%   +1.654%   +0.034%   +0.033%
 4#silesia.tar:   +0.912%   +1.755%   +0.027%   +0.042%
 5#silesia.tar:   +0.995%   +1.826%   +0.062%   +0.094%
 6#silesia.tar:   +0.976%   +1.777%   +0.065%   +0.104%
 7#silesia.tar:   +0.910%   +1.738%   +0.077%   +0.110%
2025-06-20 17:05:41 +00:00
Arpad Panyik 7e4937bc75 AArch64: Add SVE2 implementation of histogram computation
The existing scalar implementation uses a 4-way pipelined histogram
calculation which is very efficient on out-of-order CPUs. However,
this can be further accelerated using the SVE2 HISTSEG instructions -
which compute a histogram for 16 byte chunks in a vector register.

On a system with 128-bit vectors (VL128) we need 16 HISTSEG executions
to compute the histogram for the whole symbol space (0..255) of 16
bytes input. However we can only accumulate 15 of such 16 byte strips
before possible overflow. So we need to extend and save the 8-bit
histogram accumulators to 16-bit after every 240 byte chunks of input.
To store all in registers we would need 32 128-bit registers. Longer
SVE2 vectors could help here, if such machines become available.

The maximum input block size in Zstd is 128 KiB, so 16-bit accumulators
would not be enough. However an LZ pass will prepend the histogram
calculation, so it is impossible (my assumption) to overflow the 16-bit
accumulators.

The symbol distribution is also not uniform, the lower values are more
common, so we used a 3 pass algorithm to prevent stack spilling. In the
first pass we only compute histograms for 64 symbols (4-way SIMD) while
also computing the maximum symbol value. If we have symbol values
larger than 64 we start the second pass to compute the next 96 elements
of the histogram. The final pass calculates the remaining part of the
histogram (256 symbols in total) if needed. This split of histogram
generation gave the best overall results for performance.

This implementation is the best performing of a number of different
cache blocking schemes tested.

Compression uplifts on a Neoverse V2 system, using Zstd-1.5.8
(e26dde3d) as a baseline, compiled with "-O3 -march=armv8.2-a+sve2":

                 Clang-20    GCC-14
 1#silesia.tar:   +6.173%   +5.987%
 2#silesia.tar:   +5.200%   +5.011%
 3#silesia.tar:   +4.332%   +5.031%
 4#silesia.tar:   +2.789%   +3.064%
 5#silesia.tar:   +2.028%   +1.838%
 6#silesia.tar:   +1.562%   +1.340%
 7#silesia.tar:   +1.160%   +0.959%
2025-06-11 12:14:22 +00:00
李子建 d95123f2e6 Improve speed of ZSTD_compressSequencesAndLiterals() using RVV 2025-06-02 17:21:02 +08:00
Nick Terrell 0de4991942 Add a method for checking if ZSTD was compiled with flags that impact determinism 2025-03-07 10:31:19 -05:00
Yann Collet db2d205ada fixed -Wconversion for lib/decompress/zstd_decompress_block.c 2025-02-26 10:01:05 -08:00
Yann Collet 30281d889f fix conversion warning 2025-02-26 07:41:34 -08:00
Yann Collet 54e9d46db4 added __clang__ to compiler-specific alignment attribute
when clang is used within msvc, `__GNUC__` isn't defined,
so testing `__clang__` explicitly is required.
2025-02-05 13:48:24 -08:00
Yann Collet bcf404c0ab changed C11 keyword to _Alignas
so that it doesn't depend on #include
2025-02-05 13:25:14 -08:00
Yann Collet 26a2b5d5df Merge pull request #4265 from pps83/static-bmi2-check
Check `STATIC_BMI2` instead of `STATIC_BMI2 == 1`
2025-01-31 14:39:20 -08:00
Pavel P 0cda0100ea fix formatting 2025-01-24 03:03:22 +02:00
Pavel P f7e8fc339b Check STATIC_BMI2 instead of STATIC_BMI2 == 1 2025-01-24 03:03:21 +02:00
Pavel P 0a183620a3 Reorder __BMI2__ check
+ if `__BMI2__` defined, then set STATIC_BMI2 for all compilers
 + use `defined(_MSC_VER) && defined(__AVX2__)` as fallback for ms compiler
2025-01-24 03:02:47 +02:00
Pavel P d486ccc9e9 Update comment for STATIC_BMI2 macro 2025-01-24 03:02:47 +02:00
Pavel P 1b15e888fc Move STATIC_BMI2 block as-is to portability_macros.h 2025-01-24 03:02:46 +02:00
Yann Collet a7b59bcb7f Merge pull request #4257 from pps83/dev-x64test
Use _M_X64 only without mixing with _M_AMD64
2025-01-23 12:50:27 -08:00
Yann Collet 55c0c5bdca Merge pull request #4258 from pps83/dev-ZSTD_ALIGNED
Implement ZSTD_ALIGNED for ms compiler
2025-01-22 15:09:35 -08:00
Pavel P a0872a8372 Implement ZSTD_ALIGNED for ms compiler 2025-01-21 02:33:25 +02:00
Pavel P 6c1d1cc600 Use _M_X64 only without mixing with _M_AMD64 2025-01-21 02:27:39 +02:00
Yann Collet 48b186f76b Merge pull request #4253 from facebook/BitContainerType
minor: use BitContainerType when appropriate
2025-01-19 18:35:36 -08:00
Yann Collet 82346b92bb minor: generalize BitContainerType
technically equivalent to `size_t`,
but it's the proper type for underlying register representation.

This makes it possible to control register type, and therefore size, independently from `size_t`,
which can be useful on systems where `size_t` is 32-bit, while the architecture supports 64-bit registers.
2025-01-19 18:05:57 -08:00
Yann Collet 4bbf4a285d enable DYNAMIC_BMI2 by default on x86 (32-bit mode)
so far was only enabled for x64 (64-bit mode)
2025-01-19 08:11:59 -08:00
Yann Collet a556559841 no longer limit automated BMI2 detection to x64
this was previously no triggered in x86 32-bit mode,
due to a limitation in `bitstream.h`, that was fixed in #4248.

Now, `bmi2` will be automatically detected and triggered
at compilation time, if the corresponding instruction set is enabled,
even in 32-bit mode.

Also: updated library documentation, to feature STATIC_BMI2 build variable
2025-01-19 00:08:57 -08:00
Yann Collet 27d7940631 minor: cosmetic, indentation 2025-01-18 22:49:16 -08:00
Yann Collet 9efb09749b added a CI test for x86 32-bit + avx2 combination
which is expected to be quite rare, but nonetheless possible.

This test is initially expected to fail, before integration of #4248 fix
2025-01-18 22:49:16 -08:00
Yann Collet a469e7c083 Merge pull request #4248 from pps83/dev-bzhi32
Use _bzhi_u32 for 32-bit builds when building with STATIC_BMI2
2025-01-18 22:48:24 -08:00
Pavel P fcd684b9b4 update sizeof check 2025-01-19 02:37:35 +02:00
Pavel P d60c4d75e9 remove unrelated changes 2025-01-19 02:36:00 +02:00
Pavel P 462484d5dc change to BitContainerType 2025-01-19 02:34:41 +02:00
Pavel P 26e5fb3614 handle 32bit size_t when building for x64 2025-01-18 23:37:50 +02:00
Pavel P 936927a427 handle 32bit size_t when building for x64 2025-01-18 23:30:55 +02:00
Pavel P ee17f4c6d2 Use _bzhi_u32 for 32-bit builds when building with STATIC_BMI2
`_bzhi_u64` is available only for 64-bit builds, while `BIT_getLowerBits` expects `nbBits` to be less than `BIT_MASK_SIZE` (`BIT_MASK_SIZE` is 32)
2025-01-18 21:33:04 +02:00
Pavel P 46e17b805b [asm] Enable x86_64 asm for windows builds 2025-01-18 05:33:08 +02:00
Yann Collet 8bff69af86 Alignment instruction ZSTD_ALIGNED() in common/compiler.h 2025-01-15 17:11:27 -08:00
Yann Collet 6f8e6f3c97 create new compilation macro ZSTD_ARCH_X86_AVX2 2025-01-15 17:11:27 -08:00
MessyHack 42d704ad5e should check defined(_M_X64) not defined(_M_X86) when building with MSVC.
_M_X86 is only defined under MSVC 32Bit
_M_X64 is only defined under MSVC 64Bit
2025-01-10 22:47:48 -08:00
Victor Zhang a610550e2c Merge pull request #4218 from facebook/externC
Move #includes out of `extern "C"` blocks
2025-01-07 10:06:08 -08:00
Yann Collet a2ff6ea784 improve ZSTD_getFrameHeader on skippable frames
now reports:
- the header size
- the magic variant (within @dictID field)
2024-12-29 12:26:04 -08:00
Yann Collet b339efff2b add dedicated error code for special case
ZSTD_compressSequencesAndLiterals() cannot produce an uncompressed block
2024-12-20 10:37:00 -08:00
Yann Collet 0a5c0807af minor conversion warning fix 2024-12-20 10:36:59 -08:00
Yann Collet 477a01067f codemod: symbolEncodingType_e -> SymbolEncodingType_e 2024-12-20 10:36:56 -08:00
Yann Collet b4a40a845f move Sequences definition to zstd_compress_internal.h
they should not be in common/zstd_internal.h,
since these definitions are not shared beyond lib/compress/.
2024-12-20 10:36:55 -08:00
Victor Zhang 8f49db5a02 Revert "Remove unnecessary extern C declarations from xxhash.h"
This reverts commit 10b9d81909.
2024-12-19 17:54:41 -08:00
Victor Zhang 10b9d81909 Remove unnecessary extern C declarations from xxhash.h 2024-12-19 16:54:32 -08:00
Victor Zhang d0d5ce4c00 Remove extern C blocks from lib/* internal APIs (except xxhash.h) 2024-12-19 16:00:11 -08:00