1
0
mirror of https://github.com/facebook/zstd.git synced 2025-07-03 22:30:29 +02:00
Commit Graph

100 Commits

Author SHA1 Message Date
a7de1d9f49 Fix all MSVC warnings (#3495)
* fix and test MSVC AVX2 build

* treat msbuild warnings as errors

* fix incorrect MSVC 2019 compiler warning

* fix MSVC error D9035: option 'Gm' has been deprecated and will be removed in a future release
2023-02-11 10:56:59 -05:00
bda947e17a [huf] Fix bug in fast C decoders
The input bounds checks were buggy because they were only breaking from
the inner loop, not the outer loop. The fuzzers found this immediately.
The fix is to use `goto _out` instead of `break`.

This condition can happen on corrupted inputs.

I've benchmarked before and after on x86-64 and there were small changes
in performance, some positive, and some negative, and they end up about
balacing out.

Credit to  OSS-Fuzz
2023-01-26 14:39:13 -08:00
8957fef554 [huf] Add generic C versions of the fast decoding loops
Add generic C versions of the fast decoding loops to serve architectures
that don't have an assembly implementation. Also allow selecting the C
decoding loop over the assembly decoding loop through a zstd
decompression parameter `ZSTD_d_disableHuffmanAssembly`.

I benchmarked on my Intel i9-9900K and my Macbook Air with an M1 processor.
The benchmark command forces zstd to compress without any matches, using
only literals compression, and measures only Huffman decompression speed:

```
zstd -b1e1 --compress-literals --zstd=tlen=131072 silesia.tar
```

The new fast decoding loops outperform the previous implementation uniformly,
but don't beat the x86-64 assembly. Additionally, the fast C decoding loops suffer
from the same stability problems that we've seen in the past, where the assembly
version doesn't. So even though clang gets close to assembly on x86-64, it still
has stability issues.

| Arch    | Function       | Compiler     | Default (MB/s) | Assembly (MB/s) | Fast (MB/s) |
|---------|----------------|--------------|----------------|-----------------|-------------|
| x86-64  | decompress 4X1 | gcc-12.2.0   |         1029.6 |          1308.1 |      1208.1 |
| x86-64  | decompress 4X1 | clang-14.0.6 |         1019.3 |          1305.6 |      1276.3 |
| x86-64  | decompress 4X2 | gcc-12.2.0   |         1348.5 |          1657.0 |      1374.1 |
| x86-64  | decompress 4X2 | clang-14.0.6 |         1027.6 |          1659.9 |      1468.1 |
| aarch64 | decompress 4X1 | clang-12.0.5 |         1081.0 |             N/A |      1234.9 |
| aarch64 | decompress 4X2 | clang-12.0.5 |         1270.0 |             N/A |      1516.6 |
2023-01-25 13:47:51 -08:00
dc2b3e8876 Fix -Wstringop-overflow warning
Backported from kernel patch [0].

I wasn't able to reproduce the warning locally, but could repro it in
the kernel.

[0] https://lore.kernel.org/lkml/20220330193352.GA119296@embeddedor/
2023-01-23 10:12:25 -08:00
329169189c Replace Huffman boolean args with flags bit set 2023-01-20 14:12:53 -08:00
0cc1b0cb22 Delete unused Huffman functions
Remove all Huffman functions that aren't used by zstd.
2023-01-20 14:12:53 -08:00
6a9c525903 spec update : require minimum nb of literals for 4-streams mode
Reported by @shulib :
the specification for 4-streams mode
doesn't work when the amount of literals to compress is 5 bytes.
Extending it, it also doesn't work for sizes 1 or 2.

This patch updates the specification and the implementation
to require a minimum of 6 literals to trigger or accept the 4-streams mode.

The impact is expected to be a no-op :
the 4-streams mode is never triggered for such small quantity of literals anyway,
since it would be wasteful (it costs ~7.3 bytes more than single-stream mode).
An informal lower limit is set at ~256 bytes,
so the technical minimum is very far from this limit.

This is just meant for completeness of the specification.
2022-12-22 16:14:34 -08:00
5d693cc38c Coalesce Almost All Copyright Notices to Standard Phrasing
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do sed -i '/Copyright .* \(Yann Collet\)\|\(Meta Platforms\)/ s/Copyright .*/Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done

git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0*.c lib/legacy/zstd_v0*.h
nano ./programs/windres/zstd.rc
nano ./build/VS2010/zstd/zstd.rc
nano ./build/VS2010/libzstd-dll/libzstd-dll.rc
```
2022-12-20 12:52:34 -05:00
8927f985ff Update Copyright Headers 'Facebook' -> 'Meta Platforms'
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
  sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
2022-12-20 12:37:57 -05:00
529cd7b821 Fix nits 2022-02-14 14:24:50 -05:00
db2f4a6532 Move bitwise builtins into bits.h 2022-02-14 11:16:03 -05:00
568c69a4eb x86-64: Hide internal assembly functions
Hide x86-64 internal assembly functions. Before

$ nm -D lib/libzstd.so.1 | grep usingDTable_internal_bmi2_asm_loop
00000000000c23c0 T _HUF_decompress4X1_usingDTable_internal_bmi2_asm_loop
00000000000c23c0 T HUF_decompress4X1_usingDTable_internal_bmi2_asm_loop
00000000000c283d T _HUF_decompress4X2_usingDTable_internal_bmi2_asm_loop
00000000000c283d T HUF_decompress4X2_usingDTable_internal_bmi2_asm_loop
$

After

$ nm -D lib/libzstd.so.1 | grep usingDTable_internal_bmi2_asm_loop
$

This fixes issue #2990.
2022-01-11 10:12:24 -08:00
e1ab2200ff fixed x32 compatibility 2021-12-10 21:02:17 -08:00
c284569457 [asm] Share portability macros and restrict ASM further
Move portability macros to `lib/common/portability_macros.h`. This file
only contains platform/feature detection (e.g. 0/1 macros). This file is
shared between C and ASM code, so it cannot include any C code.

Rename `HUF_` ASM macros to be `ZSTD_` prefixed, and move to the new
header.

Restrict `ZSTD_ASM_SUPPORTED` to `__GNUC__`, because we need the GAS
assembler.

Finally, only include the ASM code if we are actually going to use it.
This disables it on all Windows platforms, which should resolve the
problem brought up in Issue #2789.
2021-12-02 16:58:04 -08:00
5414dd7978 [bmi2] Add lzcnt and bmi target attributes
* When dynamic dispatching to bmi2 add lzcnt and bmi to the
  TARGET_ATTRIBUTE.
* Centralize the bmi2 TARGET_ATTRIBUTE definition to
  BMI2_TARGET_ATTRIBUTE so we can change it in the future.
* Only enable bmi2 when both bmi1 & bmi2 are supported. There shouldn't
  be any cases where bmi2 is supported but bmi1 isn't. But, since we are
  using the instruction we should check bmi1 as well.
2021-11-30 17:54:56 -08:00
ebbd675998 Fix typos 2021-11-13 10:04:04 +02:00
d46995efeb Backport zstd patch from LKML
Credit to Nathan Chancellor for the bug fix and Nick Desaulniers for the
bug report.

Link: https://github.com/ClangBuiltLinux/linux/issues/1486
Link: https://lore.kernel.org/all/20211021202353.2356400-1-nathan@kernel.org/
2021-11-05 14:09:49 -07:00
3a4d421c0f Merge pull request #2802 from solbjorn/fix-kernel-wundef
[contrib][linux] Fix -Wundef inside Linux kernel tree
2021-09-29 09:48:17 -07:00
a07ddb47f7 [huf] Fix OSS-Fuzz assert
PR #2784 introduced a bug in the decompressor that caused some valid
inputs to fail to decompress. The bitstream isn't reloaded after the 4X*
loop if the number of elements remaining is small enough, causing us to
read more bits than are available in the bitcontainer.

This was caught by the MSAN fuzzer in OSS-Fuzz because the assembly
implementation isn't used in the MSAN build.

Credit to OSS-Fuzz.
2021-09-27 13:56:07 -07:00
71526e6f29 [contrib][linux] Fix -Wundef inside Linux kernel tree
Commit d7ef97a013
("[build] Fix oss-fuzz build with the dataflow sanitizer") broke
build inside Linux-kernel after 'import', as it no longer can
conditionally remove ZSTD_MEMORY_SANITIZER definition from
the #if DEF_A || DEF_B block. This emits -Wundef warning which
can be treated as error.
Split this preprocessor condition into two separate conditions
to fix this.

Fixes: d7ef97a013 ("[build] Fix oss-fuzz build with the dataflow sanitizer")
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
2021-09-25 13:35:25 +02:00
d7ef97a013 [build] Fix oss-fuzz build with the dataflow sanitizer
The dataflow sanitizer requires all code to be instrumented. We can't
instrument the ASM function, so we have to disable it.
2021-09-23 11:48:39 -07:00
9450876a9d [huf] Fix compilation when DYNAMIC_BMI2=0 && BMI2 is supported
* Fix compilation issues pointed out in PR #2790.
* Add test cases to GitHub actions that test all combinations of
  `DYNAMIC_BMI2` BMI2 support.
2021-09-21 16:49:13 -07:00
a5f2c45528 Huffman ASM 2021-09-20 14:46:43 -07:00
d7542aacd9 [fuzzer] Add huf_decompress fuzzer
Add a fuzzer for Huffman decompression. Fix several bugs in Huffman
decompression, mostly related to `op == NULL` and pointer underflow.
2021-09-17 15:00:49 -07:00
a494308ae9 [copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files
* Switch to yearless copyright per FB policy
* Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources
* Add zstd copyright/license header to the `contrib/linux-kernel` sources
* Update the `tests/test-license.py` to check for yearless copyright
* Improvements to `tests/test-license.py`
* Check `contrib/linux-kernel` in `tests/test-license.py`
2021-03-30 10:30:43 -07:00
756bd59322 [huf][fse] Clean up workspaces
* Move `counting` to a struct in `FSE_decompress_wksp_body()`
* Fix error code in `FSE_decompress_wksp_body()`
* Rename a variable in `HUF_ReadDTableX2_Workspace`
2021-03-17 16:50:37 -07:00
0f18059a4e [huf] Reduce stack usage of HUF_readDTableX2 by ~460 bytes
* Use `HUF_readStats_wksp()`
* Use workspace in `HUF_fillDTableX2*()`
* Clean up workspace usage to use a workspace struct
2021-03-05 12:39:46 -08:00
66e811d782 [license] Update year to 2021 2021-01-04 17:53:52 -05:00
79ded1b4a9 [lib] Add ZSTD_NO_UNUSED_FUNCTIONS macro to hide unused functions
The unused function definitions are hidden behind a
`#ifndef ZSTD_NO_UNUSED_FUNCTIONS` check.

Initially hiding all functions which are unused and take up more than
2KB of stack space, because these will show up as warnings in the
Linux Kernel build system.
2020-09-09 14:35:39 -07:00
f91ed5c766 [lib] s/current/curr because it collides with Linux Kernel macro 2020-09-09 14:35:39 -07:00
c465f24457 ZSTD_ prefix mem{cpy,move,set},malloc,calloc,free 2020-08-26 12:26:03 -07:00
80f577baa2 Move standard includes to zstd_deps.h 2020-08-26 12:25:08 -07:00
8f8bd2d1ac [regression] Update results.csv 2020-08-20 12:41:35 -07:00
612e947c5e wire up bmi2 support 2020-08-17 16:35:28 -07:00
ba1fd17a9f speed up literal header decoding 2020-08-17 12:17:53 -07:00
6028827fee Rewrite Include Paths to be Relative
Addresses #1998.
2020-05-04 15:20:26 -04:00
a93fadfcd9 Further replication removed
`CHECK_F` is now in `error_private.h`. Minor tidy.
2020-04-07 11:25:16 +02:00
7202184ee0 Fixes decompressor when using -Wshorten-64-to-32 (#2062)
Spotted on iOS when building with `-Wshorten-64-to-32` (since `__builtin_expect` returns a `long`).
2020-04-03 02:55:29 -07:00
ac58c8d720 Fix copyright and license lines
* All copyright lines now have -2020 instead of -present
* All copyright lines include "Facebook, Inc"
* All licenses are now standardized

The copyright in `threading.{h,c}` is not changed because it comes from
zstdmt.

The copyright and license of `divsufsort.{h,c}` is not changed.
2020-03-26 17:02:06 -07:00
cb2abc3dbe Fix performance regression on aarch64 with clang 2020-01-23 17:31:14 -08:00
718f00ff6f Optimize decompression speed for gcc and clang (#1892)
* Optimize `ZSTD_decodeSequence()`
* Optimize Huffman decoding
* Optimize `ZSTD_decompressSequences()`
* Delete `ZSTD_decodeSequenceLong()`
2019-11-25 18:26:19 -08:00
e0d6daabac Fix Appveyor failure 2019-11-19 11:12:26 -08:00
b3c9fc27b4 Optimized loop bounds to allow the compiler to unroll the loop.
This has no measurable impact on large files but improves small file
decompression by ~1-2% for 10kB, benchmarked with:

head -c 10000 silesia.tar > /tmp/test
make CC=/usr/local/bin/clang-9 BUILD_STATIC=1 && ./lzbench -ezstd -t1,5 /tmp/test
2019-11-15 08:27:05 +01:00
901ea61f83 Tweaks to create a single-file decoder
The CHECK_F macros differ slightly (but eventually do the same thing). Older GCC needs to fallback on the old-style pragma optimisation flags.
2019-08-21 17:49:17 +02:00
0d606ee3db Fix Incorrect assert() 2018-12-18 13:36:39 -08:00
c2d51637d9 Add Mutual-Exclusion Error 2018-12-18 13:36:39 -08:00
c560e34c86 Add HUF_FORCE_DECOMPRESS_X2 2018-12-18 13:36:39 -08:00
abd1567d3c Move HUF_DGEN Up Out of X1 Definitions 2018-12-18 13:36:39 -08:00
432314b58a Rename HUF_DECOMPRESS_MINIMAL -> HUF_FORCE_DECOMPRESS_X1 2018-12-18 13:36:39 -08:00
f45c9df42e Totally Hide/Disable X2 Variants when HUF_DECOMPRESS_MINIMAL is Defined 2018-12-18 13:36:39 -08:00