1
0
mirror of https://github.com/facebook/zstd.git synced 2025-03-06 16:56:49 +02:00

10295 Commits

Author SHA1 Message Date
Yonatan Komornik
6422d1d7a8 Bugfix: --[no-]row-match-finder do the opposite of what they are supposed to 2023-01-25 17:59:35 -08:00
Yann Collet
a82e0aac44
Merge pull request #3450 from facebook/no_rm_on_o
disable --rm on -o command
2023-01-25 17:51:53 -08:00
Yann Collet
02434e0867 enforce a hard fail when input files are set to be erased
in scenarios where it's supposed to not be possible.

suggested by @terrelln
2023-01-25 16:18:20 -08:00
Yann Collet
8c85b29e32 disable --rm on -o command
make it more similar to -c (aka `stdout`) convention.
2023-01-25 16:09:25 -08:00
Yann Collet
efc9ae3480
Merge pull request #3455 from facebook/fix3454
Provide more accurate error codes for busy-loop scenarios
2023-01-25 15:22:51 -08:00
Nick Terrell
321490cd5b [version-test] Work around bugs in v0.7.3 dict builder
Before calling a dictionary good, make sure that it can compress an
input. If v0.7.3 rejects v0.7.3's dictionary, fall back to the v1.0
dictionary. This is not the job of the verison test to test it, because
we cannot fix this code.
2023-01-25 13:47:51 -08:00
Nick Terrell
8957fef554 [huf] Add generic C versions of the fast decoding loops
Add generic C versions of the fast decoding loops to serve architectures
that don't have an assembly implementation. Also allow selecting the C
decoding loop over the assembly decoding loop through a zstd
decompression parameter `ZSTD_d_disableHuffmanAssembly`.

I benchmarked on my Intel i9-9900K and my Macbook Air with an M1 processor.
The benchmark command forces zstd to compress without any matches, using
only literals compression, and measures only Huffman decompression speed:

```
zstd -b1e1 --compress-literals --zstd=tlen=131072 silesia.tar
```

The new fast decoding loops outperform the previous implementation uniformly,
but don't beat the x86-64 assembly. Additionally, the fast C decoding loops suffer
from the same stability problems that we've seen in the past, where the assembly
version doesn't. So even though clang gets close to assembly on x86-64, it still
has stability issues.

| Arch    | Function       | Compiler     | Default (MB/s) | Assembly (MB/s) | Fast (MB/s) |
|---------|----------------|--------------|----------------|-----------------|-------------|
| x86-64  | decompress 4X1 | gcc-12.2.0   |         1029.6 |          1308.1 |      1208.1 |
| x86-64  | decompress 4X1 | clang-14.0.6 |         1019.3 |          1305.6 |      1276.3 |
| x86-64  | decompress 4X2 | gcc-12.2.0   |         1348.5 |          1657.0 |      1374.1 |
| x86-64  | decompress 4X2 | clang-14.0.6 |         1027.6 |          1659.9 |      1468.1 |
| aarch64 | decompress 4X1 | clang-12.0.5 |         1081.0 |             N/A |      1234.9 |
| aarch64 | decompress 4X2 | clang-12.0.5 |         1270.0 |             N/A |      1516.6 |
2023-01-25 13:47:51 -08:00
Yann Collet
db18a62f89 Provide more accurate error codes for busy-loop scenarios
fixes #3454
2023-01-25 13:07:53 -08:00
daniellerozenblit
f3255bfeff
Merge pull request #3447 from daniellerozenblit/fuzz-sequence-compression
Fuzz large offsets through sequence compression api
2023-01-25 09:27:34 -05:00
daniellerozenblit
29a4c8cc4a
Merge pull request #3452 from daniellerozenblit/fix-seekable-32bit
Fix 32-bit build errors in zstd seekable format
2023-01-25 09:23:34 -05:00
Danielle Rozenblit
63042f1f11 fix 32bit build errors in zstd seekable 2023-01-24 15:53:59 -08:00
Yonatan Komornik
2baac04110
Merge pull request #3451 from yoniko/red-zones-bugfix
Bugfix redzone unpoisoning
2023-01-24 14:32:56 -08:00
Yonatan Komornik
1d636b4ba0 Bug fix redzones by unpoisoning only the intended buffer and not the followup redzone. 2023-01-24 12:54:43 -08:00
Danielle Rozenblit
7d600c628a fix bound check for ZSTD_copySequencesToSeqStoreNoBlockDelim() 2023-01-24 06:40:40 -08:00
Elliot Gorokhovsky
41682e6293
Merge pull request #3448 from facebook/embg-doc-fix
Fix ZSTD_estimate* and ZSTD_initCStream() docs
2023-01-23 15:04:53 -05:00
Danielle Rozenblit
0a91b31b17 Merge branch 'dev' into fuzz-sequence-compression
for testing
2023-01-23 11:11:33 -08:00
daniellerozenblit
9116000be6
Merge pull request #3439 from daniellerozenblit/sequence-validation-bug-fix
Fix sequence validation and seqStore bounds check
2023-01-23 13:50:37 -05:00
Danielle Rozenblit
7fc00c18b8 calloc dictionary in sequence compression fuzzer rather than generating a random buffer 2023-01-23 10:42:09 -08:00
Elliot Gorokhovsky
3bfd3be5fb
Fix ZSTD_estimate* and ZSTD_initCStream() docs
Fix the following documentation bugs:
* Note that `ZSTD_estimate*` functions are not compatible with the external matchfinder API
* Note that `ZSTD_estimateCStreamSize_usingCCtxParams()` is not compatible with `nbWorkers >= 1`
* Remove incorrect warning that the legacy streaming API is incompatible with advanced parameters and/or dictionary compression
* Note that `ZSTD_initCStream()` is incompatible with dictionary compression
* Warn that
2023-01-23 13:28:36 -05:00
Yann Collet
ced0882e45
Merge pull request #3443 from facebook/no_rm_w_stdout
refactor : --rm ignored with stdout
2023-01-23 10:22:11 -08:00
Nick Terrell
dc2b3e8876 Fix -Wstringop-overflow warning
Backported from kernel patch [0].

I wasn't able to reproduce the warning locally, but could repro it in
the kernel.

[0] https://lore.kernel.org/lkml/20220330193352.GA119296@embeddedor/
2023-01-23 10:12:25 -08:00
Danielle Rozenblit
815d1d4eda update external sequence error to fit error naming scheme 2023-01-23 09:58:34 -08:00
Elliot Gorokhovsky
6aee603d0e
Merge pull request #3446 from facebook/dependabot/github_actions/github/codeql-action-2.1.39
Bump github/codeql-action from 2.1.38 to 2.1.39
2023-01-23 11:55:13 -05:00
Danielle Rozenblit
f75afb613f merge dev 2023-01-23 08:12:19 -08:00
Danielle Rozenblit
1b65727e74 fix nits and add new error code for invalid external sequences 2023-01-23 07:59:02 -08:00
Danielle Rozenblit
638d502002 modify sequence compression api fuzzer 2023-01-23 07:55:11 -08:00
dependabot[bot]
3663faa05a
Bump github/codeql-action from 2.1.38 to 2.1.39
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.1.38 to 2.1.39.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](515828d974...a34ca99b46)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-01-23 05:10:26 +00:00
Yann Collet
b6fd91ba84 update man 2023-01-20 18:07:55 -08:00
Yann Collet
cee6bec9fa refactor : --rm is ignored with stdout
`zstd` CLI has progressively moved to the policy of
ignoring `--rm` command when the output is `stdout`.
The primary drive is to feature a behavior more consistent with `gzip`,
when `--rm` is the default, but is also ignored when output is `stdout`.
Other policies are certainly possible, but would break from this `gzip` convention.

The new policy was inconsistenly enforced, depending on the exact list of commands.
For example, it was possible to circumvent it by using `-c --rm` in this order,
which would re-establish source removal.

- Update the CLI so that it necessarily catch these situations and ensure that `--rm` is always disabled when output is `stdout`.
- Added a warning message in this case (for verbosity 3 `-v`).
- Added an `assert()`, which controls that `--rm` is no longer active with `stdout`
- Added tests, which control the behavior, even when `--rm` is added after `-c`
- Removed some legacy code which where trying to apply a specific policy for the `stdout` + `--rm` case, which is no longer possible
2023-01-20 18:04:55 -08:00
Yann Collet
d9280afb7d fixed minor c89 warning
introduced due to parallel merges
2023-01-20 18:04:20 -08:00
Felix Handte
3d25502c2d
Merge pull request #3432 from felixhandte/fix-perms
Fix CLI Handling of Permissions and Ownership (Again)
2023-01-20 19:19:05 -05:00
Felix Handte
772229afd5
Merge pull request #3442 from felixhandte/pgo-tests
Test PGO Builds
2023-01-20 19:18:51 -05:00
Nick Terrell
b4467c1061 Fix bufferless API with attached dictionary
Fixes #3102.
2023-01-20 16:15:16 -08:00
W. Felix Handte
aab3dd4312 Add PGO Build Jobs to CI 2023-01-20 18:37:04 -05:00
W. Felix Handte
87e169d05d Add Additional Flags to PGO Build
In GCC, we can add a couple more flags to give us confidence that the profile
data is actually being found and used.

Also, my system for example doesn't have a binary installed under the name
`llvm-profdata`, but it does have, e.g., `llvm-profdata-13`, etc. So this
commit adds a variable that can be overridden.
2023-01-20 17:32:49 -05:00
Nick Terrell
329169189c Replace Huffman boolean args with flags bit set 2023-01-20 14:12:53 -08:00
Nick Terrell
0cc1b0cb22 Delete unused Huffman functions
Remove all Huffman functions that aren't used by zstd.
2023-01-20 14:12:53 -08:00
Yann Collet
bb9b9bc7be
Merge pull request #3431 from facebook/cygwin
added cygwin tests to Github Actions
2023-01-20 14:07:25 -08:00
Yann Collet
6742f20a7f
Merge pull request #3435 from facebook/c89build
added c89 build test to CI
2023-01-20 14:07:12 -08:00
Nick Terrell
667eb6d4fd [versions-test] Work around bug in dictionary builder for older versions
Older versions of zstandard have a bug in the dictionary builder, that
can cause dictionary building to fail. The process still exits 0, but
the dictionary is not created.

For reference, the bug is that it creates a dictionary that starts with
the zstd dictionary magic, in the process of writing the dictionary header,
but the header isn't fully written yet, and zstd fails compressions in
this case, because the dictionary is malformated. We fixed this later on
by trying to load the dictionary as a zstd dictionary, but if that fails
we fallback to content only (by default).

The fix is to:
1. Make the dictionary determinsitic by sorting the input files.
   Previously the bug would only sometimes occur, when the input files
   were in a particular order.
2. If dictionary creation fails, fallback to the `head` dictionary.
2023-01-20 14:05:36 -08:00
Nick Terrell
666944fbe6 Cap hashLog & chainLog to ensure that we only use 32 bits of hash
* Cap shortCache chainLog to 24
* Cap row match finder hashLog so that rowLog <= 24
* Add unit tests to expose all cases. The row match finder unit tests
  are only run in 64-bit mode, because they allocate ~1GB.

Fixes #3336
2023-01-20 14:05:26 -08:00
Yann Collet
abf965c64a
Merge pull request #3415 from facebook/dependabot/github_actions/actions/upload-artifact-3.1.2
Bump actions/upload-artifact from 3.1.1 to 3.1.2
2023-01-20 14:00:28 -08:00
Danielle Rozenblit
aa385ece13 fix sequence validation and bounds check in ZSTD_copySequencesToSeqStore() 2023-01-20 10:32:35 -08:00
Elliot Gorokhovsky
64963dcbd6
Merge pull request #3437 from embg/fuzz_emf
Fuzz the External Matchfinder API
2023-01-20 11:42:59 -05:00
Elliot Gorokhovsky
f593e54ee1
Enable if == 1 rather than if == 0
Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
2023-01-20 11:41:53 -05:00
Yann Collet
cd272d7a2d added cygwin tests to github actions 2023-01-19 17:43:43 -08:00
Yann Collet
ea684c335a added c89 build test to CI 2023-01-19 14:59:30 -08:00
Elliot Gorokhovsky
3f9f568aa6 Fuzz the external matchfinder API 2023-01-19 13:33:25 -08:00
Elliot Gorokhovsky
bce0382c82
Bugfixes for the External Matchfinder API (#3433)
* external matchfinder bugfixes + tests

* small doc fix
2023-01-19 10:41:24 -05:00
daniellerozenblit
dc1c6cc5df
Merge pull request #3418 from daniellerozenblit/fuzz-max-block-size
Fuzz on maxBlockSize
2023-01-19 08:18:04 -05:00