1
0
mirror of https://github.com/facebook/zstd.git synced 2025-07-03 22:30:29 +02:00
Commit Graph

4440 Commits

Author SHA1 Message Date
ee6475cbbd Add missing parens around macro definition
Fixes #3301.
2022-12-15 17:18:23 -08:00
45ed0df18a check potential overflow of compressBound()
fixed #3323, reported by @nigeltao

Completed documentation around this risk
(which is largely theoretical,
I can't see that happening in any "real world" scenario,
but an erroneous @srcSize value could indeed trigger it).
2022-12-15 15:23:15 -08:00
a91e7ec175 Fix corruption that rarely occurs in 32-bit mode with wlog=25
Fix an off-by-one error in the compressor that emits corrupt blocks if:

* Zstd is compiled in 32-bit mode
* The windowLog == 25 exactly
* An offset of 2^25-3, 2^25-2, 2^25-1, or 2^25 is emitted
* The bitstream had 7 bits leftover before writing the offset

This bug has been present since before v1.0, but wasn't able to easily
be triggered, since until somewhat recently zstd wasn't able to find
matches that were within 128KB of the window size.

Add a test case, and fix 2 bugs in `ZSTD_compressSequences()`:
* The `ZSTD_isRLE()` check was incorrect. It wouldn't produce
  corruption, but it could waste CPU and not emit RLE even if the block
  was RLE
* One windowSize was `1 << windowLog`, not `1u << windowLog`

Thanks to @tansy for finding the issue, and giving us a reproducer!

Fixes Issue #3350.
2022-12-15 14:41:50 -08:00
e2fc93340f Merge branch 'dev' into http-to-https 2022-12-15 10:46:13 -05:00
728e73ebb4 [legacy] Remove FORCE_MEMORY_ACCESS and only use memcpy
Delete unaligned memory access code from the legacy codebase by removing all the
non-memcpy functions. We don't care about speed at all for this codebase, only
simplicity.
2022-12-14 17:54:35 -08:00
f31b83ff34 [decompress] Fix nullptr addition & improve fuzzer
Fix an instance of `NULL + 0` in `ZSTD_decompressStream()`. Also, improve our
`stream_decompress` fuzzer to pass `NULL` in/out buffers to
`ZSTD_decompressStream()`, and fix 2 issues that were immediately surfaced.

Fixes #3351
2022-12-14 17:54:22 -08:00
a78c91ae59 Use proper unaligned access attributes
Instead of using packed attribute hack, just use aligned attribute. It
improves code generation on armv6 and armv7, and slightly improves code
generation on aarch64. GCC generates identical code to regular aligned
access on ARMv6 for all versions between 4.5 and trunk, except GCC 5
which is buggy and generates the same (bad) code as packed access:
https://gcc.godbolt.org/z/hq37rz7sb
2022-12-14 16:00:37 -08:00
4dffc35f2e Convert references to https from http 2022-12-14 06:58:35 -08:00
c43da3d605 Fix C90 compat 2022-12-13 18:01:32 -08:00
e1e82f74f1 Reserve two fields in ZSTD_frameHeader 2022-12-13 17:45:05 -08:00
69ec75f0d5 fix window resizing edge case 2022-12-13 08:35:20 -08:00
80cf73fd66 Merge pull request #3320 from cwoffenden/msvc-arm64-size_t
Fix for MSVC C4267 warning on ARM64 (which becomes error C2220 with /WX)
2022-12-04 18:55:30 -08:00
ecd7601c36 minor: proper pledgedSrcSize trace 2022-11-22 06:00:45 -08:00
c8d870fe52 Improve LDM cparam validation logic 2022-11-21 15:39:18 -05:00
0547c3d3f8 Random edit to re-run the CI
I don't believe the (x64) Mac failure is related to error since it would take the SSE path.
2022-11-19 19:04:08 +01:00
0168914490 Fix for MSVC C4267 error 2022-11-18 11:31:17 +01:00
c2638212af Change threshold for benchmarking 2022-10-27 13:13:17 -07:00
db74d043d6 Speed optimizations with macro 2022-10-27 10:20:44 -07:00
401331909e Commit for benchmarking 2022-10-24 12:35:16 -07:00
dcc7228de9 [lazy] Use switch instead of indirect function calls. (#3295)
Use a switch statement to select the search function instead of an
indirect function call. This results in a sizable performance win.

This PR is a modification of the approach taken in PR #2828.
When I measured performance for that commit, it was neutral.
However, I now see a performance regression on gcc, but still
neutral on clang. I'm measuring on the same platform, but with
newer compilers. The new approach beats both the current dev
branch and the baseline before PR #2828 was merged.

This PR is necessary for Issue #3275, to update zstd in the kernel.
Without this PR there is a large regression in greedy - btlazy2
compression speed. With this PR it is about neutral.

gcc version: 12.2.0
clang version: 14.0.6
dataset: silesia.tar

| Compiler | Level | Dev Speed (MB/s) | PR Speed (MB/s) | Delta  |
|----------|-------|------------------|-----------------|--------|
| gcc      |     5 |            102.6 |           113.7 | +10.8% |
| gcc      |     7 |             66.6 |            74.8 | +12.3% |
| gcc      |     9 |             51.5 |            58.9 | +14.3% |
| gcc      |    13 |             14.3 |            14.3 |  +0.0% |
| clang    |     5 |            108.1 |           114.8 |  +6.2% |
| clang    |     7 |             68.5 |            72.3 |  +5.5% |
| clang    |     9 |             53.2 |            56.2 |  +5.6% |
| clang    |    13 |             14.3 |            14.7 |  +2.8% |

The binary size stays just about the same for clang and gcc, measured
using the `size` command:

| Compiler | Branch | Text    | Data | BSS | Total   |
|----------|--------|---------|------|-----|---------|
| gcc      | dev    | 1127950 | 3312 | 280 | 1131542 |
| gcc      | PR     | 1123422 | 2512 | 280 | 1126214 |
| clang    | dev    | 1046254 | 3256 | 216 | 1049726 |
| clang    | PR     | 1048198 | 2296 | 216 | 1050710 |
2022-10-21 17:14:02 -07:00
99d239de32 Merge pull request #3290 from felixhandte/ddict-dict-id-from-ddict
Make ZSTD_getDictID_fromDDict() Read DictID from DDict
2022-10-18 13:33:32 -04:00
0d5d571080 Merge pull request #3285 from daniellerozenblit/optimal-huff-depth
Optimal huf depth
2022-10-18 10:31:44 -04:00
b4f0d364af Merge 2022-10-17 11:24:24 -07:00
a08fabd51a Rough draft speed optimization 2022-10-17 10:24:29 -07:00
a910489ff5 No longer pass srcSize to minTableLog 2022-10-17 08:03:44 -07:00
b34729018c Minor simplication: no longer need to check src size if using cardinality for minTableLog 2022-10-17 07:55:07 -07:00
d7841d150b Make ZSTD_getDictID_fromDDict() Read DictID from DDict
Currently this function actually reads the dict ID from the dictionary's
header, via `ZSTD_getDictID_fromDict()`. But during decompression the decomp-
ressor actually compares the dict ID in the frame header with the dict ID in
the DDict. Now of course the dict ID in the dictionary contents and the dict
ID in the DDict struct *should* be the same. But in cases of memory corrupt-
ion, where they can drift out of sync, it's misleading for this function to
read it again from the dict buffer rather then return the dict ID that will
actually be used.

Also doing it this way avoids rechecking the magic and so on and so it is a
tiny bit more efficient.
2022-10-14 22:53:03 -04:00
75cd42afd7 Update regression results and better variable naming for HUF_cardinality 2022-10-14 13:37:19 -07:00
c4853e1553 Update threshold to use optimal depth 2022-10-14 11:29:32 -07:00
e60cae33cf Additional ratio optimizations 2022-10-14 10:37:35 -07:00
b7d55cfa0d fix issue #3119
fix segfault error when running zstreamtest with MALLOC_PERTURB_
2022-10-12 23:04:23 -07:00
fa7d9c1139 Set threshold to use optimal table log 2022-10-11 14:33:25 -07:00
8888a2ddcc CI failure fixes 2022-10-11 13:12:19 -07:00
5635827ede Move ZSTD_DEPRECATED before ZSTDLIB_API/ZSTDLIB_STATIC_API
Clang doesn't allow [[deprecated(...)]] attribute after __attribute__.
Move [[deprecated(...)]] before __attribute__ to fix C++14/C++17 uses
with Clang.

Fix #3250
2022-09-22 12:30:44 -07:00
434ffe979c minor: refactor publication of ZSTD_copyCCtx()
for improved clarity
2022-09-22 11:14:21 -07:00
97c23cf615 Merge pull request #3199 from JunHe77/comp
compress:check more bytes to reduce ZSTD_count call
2022-09-19 10:49:10 -07:00
e9e88753d5 Merge pull request #3245 from haampie/fix/SED_ERE_OPT
drop -E flag in sed
2022-09-19 10:48:11 -07:00
f7251f88b9 Merge pull request #3247 from haampie/fix/grep
Fix make variable
2022-09-19 10:47:38 -07:00
ce52acd7dc compress:check more bytes to reduce ZSTD_count call
Comparing 4B instead of comparing 1B in ZSTD_noDict
mode, thus it can avoid cases like match in match[ml]
but mismatch in match[ml-3]..match[ml-1]. So the call
count of ZSTD_count can be reduced.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: I3449ea423d5c8e8344f75341f19a2d1643c703f6
2022-09-18 14:45:41 +08:00
8bb833bb5a Merge branch 'null-buffer-decompress' of github.com:daniellerozenblit/zstd into null-buffer-decompress 2022-09-12 18:57:53 -07:00
e46b12e1b4 fix indentation 2022-09-12 18:56:59 -07:00
f59f797aa8 Merge branch 'facebook:dev' into null-buffer-decompress 2022-09-12 14:54:36 -04:00
a1d89424c2 fuzzer error fix 2022-09-12 11:53:37 -07:00
aa82998821 add sequence bound function 2022-09-09 12:34:25 -07:00
6600a05949 Merge pull request #3259 from DimitriPapadopoulos/codespell
Fix typos found by codespell
2022-09-09 11:15:05 -04:00
a06e953db9 some additional comments, remove apt-get from clang jobs, better test titles 2022-09-08 18:30:07 -07:00
0015308c0f Fix typos found by codespell 2022-09-08 23:17:00 +02:00
3d7f9a90df skip flush operation in case where op is NULL 2022-09-08 13:53:13 -07:00
f3ddaaddd6 ternary operator instead of if statement 2022-09-08 12:59:49 -07:00
028842788b fix zero offset to nullpointer errors 2022-09-07 17:52:26 -07:00