1
0
mirror of https://github.com/facebook/zstd.git synced 2025-07-06 23:56:40 +02:00
Commit Graph

2243 Commits

Author SHA1 Message Date
e3e0775cc8 [API] Add ZSTD_c_stable{In,Out}Buffer parameters
This commit adds the parameters and sets the value in the CCtxParams
but it does not do anything with the value.
2020-10-30 10:54:39 -07:00
e2581d9572 [lib] Set appliedParams in zstdmt mode
Previously only `nbWorkers` was set. Set all parameters, because that is
what is expected. This is needed for the `ZSTD_c_stable{In,Out}Buffer`
parameters.
2020-10-30 10:54:38 -07:00
536e89c723 Sequence extractor should update CBlockState 2020-10-30 12:13:19 -04:00
32cac2627a Emit last literals of 0 size as well, to indicate block boundary 2020-10-29 16:41:17 -04:00
69bd5f0654 Correct literalsRead calculation to include longLength 2020-10-29 14:49:37 -04:00
59624f3163 Remove implicit typecast to appease appVeyor windows build 2020-10-28 16:25:09 -04:00
3ed5d053d8 Clarify comments in zstd.h some more 2020-10-28 09:53:09 -04:00
599ff58e08 Merge pull request #2339 from terrelln/zstdmt-stability
Fix zstdmt stability issues and clean up the zstdmt code
2020-10-27 19:43:13 -07:00
sen
17b700d78a Merge pull request #2366 from senhuang42/enable_ldm_by_default
Enable LDM by default if window size >= 128MB and strategy uses opt parser
2020-10-27 14:59:28 -04:00
0953645837 Merge pull request #2362 from senhuang42/fix_ldm_fuzz_issue
Fix long distance matcher OSS-fuzz issue
2020-10-27 11:13:03 -07:00
3163909d14 Remove unused variable position 2020-10-27 12:58:12 -04:00
dc448563e9 Add test compatibility with last literals in sequences 2020-10-27 12:35:28 -04:00
1d221ecc03 Add support for representing last literals in the extracted seqs 2020-10-27 11:19:48 -04:00
9171f920cd Improve documentation of seqStore_t 2020-10-27 10:50:22 -04:00
96b0ff7886 Improve documentation regarding various operations in copyBlockSequences 2020-10-27 10:36:06 -04:00
3a11c7eb03 Modify ZSTD_copyBlockSequences to agree with new API 2020-10-27 10:31:40 -04:00
8bdb32aebe Add a function for LDM enable check 2020-10-20 13:46:02 -04:00
578e889ec1 Move ldm enable to compressStream2() 2020-10-20 13:04:45 -04:00
d28d8a1d72 Include LDM tables size for CCtx size estimation where relevant 2020-10-20 09:21:30 -04:00
b1c7fc5768 Add compatibility for multithreading 2020-10-19 12:07:06 -04:00
590f7f55f0 Add ldm enable condition in ZSTD_resetCCtx_internal 2020-10-19 10:26:17 -04:00
4d01979b62 Expose and call ZSTD_ldm_skipRawSeqStoreBytes() 2020-10-16 20:30:00 -04:00
a0ec50c2dc Merge pull request #2355 from senhuang42/change_ldm_mt_config
Reduce --long mode MT jobsize at higher levels
2020-10-16 13:35:50 -07:00
f49926edf4 Change cycleLog adjustment to +3 from +4 2020-10-15 09:56:05 -04:00
ee84817fe7 Reset posInSequence when using ZSTD_referenceExternalSequences() 2020-10-14 22:06:08 -04:00
d0550bb18f Clarify argument names, fix DEBUGLOG() statements 2020-10-14 15:45:43 -04:00
3f99c9b38d Adjust match backwards count args 2020-10-14 15:23:03 -04:00
bf0d559449 Introduce, implement, and call ZSTD_ldm_countBackwardsMatch_2segments() 2020-10-14 12:58:06 -04:00
467e4383b0 Merge branch 'dev' of github.com:senhuang42/zstd into change_ldm_mt_config 2020-10-14 10:17:50 -04:00
f5d5cd3b40 Merge pull request #2341 from senhuang42/ldm_optimized_for_opt_parser
Integrate long distance matches into optimal parser
2020-10-13 13:09:07 -07:00
7e6f91ed84 [minor] Improve docs and add an assert in response to review 2020-10-12 16:43:17 -07:00
354b5f1c0a Use cycleLog instead of chainLog to determine LDM jobLog 2020-10-12 16:09:59 -04:00
441ce4178f [zstdmt] Clarify a comment 2020-10-12 12:58:13 -07:00
efff5d8b2d [zstdmt] Fix determinism issue with rsyncable mode
The problem occurs in this scenario:
1. We find a synchronization point.
2. We attmept to create the job.
3. We fail because the job table is full: `mtctx->nextJobID > mtctx->doneJobID + mtctx->jobIDMask`.
4. We call `ZSTDMT_compressStream_generic` again.
5. We forget that we're at a sync point already, and we continue looking
   for the next sync point.

This fix is to detect if we're currently paused at a sync point, and if
we are then don't load any more input.

Caught by zstreamtest. I modified it to make the bug occur more often
(~1/100K -> ~1/200) and verified that it is fixed after. I then ran a
few hundred thousand unmodified zstreamtest iterations to verify.
2020-10-12 12:55:17 -07:00
ede4f97153 [zstdmt] Fix bug where extra empty blocks are emitted
When zstdmt cannot get a buffer and `ZSTD_e_end` is passed an empty
compression job can be created. Additionally, `mtctx->frameEnded` can be
set to 1, which could potentially cause problems like unterminated blocks.

The fix is to adjust to `ZSTD_e_flush` even when we can't get a buffer.
2020-10-12 12:55:17 -07:00
c51a9e79b9 [zstdmt] Rip out the zstdmt API
This commit leaves only the functions used by zstd_compress.c. All other
functions have been removed from the API. The ZSTDMT unit tests in
fuzzer.c and zstreamtest.c have been rewritten to use the ZSTD API. And
the --mt zstreamtest tests have been ripped out.
2020-10-12 12:55:16 -07:00
1784c4b4ab [zstdmt] Remove single-pass shortcut
Simplifies the code and removes blocking from zstdmt.

At this point we could completely delete
`ZSTDMT_compress_advanced_internal()`. However I'm leaving it in because
I think we want to do that in the zstd-1.5.0 release, in case anyone is
still using the ZSTDMT API, even though it is not installed by default.

Fixes #2327.
2020-10-12 12:53:26 -07:00
b55ae009ac [zstdmt] Remove singleBlockingThread mode
This is already handled by zstd, so this logic is never used.
2020-10-12 12:53:26 -07:00
d5c688e8ae Fix ZSTD_adjustCParams_internal() to handle dictionary logic
Pass in the `ZSTD_cParamMode_e` to select how we define our cparams.
Based on the mode we either take the `dictSize` into account or we set
it to `0`. See the documentation for `ZSTD_cParamMode_e`.

Some of the modes currently share the same behavior. But they have
distinct modes because they are drastically different cases. E.g.
compression + reprocessing the dictionary and creating a cdict.

Additionally, when downsizing the hashLog and chainLog take the
(adjusted) dictionary size into account, since the size of the
dictionary gets added onto the window size.

Adds a simple test to ensure that we aren't downsizing too far.
2020-10-12 12:50:04 -07:00
fadaab8c7c [minor improvement] Pass 0 as the content size in the DDS
The DDS structure can't be copied into the working tables like the DMS.
So it doesn't need to account for the source size when sizing its
parameters, just the dictionary size.
2020-10-12 12:47:21 -07:00
48ef15fb47 [minor improvement] Pass dictSize when selecting parameters
When selecting parameters in streaming compression with a dictionary use
the dictionary size to select the parameters.
2020-10-12 12:47:19 -07:00
012818df99 [refactor] Remove ZSTD_resetCStream_internal()
This function is only called in one place. It isn't a logical separation
of duties, and it was only obsfucating the code now, so inline it.
2020-10-12 12:46:10 -07:00
7083f79008 [bug] Fix dictContentType when reprocessing cdict
Conditions to trigger:
* CDict is loaded as raw content.
* CDict starts with the zstd dictionary magic number.
* The CDict is reprocessed (not attached or copied).
* The new API is used (streaming or `ZSTD_compress2()`).

Bug: The dictionary is loaded as a zstd dictionary, not a raw content
dictionary, because the dict content type is set to `ZSTD_dct_auto`.

Fix: Pass in the dictionary content type from cdict creation to the call
to `ZSTD_compress_insertDictionary()`.

Test: Added a test case that exposes the bug, and fixed the raw
content tests to not modify the `dictBuffer`, which makes all future
tests with the `dictBuffer` raw content, which doesn't seem intentional.
2020-10-12 12:46:10 -07:00
d6911b86be Require LDM matches to be strictly greater in length 2020-10-09 12:56:18 -04:00
12541931fa Merge pull request #2328 from marxin/zstd-pool-api
Allow external creation of POOLs that can be shared.
2020-10-09 01:00:50 -07:00
6fdb0cb8d9 Merge pull request #2303 from senhuang42/let_cdict_take_clevel_priority
For ZSTD_compressStream2(), let cdict take compression level priority
2020-10-09 00:48:30 -07:00
b9c8033cde Define kNullRawSeqStore for every file 2020-10-07 19:02:41 -04:00
a6165c1b28 Change matchState_t::ldmSeqStore to pointer 2020-10-07 14:13:57 -04:00
abce708a56 Move posInSequence correction to correct location 2020-10-07 13:56:25 -04:00
0c515590d8 Replace offCode of largest match if ldm's offCode is superior 2020-10-07 13:56:25 -04:00