1
0
mirror of https://github.com/facebook/zstd.git synced 2025-07-05 23:27:28 +02:00
Commit Graph

206 Commits

Author SHA1 Message Date
a146ee04ae added negative compression levels
negative compression level trade compression ratio for more compression speed.
They turn off huffman compression of literals,
and use row 0 as baseline with a stepSize = -cLevel.

added associated test in fuzzer

also added : new advanced parameter ZSTD_p_literalCompression
2018-03-11 05:21:53 -07:00
9b184359e2 pretify last unit test output 2018-02-13 10:09:01 -08:00
4b525af53a zstdmt: applies new parameters on the fly
when invoked from ZSTD_compress_generic()
2018-02-02 15:58:13 -08:00
209df52ba2 Changed nbThreads for nbWorkers
This makes it easier to explain that nbWorkers=0 --> single-threaded mode,
while nbWorkers=1 --> asynchronous mode (one mode thread on top of the "main" caller thread).
No need for an additional asynchronous mode flag.
nbWorkers>=2 works the same as nbThreads>=2 previously.
2018-02-01 19:29:30 -08:00
9c8c69e41a [fuzzer] Check ZSTD_initStaticCDict() for every level 2018-01-31 11:12:54 -08:00
f3b8f90b6d changed initStatic?Dict() return type to const ZSTD_?Dict*
ZSTD_create?Dict() is required to produce a ?Dict* return type
because `free()` does not accept a `const type*` argument.
If it wasn't for this restriction, I would have preferred to create a `const ?Dict*` object
to emphasize the fact that, once created, a dictionary never changes
(hence can be shared concurrently until the end of its lifetime).

There is no such limitation with initStatic?Dict() :
as stated in the doc, there is no corresponding free() function,
since `workspace` is provided, hence allocated, externally,
it can only be free() externally.

Which means, ZSTD_initStatic?Dict() can return a `const ZSTD_?Dict*` pointer.

Tested with `make all`, to catch initStatic's users,
which, incidentally, also updated zstd.h documentation.
2018-01-17 14:08:48 -08:00
863b2f8db4 Merge pull request #983 from terrelln/dict-wlog
Increase windowLog from CDict based on the srcSize when known
2018-01-12 07:47:43 -08:00
4b7c4e5f41 Add test for cdict window log adjustment 2018-01-11 16:45:16 -08:00
cacf47cbee Merge branch 'dev' into dubtlazy
and fixed conflicts
2018-01-11 13:25:08 -08:00
04c00f9388 Merge pull request #982 from facebook/fix304
Fix for #304 and #977 : error during dictionary creation
2018-01-11 13:20:59 -08:00
1d623e60a1 Merge pull request #981 from facebook/fix976
fixed bug #976, reported by @indygreg
2018-01-11 11:40:07 -08:00
e8093dde09 fixed #304
Pathological samples may result in literal section being incompressible.
This case is now detected,
and literal distribution is replaced by one that can be written into the dictionary.
2018-01-11 11:16:32 -08:00
218e9fe0fc added a test case for dictBuilder failure
cyclic data set makes the entropy stage fails
now, onto a fix for #304 ...
2018-01-11 09:42:38 -08:00
2103a62b3d fixed minor warning on prototype definition 2018-01-11 04:49:19 -08:00
ff795580f2 fixed bug #976, reported by @indygreg
constants in zstd.h should not depend on MIN() macro which existence is not guaranteed.

Added a test to check the specific constants.
The test is a bit too specific.
But I have found no way to control a more generic "are all macro already defined" condition,
especially as this is a valid construction (the missing macro might be defined later, intentionnally).
2018-01-10 20:33:45 -08:00
0e88f6e97b Fix break condition in decompression noise test
The bug prevents noise being added
2018-01-11 11:42:58 +10:00
02f64ef955 btlazy2: fixed interaction between unsortedMark and reduceTable 2017-12-29 19:08:51 +01:00
64482c2c97 fixed bug in dubt
the chain of unsorted candidates could grow beyond lowLimit.
2017-12-29 17:04:37 +01:00
574e75354b fuzzer: ensure existence of CHECK_Z macro beyond OS-X systems 2017-12-19 11:24:14 +01:00
d88c671663 added test case for "wrong blockSize in continue mode" 2017-12-19 10:16:09 +01:00
22727a7467 Fix cdict compressor repcodes 2017-12-13 11:31:20 -08:00
dab8cfa3c7 Combine definitions of SEC_TO_MICRO 2017-11-30 19:40:53 -08:00
9a2f6f477b Use util.h for timing 2017-11-30 14:57:25 -08:00
e19b0822bc Test large skippable frames 2017-11-01 13:10:03 -07:00
86b8134cad [libzstd] Fix parameter selection for empty input
ZSTD_compress() and friends would treat an empty input as an unknown size
when selecting parameters. Thus, they would drastically overallocate the
context. Tell ZSTD_getParams() that the source size is 1 when it is empty.
2017-10-25 17:24:15 -07:00
e963800e27 zstdmt : fixed : buffer dst0 wasn't properly set to null after usage
now it's possible to unconditionnally invoke ZSTD_releaseAllJobRessources()
wether previous compression was completed correctly or not.
2017-09-28 23:01:31 -07:00
df4e9bba25 fixed constant errors for gcc in c99 mode
C standard does not consider a `static const int` as a constant.
This is a problem for initializer, and ZSTD_STATIC_ASSERT().
Replaced by macro values
2017-09-26 14:31:06 -07:00
52a1d1c6dc added ZSTD_DCtx_reset() 2017-09-25 16:56:48 -07:00
62568c9a42 added capability to generate magic-less frames
decoder not implemented yet
2017-09-25 14:26:26 -07:00
cd3115b284 added control from frame content size at end of decompression
adding check at end of single-pass ZSTD_decompressFrame().
Check within ZSTD_decompressContinue() was already added in a previous patch : b3f33ccfb3
2017-09-21 16:21:10 -07:00
058ed2ad33 ZSTD_decodingBufferSize_min()
supporting function for bufferless streaming API (ZSTD_decompressContinue())
makes it possible to correctly size a round buffer for decoding using this API.

also : added field blockSizeMax within ZSTD_frameHeader,
as it's a necessary information to know when to restart at beginning of decoding buffer.
2017-09-09 01:03:29 -07:00
3128e03be6 updated license header
to clarify dual-license meaning as "or"
2017-09-08 00:09:23 -07:00
d7ad99b2ab Merge branch 'longRangeMatcher' into dev 2017-08-31 18:08:37 -07:00
ee65701720 Minor fixes; remove formatting only changes 2017-08-29 20:27:35 -07:00
a6e20e1bd7 Add test for raw content starting with dict header 2017-08-29 18:36:18 -07:00
c88fb9267f Replace 'byReference' with enum 2017-08-29 11:55:02 -07:00
18224608ff Remove ZSTD_setCCtxParameter() 2017-08-25 13:58:41 -07:00
32fb407c9d updated a bunch of headers
for the new license
2017-08-18 16:52:05 -07:00
38ba7002f2 fixed minor warning on unused variable in shell function 2017-07-20 18:39:04 -07:00
5e6c5203f3 fixed fuzzer test for non OS-X platforms 2017-07-20 15:11:56 -07:00
1ca1288689 added --memtest=# command to fuzzer
to jump directly to relevant test section
2017-07-19 16:01:16 -07:00
cc1522351f [libzstd] Fix bug in Huffman encoding
Summary:
Huffman encoding with a bad dictionary can encode worse than the
HUF_BLOCKBOUND(srcSize), since we don't filter out incompressible
input, and even if we did, the dictionaries Huffman table could be
ill suited to compressing actual data.

The fast optimization doesn't seem to improve compression speed,
even when I hard coded fast = 1, the speed didn't improve over hard coding
it to 0.

Benchmarks:
$ ./zstd.dev -b1e5
Benchmarking levels from 1 to 5
 1#Synthetic 50%     :  10000000 ->   3139163 (3.186), 524.8 MB/s ,1890.0 MB/s
 2#Synthetic 50%     :  10000000 ->   3115138 (3.210), 372.6 MB/s ,1830.2 MB/s
 3#Synthetic 50%     :  10000000 ->   3222672 (3.103), 223.3 MB/s ,1400.2 MB/s
 4#Synthetic 50%     :  10000000 ->   3276678 (3.052), 198.0 MB/s ,1280.1 MB/s
 5#Synthetic 50%     :  10000000 ->   3271570 (3.057), 107.8 MB/s ,1200.0 MB/s
$ ./zstd -b1e5
Benchmarking levels from 1 to 5
 1#Synthetic 50%     :  10000000 ->   3139163 (3.186), 524.8 MB/s ,1870.2 MB/s
 2#Synthetic 50%     :  10000000 ->   3115138 (3.210), 370.0 MB/s ,1810.3 MB/s
 3#Synthetic 50%     :  10000000 ->   3222672 (3.103), 223.3 MB/s ,1380.1 MB/s
 4#Synthetic 50%     :  10000000 ->   3276678 (3.052), 196.1 MB/s ,1270.0 MB/s
 5#Synthetic 50%     :  10000000 ->   3271570 (3.057), 106.8 MB/s ,1180.1 MB/s
$ ./zstd.dev -b1e5 ../silesia.tar
Benchmarking levels from 1 to 5
 1#silesia.tar       : 211988480 ->  73651685 (2.878), 429.7 MB/s ,1096.5 MB/s
 2#silesia.tar       : 211988480 ->  70158785 (3.022), 321.2 MB/s ,1029.1 MB/s
 3#silesia.tar       : 211988480 ->  66993813 (3.164), 243.7 MB/s , 981.4 MB/s
 4#silesia.tar       : 211988480 ->  66306481 (3.197), 226.7 MB/s , 972.4 MB/s
 5#silesia.tar       : 211988480 ->  64757852 (3.274), 150.3 MB/s , 963.6 MB/s
$ ./zstd -b1e5 ../silesia.tar
Benchmarking levels from 1 to 5
 1#silesia.tar       : 211988480 ->  73651685 (2.878), 429.7 MB/s ,1087.1 MB/s
 2#silesia.tar       : 211988480 ->  70158785 (3.022), 318.8 MB/s ,1029.1 MB/s
 3#silesia.tar       : 211988480 ->  66993813 (3.164), 246.5 MB/s , 981.4 MB/s
 4#silesia.tar       : 211988480 ->  66306481 (3.197), 229.2 MB/s , 972.4 MB/s
 5#silesia.tar       : 211988480 ->  64757852 (3.274), 149.3 MB/s , 963.6 MB/s

Test Plan:
I added a test case to the fuzzer which crashed with ASAN before the patch
and succeeded after.
2017-07-18 13:20:40 -07:00
052a95f77c fix : ZSTDMT_compress_advanced() correctly generates checksum
when params.fParams.checksumFlag==1.
This use case used to be impossible when only ZSTD_compress() was available
2017-07-11 17:18:26 -07:00
ef0ff7fe7f zstdmt: removed margin for improved memory usage 2017-07-11 08:54:29 -07:00
4616fad18b improved ZSTDMT_compress() memory usage
does not need the input buffer for streaming operations

also : reduced a few tests time length
2017-07-10 17:16:41 -07:00
670b1fc547 optimized memory usage for ZSTDMT_compress()
Previously, each job would reserve a CCtx right before being posted.
The CCtx would be "part of the job description",
and only released when the job is completed (aka flushed).
For ZSTDMT_compress(), which creates all jobs first and only join at the end,
that meant one CCtx per job.
The nb of jobs used to be == nb of threads,
but since latest modification,
which reduces the size of jobs in order to spread the load of difficult areas,
it also increases the nb of jobs for large sources / small compression level.
This resulted in many more CCtx being created.

In this new version, CCtx are reserved within the worker thread.
It guaranteea there cannot be more CCtx reserved than workers (<= nb threads).

To do that, it required to make the CCtx Pool multi-threading-safe :
it can now be called from multiple threads in parallel.
2017-07-10 16:30:55 -07:00
3510efb02d fix : custom allocator correctly propagated to child contexts 2017-07-10 14:21:40 -07:00
ee3423d709 extended fuzzer MT memory tests 2017-07-10 14:09:16 -07:00
f9524cf366 added --memtest to fuzzer 2017-07-10 13:48:41 -07:00
e32fb0c1fe added ZSTD_sizeof_CCtx() test 2017-07-10 12:29:57 -07:00