krak/zstd - zstd - Gitea: Git with a cup of tea

krak/zstd

mirror of https://github.com/facebook/zstd.git synced 2025-07-05 15:09:03 +02:00

Author	SHA1	Message	Date
Yann Collet	ededcfca57	fix confusion between unsigned <-> U32 as suggested in #1441. generally U32 and unsigned are the same thing, except when they are not ... case : 32-bit compilation for MIPS (uint32_t == unsigned long) A vast majority of transformation consists in transforming U32 into unsigned. In rare cases, it's the other way around (typically for internal code, such as seeds). Among a few issues this patches solves : - some parameters were declared with type `unsigned` in .h, but with type `U32` in their implementation .c . - some parameters have type unsigned*, but the caller user a pointer to U32 instead. These fixes are useful. However, the bulk of changes is about %u formating, which requires unsigned type, but generally receives U32 values instead, often just for brevity (U32 is shorter than unsigned). These changes are generally minor, or even annoying. As a consequence, the amount of code changed is larger than I would expect for such a patch. Testing is also a pain : it requires manually modifying `mem.h`, in order to lie about `U32` and force it to be an `unsigned long` typically. On a 64-bit system, this will break the equivalence unsigned == U32. Unfortunately, it will also break a few static_assert(), controlling structure sizes. So it also requires modifying `debug.h` to make `static_assert()` a noop. And then reverting these changes. So it's inconvenient, and as a consequence, this property is currently not checked during CI tests. Therefore, these problems can emerge again in the future. I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests. It's another restriction for coding, adding more frustration during merge tests, since most platforms don't need this distinction (hence contributor will not see it), and while this can matter in theory, the number of platforms impacted seems minimal. Thoughts ?	2018-12-21 18:09:41 -08:00
W. Felix Handte	c560e34c86	Add HUF_FORCE_DECOMPRESS_X2	2018-12-18 13:36:39 -08:00
W. Felix Handte	432314b58a	Rename HUF_DECOMPRESS_MINIMAL -> HUF_FORCE_DECOMPRESS_X1	2018-12-18 13:36:39 -08:00
W. Felix Handte	f45c9df42e	Totally Hide/Disable X2 Variants when HUF_DECOMPRESS_MINIMAL is Defined	2018-12-18 13:36:39 -08:00
Yann Collet	1adf84ccb7	renamed all HUF_decompressX4() functions into X2 to underline they generate up to 2 symbols per decoding, in preparation for a future X3 variant.	2018-06-14 15:17:03 -04:00
Yann Collet	a09af5eb6b	renamed all HUF_decompressX2() functions into X1 to underline they generate one symbol per decoding operation. The new naming scheme will make it easier to introduce an X3 variant.	2018-06-14 15:08:43 -04:00
Yann Collet	338f738c24	pass entropy tables to optimal parser for proper estimation of symbol's weights when using dictionary compression. Note : using only huffman costs is not good enough, presumably because sequence symbol costs are incorrect.	2018-05-08 15:37:06 -07:00
Yann Collet	a95a88af57	removed huf_compress_impl.h re-imported all functions inside huf_compress.c for easier source editing. Also updated a bunch of code comments for clarification.	2018-03-13 14:14:05 -07:00
Yann Collet	6cdf690441	minor cleaning of huff0 Update code documentation, and properly names a few "magic constants". Also, HUF_compress_internal() gets a cleaner way to determine size of tables inside workspace.	2018-02-26 14:52:23 -08:00
Nick Terrell	b58f01537e	[compress] Support BMI2	2018-02-14 19:20:32 -08:00
Nick Terrell	4319132312	[decompress] Support BMI2	2018-02-13 17:00:15 -08:00
Yann Collet	e8093dde09	fixed #304 Pathological samples may result in literal section being incompressible. This case is now detected, and literal distribution is replaced by one that can be written into the dictionary.	2018-01-11 11:16:32 -08:00
Nick Terrell	a86a7097ec	Ensure dictionary Huff table can encode any symbol * Ensure that the dictionary Huffman CTable has maxSymbolValue 255. * Fix a stack buffer overflow during compression dictionary loading.	2017-10-03 13:22:13 -07:00
Nick Terrell	de0414b736	[libzstd] Pull CTables into sub-structure	2017-07-12 19:49:19 -07:00
Stella Lau	32df49e9f8	Fix typo	2017-06-30 12:56:24 -07:00
Stella Lau	b0513b519c	Add comment to HUF_DECOMPRESS_WORKSPACE_SIZE	2017-06-30 12:53:56 -07:00
Stella Lau	99e315999c	Reduce stack usage of HUF_readDTableX4 and HUF_readDTableX2	2017-06-29 11:49:59 -07:00
Yann Collet	68a7d3d49a	added HUF_PUBLIC_API macro to huf.h to make it possible to control symbol visibility. Also : better separation and comments between "public" and "static" sections	2017-04-28 12:46:48 -07:00
Yann Collet	c47c68f6ca	proper evaluation of Huffman CTable size	2017-04-17 16:14:21 -07:00
Yann Collet	1f2c95c5f3	minor code refactor in HUF module	2017-03-05 21:07:20 -08:00
Nick Terrell	54c4babd8f	Always check Huffman tables for ZSTD_lazy+ The compressor always reuses the existing Huffman table if the literals size is at most 1 KiB. If the compression strategy is `ZSTD_lazy` or stronger always check to see if reusing the previous table or creating a new table is better. This doesn't yet weigh in decompression speed. I don't want to add any heuristics there until I have real data to work with to ensure that the heuristic works for at least one use case, preferably more.	2017-03-03 16:49:38 -08:00
Nick Terrell	d051cd5b43	Use workspace for count and CTable	2017-03-02 16:38:07 -08:00
Nick Terrell	a419777eb1	Allow compressor to repeat Huffman tables * Compressor saves most recently used Huffman table and reuses it if it produces better results. * I attempted to preserve CPU usage profile. I intentionally left all of the existing heuristics in place. There is only a speed difference on the second block and later. When compressing large enough blocks (say >= 4 KiB) there is no significant difference in compression speed. Dictionary compression of one block is the same speed for blocks with literals <= 1 KiB, and after that the difference is not very significant. * In the synthetic data, with blocks 10 KB or smaller, most blocks can't use repeated tables because the previous block did not contain a symbol that the current block contains. Once blocks are about 12 KB or more, most previous blocks have valid Huffman tables for the current block, and the compression ratio and decompression speed jumped. * In silesia blocks as small as 4KB can frequently reuse the previous Huffman table (85%), but it isn't as profitable, and the previous Huffman table only gets used about 3% of the time. * Microbenchmarks show that `HUF_validateCTable()` takes ~55 ns and `HUF_estimateCompressedSize()` takes ~35 ns. They are decently well optimized, the first versions took 90 ns and 120 ns respectively. `HUF_validateCTable()` could be twice as fast, if we cast the `HUF_CElt` to a `U32` and compare to 0. However, `U32` has an alignment of 4 instead of 2, so I think that might be undefined behavior. * I've ran `zstreamtest` compiled normally, with UASAN and with MSAN for 4 hours each. The worst case for the speed difference is a bunch of small blocks in the same frame. I modified `bench.c` to compress the input in a single frame but with blocks of the given block size, set by `-B`. Benchmarks on level 1: \| Program \| Block size \| Corpus \| Ratio \| Compression MB/s \| Decompression MB/s \| \|-----------\|------------\|-----------\|-------\|------------------\|--------------------\| \| zstd.base \| 256 \| synthetic \| 2.364 \| 110.0 \| 297.0 \| \| zstd \| 256 \| synthetic \| 2.367 \| 108.9 \| 297.0 \| \| zstd.base \| 256 \| silesia \| 2.204 \| 93.8 \| 415.7 \| \| zstd \| 256 \| silesia \| 2.204 \| 93.4 \| 415.7 \| \| zstd.base \| 512 \| synthetic \| 2.594 \| 144.2 \| 420.0 \| \| zstd \| 512 \| synthetic \| 2.599 \| 141.5 \| 425.7 \| \| zstd.base \| 512 \| silesia \| 2.358 \| 118.4 \| 432.6 \| \| zstd \| 512 \| silesia \| 2.358 \| 119.8 \| 432.6 \| \| zstd.base \| 1024 \| synthetic \| 2.790 \| 192.3 \| 594.1 \| \| zstd \| 1024 \| synthetic \| 2.794 \| 192.3 \| 600.0 \| \| zstd.base \| 1024 \| silesia \| 2.524 \| 148.2 \| 464.2 \| \| zstd \| 1024 \| silesia \| 2.525 \| 148.2 \| 467.6 \| \| zstd.base \| 4096 \| synthetic \| 3.023 \| 300.0 \| 1000.0 \| \| zstd \| 4096 \| synthetic \| 3.024 \| 300.0 \| 1010.1 \| \| zstd.base \| 4096 \| silesia \| 2.779 \| 223.1 \| 623.5 \| \| zstd \| 4096 \| silesia \| 2.779 \| 223.1 \| 636.0 \| \| zstd.base \| 16384 \| synthetic \| 3.131 \| 350.0 \| 1150.1 \| \| zstd \| 16384 \| synthetic \| 3.152 \| 350.0 \| 1630.3 \| \| zstd.base \| 16384 \| silesia \| 2.871 \| 296.5 \| 883.3 \| \| zstd \| 16384 \| silesia \| 2.872 \| 294.4 \| 898.3 \|	2017-03-02 13:27:52 -08:00
Przemyslaw Skibinski	821bf1febc	fixed Doxygen trailing comment	2016-12-02 16:13:41 +01:00
Yann Collet	b89af20353	reduced table sizes for HUF_readDTableX4	2016-12-01 18:24:59 -08:00
Yann Collet	a0d742b1e4	introduced HUF_buildCTable_wksp(), to reduce stack memory usage	2016-12-01 17:47:30 -08:00
Yann Collet	e928f7e16d	introduced ext_wksp variants of count to reduce stack memory usage	2016-12-01 16:13:35 -08:00
Yann Collet	e557fd5e92	minor compression level corrections	2016-07-17 16:21:37 +02:00
Yann Collet	3755eb8fea	fixed strict-aliasing warning on gcc6	2016-06-22 13:15:53 +02:00
Yann Collet	cd98f93cff	Fixed decompression issue with invalid data	2016-06-11 23:26:22 +02:00
Yann Collet	237ad4beb3	Added single-stream decompression variant using external DTable	2016-06-11 01:46:03 +02:00
Yann Collet	289bbd52e5	Updated huff0	2016-06-11 01:31:54 +02:00
Yann Collet	662a541431	updated huff0 - now generates a common HUF_DTable type for all decoding tables	2016-06-08 11:11:02 +02:00
Yann Collet	89703d20fb	reduced dependencies	2016-06-05 01:50:33 +02:00
Yann Collet	a91ca620cf	removed `HUF_readStats()` from public space	2016-06-05 01:33:55 +02:00
Yann Collet	130fe11394	merged `huf_static.h` into `huf.h` . Requires `HUF_STATIC_LINKING_ONLY` macro.	2016-06-05 00:42:28 +02:00
Yann Collet	f22a0d653d	huff0 dynamic reduction	2016-05-20 14:36:36 +02:00
inikep	23a0889301	separation of lib/ into common/, compress/, decompress/, dictBuilder/, legacy/	2016-04-22 12:43:18 +02:00

38 Commits