1
0
mirror of https://github.com/facebook/zstd.git synced 2025-07-05 15:09:03 +02:00
Commit Graph

38 Commits

Author SHA1 Message Date
ededcfca57 fix confusion between unsigned <-> U32
as suggested in #1441.

generally U32 and unsigned are the same thing,
except when they are not ...

case : 32-bit compilation for MIPS (uint32_t == unsigned long)

A vast majority of transformation consists in transforming U32 into unsigned.
In rare cases, it's the other way around (typically for internal code, such as seeds).

Among a few issues this patches solves :
- some parameters were declared with type `unsigned` in *.h,
  but with type `U32` in their implementation *.c .
- some parameters have type unsigned*,
  but the caller user a pointer to U32 instead.

These fixes are useful.

However, the bulk of changes is about %u formating,
which requires unsigned type,
but generally receives U32 values instead,
often just for brevity (U32 is shorter than unsigned).
These changes are generally minor, or even annoying.

As a consequence, the amount of code changed is larger than I would expect for such a patch.

Testing is also a pain :
it requires manually modifying `mem.h`,
in order to lie about `U32`
and force it to be an `unsigned long` typically.
On a 64-bit system, this will break the equivalence unsigned == U32.
Unfortunately, it will also break a few static_assert(), controlling structure sizes.
So it also requires modifying `debug.h` to make `static_assert()` a noop.
And then reverting these changes.

So it's inconvenient, and as a consequence,
this property is currently not checked during CI tests.
Therefore, these problems can emerge again in the future.

I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests.
It's another restriction for coding, adding more frustration during merge tests,
since most platforms don't need this distinction (hence contributor will not see it),
and while this can matter in theory, the number of platforms impacted seems minimal.

Thoughts ?
2018-12-21 18:09:41 -08:00
c560e34c86 Add HUF_FORCE_DECOMPRESS_X2 2018-12-18 13:36:39 -08:00
432314b58a Rename HUF_DECOMPRESS_MINIMAL -> HUF_FORCE_DECOMPRESS_X1 2018-12-18 13:36:39 -08:00
f45c9df42e Totally Hide/Disable X2 Variants when HUF_DECOMPRESS_MINIMAL is Defined 2018-12-18 13:36:39 -08:00
1adf84ccb7 renamed all HUF_decompress*X4*() functions into *X2
to underline they generate up to 2 symbols per decoding,
in preparation for a future *X3 variant.
2018-06-14 15:17:03 -04:00
a09af5eb6b renamed all HUF_decompress*X2*() functions into *X1
to underline they generate one symbol per decoding operation.

The new naming scheme will make it easier to introduce an *X3 variant.
2018-06-14 15:08:43 -04:00
338f738c24 pass entropy tables to optimal parser
for proper estimation of symbol's weights
when using dictionary compression.

Note : using only huffman costs is not good enough,
presumably because sequence symbol costs are incorrect.
2018-05-08 15:37:06 -07:00
a95a88af57 removed huf_compress_impl.h
re-imported all functions inside huf_compress.c
for easier source editing.

Also updated a bunch of code comments
for clarification.
2018-03-13 14:14:05 -07:00
6cdf690441 minor cleaning of huff0
Update code documentation, and properly names a few "magic constants".
Also, HUF_compress_internal() gets a cleaner way
to determine size of tables inside workspace.
2018-02-26 14:52:23 -08:00
b58f01537e [compress] Support BMI2 2018-02-14 19:20:32 -08:00
4319132312 [decompress] Support BMI2 2018-02-13 17:00:15 -08:00
e8093dde09 fixed #304
Pathological samples may result in literal section being incompressible.
This case is now detected,
and literal distribution is replaced by one that can be written into the dictionary.
2018-01-11 11:16:32 -08:00
a86a7097ec Ensure dictionary Huff table can encode any symbol
* Ensure that the dictionary Huffman CTable has maxSymbolValue 255.
* Fix a stack buffer overflow during compression dictionary loading.
2017-10-03 13:22:13 -07:00
de0414b736 [libzstd] Pull CTables into sub-structure 2017-07-12 19:49:19 -07:00
32df49e9f8 Fix typo 2017-06-30 12:56:24 -07:00
b0513b519c Add comment to HUF_DECOMPRESS_WORKSPACE_SIZE 2017-06-30 12:53:56 -07:00
99e315999c Reduce stack usage of HUF_readDTableX4 and HUF_readDTableX2 2017-06-29 11:49:59 -07:00
68a7d3d49a added HUF_PUBLIC_API macro to huf.h
to make it possible to control symbol visibility.
Also : better separation and comments between "public" and "static" sections
2017-04-28 12:46:48 -07:00
c47c68f6ca proper evaluation of Huffman CTable size 2017-04-17 16:14:21 -07:00
1f2c95c5f3 minor code refactor in HUF module 2017-03-05 21:07:20 -08:00
54c4babd8f Always check Huffman tables for ZSTD_lazy+
The compressor always reuses the existing Huffman table if the literals
size is at most 1 KiB. If the compression strategy is `ZSTD_lazy` or
stronger always check to see if reusing the previous table or creating
a new table is better.

This doesn't yet weigh in decompression speed. I don't want to add any
heuristics there until I have real data to work with to ensure that the
heuristic works for at least one use case, preferably more.
2017-03-03 16:49:38 -08:00
d051cd5b43 Use workspace for count and CTable 2017-03-02 16:38:07 -08:00
a419777eb1 Allow compressor to repeat Huffman tables
* Compressor saves most recently used Huffman table and reuses it
  if it produces better results.
* I attempted to preserve CPU usage profile.
  I intentionally left all of the existing heuristics in place.
  There is only a speed difference on the second block and later.
  When compressing large enough blocks (say >= 4 KiB) there is
  no significant difference in compression speed.
  Dictionary compression of one block is the same speed for blocks
  with literals <= 1 KiB, and after that the difference is not
  very significant.
* In the synthetic data, with blocks 10 KB or smaller, most blocks
  can't use repeated tables because the previous block did not
  contain a symbol that the current block contains.
  Once blocks are about 12 KB or more, most previous blocks have
  valid Huffman tables for the current block, and the compression
  ratio and decompression speed jumped.
* In silesia blocks as small as 4KB can frequently reuse the
  previous Huffman table (85%), but it isn't as profitable, and
  the previous Huffman table only gets used about 3% of the time.
* Microbenchmarks show that `HUF_validateCTable()` takes ~55 ns
  and `HUF_estimateCompressedSize()` takes ~35 ns.
  They are decently well optimized, the first versions took 90 ns
  and 120 ns respectively. `HUF_validateCTable()` could be twice as
  fast, if we cast the `HUF_CElt*` to a `U32*` and compare to 0.
  However, `U32` has an alignment of 4 instead of 2, so I think that
  might be undefined behavior.
* I've ran `zstreamtest` compiled normally, with UASAN and with MSAN
  for 4 hours each.

The worst case for the speed difference is a bunch of small blocks
in the same frame. I modified `bench.c` to compress the input in a
single frame but with blocks of the given block size, set by `-B`.
Benchmarks on level 1:

|  Program  | Block size |   Corpus  | Ratio | Compression MB/s | Decompression MB/s |
|-----------|------------|-----------|-------|------------------|--------------------|
| zstd.base |        256 | synthetic | 2.364 |            110.0 |              297.0 |
|      zstd |        256 | synthetic | 2.367 |            108.9 |              297.0 |
| zstd.base |        256 | silesia   | 2.204 |             93.8 |              415.7 |
|      zstd |        256 | silesia   | 2.204 |             93.4 |              415.7 |
| zstd.base |        512 | synthetic | 2.594 |            144.2 |              420.0 |
|      zstd |        512 | synthetic | 2.599 |            141.5 |              425.7 |
| zstd.base |        512 | silesia   | 2.358 |            118.4 |              432.6 |
|      zstd |        512 | silesia   | 2.358 |            119.8 |              432.6 |
| zstd.base |       1024 | synthetic | 2.790 |            192.3 |              594.1 |
|      zstd |       1024 | synthetic | 2.794 |            192.3 |              600.0 |
| zstd.base |       1024 | silesia   | 2.524 |            148.2 |              464.2 |
|      zstd |       1024 | silesia   | 2.525 |            148.2 |              467.6 |
| zstd.base |       4096 | synthetic | 3.023 |            300.0 |             1000.0 |
|      zstd |       4096 | synthetic | 3.024 |            300.0 |             1010.1 |
| zstd.base |       4096 | silesia   | 2.779 |            223.1 |              623.5 |
|      zstd |       4096 | silesia   | 2.779 |            223.1 |              636.0 |
| zstd.base |      16384 | synthetic | 3.131 |            350.0 |             1150.1 |
|      zstd |      16384 | synthetic | 3.152 |            350.0 |             1630.3 |
| zstd.base |      16384 | silesia   | 2.871 |            296.5 |              883.3 |
|      zstd |      16384 | silesia   | 2.872 |            294.4 |              898.3 |
2017-03-02 13:27:52 -08:00
821bf1febc fixed Doxygen trailing comment 2016-12-02 16:13:41 +01:00
b89af20353 reduced table sizes for HUF_readDTableX4 2016-12-01 18:24:59 -08:00
a0d742b1e4 introduced HUF_buildCTable_wksp(), to reduce stack memory usage 2016-12-01 17:47:30 -08:00
e928f7e16d introduced ext_wksp variants of count to reduce stack memory usage 2016-12-01 16:13:35 -08:00
e557fd5e92 minor compression level corrections 2016-07-17 16:21:37 +02:00
3755eb8fea fixed strict-aliasing warning on gcc6 2016-06-22 13:15:53 +02:00
cd98f93cff Fixed decompression issue with invalid data 2016-06-11 23:26:22 +02:00
237ad4beb3 Added single-stream decompression variant using external DTable 2016-06-11 01:46:03 +02:00
289bbd52e5 Updated huff0 2016-06-11 01:31:54 +02:00
662a541431 updated huff0 - now generates a common HUF_DTable type for all decoding tables 2016-06-08 11:11:02 +02:00
89703d20fb reduced dependencies 2016-06-05 01:50:33 +02:00
a91ca620cf removed HUF_readStats() from public space 2016-06-05 01:33:55 +02:00
130fe11394 merged huf_static.h into huf.h . Requires HUF_STATIC_LINKING_ONLY macro. 2016-06-05 00:42:28 +02:00
f22a0d653d huff0 dynamic reduction 2016-05-20 14:36:36 +02:00
23a0889301 separation of lib/ into common/, compress/, decompress/, dictBuilder/, legacy/ 2016-04-22 12:43:18 +02:00