Yann Collet
04a2a0219c
update type names
...
naming convention: Type names should start with a Capital letter (after the prefix)
2024-12-29 14:25:33 -08:00
Yann Collet
b7a9e69d8d
added parameter litCapacity
...
to ZSTD_compressSequencesAndLiterals()
to enforce the litCapacity >= litSize+8 condition.
2024-12-20 10:37:01 -08:00
Yann Collet
76445bb379
add a check, to return an error if Sequence validation is enabled
...
since ZSTD_compressSequencesAndLiterals() doesn't support it.
2024-12-20 10:37:01 -08:00
Yann Collet
b339efff2b
add dedicated error code for special case
...
ZSTD_compressSequencesAndLiterals() cannot produce an uncompressed block
2024-12-20 10:37:00 -08:00
Yann Collet
0a54f6f288
ZSTD_compressSequencesAndLiterals requires srcSize as parameter
...
this makes it possible to adjust windowSize to its tightest.
2024-12-20 10:37:00 -08:00
Yann Collet
12c47d3262
improved speed of the Sequences converter
2024-12-20 10:37:00 -08:00
Yann Collet
0b013b2688
added unit tests to ZSTD_compressSequencesAndLiterals()
...
seems to work as expected,
correctly control that `litSize` and `srcSize` are exactly correct.
2024-12-20 10:36:58 -08:00
Yann Collet
14a21e43b3
produced ZSTD_compressSequencesAndLiterals() as a separate pipeline
...
only supports explicit delimiter mode, at least for the time being
2024-12-20 10:36:58 -08:00
Yann Collet
047db4f1f8
ZSTD_SequenceCopier_f no returns the nb of bytes consumed from input
...
which feels much more natural
2024-12-20 10:36:58 -08:00
Yann Collet
4ef9d7d585
codemod: ZSTD_cParamMode_e -> ZSTD_CParamMode_e
2024-12-20 10:36:58 -08:00
Yann Collet
56cfb7816a
codemod: ZSTD_paramSwitch_e -> ZSTD_ParamSwitch_e
2024-12-20 10:36:58 -08:00
Yann Collet
13b9296d79
minor simplification
2024-12-20 10:36:58 -08:00
Yann Collet
c97522f7fb
codemod: ZSTD_sequenceFormat_e -> ZSTD_SequenceFormat_e
...
since it's a type name.
Note: in contrast with previous names, this one is on the Public API side.
So there is a #define, so that existing programs using ZSTD_sequenceFormat_e still work.
2024-12-20 10:36:56 -08:00
Yann Collet
b4a40a845f
move Sequences definition to zstd_compress_internal.h
...
they should not be in common/zstd_internal.h,
since these definitions are not shared beyond lib/compress/.
2024-12-20 10:36:55 -08:00
Dimitri Papadopoulos
fcf88ae39b
Fix new typos found by codespell
2024-11-26 11:15:39 +01:00
Yann Collet
2e02cd330d
inform manual users that it's automatically generated
...
suggested by @Eugeny1
2024-10-31 15:06:48 -07:00
Yann Collet
d9553fd218
elevated ZSTD_getErrorCode() to stable status
...
answering #4183
2024-10-31 14:15:50 -07:00
Yann Collet
3e7c66acd1
added ascending order example
2024-10-09 01:06:24 -07:00
Yann Collet
3b343dcfb1
refactor huffman prefix code paragraph
2024-10-07 17:15:07 -07:00
Yann Collet
a8b86d024a
refactor documentation of the FSE decoding table build process
2024-10-02 23:09:06 -07:00
Yann Collet
d2212c680a
Merge pull request #4013 from elasota/spec-clarify-offset-code-overflow
...
Specify that decoders may reject non-zero probabilities for larger offset codes than implementation supports
2024-09-27 13:42:32 -07:00
Yann Collet
3de0541aef
Merge pull request #4079 from elasota/truncated-huff-state-error
...
Throw error if Huffman weight initial states are truncated
2024-06-30 16:17:03 -07:00
elasota
0938308ff6
Throw error if Huffman weight initial states are truncated
2024-06-20 17:46:16 -04:00
Dimitri Papadopoulos
2d736d9c50
Fix new typos found by codespell
2024-06-20 20:12:16 +02:00
Quentin Boswank
f19c98228f
Fix $filter and Msys/Cygwin
...
- switched the patter and input of $filter into the right places
- added pattern wildcard to MSYS_NT & CYGWIN_NT as they change with windows versions
- correctly identify MSYS2, even in an env like MINGW64
2024-06-05 18:37:27 +02:00
elasota
c54f4783d0
Specify that decoders may reject non-zero probabilities for larger offset codes than supported by the implementation
2024-04-01 20:13:48 -04:00
elasota
8cff66f2f5
Remove text specifying probability overflow as invalid, the variable-size value encoding scheme makes this impossible.
2024-04-01 20:08:42 -04:00
Yann Collet
c6e5257240
Merge pull request #3977 from facebook/doc_advanced
...
Doc update
2024-03-21 12:33:15 -07:00
Elliot Gorokhovsky
741b87bbe1
Fuzzing and bugfixes for magicless-format decoding ( #3976 )
...
* fuzzing and bugfixes for magicless format
* reset dctx before each decompression
* do not memcmp empty buffers
* nit: decompressor errata
2024-03-20 19:22:34 -04:00
Yann Collet
c5da438dc0
fix typo
2024-03-18 12:33:22 -07:00
Yann Collet
3d18d9a9ce
updated API manual
2024-03-18 12:30:54 -07:00
Yann Collet
686e7e4b4b
updated version to v1.5.6
2024-03-14 15:38:14 -07:00
Yann Collet
eb5f7a7fa2
produced golden sample for the offset==0 decoder test
...
is correctly detected as corrupted by new version,
and is accepted (changed into offset==1) by older version.
updated documentation accordingly, with an hexadecimal representation.
2024-03-09 00:33:44 -08:00
Yann Collet
d2f56ba442
update documentation
2024-03-08 15:55:30 -08:00
Yann Collet
e127139ceb
Merge pull request #3824 from elasota/specify-zero-offset
...
Specify offset 0 as invalid and specify required fixup behavior
2024-03-08 15:25:48 -08:00
Yann Collet
478e5fedf9
Merge pull request #3816 from elasota/fix-state-table
...
Fix state table formatting
2024-03-08 15:02:00 -08:00
Yann Collet
f77f634d41
update API documentation
2024-02-24 01:28:17 -08:00
Yann Collet
7971fd16f7
Merge pull request #3817 from elasota/oversized-probs-clarification
...
Clarify that probability tables must not contain non-zero probabilities for invalid values
2024-01-13 11:37:54 -08:00
elasota
f06b18b3ff
Specify offset 0 as invalid
2023-12-28 16:47:09 -05:00
elasota
05059e5a48
Clarify that there must be at least 2 weights, i.e. encoding all weights as 0 is invalid
2023-11-24 16:49:40 -05:00
elasota
dc84e35138
Clarify that the presence of a value with weight 1 is required
2023-11-24 16:49:40 -05:00
elasota
c5bf96fb74
Clarify that a non-zero probability for an invalid symbol is invalid
2023-11-13 00:03:56 -05:00
elasota
52e41b9ac8
Fix malformed state table
2023-11-09 12:28:21 -05:00
elasota
e61e3ff152
Clarify that decoding too many Huffman weights is a failure condition
2023-11-08 20:06:58 -05:00
elasota
324cce4996
Add definition of "log2sup" function
2023-10-31 11:45:10 -04:00
elasota
b38d87b476
Clarify that the log2 of the largest possible symbol is the maximum number of bits consumed
2023-10-31 01:17:23 -04:00
Dimitri Papadopoulos
fe34776c20
Fix new typos found by codespell
2023-09-23 18:56:01 +02:00
Yann Collet
3732a08f5b
fixed decoder behavior when nbSeqs==0 is encoded using 2 bytes
...
The sequence section starts with a number, which tells how sequences are present in the section.
If this number if 0, the section automatically ends.
The number 0 can be represented using the 1 byte or the 2 bytes formats.
That's because the 2-bytes formats fully overlaps the 1 byte format.
However, when 0 is represented using the 2-bytes format,
the decoder was expecting the sequence section to continue,
and was looking for FSE tables, which is incorrect.
Fixed this behavior, in both the reference decoder and the educational behavior.
In practice, this behavior never happens,
because the encoder will always select the 1-byte format to represent 0,
since this is more efficient.
Completed the fix with a new golden sample for tests,
a clarification of the specification,
and a decoder errata paragraph.
2023-06-05 16:03:00 -07:00
Yann Collet
8030342eea
Merge pull request #3659 from facebook/fixHarness
...
Fixed a bug in the educational decoder
2023-06-05 15:03:14 -04:00
Yann Collet
1f83b7cfc4
fix a minor inefficiency in compress_superblock
...
and in `decodecorpus`:
the specific case `nbSeq=127` can be represented using the 1-byte format.
Note that both the 1-byte and the 2-bytes formats are valid to represent this case,
so there was no "error", produced data remains valid,
it's just that the 1-byte format is more efficient.
fix #3667
Credit to @ip7z for finding this issue.
2023-06-05 09:51:52 -07:00