1
0
mirror of https://github.com/facebook/zstd.git synced 2025-07-05 15:09:03 +02:00
Commit Graph

39 Commits

Author SHA1 Message Date
618bf84e0d seekable_format: Prevent rereading frame when seeking forward
When decompressing a seekable file, if seeking forward within
a frame (by issuing multiple ZSTD_seekable_decompress calls
with a small gap between them), the frame will be unnecessarily
reread from the beginning. This patch makes it continue using
the current frame data and simply skip over the unneeded bytes.
2023-03-29 21:24:12 -07:00
63042f1f11 fix 32bit build errors in zstd seekable 2023-01-24 15:53:59 -08:00
5d693cc38c Coalesce Almost All Copyright Notices to Standard Phrasing
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do sed -i '/Copyright .* \(Yann Collet\)\|\(Meta Platforms\)/ s/Copyright .*/Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done

git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0*.c lib/legacy/zstd_v0*.h
nano ./programs/windres/zstd.rc
nano ./build/VS2010/zstd/zstd.rc
nano ./build/VS2010/libzstd-dll/libzstd-dll.rc
```
2022-12-20 12:52:34 -05:00
7f12f24cf4 Rewrite Copyright Date Ranges from -present to -2022
Apparently it's better. Somehow.

```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do echo $f; sed -i 's/\-present/-2022/' $f; done

g co HEAD -- build/meson/
```
2022-12-20 12:44:56 -05:00
8927f985ff Update Copyright Headers 'Facebook' -> 'Meta Platforms'
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
  sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
2022-12-20 12:37:57 -05:00
4dffc35f2e Convert references to https from http 2022-12-14 06:58:35 -08:00
03903f5701 fixed minor compression difference in btlazy2
subtle dependency on sumtype numeric representation
2021-12-29 18:51:03 -08:00
sen
d6be7659b0 Add seekable roundtrip fuzzer (#2617) 2021-05-06 10:08:21 -04:00
53a60e98de seekable decompression fixes (#2594)
* seekable_format: fix from-file reading (not in-memory)

It tries to check the buffer boundary, but there is no buffer for
from-file reading.

* seekable_decompression: break when ZSTD_seekable_decompress() returns zero

* seekable_decompression_mem: break when ZSTD_seekable_decompress() returns zero

* seekable_format: cap the offset+len up to the last dOffset

This will allow to read the whole file w/o gotting corruption error if
the offset is more then the data left in file, i.e.:

    $ ./seekable_compression seekable_compression.c 8192 | head
    $ zstd -cdq seekable_compression.c.zst | wc -c
    4737

Before this patch:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    ZSTD_seekable_decompress() error : Corrupted block detected
    0

After:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    4737
2021-05-05 10:05:41 -04:00
21697b9c9e Fix seek table descriptor check when loading 2021-03-12 23:07:15 +02:00
6c0bfc468c fixed wrong assert condition 2021-03-03 15:30:55 -08:00
24d59a655d Merge branch 'dev' into seekTable 2021-03-03 15:08:40 -08:00
ac95a30455 various minor style fixes 2021-03-02 16:03:18 -08:00
029f974ddc strengthen compilation flags 2021-03-02 15:43:52 -08:00
c7e42e147b fixed const guarantees
read-only objects are properly const-ified in parameters
2021-03-02 15:24:30 -08:00
a80b10f5e6 fix potential leak on exit 2021-03-02 15:03:37 -08:00
527a20c3cd Fix seekable decompress hanging 2021-03-02 14:30:03 -08:00
ce6d1b9376 Merge pull request #2113 from mdittmer/expose-seek-table
[contrib] Support seek table-only API
2021-03-02 10:50:47 -08:00
adb54293ab Stop using deprecated reset?Stream functions
These are replaced by the corresponding context resets. When
converting resetCStream, CCtx_setPledgedSrcSize isn't called if the
source size is "unknown".

This helps reduce the reliance on "static only" symbols, as well as
reducing the use of deprecated functions.

Signed-off-by: Stephen Kitt <steve@sk2.org>
2021-02-23 21:56:01 +01:00
152b55879c Add unit tests to seekable 2020-12-02 15:33:12 -05:00
9db49a3989 Add a forward progress requirement bound to seekable streaming decompression 2020-12-02 12:24:16 -05:00
9b8f337357 [contrib] Support seek table-only API
Memory constrained use cases that manage multiple archives benefit from
retaining multiple archive seek tables without retaining a ZSTD_seekable
instance for each.

* New opaque type for seek table: ZSTD_seekTable.
* ZSTD_seekable_copySeekTable() supports copying seek table out of a
  ZSTD_seekable.
* ZSTD_seekTable_[eachSeekTableOp]() defines seek table API that mirrors
  existing seek table operations.
* Existing ZSTD_seekable_[eachSeekTableOp]() retained; they delegate to
  ZSTD_seekTable the variant.

These changes allow the above-mentioned use cases to initialize a
ZSTD_seekable, extract its ZSTD_seekTable, then throw the ZSTD_seekable
away to save memory. Standard ZSTD operations can then be used to
decompress frames based on seek table offsets.

The copy and delegate patterns are intended to minimize impact on
existing code and clients. Using copy instead of move for the infrequent
operation extracting a seek table ensures that the extraction does not
render the ZSTD_seekable useless. Delegating to *new* seek
table-oriented APIs ensures that this is not a breaking change for
existing clients while supporting all meaningful operations that depend
only on seek table data.
2020-05-07 09:31:43 -04:00
671d533ea7 Fix seekable decompression in-memory api 2019-07-21 23:22:25 -04:00
ededcfca57 fix confusion between unsigned <-> U32
as suggested in #1441.

generally U32 and unsigned are the same thing,
except when they are not ...

case : 32-bit compilation for MIPS (uint32_t == unsigned long)

A vast majority of transformation consists in transforming U32 into unsigned.
In rare cases, it's the other way around (typically for internal code, such as seeds).

Among a few issues this patches solves :
- some parameters were declared with type `unsigned` in *.h,
  but with type `U32` in their implementation *.c .
- some parameters have type unsigned*,
  but the caller user a pointer to U32 instead.

These fixes are useful.

However, the bulk of changes is about %u formating,
which requires unsigned type,
but generally receives U32 values instead,
often just for brevity (U32 is shorter than unsigned).
These changes are generally minor, or even annoying.

As a consequence, the amount of code changed is larger than I would expect for such a patch.

Testing is also a pain :
it requires manually modifying `mem.h`,
in order to lie about `U32`
and force it to be an `unsigned long` typically.
On a 64-bit system, this will break the equivalence unsigned == U32.
Unfortunately, it will also break a few static_assert(), controlling structure sizes.
So it also requires modifying `debug.h` to make `static_assert()` a noop.
And then reverting these changes.

So it's inconvenient, and as a consequence,
this property is currently not checked during CI tests.
Therefore, these problems can emerge again in the future.

I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests.
It's another restriction for coding, adding more frustration during merge tests,
since most platforms don't need this distinction (hence contributor will not see it),
and while this can matter in theory, the number of platforms impacted seems minimal.

Thoughts ?
2018-12-21 18:09:41 -08:00
34f01e600f fixed multiple conversions
from 64-bit to 32-bit
2018-12-13 14:02:22 -08:00
b83d1e7714 removed some static const variables
and replaced by traditional macro constants.

Unfortunately, C doesn't consider `static const` to mean "constant"
2018-11-13 16:56:32 -08:00
b52867a97f zstdseek_decompress: fix decompression with data left in input buffer 2018-09-16 18:05:32 +03:00
42a02ab745 fixed minor warnings issued by scan-build 2018-08-15 14:36:02 -07:00
b567ce9d68 Fix name of macOS 2018-06-09 14:31:17 -05:00
97c60cdf36 fixed seekable_format type mismatch
and some minor "unused variable" warnings.
Also : zstd_seekable.h is actually depending on zstd.h for ZSTDLIB_API
2018-06-06 13:10:29 -07:00
355cb645bf fixed seekable format example 2018-03-15 16:29:28 -07:00
0d58aaf6f0 /contrib: fixed license header
removed last reference to PATENTS file
2017-10-02 02:07:17 -07:00
11dc940e72 Add parallel processing example for seekable API 2017-04-21 12:23:06 -07:00
35186e65b0 Address comments and make sure all prototypes are rendered by gen_html 2017-04-20 16:48:54 -07:00
0f7bd772e6 Update seekable API to simplify IO 2017-04-18 16:48:30 -07:00
9626cf1ac6 Address @terrelln's comments 2017-04-13 17:48:35 -07:00
5ee1135f30 s/chunk/frame/ 2017-04-12 11:15:50 -07:00
e80f1d74b3 Address PR comments and minor fixes 2017-04-12 11:15:46 -07:00
d048fefef7 Move seekable format content to /contrib 2017-04-11 14:38:56 -07:00