krak/zstd - zstd - Gitea: Git with a cup of tea

krak/zstd

mirror of https://github.com/facebook/zstd.git synced 2025-03-06 16:56:49 +02:00

Author	SHA1	Message	Date
Victor Zhang	a610550e2c	Merge pull request #4218 from facebook/externC Move #includes out of `extern "C"` blocks	2025-01-07 10:06:08 -08:00
Victor Zhang	6b046f5841	PR feedback	2025-01-02 15:05:58 -08:00
Victor Zhang	54c3d998a0	Support for libc variants without fseeko/ftello Some older Android libc implementations don't support `fseeko` or `ftello`. This commit adds a new compile-time macro `LIBC_NO_FSEEKO` as well as a usage in CMake for old Android APIs.	2025-01-02 14:02:10 -08:00
Victor Zhang	fc726da774	Move #includes out of `extern "C"` blocks Do some include shuffling for `**.h` files within lib, programs, tests, and zlibWrapper. `lib/legacy` and `lib/deprecated` are untouched. `#include`s within `extern "C"` blocks in .cpp files are untouched. todo: shuffling for `xxhash.h`	2024-12-17 17:55:07 -08:00
Robert Rose	b683c0dbe2	prevent possible segfault when creating seek table Add a check whether the seek table of a `ZSTD_seekable` is initialized before creating a new seek table from it. Return `NULL`, if the check fails.	2024-11-25 08:57:25 +01:00
Dimitri Papadopoulos	585aaa0ed3	Do not test WIN32, instead test _WIN32 To the best of my knowledge: * `_WIN32` and `_WIN64` are defined by the compiler, * `WIN32` and `WIN64` are defined by the user, to indicate whatever the user chooses them to indicate. They mean 32-bit and 64-bit Windows compilation by convention only. See: https://accu.org/journals/overload/24/132/wilson_2223/ Windows compilers in general, and MSVC in particular, have been defining `_WIN32` and `_WIN64` for a long time, provably at least since Visual Studio 2015, and in practice as early as in the days of 16-bit Windows. See: https://learn.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=msvc-140 https://learn.microsoft.com/en-us/windows/win32/winprog64/the-tools Tests used to be inconsistent, sometimes testing `_WIN32`, sometimes `_WIN32` and `WIN32`. This brings consistency to Windows detection.	2023-09-23 19:03:18 +02:00
Yoni Gilad	649a9c85c3	seekable_format: Add unit test for multiple decompress calls This does the following: 1. Compress test data into multiple frames 2. Perform a series of small decompressions and seeks forward, checking that compressed data wasn't reread unnecessarily. 3. Perform some seeks forward and backward to ensure correctness.	2023-03-29 21:35:52 -07:00
Yoni Gilad	618bf84e0d	seekable_format: Prevent rereading frame when seeking forward When decompressing a seekable file, if seeking forward within a frame (by issuing multiple ZSTD_seekable_decompress calls with a small gap between them), the frame will be unnecessarily reread from the beginning. This patch makes it continue using the current frame data and simply skip over the unneeded bytes.	2023-03-29 21:24:12 -07:00
Yann Collet	dd8cb5a0f1	added documentation for the seekable format and notably provide additional context for the Maximum Frame Size parameter. requested by @P-E-Meunier at `1df9f36c6c (commitcomment-103856979)`.	2023-03-10 15:54:31 -08:00
Yann Collet	1df9f36c6c	Improved seekable format ingestion speed for small frame size As reported by @P-E-Meunier in https://github.com/facebook/zstd/issues/2662#issuecomment-1443836186, seekable format ingestion speed can be particularly slow when selected `FRAME_SIZE` is very small, especially in combination with the recent row_hash compression mode. The specific scenario mentioned was `pijul`, using frame sizes of 256 bytes and level 10. This is improved in this PR, by providing approximate parameter adaptation to the compression process. Tested locally on a M1 laptop, ingestion of `enwik8` using `pijul` parameters went from 35sec. (before this PR) to 2.5sec (with this PR). For the specific corner case of a file full of zeroes, this is even more pronounced, going from 45sec. to 0.5sec. These benefits are unrelated to (and come on top of) other improvement efforts currently being made by @yoniko for the row_hash compression method specifically. The `seekable_compress` test program has been updated to allows setting compression level, in order to produce these performance results.	2023-03-09 18:00:30 -08:00
Danielle Rozenblit	63042f1f11	fix 32bit build errors in zstd seekable	2023-01-24 15:53:59 -08:00
W. Felix Handte	5d693cc38c	Coalesce Almost All Copyright Notices to Standard Phrasing ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache $ -prune -o -type f); do sed -i '/Copyright .* $Yann Collet$\\|$Meta Platforms$/ s/Copyright ./Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0.c lib/legacy/zstd_v0*.h nano ./programs/windres/zstd.rc nano ./build/VS2010/zstd/zstd.rc nano ./build/VS2010/libzstd-dll/libzstd-dll.rc ```	2022-12-20 12:52:34 -05:00
W. Felix Handte	7f12f24cf4	Rewrite Copyright Date Ranges from `-present` to `-2022` Apparently it's better. Somehow. ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache $ -prune -o -type f); do echo $f; sed -i 's/\-present/-2022/' $f; done g co HEAD -- build/meson/ ```	2022-12-20 12:44:56 -05:00
W. Felix Handte	8927f985ff	Update Copyright Headers 'Facebook' -> 'Meta Platforms' ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora $ -prune -o -type f); do sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f; done ```	2022-12-20 12:37:57 -05:00
daniellerozenblit	e2fc93340f	Merge branch 'dev' into http-to-https	2022-12-15 10:46:13 -05:00
Danielle Rozenblit	4dffc35f2e	Convert references to https from http	2022-12-14 06:58:35 -08:00
Danielle Rozenblit	aece0f258a	free memory in test case	2022-12-13 08:15:16 -08:00
yhoogstrate	f17652931c	seekable_format no header when compressing empty string to stream	2022-02-08 14:06:00 +01:00
Yann Collet	03903f5701	fixed minor compression difference in btlazy2 subtle dependency on sumtype numeric representation	2021-12-29 18:51:03 -08:00
sen	d6be7659b0	Add seekable roundtrip fuzzer (#2617 )	2021-05-06 10:08:21 -04:00
Azat Khuzhin	53a60e98de	seekable decompression fixes (#2594 ) * seekable_format: fix from-file reading (not in-memory) It tries to check the buffer boundary, but there is no buffer for from-file reading. * seekable_decompression: break when ZSTD_seekable_decompress() returns zero * seekable_decompression_mem: break when ZSTD_seekable_decompress() returns zero * seekable_format: cap the offset+len up to the last dOffset This will allow to read the whole file w/o gotting corruption error if the offset is more then the data left in file, i.e.: $ ./seekable_compression seekable_compression.c 8192 \| head $ zstd -cdq seekable_compression.c.zst \| wc -c 4737 Before this patch: $ ./seekable_decompression seekable_compression.c.zst 0 10000000 \| wc -c ZSTD_seekable_decompress() error : Corrupted block detected 0 After: $ ./seekable_decompression seekable_compression.c.zst 0 10000000 \| wc -c 4737	2021-05-05 10:05:41 -04:00
Fotis Xenakis	3c6f5d5eca	Fix seekable test to provide valid descriptor	2021-03-13 00:00:08 +02:00
Fotis Xenakis	21697b9c9e	Fix seek table descriptor check when loading	2021-03-12 23:07:15 +02:00
Yann Collet	2fa4c8c405	added code comments for new API ZSTD_seekTable	2021-03-03 22:54:04 -08:00
Yann Collet	6e390ced1f	Merge branch 'seekTable' of github.com:facebook/zstd into seekTable	2021-03-03 22:44:38 -08:00
Yann Collet	16ec1cf355	added test case for seekTable API and simple roundtrip test	2021-03-03 18:56:23 -08:00
Yann Collet	713d4953f7	fixed gcc-7 conversion warning	2021-03-03 18:00:41 -08:00
Yann Collet	6c0bfc468c	fixed wrong assert condition	2021-03-03 15:30:55 -08:00
Yann Collet	a1d7b9d654	fixed gcc conversion warnings	2021-03-03 15:17:12 -08:00
Yann Collet	24d59a655d	Merge branch 'dev' into seekTable	2021-03-03 15:08:40 -08:00
Yann Collet	ac95a30455	various minor style fixes	2021-03-02 16:03:18 -08:00
Yann Collet	029f974ddc	strengthen compilation flags	2021-03-02 15:43:52 -08:00
Yann Collet	c7e42e147b	fixed const guarantees read-only objects are properly const-ified in parameters	2021-03-02 15:24:30 -08:00
Yann Collet	a80b10f5e6	fix potential leak on exit	2021-03-02 15:03:37 -08:00
Sen Huang	527a20c3cd	Fix seekable decompress hanging	2021-03-02 14:30:03 -08:00
Martin Lindsay	3cbdbb888b	ZSTD_seekable_decompress() example that hangs.	2021-03-02 14:25:17 -08:00
Yann Collet	ce6d1b9376	Merge pull request #2113 from mdittmer/expose-seek-table [contrib] Support seek table-only API	2021-03-02 10:50:47 -08:00
Stephen Kitt	adb54293ab	Stop using deprecated reset?Stream functions These are replaced by the corresponding context resets. When converting resetCStream, CCtx_setPledgedSrcSize isn't called if the source size is "unknown". This helps reduce the reliance on "static only" symbols, as well as reducing the use of deprecated functions. Signed-off-by: Stephen Kitt <steve@sk2.org>	2021-02-23 21:56:01 +01:00
Yann Collet	0b39531d75	moving all references to `release` branch was previously `master`	2020-12-16 23:00:35 -08:00
senhuang42	26f89d47aa	Clean up makefile for seekable tests	2020-12-03 09:25:04 -05:00
senhuang42	152b55879c	Add unit tests to seekable	2020-12-02 15:33:12 -05:00
senhuang42	9db49a3989	Add a forward progress requirement bound to seekable streaming decompression	2020-12-02 12:24:16 -05:00
Mark Dittmer	9b8f337357	[contrib] Support seek table-only API Memory constrained use cases that manage multiple archives benefit from retaining multiple archive seek tables without retaining a ZSTD_seekable instance for each. * New opaque type for seek table: ZSTD_seekTable. * ZSTD_seekable_copySeekTable() supports copying seek table out of a ZSTD_seekable. * ZSTD_seekTable_[eachSeekTableOp]() defines seek table API that mirrors existing seek table operations. * Existing ZSTD_seekable_[eachSeekTableOp]() retained; they delegate to ZSTD_seekTable the variant. These changes allow the above-mentioned use cases to initialize a ZSTD_seekable, extract its ZSTD_seekTable, then throw the ZSTD_seekable away to save memory. Standard ZSTD operations can then be used to decompress frames based on seek table offsets. The copy and delegate patterns are intended to minimize impact on existing code and clients. Using copy instead of move for the infrequent operation extracting a seek table ensures that the extraction does not render the ZSTD_seekable useless. Delegating to new seek table-oriented APIs ensures that this is not a breaking change for existing clients while supporting all meaningful operations that depend only on seek table data.	2020-05-07 09:31:43 -04:00
W. Felix Handte	15da57820d	Add New Seekable Compression Example to .gitignore	2019-07-24 18:22:20 -04:00
Sean Purcell	671d533ea7	Fix seekable decompression in-memory api	2019-07-21 23:22:25 -04:00
Yann Collet	ededcfca57	fix confusion between unsigned <-> U32 as suggested in #1441. generally U32 and unsigned are the same thing, except when they are not ... case : 32-bit compilation for MIPS (uint32_t == unsigned long) A vast majority of transformation consists in transforming U32 into unsigned. In rare cases, it's the other way around (typically for internal code, such as seeds). Among a few issues this patches solves : - some parameters were declared with type `unsigned` in .h, but with type `U32` in their implementation .c . - some parameters have type unsigned*, but the caller user a pointer to U32 instead. These fixes are useful. However, the bulk of changes is about %u formating, which requires unsigned type, but generally receives U32 values instead, often just for brevity (U32 is shorter than unsigned). These changes are generally minor, or even annoying. As a consequence, the amount of code changed is larger than I would expect for such a patch. Testing is also a pain : it requires manually modifying `mem.h`, in order to lie about `U32` and force it to be an `unsigned long` typically. On a 64-bit system, this will break the equivalence unsigned == U32. Unfortunately, it will also break a few static_assert(), controlling structure sizes. So it also requires modifying `debug.h` to make `static_assert()` a noop. And then reverting these changes. So it's inconvenient, and as a consequence, this property is currently not checked during CI tests. Therefore, these problems can emerge again in the future. I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests. It's another restriction for coding, adding more frustration during merge tests, since most platforms don't need this distinction (hence contributor will not see it), and while this can matter in theory, the number of platforms impacted seems minimal. Thoughts ?	2018-12-21 18:09:41 -08:00
Yann Collet	34f01e600f	fixed multiple conversions from 64-bit to 32-bit	2018-12-13 14:02:22 -08:00
Yann Collet	b83d1e7714	removed some `static const` variables and replaced by traditional macro constants. Unfortunately, C doesn't consider `static const` to mean "constant"	2018-11-13 16:56:32 -08:00
Azat Khuzhin	d707692e05	seekable_decompression: support offset greater then UNIT_MAX	2018-09-16 18:05:32 +03:00
Azat Khuzhin	b52867a97f	zstdseek_decompress: fix decompression with data left in input buffer	2018-09-16 18:05:32 +03:00

1 2

69 Commits