krak/zstd - zstd - Gitea: Git with a cup of tea

krak/zstd

mirror of https://github.com/facebook/zstd.git synced 2025-03-07 01:10:04 +02:00

Author	SHA1	Message	Date
Nick Terrell	8957fef554	[huf] Add generic C versions of the fast decoding loops Add generic C versions of the fast decoding loops to serve architectures that don't have an assembly implementation. Also allow selecting the C decoding loop over the assembly decoding loop through a zstd decompression parameter `ZSTD_d_disableHuffmanAssembly`. I benchmarked on my Intel i9-9900K and my Macbook Air with an M1 processor. The benchmark command forces zstd to compress without any matches, using only literals compression, and measures only Huffman decompression speed: ``` zstd -b1e1 --compress-literals --zstd=tlen=131072 silesia.tar ``` The new fast decoding loops outperform the previous implementation uniformly, but don't beat the x86-64 assembly. Additionally, the fast C decoding loops suffer from the same stability problems that we've seen in the past, where the assembly version doesn't. So even though clang gets close to assembly on x86-64, it still has stability issues. \| Arch \| Function \| Compiler \| Default (MB/s) \| Assembly (MB/s) \| Fast (MB/s) \| \|---------\|----------------\|--------------\|----------------\|-----------------\|-------------\| \| x86-64 \| decompress 4X1 \| gcc-12.2.0 \| 1029.6 \| 1308.1 \| 1208.1 \| \| x86-64 \| decompress 4X1 \| clang-14.0.6 \| 1019.3 \| 1305.6 \| 1276.3 \| \| x86-64 \| decompress 4X2 \| gcc-12.2.0 \| 1348.5 \| 1657.0 \| 1374.1 \| \| x86-64 \| decompress 4X2 \| clang-14.0.6 \| 1027.6 \| 1659.9 \| 1468.1 \| \| aarch64 \| decompress 4X1 \| clang-12.0.5 \| 1081.0 \| N/A \| 1234.9 \| \| aarch64 \| decompress 4X2 \| clang-12.0.5 \| 1270.0 \| N/A \| 1516.6 \|	2023-01-25 13:47:51 -08:00
Nick Terrell	0cc1b0cb22	Delete unused Huffman functions Remove all Huffman functions that aren't used by zstd.	2023-01-20 14:12:53 -08:00
W. Felix Handte	5d693cc38c	Coalesce Almost All Copyright Notices to Standard Phrasing ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache $ -prune -o -type f); do sed -i '/Copyright .* $Yann Collet$\\|$Meta Platforms$/ s/Copyright ./Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0.c lib/legacy/zstd_v0*.h nano ./programs/windres/zstd.rc nano ./build/VS2010/zstd/zstd.rc nano ./build/VS2010/libzstd-dll/libzstd-dll.rc ```	2022-12-20 12:52:34 -05:00
W. Felix Handte	8927f985ff	Update Copyright Headers 'Facebook' -> 'Meta Platforms' ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora $ -prune -o -type f); do sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f; done ```	2022-12-20 12:37:57 -05:00
Elliot Gorokhovsky	db2f4a6532	Move bitwise builtins into bits.h	2022-02-14 11:16:03 -05:00
Nick Terrell	5414dd7978	[bmi2] Add lzcnt and bmi target attributes * When dynamic dispatching to bmi2 add lzcnt and bmi to the TARGET_ATTRIBUTE. * Centralize the bmi2 TARGET_ATTRIBUTE definition to BMI2_TARGET_ATTRIBUTE so we can change it in the future. * Only enable bmi2 when both bmi1 & bmi2 are supported. There shouldn't be any cases where bmi2 is supported but bmi1 isn't. But, since we are using the instruction we should check bmi1 as well.	2021-11-30 17:54:56 -08:00
Ma Lin	ae986fcdb8	Use __assume(0) for unreachable code path in msvc msvc will optimize away the condition check.	2021-09-27 19:23:57 +08:00
Ma Lin	e5ba858270	Don't initialize the first parameter of _BitScanForward* functions Like the document example, no need to initialize `r` to 0. https://docs.microsoft.com/en-us/cpp/intrinsics/bitscanforward-bitscanforward64	2021-09-25 16:36:53 +08:00
Nick Terrell	d8a0797268	[fuzz] Add Huffman round trip fuzzer * Add a Huffman round trip fuzzer * Fix two minor bugs in Huffman that aren't exposed in zstd - Incorrect weight comparison (weights are allowed to be equal to table log). - HUF_compress1X_usingCTable_internal() can return compressed size >= source size, so the assert that `cSize <= 65535` isn't correct, and it needs to be checked instead.	2021-08-03 08:10:06 -07:00
Nick Terrell	a494308ae9	[copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files * Switch to yearless copyright per FB policy * Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources * Add zstd copyright/license header to the `contrib/linux-kernel` sources * Update the `tests/test-license.py` to check for yearless copyright * Improvements to `tests/test-license.py` * Check `contrib/linux-kernel` in `tests/test-license.py`	2021-03-30 10:30:43 -07:00
Nick Terrell	66e811d782	[license] Update year to 2021	2021-01-04 17:53:52 -05:00
Yann Collet	68c14bdff2	minor speed improvement to HUF_readCTable() faster by ~+1-2%	2020-12-04 16:33:39 -08:00
Nick Terrell	c465f24457	ZSTD_ prefix mem{cpy,move,set},malloc,calloc,free	2020-08-26 12:26:03 -07:00
Nick Terrell	4193638996	[bug] Fix FSE_readNCount() * Fix bug introduced in PR #2271 * Fix long-standing bug that is impossible to trigger inside of zstd * Add a fuzzer that makes sure the normalized count always round trips correctly	2020-08-25 15:42:41 -07:00
Nick Terrell	6d2f750b37	Document the BMI2 default() functions	2020-08-24 14:44:33 -07:00
Nick Terrell	8def0e5fd3	Fix up code after reading through	2020-08-24 12:24:45 -07:00
Nick Terrell	8f8bd2d1ac	[regression] Update results.csv	2020-08-20 12:41:35 -07:00
Nick Terrell	612e947c5e	wire up bmi2 support	2020-08-17 16:35:28 -07:00
Nick Terrell	ba1fd17a9f	speed up literal header decoding	2020-08-17 12:17:53 -07:00
Nick Terrell	6004c1117f	speed up small blocks	2020-08-16 23:03:38 -07:00
Yann Collet	21c273da84	import some minor fixes from FSE project	2020-07-16 20:25:15 -07:00
Nick Terrell	ac58c8d720	Fix copyright and license lines * All copyright lines now have -2020 instead of -present * All copyright lines include "Facebook, Inc" * All licenses are now standardized The copyright in `threading.{h,c}` is not changed because it comes from zstdmt. The copyright and license of `divsufsort.{h,c}` is not changed.	2020-03-26 17:02:06 -07:00
Yann Collet	ff773bfcde	zeroise freq table with memset() improves decoding speed by ~5% in github_users sample set	2018-06-26 17:24:41 -07:00
Nick Terrell	f2d0924b87	Variable declarations	2018-05-23 14:58:58 -07:00
Nick Terrell	c92dd11940	Error if reported size is too large in edge case	2018-05-23 14:47:20 -07:00
Nick Terrell	a97e9a627a	[zstd] Fix decompression edge case This edge case is only possible with the new optimal encoding selector, since before zstd would always choose `set_basic` for small numbers of sequences. Fix `FSE_readNCount()` to support buffers < 4 bytes. Credit to OSS-Fuzz	2018-05-23 12:16:00 -07:00
Yann Collet	1a26ec6e8d	opt: init statistics from dictionary instead of starting from fake "default" statistics.	2018-05-10 17:59:12 -07:00
Yann Collet	1f2c95c5f3	minor code refactor in HUF module	2017-03-05 21:07:20 -08:00
Yann Collet	4596037042	updated fse version feature minor refactoring (removing FSE_abs()) also : fix a few minor issues recently introduced in examples	2017-02-15 12:00:03 -08:00
Yann Collet	b89af20353	reduced table sizes for HUF_readDTableX4	2016-12-01 18:24:59 -08:00
Yann Collet	766431909f	introduced FSE_decompress_wksp(), to use less stack space	2016-11-30 12:36:45 -08:00
Nick Terrell	d760529a05	Fix stack buffer overrun when weightTotal == 0 If `weightTotal == 0`, then `BIT_highbit32(weightTotal)` is undefined behavior in the case that it calls `__builtin_clz()`. If `tableLog == HUF_TABLELOG_ABSOLUTEMAX` then we will access one byte beyond the end of the buffer.	2016-10-19 11:39:11 -07:00
Nick Terrell	ccfcc643da	Check if dict is empty before reading first byte	2016-10-17 11:46:03 -07:00
Yann Collet	cbc5e9dc19	fixes oob read	2016-07-24 18:02:04 +02:00
Yann Collet	38b75ddeb2	removed special case all-1 huffman distribution	2016-07-24 15:35:59 +02:00
Yann Collet	7ed5e33b89	minor comment changes	2016-07-24 14:26:11 +02:00
Yann Collet	d5c5a77990	minor comments clarifications	2016-07-20 13:35:14 +02:00
Yann Collet	a91ca620cf	removed `HUF_readStats()` from public space	2016-06-05 01:33:55 +02:00
Yann Collet	d0e2cd15cb	Merged `fse_static` into `fse.h` . Now requires `FSE_STATIC_LINKING_ONLY` macro.	2016-06-05 00:58:01 +02:00
inikep	63ecd747de	added common/entropy_common.c	2016-05-13 11:27:56 +02:00

40 Commits