Doing this check with a direct c++ snippet is prone to portability problems:
- \043 is not portable between shells: dash expands it to #,
bash does not;
- using # directly works with make 4.3 but does not with make 4.2.
Let's just use the c++ version that covers both the code and the gtest.
ZSTD_resetDStream() is deprecated and replaced by ZSTD_DCtx_reset().
This removes deprecation warnings from the kernel build.
This change is a no-op, see the docs suggesting this replacement.
fcbf2fde9a/lib/zstd.h (L2655-L2663)
To the best of my knowledge:
* `_WIN32` and `_WIN64` are defined by the compiler,
* `WIN32` and `WIN64` are defined by the user, to indicate whatever
the user chooses them to indicate. They mean 32-bit and 64-bit Windows
compilation by convention only.
See:
https://accu.org/journals/overload/24/132/wilson_2223/
Windows compilers in general, and MSVC in particular, have been defining
`_WIN32` and `_WIN64` for a long time, provably at least since Visual Studio
2015, and in practice as early as in the days of 16-bit Windows.
See:
https://learn.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=msvc-140https://learn.microsoft.com/en-us/windows/win32/winprog64/the-tools
Tests used to be inconsistent, sometimes testing `_WIN32`, sometimes
`_WIN32` and `WIN32`. This brings consistency to Windows detection.
This does the following:
1. Compress test data into multiple frames
2. Perform a series of small decompressions and seeks forward, checking
that compressed data wasn't reread unnecessarily.
3. Perform some seeks forward and backward to ensure correctness.
When decompressing a seekable file, if seeking forward within
a frame (by issuing multiple ZSTD_seekable_decompress calls
with a small gap between them), the frame will be unnecessarily
reread from the beginning. This patch makes it continue using
the current frame data and simply skip over the unneeded bytes.
Rather than remove the flag entirely, as proposed in #3499, this commit uses
the newest C++ standard the compiler supports. This retains the selection of
using only standardized features (excluding GNU extensions) and keeps the
recency requirements of the codebase explicit.
Tested with various versions of `g++` and `clang++`.
As reported by @P-E-Meunier in https://github.com/facebook/zstd/issues/2662#issuecomment-1443836186,
seekable format ingestion speed can be particularly slow
when selected `FRAME_SIZE` is very small,
especially in combination with the recent row_hash compression mode.
The specific scenario mentioned was `pijul`,
using frame sizes of 256 bytes and level 10.
This is improved in this PR,
by providing approximate parameter adaptation to the compression process.
Tested locally on a M1 laptop,
ingestion of `enwik8` using `pijul` parameters
went from 35sec. (before this PR) to 2.5sec (with this PR).
For the specific corner case of a file full of zeroes,
this is even more pronounced, going from 45sec. to 0.5sec.
These benefits are unrelated to (and come on top of) other improvement efforts currently being made by @yoniko for the row_hash compression method specifically.
The `seekable_compress` test program has been updated to allows setting compression level,
in order to produce these performance results.
* Fixes zstd-dll build (https://github.com/facebook/zstd/issues/3492):
- Adds pool.o and threading.o dependency to the zstd-dll target
- Moves custom allocation functions into header to avoid needing to add dependency on common.o
- Adds test target for zstd-dll
- Adds github workflow that buildis zstd-dll
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f);
do
sed -i 's/\-2021/-present/' $f;
done
g co HEAD -- .github/workflows/dev-short-tests.yml # fix bad match
```
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
* Add `Portability.h` to fix min/max issues.
* Fix conversion warnings
* Assert that windowLog <= 23, which is currently always the case.
This could be loosened, but we aren't looking to add new functionality.
Fixes on top of PR #3375 by @eli-schwartz, which added Windows CI for contrib & programs.
1. Follow the scheme introduced in PR #2501 for both `zdict.h` and `zstd_errors.h`.
2. If the `*_VISIBLE` macro isn't set, but the `*_VISIBILITY` macro is, use that.
Also make this change for `zstd.h`, since we probably shouldn't have changed
that macro name without backward compatibility in the first place.
3. Change all references to `*_VISIBILITY` to `*_VISIBLE`.
Fixes#3359.
Instead of using packed attribute hack, just use aligned attribute. It
improves code generation on armv6 and armv7, and slightly improves code
generation on aarch64. GCC generates identical code to regular aligned
access on ARMv6 for all versions between 4.5 and trunk, except GCC 5
which is buggy and generates the same (bad) code as packed access:
https://gcc.godbolt.org/z/hq37rz7sb
Newer gcc versions were getting smart and omitting the `memset()`.
Get around this issue by outlining the `memset()` into a different
function. This test is still hacky, but it works...
Disable ASM in the kernel for now. It requires a few changes & setup to
get working. Instead of doing it in a zstd version update, I'd prefer to
package that change as a single patch, and propose it separately from
the version update. This makes the version update easier, and reduces
some risk.
The zstd_common module was added upstream in commit
637a642f5c.
But the kernel specific code was inlined into the library. This commit
switches it to use the out of line method that we use for the other
modules.
Add a `--spdx` option to the freestanding script to prefix
files with a line like (for `.c` files):
// SPDX-License-Identifier: GPL-2.0+ OR BSD-3-Clause
or (for `.h` and `.S` files):
/* SPDX-License-Identifier: GPL-2.0+ OR BSD-3-Clause */
Given the style of the line to be used depends on the extension,
a simple `sed` insert command would not work.
It also skips the file if an existing SPDX line is there,
as well as raising an error if an unexpected SPDX line appears
anywhere else in the file, as well as for unexpected
file extensions.
I double-checked that all currently generated files appear
to be license as expected with:
grep -LRF 'This source code is licensed under both the BSD-style license (found in the' linux/lib/zstd
grep -LRF 'LICENSE file in the root directory of this source tree) and the GPLv2 (found' linux/lib/zstd
but somebody knowledgable on the licensing of the project should
double-check this is the intended case.
Fixes: https://github.com/facebook/zstd/issues/3293
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Summary:
Freeing an uninitialized pointer is undefined behavior. This caused a segfault
when compiling the benchmark with Clang -O3 and benching decompression.
V2: always create compressInstructions but check if cctxParams is NULL before
setting CCtx params to avoid segfault.
Test Plan:
make and run
Summary:
Added an option -p# where -p0 (default) sets the aggregation method to fastest
speed while -p1 sets the aggregation method to median. Also added a new column
in the csv file to report this option's value.
Test Plan:
``
$ ./largeNbDicts -1 --nbDicts=1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03 (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28 (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.0 MB/s
Fastest Speed : 310.6 MB/s
$ ./largeNbDicts -1 --nbDicts=1 -p1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03 (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28 (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.9 MB/s
Median Speed : 298.4 MB/s
```