This feature has demonstrated itself to be useful in web compression and we
want to encourage other folks to use it. But we currently make it difficult
to do so since it's locked away in the experimental API.
The API itself is really straightforward and I think it's fine to commit to
maintaining support / compatibility for this API even if in the future the
underlying implementation may continue to evolve.
Note that this commit changes its enum name and also its numeric value. Users
who respected the instructions of using the experimental API should be fine
with both of these changes since they should only have referred to it by the.
Conceivably someone could have done bad feature detection of this capability
by doing `#ifdef ZSTD_c_targetCBlockSize` which will now return false since
it's no longer a macro... but I think that's an acceptable hypothetical
breakage.
Mark that `huf_decompress_amd64.S` supports BTI and PAC, which it trivially does because it is empty for aarch64.
The issue only requested BTI markings, but it also makes sense to mark PAC, which is the only other feature.
Also run add a test for this mode to the ARM64 QEMU test. Before this PR it warns on `huf_decompress_amd64.S`, after it doesn't.
Fixes Issue #3841.
in this new method, when an `offset==0` is detected,
it's converted into (size_t)(-1), instead of 1.
The logic is that (size_t)(-1) is effectively an extremely large positive number,
which will not pass the offset distance test at next stage (`execSequence()`).
Checked the source code, and offset is always checked (as it should),
using a formula which is not vulnerable to arithmetic overflow:
```
RETURN_ERROR_IF(sequence.offset > (size_t)(oLitEnd - virtualStart),
```
The benefit is that such a case (offset==0) is always detected as corrupted data
as opposed to relying on the checksum to detect the error.
and only defines a sub-block boundary when
it believes that it is compressible.
It's effectively an optimization,
avoiding a compression cycle to reach the same conclusion.
note that the size of individual compressed blocks will vary more wildly with this modification.
But it seems good enough for a first test, and fix the speed regression issue.
Further refinements can be attempted later.
While it's not always strictly a win,
it's a win for files that see a noticeably compression ratio increase,
while it's a very small noise for other files.
Downside is, this patch is less efficient for 32-bit arrays of integer
than the previous patch which was introducing losses for other files,
but it's still a net improvement on this scenario.