Reduces memory consumption by ~4MB for 1080p video with a maximum delay
of 16 frames by packing various information related to MIP:
* intra_mip_flag, 1 bit
* intra_mip_transposed_flag, 1 bit
* intra_mip_mode, 4 bits
Into a single byte.
Co-authored-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Frank Plowman <post@frankplowman.com>
There is an issue with the constants used in YUV to YUV range conversion,
where the upper bound is not respected when converting to mpeg range.
With this commit, the constants are calculated at runtime, depending on
the bit depth. This approach also allows us to more easily understand how
the constants are derived.
For bit depths <= 14, the number of fixed point bits has been set to 14
for all conversions, to simplify the code.
For bit depths > 14, the number of fixed points bits has been raised and
set to 18, to allow for the conversion to be accurate enough for the mpeg
range to be respected.
The convert functions now take the conversion constants (coeff and offset)
as function arguments.
For bit depths <= 14, coeff is unsigned 16-bit and offset is 32-bit.
For bit depths > 14, coeff is unsigned 32-bit and offset is 64-bit.
x86_64:
chrRangeFromJpeg8_1920_c: 2127.4 2125.0 (1.00x)
chrRangeFromJpeg16_1920_c: 2325.2 2127.2 (1.09x)
chrRangeToJpeg8_1920_c: 3166.9 3168.7 (1.00x)
chrRangeToJpeg16_1920_c: 2152.4 3164.8 (0.68x)
lumRangeFromJpeg8_1920_c: 1263.0 1302.5 (0.97x)
lumRangeFromJpeg16_1920_c: 1080.5 1299.2 (0.83x)
lumRangeToJpeg8_1920_c: 1886.8 2112.2 (0.89x)
lumRangeToJpeg16_1920_c: 1077.0 1906.5 (0.56x)
aarch64 A55:
chrRangeFromJpeg8_1920_c: 28835.2 28835.6 (1.00x)
chrRangeFromJpeg16_1920_c: 28839.8 32680.8 (0.88x)
chrRangeToJpeg8_1920_c: 23074.7 23075.4 (1.00x)
chrRangeToJpeg16_1920_c: 17318.9 24996.0 (0.69x)
lumRangeFromJpeg8_1920_c: 15389.7 15384.5 (1.00x)
lumRangeFromJpeg16_1920_c: 15388.2 17306.7 (0.89x)
lumRangeToJpeg8_1920_c: 19227.8 19226.6 (1.00x)
lumRangeToJpeg16_1920_c: 15387.0 21146.3 (0.73x)
aarch64 A76:
chrRangeFromJpeg8_1920_c: 6324.4 6268.1 (1.01x)
chrRangeFromJpeg16_1920_c: 6339.9 11521.5 (0.55x)
chrRangeToJpeg8_1920_c: 9656.0 9612.8 (1.00x)
chrRangeToJpeg16_1920_c: 6340.4 11651.8 (0.54x)
lumRangeFromJpeg8_1920_c: 4422.0 4420.8 (1.00x)
lumRangeFromJpeg16_1920_c: 4420.9 5762.0 (0.77x)
lumRangeToJpeg8_1920_c: 5949.1 5977.5 (1.00x)
lumRangeToJpeg16_1920_c: 4446.8 5946.2 (0.75x)
NOTE: all simd optimizations for range_convert have been disabled.
they will be re-enabled when they are fixed for each architecture.
NOTE2: the same issue still exists in rgb2yuv conversions, which is not
addressed in this commit.
The existing av_csp_trc_func_from_id() mostly implements the OETF, except for
PQ. As such, we are currently missing a precise definition of an ITU-R EOTF.
Introduce the new functions av_csp_itu_eotf() and av_csp_itu_eotf_inv(), to fill
this void. Note that this is not possible in all cases, e.g. AVCOL_TRC_LOG which
has no corresponding EOTF definition in any ITU-R standard.
Note that we cannot implement the proper HLG and SMPTE 428 OOTFs without access
to all three color channels, because they are not independent per channel. As a
result, we need to define them on double[3] instead of double (*func)(double).
This explanation was inaccurate and highly misleading. The new wording is taken
more or less directly from ITU-T H.273, and also matches my understanding of
these functions.
The basic problem here is that the rgb*ToUV_half_* functions hard-code a
bilinear downsample from src[i] + src[i+1], with no bounds check on the i+1
access.
Due to the signature of the function, we cannot easily plumb the "true" width
into the function body to perform a bounds check. Similarly, we cannot easily
pre-pad the input because it is typically reading from the (const) input
frame, which would require a full memcpy to pad. Either of these solutions are
more trouble than the feature is worth, so just disable it on odd input sizes.
Fixes: use of uninitialized value
Fixes: ticket #11265
Signed-off-by: Niklas Haas <git@haasn.dev>
Sponsored-by: Sovereign Tech Fund
WASI mssing signal and siglongjmp support. This patch workaround
build error and add simd128 flag. Please note that many tests use
large array on stack, so you need to increase the stack size when
build checkasm, e.g., --extra-ldflags='-Wl,-z,stack-size=10485760'
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
And add wasm simd128 flag, so we can add simd128 optimizations.
It can be enabled by put -msimd128 to extra cflags. There is
no runtime detection on simd128 support yet. I think that needs to
be done with JavaScript.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Fixes: 70991/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WEBP_fuzzer-5544067620995072
Fixes: use of uninintailized value
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: Use of uninitialized value
Fixes: 71350/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ILBC_fuzzer-6322020827070464
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The rv60_qp_to_idx table only supports qp up to 31 on intra
Fixes: global-buffer-overflow
Fixes: 377543818/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_RV60_fuzzer-5160167345291264
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Note ISO/IEC 13818-1 defines an Extension_descriptor with descriptor_tag value
0x3f (63), so I kept the DVB comment.
I don't know what defines stream_type value 0x8a as DTS.
I don't have any Blu-ray standards so I don't know where those stream_type
values are defined.
Signed-off-by: Marton Balint <cus@passwd.hu>
H.266 (V3) section 7.4.12.8: "The value of lMvd[ compIdx ] shall be in
the range of −2^{17} to 2^{17} − 1, inclusive."
Signed-off-by: Frank Plowman <post@frankplowman.com>
Per H.266 (V3) section 7.4.3.19, LmcsMaxBinIdx is set equal to
15 - lmcs_delta_max_bin_idx. The previous code instead had it equal to
15 - lmcs_min_bin_idx. This could cause decoder mismatches.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Previously, the code only stored the MIP mode and transpose flag in the
relevant tables at the top-left corner of the CU. This information ends
up being retrieved in ff_vvc_intra_pred_* not based on the CU position
but instead the transform unit position (specifically, using the x0 and
y0 from get_luma_predict_unit). There might be multiple transform units
in a CU, hence the top-left corner of the transform unit might not
coincide with the top-left corner of the CU. Consequently, we need to
store the MIP information at all positions in the CU, not only its
top-left corner, as we already do for the MIP flag.
Signed-off-by: Frank Plowman <post@frankplowman.com>
The final parameter of check_available determines whether the motion
estimation region constraints imposed in section 8.5.2.3 of H.266 (V3)
on MVP candidates apply to the current candidate or not. In the case of
IBC spatial merge candidates they do not, as their availability is
dependent only on the criteria described in sections 8.6.2.3 and 6.4.4,
which do not include this constraint on the motion estimation region.
Signed-off-by: Frank Plowman <post@frankplowman.com>
MinQtLog2SizeIntraC is usually (eq. (51) from VVCv3) defined as
sps_log2_diff_min_qt_min_cb_intra_slice_chroma + MinCbLog2SizeY
However, in the case ph_log2_diff_min_qt_min_cb_intra_slice_chroma is
present, it is instead (eq. (83) from VVCv3) defined as
ph_log2_diff_min_qt_min_cb_intra_slice_chroma + MinCbLog2SizeY
When ph_log2_diff_max_bt_min_qt_intra_slice_chroma and
ph_log2_diff_max_tt_min_qt_intra_slice_chroma are present, so is
ph_log2_diff_min_qt_min_cb_intra_slice_chroma, and so we should use the
second definition of MinQtLog2SizeIntraC, rather than the first, when
calculating the bounds for these syntax elements.
Signed-off-by: Frank Plowman <post@frankplowman.com>
This does not replicate on my setup, thus this is a blind fix based on ossfuzz trace
Fixes: use of uninitialized value
Fixes: 71747/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5427736120721408
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: Use of uninitialized value
Fixes: 71551/clusterfuzz-testcase-minimized-ffmpeg_dem_QCP_fuzzer-4647386712965120
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: Use of uninitialized memory
Fixes: 71546/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_EATGQ_fuzzer-5607656650244096
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: integer overflow: -2147483648 - 1 cannot be represented in type 'int'
Fixes: 373971762/clusterfuzz-testcase-minimized-ffmpeg_dem_DXA_fuzzer-4880491112103936
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: use of uninitialized memory in hScale16To15_c()
Fixes: 373924007/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5841199968092160
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If the function is not inlined, this is more efficient. Also
it allows calling refill() without the check
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>