This patch implements ff_hscale_8_to_15_neon with NEON fused multiply accumulate
and bumps the vectorization factor from 2 to 4.
The speedup is of 25% on Graviton1 A1 instances based on A-72 cpus:
$ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null -
before: t:0.040303 avg:0.040287 max:0.040371 min:0.039214
after: t:0.032168 avg:0.032215 max:0.033081 min:0.032146
The speedup is of 39% on Graviton2 m6g instances based on Neoverse-N1 cpus:
$ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null -
before: t:0.019446 avg:0.019423 max:0.019493 min:0.019181
after: t:0.014015 avg:0.014096 max:0.015018 min:0.013971
Tested with `make check` on aarch64-linux.
Signed-off-by: Sebastian Pop <spop@amazon.com>
Reviewed-by: Jean-Baptiste Kempf <jb@videolan.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Once removed in 4a9bab3db0.
Introduced again in b25cd7540e.
Signed-off-by: Linjie Fu <fulinjie@zju.edu.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
5 cabac states for cbf_cb and cbf_cr are supported according to
Table 9-4.
Add a test for 64x64 4:4:4 8bit HEVC clips with TUDepth = 4, cbf_cr > 0.
Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The max transform depth is 5(from 0 to 4), so we need 5 cabac states for
cbf_cb and cbf_cr.
See Table 9-4 for details.
Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Wavelet types with large amounts of overreading/writing like 9_7 would
write into the padding at high wavelet depths, which would remain and be
read by the next frame's transform and quickly cause artifacts to appear
for subsequent frames.
This fix affects all frames encoded with a non-power-of-two width, with
the artifacts varying between non-observable to very noticeable,
depending on encoder settings, so reencoding is advisable.
When testing on a memory limited system, these tests consume a
significant amount of memory and can often fail if testing by running
multiple processes in parallel.
Signed-off-by: Martin Storsjö <martin@martin.st>
Currently, assigning new buffer for pkt when multiple buffers were returned
from vaMapBuffer will overwrite the previous encoded pkt data and lead
to encode issues.
Iterate through the buf_list first to find out the total buffer size
needed for the pkt, allocate the whole pkt to avoid repeated reallocation
and memcpy, then copy data from each buf to pkt.
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
It performs HDR(High Dynamic Range) to SDR(Standard Dynamic Range) conversion
with tone-mapping. It only supports HDR10 as input temporarily.
An example command to use this filter with vaapi codecs:
FFMPEG -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format vaapi \
-i INPUT -vf 'tonemap_vaapi=format=p010' -c:v hevc_vaapi -profile 2 OUTPUT
Signed-off-by: Xinpeng Sun <xinpeng.sun@intel.com>
Signed-off-by: Zachary Zhou <zachary.zhou@intel.com>
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
payload_count is used to track the number of SEI payloads. It is also
used to free the SEIs in cbs_h264_free_sei()/cbs_h265_free_sei().
Currently, payload_count is set after for loop is completed. Hence if
there is an error and the function exits, the payload remains zero
causing a memleak.
This commit keeps track of payload_count inside the for loop to fix the
issue. Note that that the contents of current are initialized with
av_mallocz() so there is no need to zero initialize payload_count.
Found-by: libFuzzer
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Andriy Gelman <andriy.gelman@gmail.com>
There can be at most 31 SPS and 255 PPS in the mp4/Matroska extradata.
Given that each has a size of at most 2^16-1, the length of the output
derived from these parameter sets can never overflow an ordinary 32 bit
integer. So use a simple uint32_t instead of uint64_t and replace the
unnecessary check with an av_assert1.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -2147483648 - 13 cannot be represented in type 'int'
Fixes: 18893/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ADPCM_IMA_APC_fuzzer-5630760442920960
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2038337026 + 109343477 cannot be represented in type 'int'
Fixes: 18886/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WMALOSSLESS_fuzzer-5673660505653248
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 1677721600 * 32 cannot be represented in type 'int'
Fixes: 18885/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FFWAVESYNTH_fuzzer-5741242185154560
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: left shift of 1 by 31 places cannot be represented in type 'int'
Fixes: 18860/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WMAPRO_fuzzer-5755223125786624
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: left shift of negative value -1
Fixes: 18859/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ADPCM_XA_fuzzer-5748474213040128
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: left shift of 1 by 31 places cannot be represented in type 'int'
Fixes: 18817/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WMALOSSLESS_fuzzer-5713317180211200
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array read
Fixes: 19309/clusterfuzz-testcase-minimized-ffmpeg_BSF_MP3_HEADER_DECOMPRESS_fuzzer-5651002950942720
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The documentation still mentions numerical constants in addition to textual
ones. It is also wrong to use distinct modes as flags and it disallows us to
actually use the flags field for real flags in the future.
Signed-off-by: Marton Balint <cus@passwd.hu>
"It is a requirement of bitstream conformance that num_y_points is less than or equal to 14."
Fixes: index 24 out of bounds for type 'uint8_t [24]'
Fixes: 19282/clusterfuzz-testcase-minimized-ffmpeg_BSF_AV1_FRAME_MERGE_fuzzer-5747424845103104
Note, also needs a23dd33606
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: jamrial
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The IVF muxer autoinserts the av1_metadata filter unconditionally, which is
not desirable for these tests.
Signed-off-by: James Almer <jamrial@gmail.com>
These functions aren't available when building for the restricted
UWP/WinRT/WinStore API subsets.
Normally when building in this mode, one is probably only building
the libraries, but being able to build ffmpeg.exe still is useful
(and a ffmpeg.exe targeting these API subsets still can be run
e.g. in wine, for testing).
Signed-off-by: Martin Storsjö <martin@martin.st>
As the values generated by av_bmg_get can be arbitrarily large
(only the stddev is specified), we can't use a fixed tolerance.
Calculate a dynamic tolerance (like in float_dsp from 38f966b222),
based on the individual steps of the calculation.
This fixes running this test with certain seeds, when built with
clang for mingw/x86_32.
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes: out of array read
Fixes: 19129/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_V210_fuzzer-5068171023482880
Maybe fixes: 19130/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_V210_fuzzer-5637264407527424
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Limin Wang <lance.lmwang@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
These functions already free it themselves before they allocate the new
extradata.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
It is not uncommon to find code where the caller thinks to know better
what the return value should be than the callee. E.g. something like
"if (av_new_packet(pkt, size) < 0) return AVERROR(ENOMEM);". This commit
changes several instances of this to instead forward the actual error.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>