checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C
version can still outperform the NEON version in specific cases. The balance
between different code paths is stream-dependent, but in practice the best
case happens about 5% of the time, the worst case happens about 40% of the
time, and the complexity of the remaining cases fall somewhere in between.
Therefore, taking the average of the best and worst case timings is
probably a conservative estimate of the degree by which the NEON code
improves performance.
vc1dsp.vc1_h_loop_filter4_bestcase_c: 10.7
vc1dsp.vc1_h_loop_filter4_bestcase_neon: 43.5
vc1dsp.vc1_h_loop_filter4_worstcase_c: 184.5
vc1dsp.vc1_h_loop_filter4_worstcase_neon: 73.7
vc1dsp.vc1_h_loop_filter8_bestcase_c: 31.2
vc1dsp.vc1_h_loop_filter8_bestcase_neon: 62.2
vc1dsp.vc1_h_loop_filter8_worstcase_c: 358.2
vc1dsp.vc1_h_loop_filter8_worstcase_neon: 88.2
vc1dsp.vc1_h_loop_filter16_bestcase_c: 51.0
vc1dsp.vc1_h_loop_filter16_bestcase_neon: 107.7
vc1dsp.vc1_h_loop_filter16_worstcase_c: 722.7
vc1dsp.vc1_h_loop_filter16_worstcase_neon: 140.5
vc1dsp.vc1_v_loop_filter4_bestcase_c: 9.7
vc1dsp.vc1_v_loop_filter4_bestcase_neon: 43.0
vc1dsp.vc1_v_loop_filter4_worstcase_c: 178.7
vc1dsp.vc1_v_loop_filter4_worstcase_neon: 69.0
vc1dsp.vc1_v_loop_filter8_bestcase_c: 30.2
vc1dsp.vc1_v_loop_filter8_bestcase_neon: 50.7
vc1dsp.vc1_v_loop_filter8_worstcase_c: 353.0
vc1dsp.vc1_v_loop_filter8_worstcase_neon: 69.2
vc1dsp.vc1_v_loop_filter16_bestcase_c: 60.0
vc1dsp.vc1_v_loop_filter16_bestcase_neon: 90.0
vc1dsp.vc1_v_loop_filter16_worstcase_c: 714.2
vc1dsp.vc1_v_loop_filter16_worstcase_neon: 97.2
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes: Out of array read
Fixes: 45137/clusterfuzz-testcase-minimized-ffmpeg_BSF_VP9_SUPERFRAME_SPLIT_fuzzer-4984270639202304
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are invalid in VP9. If any of the frames inside a superframe
had a size of zero, the code would either read into the next frame
or into the superframe index; so check for the length to stop this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Packets without data need to be handled specially in order to avoid
undefined reads. Pass these packets through unchanged in case there
are no cached packets* and error out in case there are cached packets:
Returning the packet would mess with the order of the packets;
if one returned the zero-sized packet before the superframe that will
be created from the packets in the cache, the zero-sized packet would
overtake the packets in the cache; if one returned the packet later,
the packets that complete the superframe will overtake the zero-sized
packet.
*: This case e.g. encompasses the scenario of updated extradata
side-data at the end.
Fixes: Out of array read
Fixes: 45722/clusterfuzz-testcase-minimized-ffmpeg_BSF_VP9_SUPERFRAME_fuzzer-5173378975137792
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The existing x86 assembly for loop filters uses the stride as a
full register without clearing/sign extending the upper half
of the registers on x86_64.
This avoids crashes if the caller would have passed nonzero bits
in the previously undefined upper 32 bits of the parameters.
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes: Out of array write
Fixes: 45613/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WMALOSSLESS_fuzzer-4539073606320128
Fixes: 46008/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WMALOSSLESS_fuzzer-4681245747970048
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes: NULL pointer dereference
Fixes: 45955/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BINKAUDIO_DCT_fuzzer-4842044192849920
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: division by zero
Fixes: 45811/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VMDAUDIO_fuzzer-6412592581574656
Fixes: 45979/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VMDAUDIO_fuzzer-5362043060879360
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Modifying the main context by a slice thread is racy;
so constify the pointer to it in H264SliceContext.
The code itself was already compatible with this change.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Since 7be2d2a70c only one context
is used. Moving it to H264Context from H264SliceContext is natural.
One could access the ERContext from H264SliceContext
via H264SliceContext.h264->er; yet H264SliceContext.h264 should
naturally be const-qualified, because slice threads should not
modify the main context. The ERContext is an exception
to this, as ff_er_add_slice() is intended to be called simultaneously
by multiple threads. And for this one needs a pointer whose
pointed-to-type is not const-qualified.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_er_frame_start() initializes ERContext.error_count
to three times the number of macroblocks to decode.
Later ff_er_add_slice() reduces this number by the amount
of macroblocks whose AC resp. DC resp. MV have been finished
(so every correctly decoded MB counts three times).
So the frame has been decoded correctly if error_count is zero
at the end.
The H.264 decoder uses multiple ERContexts when using
slice threading and therefore combines these error counts:
The first slice's ERContext is intended to be initialized
by ff_er_frame_start(), error_count of all the other
slice contexts is intended to be zeroed initially and
all afterwards all the error_counts are summed.
Yet commit 43b434210e
(probably unintentionally) changed the code to set
the first slice's error_count to zero as well.
This leads to bogus error messages in case one decodes
an input video using multiple slices with slice threading
with error concealment enabled (which is not the default)
("concealing 0 DC, 0 AC, 0 MV errors in [IPB] frame");
furthermore the returned frame is marked as corrupt as well
(ffmpeg reports "corrupt decoded frame in stream %d" for this).
This can be fixed easily given that only the first ERContext
is really used since 7be2d2a70c:
Don't reset the error_count; and don't sum the error counts as well.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This patch is analogous to 20f9727018:
It hides the internal part of AVBitStreamFilter by adding a new
internal structure FFBitStreamFilter (declared in bsf_internal.h)
that has an AVBitStreamFilter as its first member; the internal
part of AVBitStreamFilter is moved to this new structure.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise get_pixel_format() will not be called when parsing a subsequent Sequence
Header in non hwaccel enabled scenarios, allowing frame parsing when it shouldn't.
This prevents the scenario seqhdr -> frame_hdr/redundant_frame_hdr -> seqhdr ->
redundant_frame_hdr from having the latter redundant frame header parsed as if it
was a frame header by the decoder because the former was discarded.
Since CBS did not discard it, the latter redundant frame header is output with a
zeroed AV1RawFrameHeader struct, which can have undesired results, like division
by zero with fields normally guaranteed to be anything else.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
It is a more fitting place for them.
Also move the definition of ff_log2_run to mathtables.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
bitstream.c is currently the disjoint union of three parts:
The first part is ff_log2_run, the second part are some auxiliary
functions for the PutBits-API; and the third part is the code
for creating VLCs. This commit moves the latter into a file of its own.
This has the advantage of making one of the hacks in tableprint_vlc.h
redundant as vlc.c does not include config.h (whereas the PutBits-API
part does).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Effectively reverts eaff1aa09e
given that bitswap_32 is no longer used outside of bitstream.c
since 03008c2811.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: member access within null pointer of type 'const FFCodec' (aka 'const struct FFCodec')
Fixes: 45726/clusterfuzz-testcase-minimized-ffmpeg_DEMUXER_fuzzer-6554445419249664
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: division by zero
Fixes: 43769/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AV1_fuzzer-5392562205097984
Fixes: 43950/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AV1_fuzzer-5769210217758720
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
avctx->ch_layout will be reinitialized using channel_mask later in the
function.
Fixes: 45736/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WMAPRO_fuzzer-5769886813519872
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes: shift exponent 33 is too large for 32-bit type 'int'
Fixes: 45645/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TRUEHD_fuzzer-5651350182035456
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 11494 * 1073741824000000 cannot be represented in type 'long'
Fixes: 26586/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PIXLET_fuzzer-5752633970917376
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This structure is no longer declared in a public header,
so using an FF-prefix is more appropriate.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, codec.h contains both public and private parts
of AVCodec. This exposes the internals of AVCodec to users
and leads them into the temptation of actually using them
and forces us to forward-declare structures and types that
users can't use at all.
This commit changes this by adding a new structure FFCodec to
codec_internal.h that extends AVCodec, i.e. contains the public
AVCodec as first member; the private fields of AVCodec are moved
to this structure, leaving codec.h clean.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also move FF_CODEC_TAGS_END as well as struct AVCodecDefault.
This reduces the amount of files that have to include internal.h
(which comes with quite a lot of indirect inclusions), as e.g.
most encoders don't need it. It is furthemore in preparation
for moving the private part of AVCodec out of the public codec.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
No need to use a Custom layout when the non diegetic channels can be
described as a standard mask.
This fixes:
45684/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_LIBOPUS_fuzzer-5039410989629440
Signed-off-by: James Almer <jamrial@gmail.com>
They return nicer error messages on error; furthermore,
they also use our allocation functions. It also stops
calling deflateEnd() on a z_stream that might not have been
successfully initialized.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They emit better error messages (it does not claim that inflateInit
failed upon an error from deflateInit!) and uses our allocation functions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The rationale is the same as for the wrappers for inflateInit(),
although the case for it is admittedly not so strong because
there are less users of deflateInit().
Also remove an unnecessary inclusion of config.h in
libavformat/protocols.c in order to trigger a request for reconfigure
(which is needed for CONFIG_DEFLATE_WRAPPER to take effect).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Instead reuse and reset a single z_stream.
Also use FFZStream in decode_zbuf(), because it has nicer error
messages.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This fixes the problem of potentially closing a z_stream
that has never been successfully initialized.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Returns better error messages in case of error and deduplicates
the inflateInit() code and also allows to cleanup generically
in case of errors as it is save to call ff_inflate_end() if
ff_inflate_init() has not been called successfully.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This fixes the problem of potentially closing a z_stream
that has never been successfully initialized.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>