The guess_palette() implementation is questionable in itself
as its results don't match those from other DVD subtitle decoders.
This commit starts cleanup by fixing an obvious bug which has made
certain DVD subs appear yellow instead of white or grey for more than
10 years..
Signed-off-by: softworkz <softworkz@hotmail.com>
Signed-off-by: rcombs <rcombs@rcombs.me>
A bug was found in dav1d <= 1.0.0 where the event flag New Sequence Header would
not be signaled for some samples using delayed random access points.
It has since been fixed, but nonetheless it's best to ensure the AVCodecContext
is filled with parameters when parsing the first frame, regardless of what events
were signaled.
Fixes ticket #9694.
Signed-off-by: James Almer <jamrial@gmail.com>
If the svt equivalent option to an avctx AVOption is passed by the user
then it should have priority. The exception are fields like dimensions, bitdepth
and pixel format, which must match what lavc will feed the encoder after init.
This addresses libsvt-av1 issue #1858.
Signed-off-by: James Almer <jamrial@gmail.com>
Qsv encoder only support input P010 and nv12 format directly from system
memory. For other format, we need to upload frame to device memory and
input qsv format to encoder. Now add other system memory format support
to qsv encoder.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Qsv decoder only supports directly output nv12 and p010 to system
memory. For other format, we need to download frame from qsv format
to system memory. Now add other supported format to qsvdec.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
The init_pool_size is set to be 64 and it is too many.
Use IOSurfQuery to get NumFrameSuggest which is the suggested
number of frame that needed to be allocated when initializing the decoder.
Considering that the hevc_qsv encoder uses the most frame buffer,
async is 4 (default) and max_b_frames is 8 (default) and decoder
may followed by VPP, use NumFrameSuggest + 16 to set init_pool_size.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Guangxin Xu <guangxin.xu@intel.com>
Since ffmpeg-qsv uses return value to reinit decoder, it doesn't need to
decode header each time. Move qsv_decode_header's position so that
it will be called only if codec needed to be reinitialized.
Rearrange the code of flushing decoder and re-init decoder operation.
Remove the buffer_count and use the got_frame to decide whether the
decoder is drain.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Guangxin Xu <guangxin.xu@intel.com>
FFmpeg-qsv decoder reinit codec when width and height change, but there
are not only resolution change need to reinit codec. I change it to use
return value from DecodeFrameAsync() to decide whether decoder need to
be reinitialized.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Guangxin Xu <guangxin.xu@intel.com>
In case the BSF has not been drained before flushing/closing,
the context's next_frame might be set; yet it is not freed
in flush or close. The former only zeroes it (which automatically
causes a leak in case it was set). So do this when closing
and flushing.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible, because every given FFCodec has to implement
exactly one of these. Doing so decreases sizeof(FFCodec) and
therefore decreases the size of the binary.
Notice that in case of position-independent code the decrease
is in .data.rel.ro, so that this translates to decreased
memory consumption.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This increases type-safety by avoiding conversions from/through void*.
It also avoids the boilerplate "AVFrame *frame = data;" line
for non-subtitle decoders.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This increases type-safety by avoiding conversions from/through void*.
It also avoids the boilerplate "AVSubtitle *sub = data;" line
for subtitle decoders. Its only downside is that it increases
sizeof(FFCodec), yet this can be more than offset lateron.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The inputs are unused except for this computation so wraparound
does not give an attacker any extra values as they are already fully
controlled
Fixes: signed integer overflow: 0 - -2147483648 cannot be represented in type 'int'
Fixes: 45820/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_EXR_fuzzer-5766159019933696
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -128275513086 * -76056576 cannot be represented in type 'long'
Fixes: 45818/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5129799149944832
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -101 * 71041254 cannot be represented in type 'int'
Fixes: 45938/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TAK_fuzzer-4687974320701440
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -2146549696 - 3923884 cannot be represented in type 'int'
Fixes: 45907/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_APE_fuzzer-5992380584558592
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
It is currently a "Picture", an mpegvideo-specific type
that has a lot of baggage, all of which is unnecessary
for new_picture, because only its embedded AVFrame
is ever used. So just use an ordinary AVFrame.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In the aforementioned case mpegvideo_enc.c calls
ff_mjpeg_encode_stuffing() at the end of every line which
pads the output to byte-alignment and escapes it;
yet it does not write the restart-markers (and also not
the DRI marker when writing the header) and so the output files
are broken.
Fix this by writing these markers depending upon the number of
slices and not the number of threads in use; this also makes
the output of the encoder reproducible given a slice count
and is therefore important if encoder tests that actually use
-threads auto are added in the future.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Our code for writing optimal huffman tables is incompatible
with using multiple slices and hence commit
884506dfe2e29a5b2bd2905ca4f17e277e32acb1 that implemented this
also added an assert that slice_context_count is always 1.
Yet this was always wrong: a) The MJPEG-encoder has (and had)
the AV_CODEC_CAP_SLICE_THREADS capability, so asserting that
it always uses one slice context is incorrect.
b) This commit did not add any proper checks that ensured that
optimal huffman tables are never used together with multiple slices.
This only happened with 03eb0515c12637dbd20c2e3ca8503d7b47cf583a.
c) This assert is at the wrong place: ff_mjpeg_encode_init() is
called before the actual slice_context_count is set. This is
the reason why this assert was never triggered.
Therefore this commit removes this assert.
Also remove an assert from the SpeedHQ encoder sharing b) and c).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
One can use slices without slice-threading. The results for
mpegvideo-encoders are abysmal: AMV, SpeedHQ, H.263, RV10, RV20,
MSMPEG4v2, MSMPEG4v3 and WMV1 produce broken files.
WMV2 meanwhile expects the MpegEncContext given to ff_wmv2_encode_mb()
to be at the beginning of a Wmv2Context (a structure that this encoder
shares with the WMV2 decoder), yet this is only true for the
main context and not for the slice contexts, leading to segfaults.
SpeedHQ additionally triggers an av_assert2, because it is not
byte-aligned at a position where it ought to be byte-aligned.
Given that no codec not supporting slice threading works this commit
disallows using slices unless the encoder supports slice threading.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows.
vc1dsp.vc1_unescape_buffer_c: 918624.7
vc1dsp.vc1_unescape_buffer_neon: 142958.0
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows.
vc1dsp.vc1_unescape_buffer_c: 655617.7
vc1dsp.vc1_unescape_buffer_neon: 118237.0
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C
version can still outperform the NEON version in specific cases. The balance
between different code paths is stream-dependent, but in practice the best
case happens about 5% of the time, the worst case happens about 40% of the
time, and the complexity of the remaining cases fall somewhere in between.
Therefore, taking the average of the best and worst case timings is
probably a conservative estimate of the degree by which the NEON code
improves performance.
vc1dsp.vc1_h_loop_filter4_bestcase_c: 19.0
vc1dsp.vc1_h_loop_filter4_bestcase_neon: 48.5
vc1dsp.vc1_h_loop_filter4_worstcase_c: 144.7
vc1dsp.vc1_h_loop_filter4_worstcase_neon: 76.2
vc1dsp.vc1_h_loop_filter8_bestcase_c: 41.0
vc1dsp.vc1_h_loop_filter8_bestcase_neon: 75.0
vc1dsp.vc1_h_loop_filter8_worstcase_c: 294.0
vc1dsp.vc1_h_loop_filter8_worstcase_neon: 102.7
vc1dsp.vc1_h_loop_filter16_bestcase_c: 54.7
vc1dsp.vc1_h_loop_filter16_bestcase_neon: 130.0
vc1dsp.vc1_h_loop_filter16_worstcase_c: 569.7
vc1dsp.vc1_h_loop_filter16_worstcase_neon: 186.7
vc1dsp.vc1_v_loop_filter4_bestcase_c: 20.2
vc1dsp.vc1_v_loop_filter4_bestcase_neon: 47.2
vc1dsp.vc1_v_loop_filter4_worstcase_c: 164.2
vc1dsp.vc1_v_loop_filter4_worstcase_neon: 68.5
vc1dsp.vc1_v_loop_filter8_bestcase_c: 43.5
vc1dsp.vc1_v_loop_filter8_bestcase_neon: 55.2
vc1dsp.vc1_v_loop_filter8_worstcase_c: 316.2
vc1dsp.vc1_v_loop_filter8_worstcase_neon: 72.7
vc1dsp.vc1_v_loop_filter16_bestcase_c: 62.2
vc1dsp.vc1_v_loop_filter16_bestcase_neon: 103.7
vc1dsp.vc1_v_loop_filter16_worstcase_c: 646.5
vc1dsp.vc1_v_loop_filter16_worstcase_neon: 110.7
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C
version can still outperform the NEON version in specific cases. The balance
between different code paths is stream-dependent, but in practice the best
case happens about 5% of the time, the worst case happens about 40% of the
time, and the complexity of the remaining cases fall somewhere in between.
Therefore, taking the average of the best and worst case timings is
probably a conservative estimate of the degree by which the NEON code
improves performance.
vc1dsp.vc1_h_loop_filter4_bestcase_c: 10.7
vc1dsp.vc1_h_loop_filter4_bestcase_neon: 43.5
vc1dsp.vc1_h_loop_filter4_worstcase_c: 184.5
vc1dsp.vc1_h_loop_filter4_worstcase_neon: 73.7
vc1dsp.vc1_h_loop_filter8_bestcase_c: 31.2
vc1dsp.vc1_h_loop_filter8_bestcase_neon: 62.2
vc1dsp.vc1_h_loop_filter8_worstcase_c: 358.2
vc1dsp.vc1_h_loop_filter8_worstcase_neon: 88.2
vc1dsp.vc1_h_loop_filter16_bestcase_c: 51.0
vc1dsp.vc1_h_loop_filter16_bestcase_neon: 107.7
vc1dsp.vc1_h_loop_filter16_worstcase_c: 722.7
vc1dsp.vc1_h_loop_filter16_worstcase_neon: 140.5
vc1dsp.vc1_v_loop_filter4_bestcase_c: 9.7
vc1dsp.vc1_v_loop_filter4_bestcase_neon: 43.0
vc1dsp.vc1_v_loop_filter4_worstcase_c: 178.7
vc1dsp.vc1_v_loop_filter4_worstcase_neon: 69.0
vc1dsp.vc1_v_loop_filter8_bestcase_c: 30.2
vc1dsp.vc1_v_loop_filter8_bestcase_neon: 50.7
vc1dsp.vc1_v_loop_filter8_worstcase_c: 353.0
vc1dsp.vc1_v_loop_filter8_worstcase_neon: 69.2
vc1dsp.vc1_v_loop_filter16_bestcase_c: 60.0
vc1dsp.vc1_v_loop_filter16_bestcase_neon: 90.0
vc1dsp.vc1_v_loop_filter16_worstcase_c: 714.2
vc1dsp.vc1_v_loop_filter16_worstcase_neon: 97.2
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes: Out of array read
Fixes: 45137/clusterfuzz-testcase-minimized-ffmpeg_BSF_VP9_SUPERFRAME_SPLIT_fuzzer-4984270639202304
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are invalid in VP9. If any of the frames inside a superframe
had a size of zero, the code would either read into the next frame
or into the superframe index; so check for the length to stop this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Packets without data need to be handled specially in order to avoid
undefined reads. Pass these packets through unchanged in case there
are no cached packets* and error out in case there are cached packets:
Returning the packet would mess with the order of the packets;
if one returned the zero-sized packet before the superframe that will
be created from the packets in the cache, the zero-sized packet would
overtake the packets in the cache; if one returned the packet later,
the packets that complete the superframe will overtake the zero-sized
packet.
*: This case e.g. encompasses the scenario of updated extradata
side-data at the end.
Fixes: Out of array read
Fixes: 45722/clusterfuzz-testcase-minimized-ffmpeg_BSF_VP9_SUPERFRAME_fuzzer-5173378975137792
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The existing x86 assembly for loop filters uses the stride as a
full register without clearing/sign extending the upper half
of the registers on x86_64.
This avoids crashes if the caller would have passed nonzero bits
in the previously undefined upper 32 bits of the parameters.
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes: Out of array write
Fixes: 45613/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WMALOSSLESS_fuzzer-4539073606320128
Fixes: 46008/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WMALOSSLESS_fuzzer-4681245747970048
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes: NULL pointer dereference
Fixes: 45955/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BINKAUDIO_DCT_fuzzer-4842044192849920
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: division by zero
Fixes: 45811/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VMDAUDIO_fuzzer-6412592581574656
Fixes: 45979/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VMDAUDIO_fuzzer-5362043060879360
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Modifying the main context by a slice thread is racy;
so constify the pointer to it in H264SliceContext.
The code itself was already compatible with this change.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>