The sample mpeg4/mpeg4_sstp_dpcm.m4v existed in the FATE-suite,
but it was surprisingly unused.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In this case the macroblocks written to are smaller, yet
the MPEG-4 Simple Studio Profile code for 10bit DPCM ignored this;
e.g. in case of lowres = 2 or = 3, the sample mpeg4_sstp_dpcm.m4v
from the FATE-suite reads beyond the end of the buffer.
This commit fixes this by taking lowres into account.
The DPCM macroblocks of the aforementioned sample look
as good as can be expected after this patch; yet the non-DPCM
coded macroblocks are simply corrupt.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
jpeg2000_decode_tile() (which is run concurrently by several threads
when using slice threading) currently modifies some joint values
before doing its actual work. This is a data race that happens to work
because all threads set the same values; but it is nevertheless
undefined behaviour.
Fix this by performing said preparatory work in the main thread instead.
This fixes the vsynth(1|2|_lena)-jpeg2000(-97)? FATE-tests when using
TSAN and slice threading.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When AV_CODEC_EXPORT_DATA_FILM_GRAIN is present, AV1 decoder should
disable film grain application and export the corresponding side data
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
For vaapi if the init_pool_size is not zero, the pool size is fixed.
This means max surfaces is init_pool_size, but when mapping vaapi
frame to qsv frame, the init_pool_size < nb_surface. The cause is that
vaapi_decode_make_config() config the init_pool_size and it is called
twice. The first time is to init frame_context and the second time is to
init codec. On the second time the init_pool_size is changed to original
value so the init_pool_size is lower than the reall size because
pool_size used to initialize frame_context need to plus thread_count and
3 (guarantee 4 base work surfaces). Now add code to make sure
init_pool_size is only set once. Now the following commandline works:
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format vaapi -i input.264 \
-vf "hwmap=derive_device=qsv,format=qsv" \
-c:v h264_qsv output.264
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Makes Bulldozer prefer AVX functions rather than AVX2,
which are 64% slower:
AVX: 117653 decicycles in av_tx (fft), 1048535 runs, 41 skips
AVX2: 193385 decicycles in av_tx (fft), 1048561 runs, 15 skips
The only difference between both is that vgatherdpd is used in
the former. We don't want to mark them with the new SLOW_GATHER
flag however, since gathers are still faster on Haswell/Zen 2/3
than plain loads.
If a codelet initializes 2 subtransforms, and the second one fails,
the failure would free all subcontexts.
Instead, if there are subcontexts still left, don't free the array.
If all initializations fail, the init() function will return,
and reset_ctx() from the previous step will clean up all contained
subtransforms.
Fix CID: 1497864
The control flow should return ENOSYS if nb_cd_matches is 0 at before
and the ret equal AVERROR(ENOMEM) or goto end label, so remove the last
control flow if (ret >= 0) before end label.
Signed-off-by: Steven Liu <liuqi05@kuaishou.com>
Add intra refresh support to hevc_qsv as well.
Add an new intra refresh type: "horizontal", and an new param
ref_cycle_dist. This param specify the distance between the
beginnings of the intra-refresh cycles in frames.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Add b_strategy option to hevc_qsv. By enabling this option, encoder can
use b frames as reference.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>