Similarly to the previous changes, we don't need to synchronise
after a memcpy to device memory. On the other hand, we need to
keep synchronising after a copy to host memory, otherwise there's
no guarantee that subsequent host reads will return valid data.
We're also doing a sync here after copying the frame to be passed
on down the pipleine. And it is also unnecessary.
I was able to demonstrate a 33% speedup removing the sync from
an example transcode pipeline.
I put this call in by habit, rather than because there was any
actual need. The filter is simply processing frames one after
the other and has no need to synchronise.
malakudi on the devtalk forums noticed a slowdown when using nvenc
with temporal/spatial aq and that the slowdown went away if the
sync call was removed. I also verified that in the basic encoding
case there's an observable speedup.
I also verified that we aren't doing unnecessary sync calls in any
other filter.
The following are the newly added options:
arnr_max_frames, arnr_strength, aq_mode, denoise_noise_level, denoise_block_size,
rc_undershoot_pct, rc_overshoot_pct, minsection_pct, maxsection_pct, frame_parallel,
enable_cdef, enable_global_motion, and intrabc.
Also added macros for compiling for aom 1.0.0 and fixed the default values.
Signed-off-by: James Almer <jamrial@gmail.com>
cbs trace qsv vps header failed due to some reasons:
1. vps_temporal_id_nesting_flag is not set but spec required it must to
be 1 when vps_max_sub_layers_minus1 is equal to 0.
2. vps_num_hrd_parameters is not set and written.
3. other issues in ff_hevc_encode_nal_vps() (fixed in pervious commit_id: 520226b683).
Reproduce: ffmpeg -hwaccel qsv -v verbose -c:v h264_qsv -i bbb_sunflower_1080p_30fps_normal.mp4 -vframes 1
-c:v hevc_qsv -bsf:v trace_headers -f null -
Signed-off-by: Zhong Li <zhong.li@intel.com>
Right now, the code check for no filter description, but if we use a
filter_complex, the code will use the AVFrame.duration which could be
wrong in case of using fps filter.
How to reproduce the problem:
ffmpeg -f lavfi -i testsrc=duration=1 -vf fps=fps=50 -vsync 1 -f null -
output 50 frames
ffmpeg -f lavfi -i testsrc=duration=1 -filter_complex fps=fps=50 -vsync 1 -f null -
output 51 frames
With this commit, the same command will always output 50 frames.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Dong, Jerry <jerry.dong@intel.com>
Signed-off-by: Decai Lin <decai.lin@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
There are many problems of current qsv trellis option:
1. Duplicated with AVCodecContext definition
2. MFX_TRELLIS_XXX is introduced by MSDK API 1.17
Currently Without MSDK API checking thus may cause compilation issue.
3. user is inclined to enable trellis when set "-trellis 1", but
actually it is to disable since MFX_TRELLIS_OFF is equal to 1.
4. It is too complex for user to enable trellis for every frame(I/P/B).
Just simply remove the private option, and switch to the AVCodecContext
definition. Compatibility should not a big problem (except can't exact map)
since the option name is same as AVCodecContext.
Signed-off-by: Zhong Li <zhong.li@intel.com>
Reviewed-by: Carl Eugen Hoyos <ceffmpeg@gmail.com>
Reviewed-by: Moritz Barsnick <barsnick@gmx.net>
Currectly just standard header path can be found,
check_type/struct will fail if vaapi is installed somewhere else.
Move them followed "check_pkg_config"
Reviewed-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Zhong Li <zhong.li@intel.com>
The fields are deprecated in current vaapi,
setting them to 0 in old versions is fine
as FMO is not implemented.
Fixes the following warnings:
libavcodec/vaapi_h264.c:259:10: warning: 'num_slice_groups_minus1' is deprecated [-Wdeprecated-declarations]
.num_slice_groups_minus1 = pps->slice_group_count - 1,
^
libavcodec/vaapi_h264.c:260:10: warning: 'slice_group_map_type' is deprecated [-Wdeprecated-declarations]
.slice_group_map_type = pps->mb_slice_group_map_type,
^
libavcodec/vaapi_h264.c:261:10: warning: 'slice_group_change_rate_minus1' is deprecated [-Wdeprecated-declarations]
.slice_group_change_rate_minus1 = 0, /* FMO is not implemented */
^
Reviewed-by: Mark Thompson
Fixes the following compilation warnings:
libavcodec/vaapi_hevc.c:155:21: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides]
.pic_fields.bits = {
~^~~~
libavcodec/vaapi_hevc.c:125:57: note: previous initialization is here
.pic_fields.value = 0,
^
libavcodec/vaapi_hevc.c:175:31: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides]
.slice_parsing_fields.bits = {
~^~~~
libavcodec/vaapi_hevc.c:126:57: note: previous initialization is here
.slice_parsing_fields.value = 0,
Reviewed-by: Mark Thompson
Fixes: NULL pointer dereference and out of array access
Fixes: 13871/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5746167087890432
Fixes: 13845/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5650370728034304
This also fixes the return code for explode mode
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \
-s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \
-cpuflags 0 -v error -
This uses 32-bit mul, so POWER8 only.
The following output formats get about 4.5x speedup:
rgb24
39980 UNITS in yuv2packed1, 32768 runs, 0 skips
8774 UNITS in yuv2packed1, 32768 runs, 0 skips
bgr24
40069 UNITS in yuv2packed1, 32768 runs, 0 skips
8772 UNITS in yuv2packed1, 32766 runs, 2 skips
rgba
39759 UNITS in yuv2packed1, 32768 runs, 0 skips
8681 UNITS in yuv2packed1, 32767 runs, 1 skips
bgra
39729 UNITS in yuv2packed1, 32768 runs, 0 skips
8696 UNITS in yuv2packed1, 32766 runs, 2 skips
argb
39766 UNITS in yuv2packed1, 32768 runs, 0 skips
8672 UNITS in yuv2packed1, 32766 runs, 2 skips
bgra
39784 UNITS in yuv2packed1, 32768 runs, 0 skips
8659 UNITS in yuv2packed1, 32767 runs, 1 skips
AVFMT_NOTIMESTAMPS may be using in AVInputFormat.flags for demuxing
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
The lensfun filter wraps the lensfun library which performs
transformations on videos to correct for lens distortion. Often this
results in areas in the input being mapped to areas that fall outside
the boundaries of the output. The library has a parameter called scale
which is a scale factor applied to the output video. By decreasing it it
is possible to regain the areas of the video which would otherwise have
been lost. There is a special value of 0 which indicates that the
library should automatically determine a scale factor that results in
the output frame being filled (i.e. little or no black/unmapped areas).
This patch adds a corresponding scale option to the lensfun filter which
is passed through to the library. The existing behaviour of using the
automatic value of 0 is retained as the default behaviour, while other
values will be passed through to the library.
Signed-off-by: Daniel Playfair Cal <daniel.playfair.cal@gmail.com>
Fixes: Timeout (longer than i had patience for -> 2sec)
Fixes: 13205/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PROSUMER_fuzzer-5105644481282048
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 1111638592 - -2122219136 cannot be represented in type 'int'
Fixes: 13441/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TRUEMOTION2_fuzzer-5732769815068672
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The first frame contains the sequence header, which is needed to parse every
following frame.
This fixes parsing streams with broken extradata but correct packet data.
Signed-off-by: James Almer <jamrial@gmail.com>
* commit '0676de935b1e81bc5b5698fef3e7d48ff2ea77ff':
arm: Implement a NEON version of 422 h264_h_loop_filter_chroma
Merged-by: James Almer <jamrial@gmail.com>