low_power mode will use a fixed HW engine (SFC), thus can offload EU usage.
high quality mode will take EU usage (AVS sampler).
Performance and EU usage (Render usage) comparsion on Intel(R) Xeon(R) CPU E3-1225 v5 @ 3.30GHz:
High quality mode : ffmpeg -hwaccel qsv -c:v h264_qsv -i bbb_sunflower_1080p_30fps_normal_2000frames.h264 \
-vf scale_qsv=w=1280:h=736:mode=hq -f null -
fps=389
RENDER usage: 28.10 (provided by MSDK metrics_monitor)
Low Power mode: ffmpeg -hwaccel qsv -c:v h264_qsv -i ~/bbb_sunflower_1080p_30fps_normal_2000frames.h264 \
-vf scale_qsv=w=1280:h=736:mode=low_power -f null -
fps=343
RENDER usage: 0.00
Low power mode (SFC) may be disabled if not supported by
MSDK/Driver/HW, and replaced by AVS mode interanlly.
Signed-off-by: Zhong Li <zhong.li@intel.com>
Redundant condition: '!A || B' is equivalent to '!A || (A && B)' but
more clearly.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
1. Currently output format is hard-coded as NV12, thus means
CSC is always done for not NV12 input such as P010.
Follow original input format as default output.
2. Add an option to specify output format.
Signed-off-by: Zhong Li <zhong.li@intel.com>
The horizontal pass get ~2x performance with the patch
under single thread.
Tested overall performance using the command(avx2 enabled):
./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null
./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null
For single thread, the fps improves from 43 to 60, about 40%.
For multi-thread, the fps improves from 110 to 130, about 20%.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Remove the rain in the input image/video by applying the derain
methods based on convolutional neural networks. Training scripts
as well as scripts for model generation are provided in the
repository at https://github.com/XueweiMeng/derain_filter.git.
Signed-off-by: Xuewei Meng <xwmeng96@gmail.com>
We perfer the coding style like:
/* some stuff */
if (error) {
/* error handling */
return -(errorcode);
}
/* normal actions */
do_something()
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
benchmarking with a simple command:
ffmpeg -i 1080p.mp4 -vf unsharp=la=3:ca=3 -an -f null /dev/null
with the patch, the fps increase from 50 to 120 on my local machine (i7-6770HQ).
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Used the command for 1080p h264 clip as follow:
a). ffmpeg -i input -vf lutyuv="u=128:v=128" -f null /dev/null
b). ffmpeg -i input -vf lutrgb="g=0:b=0" -f null /dev/null
after enabled the slice threading, the fps change from:
a). 144fps to 258fps (lutyuv)
b). 94fps to 153fps (lutrgb)
in Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Add slice threading support, use the command like:
./ffmpeg -i input -vf colorlevels -f null /dev/null
with 1080p h264 clip, the fps from 39 fps to 79 fps
in the local(Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Attempts to pick the set of supported colour properties best matching the
input. Output is then set with the same values, except for the colour
matrix which may change when converting between RGB and YUV.
Fixes two warnings:
libavfilter/avf_showspatial.c:157:26: warning: variable ‘w’ set but not used
libavfilter/avf_showspatial.c:157:23: warning: variable ‘h’ set but not used
Currently, picref will be freed by calling av_frame_free(&picref) in
submit_frame() in qsvvpp.c when working in system memory mode,and normally it
is freed in filter_frame() in vf_vpp_qsv.c when working in other modes.
Double free happens when working in system memory mode, remove to
fix the memory issue.
Reproduce:
ffmpeg -init_hw_device qsv=foo -filter_hw_device foo -f rawvideo -pix_fmt nv12 -s:v 852x480 \
-i 852x480.nv12 -vf 'vpp_qsv=w=500:h=400' -f rawvideo -pix_fmt nv12 qsv.nv12
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Zhong Li <zhong.li@intel.com>
Fixes infinte loop with -vf loop=loop=1 and also fixes looping when the input
is less frames than the specified loop size.
Possible regressions since ef1aadffc7.
Signed-off-by: Marton Balint <cus@passwd.hu>