Martin Storsjö
fd3bd5c492
aarch64: h264qpel: Do vertical filtering without transposing
...
This gives rather big speedups on these functions:
Before:
put_h264_qpel_8_mc01_8_neon: 241.0 131.5 138.7
put_h264_qpel_8_mc02_8_neon: 214.7 121.2 127.5
put_h264_qpel_8_mc03_8_neon: 242.5 131.2 135.7
put_h264_qpel_8_mc11_8_neon: 421.2 218.7 251.0
put_h264_qpel_8_mc12_8_neon: 878.0 509.5 537.5
put_h264_qpel_8_mc13_8_neon: 423.7 217.0 252.0
put_h264_qpel_8_mc21_8_neon: 858.2 479.5 514.0
put_h264_qpel_8_mc22_8_neon: 649.7 385.2 403.0
put_h264_qpel_8_mc23_8_neon: 860.2 476.5 517.7
put_h264_qpel_8_mc31_8_neon: 437.2 219.5 252.5
put_h264_qpel_8_mc32_8_neon: 892.5 510.5 546.0
put_h264_qpel_8_mc33_8_neon: 438.2 218.5 257.0
put_h264_qpel_16_mc01_8_neon: 944.2 509.7 546.7
put_h264_qpel_16_mc02_8_neon: 878.7 469.5 509.7
put_h264_qpel_16_mc03_8_neon: 945.7 510.7 557.0
put_h264_qpel_16_mc11_8_neon: 1663.2 858.5 979.5
put_h264_qpel_16_mc12_8_neon: 3510.2 2027.7 2112.7
put_h264_qpel_16_mc13_8_neon: 1664.7 857.5 980.5
put_h264_qpel_16_mc21_8_neon: 3366.2 1928.5 2030.5
put_h264_qpel_16_mc22_8_neon: 2584.7 1514.7 1590.2
put_h264_qpel_16_mc23_8_neon: 3367.7 1927.7 2035.0
put_h264_qpel_16_mc31_8_neon: 1716.7 849.7 997.0
put_h264_qpel_16_mc32_8_neon: 3564.0 2044.2 3835.2
put_h264_qpel_16_mc33_8_neon: 1717.7 863.0 989.5
After:
put_h264_qpel_8_mc01_8_neon: 136.0 73.7 76.0
put_h264_qpel_8_mc02_8_neon: 108.7 65.0 64.0
put_h264_qpel_8_mc03_8_neon: 137.5 72.7 73.0
put_h264_qpel_8_mc11_8_neon: 316.2 159.0 188.5
put_h264_qpel_8_mc12_8_neon: 653.0 375.5 384.7
put_h264_qpel_8_mc13_8_neon: 318.7 165.5 189.5
put_h264_qpel_8_mc21_8_neon: 739.2 385.7 432.5
put_h264_qpel_8_mc22_8_neon: 530.7 295.5 309.5
put_h264_qpel_8_mc23_8_neon: 741.2 393.7 421.0
put_h264_qpel_8_mc31_8_neon: 332.2 162.5 190.0
put_h264_qpel_8_mc32_8_neon: 667.5 378.2 390.5
put_h264_qpel_8_mc33_8_neon: 332.7 166.5 195.5
put_h264_qpel_16_mc01_8_neon: 524.2 285.2 294.0
put_h264_qpel_16_mc02_8_neon: 454.7 252.2 250.2
put_h264_qpel_16_mc03_8_neon: 525.7 286.0 283.0
put_h264_qpel_16_mc11_8_neon: 1243.2 630.7 726.7
put_h264_qpel_16_mc12_8_neon: 2610.2 1479.7 1481.2
put_h264_qpel_16_mc13_8_neon: 1250.5 631.7 727.7
put_h264_qpel_16_mc21_8_neon: 2890.2 1571.2 1679.7
put_h264_qpel_16_mc22_8_neon: 2108.7 1177.5 1223.5
put_h264_qpel_16_mc23_8_neon: 2891.7 1578.7 1667.7
put_h264_qpel_16_mc31_8_neon: 1296.7 630.5 752.5
put_h264_qpel_16_mc32_8_neon: 2664.0 1483.2 1503.5
put_h264_qpel_16_mc33_8_neon: 1297.7 632.5 747.2
I.e. overall a 20%-60% reduction in runtime of these
functions.
Signed-off-by: Martin Storsjö <martin@martin.st >
2021-10-18 14:27:58 +03:00
Martin Storsjö
2d5a7f6d00
arm/aarch64: Improve scheduling in the avg form of h264_qpel
...
Don't use the loaded registers directly, avoiding stalls on in
order cores. Use vrhadd.u8 with q registers where easily possible.
Signed-off-by: Martin Storsjö <martin@martin.st >
2021-10-18 14:27:36 +03:00
Gyan Doshi
d04c005021
doc/filters: correct description of select filter variables
2021-10-18 14:28:04 +05:30
Paul B Mahol
bbbf95848b
avfilter/vf_w3fdif: do not output extra frame at start with deint=interlaced
2021-10-18 09:29:41 +02:00
Michael Niedermayer
85c169f6a6
avcodec/speexdec: Seed should be unsigned otherwise the operations done on it are undefined
...
Fixes: signed integer overflow: 1664525000 + 1013904223 cannot be represented in type 'int'
Fixes: 39865/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SPEEX_fuzzer-4979694508834816
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Paul B Mahol <onemda@gmail.com >
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc >
2021-10-17 22:20:07 +02:00
Limin Wang
77970abb71
avcodec/hevc_filter: Correct indention
...
Signed-off-by: Limin Wang <lance.lmwang@gmail.com >
2021-10-17 16:57:53 +08:00
Limin Wang
5a91850b61
avcodec/hevc_filter: remove unneeded headers
...
Signed-off-by: Limin Wang <lance.lmwang@gmail.com >
2021-10-17 16:57:47 +08:00
Limin Wang
06548e6045
avcodec/hevcdec: remove unused code
...
Signed-off-by: Limin Wang <lance.lmwang@gmail.com >
2021-10-17 16:57:43 +08:00
Limin Wang
fb4f9a2043
avformat/rtpdec_rfc4175: add support for RANGE
...
Signed-off-by: Limin Wang <lance.lmwang@gmail.com >
2021-10-17 16:54:03 +08:00
Limin Wang
bad48dfe9a
avformat/rtpdec_rfc4175: add support for colorimetry
...
Signed-off-by: Limin Wang <lance.lmwang@gmail.com >
2021-10-17 16:54:03 +08:00
Limin Wang
ca56fedab5
avformat/rtpdec_rfc4175: add support for TCS
...
Signed-off-by: Limin Wang <lance.lmwang@gmail.com >
2021-10-17 16:54:03 +08:00
Limin Wang
b07437f956
avformat/rtpdec_rfc4175: add support for exactframerate
...
Signed-off-by: Limin Wang <lance.lmwang@gmail.com >
2021-10-17 16:54:03 +08:00
Fei Wang
84c73102d9
avcodec/av1_vaapi: improve decode quality
...
- quantizer delta and matrix level specific.
- support loop filter delta.
- support use superres.
Signed-off-by: Fei Wang <fei.w.wang@intel.com >
2021-10-16 19:00:44 -03:00
Fei Wang
dc94f2eaaf
avcodec/av1_vaapi: enable segmentation features
...
Signed-off-by: Fei Wang <fei.w.wang@intel.com >
2021-10-16 19:00:44 -03:00
Fei Wang
7871144cf8
avcodec/av1_vaapi: setting 2 output surface for film grain
...
VAAPI needs 2 output surface for film grain frame. One used for
reference and the other used for applying film grain and pushing
to downstream.
Signed-off-by: Fei Wang <fei.w.wang@intel.com >
2021-10-16 19:00:44 -03:00
Fei Wang
53403158cc
avcodec/vaapi: increase av1 decode pool size
...
For film grain clip, vaapi_av1 decoder will cache additional 8
surfaces that will be used to store frames which apply film grain.
So increase the pool size by plus 8 to avoid leak of surface.
Signed-off-by: Fei Wang <fei.w.wang@intel.com >
2021-10-16 19:00:44 -03:00
Tong Wu
4e7a7d75e3
avcodec/dxva2_av1: fix global motion params
...
Defined in spec 5.9.24/5.9.25. Since function void
global_motion_params(AV1DecContext *s) already updates
gm type/params, the wminvalid parameter only need to get
the value from cur_frame.gm_invalid.
Signed-off-by: Tong Wu <tong1.wu@intel.com >
2021-10-16 19:00:44 -03:00
Fei Wang
0d0ea70e7b
avcodec/av1_vaapi: add gm params valid check
...
Signed-off-by: Fei Wang <fei.w.wang@intel.com >
2021-10-16 19:00:44 -03:00
Fei Wang
de7475b111
avcodec/av1dec: support setup shear process
...
Defined in spec 7.11.3.6/7.11.3.7.
Signed-off-by: Fei Wang <fei.w.wang@intel.com >
2021-10-16 19:00:44 -03:00
Fei Wang
75de7fe262
avcodec/av1: extend some definitions in spec section 3
...
Signed-off-by: Fei Wang <fei.w.wang@intel.com >
2021-10-16 19:00:44 -03:00
Fei Wang
e7ff5722b1
cbs_av1: fix incorrect data type
...
Since order_hint_bits_minus_1 range is 0~7, cur_frame_hint can be
most 128. And similar return value for cbs_av1_get_relative_dist.
So if plus them and use int8_t for the result may lose its precision.
Signed-off-by: Fei Wang <fei.w.wang@intel.com >
2021-10-16 19:00:43 -03:00
Tsutomu Seki
9b445663a5
avfilter/opencl: Fix program_opencl for source code larger than 64kB
...
The condition (pos < len) is always true and the
rest of the OpenCL program code would not be read, while
the maximum number of "rb" is "len - pos - 1", and then, the
maximum number of the "pos" is "len - 1".
Fixes: trac.ffmpeg.org/ticket/9217
2021-10-16 12:17:23 +02:00
Paul B Mahol
5bcc61ce87
avfilter/vf_v360: add reset_rot option
2021-10-16 11:39:15 +02:00
Niklas Haas
3cc3f5de2a
avcodec/hevcdec: apply H.274 film grain
...
Similar in spirit and design to 66845cffc3
, but slightly simpler due
to the lack of interlaced frames in HEVC. See that commit for more
details.
For the seed value, since no specification for this appears to exist, I
semi-arbitrarily decided to base it off the POC id alone, since there's
no analog of the idr_pic_id in HEVC's I-frames. This design is stable
across remuxes and seeks, but changes for adjacent frames with a period
that's typically long enough not to be noticeable, which makes it
satisfy all of the requirements that a film grain seed should have.
Tested with and without threading, using a patch to insert film grain
metadata artificially (for lack of real files containing film grain).
2021-10-15 11:55:45 -03:00
Zane van Iperen
5d16660598
avformat/argo_asf: use title metadata when muxing
...
Signed-off-by: Zane van Iperen <zane@zanevaniperen.com >
2021-10-15 23:40:15 +10:00
Zane van Iperen
9a2b9aafba
avformat/argo_asf: pass name through as metadata
...
Signed-off-by: Zane van Iperen <zane@zanevaniperen.com >
2021-10-15 23:40:15 +10:00
Zane van Iperen
20fa838da5
avformat/argo_asf: cleanup and NULL-terminate name field in header
...
Preparation for metadata changes in the following patches. Saves
having to create an extra buffer.
Signed-off-by: Zane van Iperen <zane@zanevaniperen.com >
2021-10-15 23:39:47 +10:00
Wu Jianhua
2c734a8496
libswscale/x86/rgb2rgb: add shuffle_bytes avx2
...
Performance data(Less is better):
shuffle_bytes_ssse3 3.64654
shuffle_bytes_avx2 0.94288
Signed-off-by: Wu Jianhua <jianhua.wu@intel.com >
2021-10-15 10:59:20 +02:00
Paul B Mahol
767f162432
avfilter/window_func: unify all filters win_func option that use this header
2021-10-15 10:45:50 +02:00
James Almer
39f3c98bb1
x86/vf_lut3d: use three operand form for some instructions
...
Fixes compilation with old yasm.
Signed-off-by: James Almer <jamrial@gmail.com >
2021-10-14 18:09:38 -03:00
Paul B Mahol
890cef1ff6
avfilter/vf_fftfilt: export FFT arrays size
2021-10-14 20:26:23 +02:00
Paul B Mahol
e1b820fa33
avfilter/vf_overlay: unbreak alpha composition with negative y and threads > 1
2021-10-14 20:05:39 +02:00
Martin Storsjö
bb10f8d802
avfilter/vf_fftfilt: Use av_clip_uint8
...
The refactoring in 844890b1bc
caused
fate-source to point out that this could be av_clip_uintp2 (or
rather av_clip_uint8).
Signed-off-by: Martin Storsjö <martin@martin.st >
2021-10-14 14:05:39 +03:00
Paul B Mahol
c336c7a9d7
fate: update histogram test results
2021-10-14 12:22:38 +02:00
Paul B Mahol
df05603291
avfilter/vf_histogram: add colors_mode option
2021-10-14 12:16:30 +02:00
Paul B Mahol
7d3a9bb54b
avfilter/vf_fftfilt: add gray formats >8 depth support
2021-10-14 10:08:59 +02:00
Pekka Väänänen
4d52e36bd0
avformat/westwood_vqa: Store VQFL codebook chunks
...
High color 15-bit VQA3 video streams contain high level chunks with
only codebook updates that shouldn't be considered new frames. Now
the demuxer stores a reference to such VQFL chunks and returns them
later along with a VQFR chunk with full frame data.
2021-10-14 09:59:52 +02:00
Paul B Mahol
844890b1bc
avfilter/vf_fftfilt: add slice threading support
2021-10-14 01:27:16 +02:00
Paul B Mahol
8add1b39e2
avfilter/vf_fftfilt: simplify bits/len calculation
2021-10-14 01:27:16 +02:00
Paul B Mahol
933765aa0e
avfilter: add xcorrelate video filter
2021-10-13 19:09:21 +02:00
Paul B Mahol
32eaf4069e
avfilter: add limitdiff video filter
2021-10-13 19:02:34 +02:00
Soft Works
73fe19f09c
avfilter/vf_palettegen: cosmetic changes
...
Signed-off-by: softworkz <softworkz@hotmail.com >
2021-10-13 18:52:14 +02:00
Soft Works
dea673d0d5
avfilter/vf_palette(gen|use): support palettes with alpha
2021-10-13 18:52:14 +02:00
Mark Reid
3ee7250116
avfilter/vf_lut3d: fix building with --disable-optimizations
2021-10-13 18:01:21 +02:00
Limin Wang
871fee82e1
avcodec/videotoolboxenc: use goto end for memory cleanup
...
Signed-off-by: Limin Wang <lance.lmwang@gmail.com >
2021-10-13 20:12:30 +08:00
Limin Wang
f25871d790
avcodec/avs3_parser: Fix usage of init_get_bits() and use init_get_bits8()
...
Signed-off-by: Limin Wang <lance.lmwang@gmail.com >
2021-10-13 20:12:30 +08:00
Limin Wang
ba03e4ed33
avcodec/audiotoolboxdec: Fix usage of init_get_bits() and use init_get_bits8()
...
Signed-off-by: Limin Wang <lance.lmwang@gmail.com >
2021-10-13 20:12:30 +08:00
Paul B Mahol
13141339c1
avformat/dhav: make duration extraction more robust
2021-10-13 12:14:39 +02:00
Paul B Mahol
6384175d8c
avformat/dhav: check if timestamp matches when seeking
2021-10-13 12:14:39 +02:00
Nachiket Tarate
f14adb0516
libavformat/hls: correct indentation
...
Signed-off-by: Nachiket Tarate <nachiket.programmer@gmail.com >
Signed-off-by: Steven Liu <lq@chinaffmpeg.org >
2021-10-13 11:24:02 +08:00