1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-18 03:19:31 +02:00
FFmpeg/libavcodec/x86
Stone Chen 0e52a4e434 libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC
Implements AVX2 DMVR (decoder-side motion vector refinement) SAD functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h > 128. To reduce complexity, SAD is only calculated on even rows. This is calculated for all video bitdepths, but the values passed to the function are always 16bit (even if the original video bitdepth is 8). The AVX2 implementation uses min/max/sub.

Additionally this changes parameters dx and dy from int to intptr_t. This allows dx & dy to be used as pointer offsets without needing to use movsxd.

Benchmarks ( AMD 7940HS )
Before:
BQTerrace_1920x1080_60_10_420_22_RA.vvc | 106.0 |
Chimera_8bit_1080P_1000_frames.vvc | 204.3 |
NovosobornayaSquare_1920x1080.bin | 197.3 |
RitualDance_1920x1080_60_10_420_37_RA.266 | 174.0 |

After:
BQTerrace_1920x1080_60_10_420_22_RA.vvc | 109.3 |
Chimera_8bit_1080P_1000_frames.vvc | 216.0 |
NovosobornayaSquare_1920x1080.bin | 204.0|
RitualDance_1920x1080_60_10_420_37_RA.266 | 181.7 |

Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-05-22 20:36:21 -03:00
..
h26x
vvc libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC 2024-05-22 20:36:21 -03:00
aacencdsp_init.c
aacencdsp.asm
aacpsdsp_init.c
aacpsdsp.asm
ac3dsp_downmix.asm
ac3dsp_init.c
ac3dsp.asm x86/ac3dsp: clear the upper 32 bits for input arguments where needed 2024-04-08 13:45:58 -03:00
alacdsp_init.c
alacdsp.asm
audiodsp_init.c
audiodsp.asm
blockdsp_init.c x86/blockdsp: add sse2 and avx2 versions of fill_block_tab 2024-05-08 21:13:23 -03:00
blockdsp.asm x86/blockdsp: add sse2 and avx2 versions of fill_block_tab 2024-05-08 21:13:23 -03:00
bswapdsp_init.c
bswapdsp.asm
cabac.h
cavsdsp.c
cavsidct.asm
celt_pvq_init.c
celt_pvq_search.asm x86: Update x86inc.asm 2024-03-24 14:53:57 +01:00
cfhddsp_init.c
cfhddsp.asm
cfhdencdsp_init.c
cfhdencdsp.asm
constants.c
constants.h
dcadsp_init.c
dcadsp.asm
dct32.asm
dirac_dwt_init.c
dirac_dwt.asm
diracdsp_init.c
diracdsp.asm
dnxhdenc_init.c
dnxhdenc.asm
exrdsp_init.c
exrdsp.asm
fdct.c
fdct.h
fdctdsp_init.c
flac_dsp_gpl.asm
flacdsp_init.c x86/flacdsp: add an SSE4 version of wasted33 2024-05-13 12:18:10 -03:00
flacdsp.asm x86/flacdsp: remove unused parameters to pmacsdql macro 2024-05-13 12:18:38 -03:00
flacencdsp_init.c
fmtconvert_init.c
fmtconvert.asm
fpel.asm avcodec/x86/fpel: Remove remnants of MMX 2024-03-03 19:48:41 +01:00
fpel.h
g722dsp_init.c
g722dsp.asm
h263_loopfilter.asm
h263dsp_init.c
h264_cabac.c
h264_chromamc_10bit.asm
h264_chromamc.asm x86: Update x86inc.asm 2024-03-24 14:53:57 +01:00
h264_deblock_10bit.asm
h264_deblock.asm
h264_idct_10bit.asm
h264_idct.asm avcodec/x86/h264_idct: Fix incorrect xmm spilling on win64 2024-03-25 21:17:47 +01:00
h264_intrapred_10bit.asm
h264_intrapred_init.c
h264_intrapred.asm x86: Update x86inc.asm 2024-03-24 14:53:57 +01:00
h264_qpel_8bit.asm
h264_qpel_10bit.asm
h264_qpel.c
h264_weight_10bit.asm
h264_weight.asm
h264chroma_init.c
h264dsp_init.c
hevc_add_res.asm
hevc_deblock.asm
hevc_idct.asm
hevc_mc.asm x86: Update x86inc.asm 2024-03-24 14:53:57 +01:00
hevc_sao_10bit.asm
hevc_sao.asm
hevcdsp_init.c
hevcdsp.h
hpeldsp_init.c
hpeldsp_rnd_template.c
hpeldsp.asm
hpeldsp.h
huffyuvdsp_init.c
huffyuvdsp_template.asm
huffyuvdsp.asm
huffyuvencdsp_init.c
huffyuvencdsp.asm
idctdsp_init.c
idctdsp.asm
idctdsp.h
imdct36.asm
inline_asm.h
jpeg2000dsp_init.c
jpeg2000dsp.asm
lossless_audiodsp_init.c
lossless_audiodsp.asm
lossless_videodsp_init.c
lossless_videodsp.asm
lossless_videoencdsp_init.c
lossless_videoencdsp.asm
lpc_init.c
lpc.asm
Makefile Remove remnants of prores_lgpl decoder 2024-05-07 23:53:26 +02:00
mathops.h
me_cmp_init.c
me_cmp.asm
mlpdsp_init.c
mlpdsp.asm
mpeg4videodsp.c
mpegaudiodsp.c
mpegvideo.c
mpegvideoenc_qns_template.c
mpegvideoenc_template.c
mpegvideoenc.c
mpegvideoencdsp_init.c
mpegvideoencdsp.asm
opusdsp_init.c opusdsp: add ability to modify deemphasis constant 2024-04-27 11:12:07 +02:00
opusdsp.asm opusdsp: add ability to modify deemphasis constant 2024-04-27 11:12:07 +02:00
pixblockdsp_init.c
pixblockdsp.asm
pngdsp_init.c
pngdsp.asm
proresdsp_init.c
proresdsp.asm
qpel.asm
qpeldsp_init.c
qpeldsp.asm
rnd_template.c
rv34dsp_init.c
rv34dsp.asm x86: Avoid using 'd' as an argument name 2024-03-24 14:53:57 +01:00
rv40dsp_init.c
rv40dsp.asm x86: Update x86inc.asm 2024-03-24 14:53:57 +01:00
sbcdsp_init.c
sbcdsp.asm
sbrdsp_init.c
sbrdsp.asm x86: Update x86inc.asm 2024-03-24 14:53:57 +01:00
simple_idct10_template.asm
simple_idct10.asm
simple_idct.asm avcodec/x86/rv40dsp, simple_idct: Remove remnants of MMX 2024-03-02 02:54:12 +01:00
simple_idct.h
snowdsp.c
svq1enc_init.c
svq1enc.asm
synth_filter_init.c
synth_filter.asm
takdsp_init.c
takdsp.asm
ttadsp_init.c
ttadsp.asm
ttaencdsp_init.c
ttaencdsp.asm
utvideodsp_init.c
utvideodsp.asm
v210-init.c
v210.asm
v210enc_init.c
v210enc.asm
vc1dsp_init.c
vc1dsp_loopfilter.asm
vc1dsp_mc.asm
vc1dsp_mmx.c
vc1dsp.h
videodsp_init.c
videodsp.asm
vorbisdsp_init.c
vorbisdsp.asm
vp3dsp_init.c avcodec/x86/vp3dsp_init: Set correct function pointer, fix crash 2024-05-02 23:38:15 +02:00
vp3dsp.asm
vp6dsp_init.c
vp6dsp.asm
vp8dsp_init.c
vp8dsp_loopfilter.asm
vp8dsp.asm x86: Update x86inc.asm 2024-03-24 14:53:57 +01:00
vp9dsp_init_10bpp.c
vp9dsp_init_12bpp.c
vp9dsp_init_16bpp_template.c
vp9dsp_init_16bpp.c
vp9dsp_init.c
vp9dsp_init.h
vp9intrapred_16bpp.asm
vp9intrapred.asm
vp9itxfm_16bpp.asm x86: Update x86inc.asm 2024-03-24 14:53:57 +01:00
vp9itxfm_template.asm
vp9itxfm.asm x86: Update x86inc.asm 2024-03-24 14:53:57 +01:00
vp9lpf_16bpp.asm
vp9lpf.asm
vp9mc_16bpp.asm
vp9mc.asm
vpx_arith.h
w64xmmtest.c
xvididct_init.c
xvididct.asm
xvididct.h