1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-08 13:22:53 +02:00
FFmpeg/libavcodec/aarch64
Martin Storsjö 68a03f6424 aarch64: me_cmp: Switch from uabd to uabal in ff_pix_abs16_xy2_neon
Using absolute-difference-accumulate does use twice the amount of
absolute-difference instructions, but avoids the need for the
uaddl and add instructions, reducing the total number of instructions
by 3.

These can be interleaved in the rest of the calculation, to avoid
tight dependencies at the end. Unfortunately, this is marginally
slower on Cortex A53, but faster on A72 and A73.

Before:       Cortex A53    A72    A73   Graviton 3
pix_abs_0_3_neon:  175.7  109.2   92.0   41.2
After:
pix_abs_0_3_neon:  179.7   96.7   87.5   41.2

Signed-off-by: Martin Storsjö <martin@martin.st>
2022-07-16 17:25:54 +03:00
..
aacpsdsp_init_aarch64.c
aacpsdsp_neon.S
asm-offsets.h
cabac.h
fft_init_aarch64.c
fft_neon.S arm64: Fix wrong BTI landing pad 2022-04-26 10:26:49 +03:00
fmtconvert_init.c
fmtconvert_neon.S
h264chroma_init_aarch64.c
h264cmc_neon.S configure: Use a separate config_components.h header for $ALL_COMPONENTS 2022-03-16 14:12:49 +02:00
h264dsp_init_aarch64.c
h264dsp_neon.S aarch64: h264dsp: Fix incorrectly indented code 2022-02-11 10:49:12 +02:00
h264idct_neon.S
h264pred_init.c
h264pred_neon.S
h264qpel_init_aarch64.c
h264qpel_neon.S
hevcdsp_idct_neon.S
hevcdsp_init_aarch64.c lavc/aarch64: add hevc sao edge 8x8 2022-05-25 08:04:46 +02:00
hevcdsp_sao_neon.S lavc/aarch64: hevc_sao reschedule slightly 2022-05-26 08:10:41 +02:00
hpeldsp_init_aarch64.c
hpeldsp_neon.S
idct.h avcodec/aarch64/idct: Add missing stddef 2022-02-21 13:10:04 +01:00
idctdsp_init_aarch64.c avcodec/idctdsp: Arm 64-bit NEON block add and clamp fast paths 2022-04-01 10:03:34 +03:00
idctdsp_neon.S avcodec/idctdsp: Arm 64-bit NEON block add and clamp fast paths 2022-04-01 10:03:34 +03:00
Makefile lavc/aarch64: motion estimation functions in neon 2022-06-28 00:51:39 +03:00
mdct_neon.S arm64: Add Armv8.3-A PAC support to assembly files 2022-03-09 15:04:25 +02:00
me_cmp_init_aarch64.c lavc/aarch64: Add pix_abs16_x2 neon implementation 2022-07-13 23:25:22 +03:00
me_cmp_neon.S aarch64: me_cmp: Switch from uabd to uabal in ff_pix_abs16_xy2_neon 2022-07-16 17:25:54 +03:00
mpegaudiodsp_init.c
mpegaudiodsp_neon.S
neon.S
neontest.c
opusdsp_init.c
opusdsp_neon.S
pixblockdsp_init_aarch64.c
pixblockdsp_neon.S
rv40dsp_init_aarch64.c
sbrdsp_init_aarch64.c
sbrdsp_neon.S
simple_idct_neon.S
synth_filter_init.c
synth_filter_neon.S arm64: Add Armv8.3-A PAC support to assembly files 2022-03-09 15:04:25 +02:00
vc1dsp_init_aarch64.c avcodec/vc1: Arm 64-bit NEON unescape fast path 2022-04-01 10:03:34 +03:00
vc1dsp_neon.S avcodec/vc1: Arm 64-bit NEON unescape fast path 2022-04-01 10:03:34 +03:00
videodsp_init.c
videodsp.S
vorbisdsp_init.c
vorbisdsp_neon.S
vp8dsp_init_aarch64.c
vp8dsp_neon.S
vp8dsp.h
vp9dsp_init_10bpp_aarch64.c
vp9dsp_init_12bpp_aarch64.c
vp9dsp_init_16bpp_aarch64_template.c
vp9dsp_init_aarch64.c
vp9dsp_init.h
vp9itxfm_16bpp_neon.S
vp9itxfm_neon.S
vp9lpf_16bpp_neon.S
vp9lpf_neon.S
vp9mc_16bpp_neon.S
vp9mc_aarch64.S
vp9mc_neon.S