FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-23 12:43:46 +02:00

History

Martin Storsjö 2905657b90 aarch64: vp9itxfm: Avoid reloading the idct32 coefficients The idct32x32 function actually pushed d8-d15 onto the stack even though it didn't clobber them; there are plenty of registers that can be used to allow keeping all the idct coefficients in registers without having to reload different subsets of them at different stages in the transform. After this, we still can skip pushing d12-d15. Before: vp9_inv_dct_dct_32x32_sub32_add_neon: 8128.3 After: vp9_inv_dct_dct_32x32_sub32_add_neon: 8053.3 This is cherrypicked from libav commit `65aa002d54`. Signed-off-by: Martin Storsjö <martin@martin.st>		2017-03-11 13:14:51 +02:00
..
asm-offsets.h
cabac.h
fft_init_aarch64.c
fft_neon.S
fmtconvert_init.c
fmtconvert_neon.S
h264chroma_init_aarch64.c
h264cmc_neon.S
h264dsp_init_aarch64.c
h264dsp_neon.S
h264idct_neon.S	aarch64: h264idct: Use the offset parameter to movrel	2016-12-08 18:11:07 +01:00
h264pred_init.c
h264pred_neon.S
h264qpel_init_aarch64.c
h264qpel_neon.S
hpeldsp_init_aarch64.c
hpeldsp_neon.S
Makefile	aarch64: Add NEON optimizations for 10 and 12 bit vp9 loop filter	2017-01-24 22:36:11 +02:00
mdct_neon.S
mpegaudiodsp_init.c
mpegaudiodsp_neon.S
neon.S
neontest.c
rv40dsp_init_aarch64.c
synth_filter_init.c
synth_filter_neon.S
vc1dsp_init_aarch64.c
videodsp_init.c
videodsp.S
vorbisdsp_init.c
vorbisdsp_neon.S
vp9dsp_init_10bpp_aarch64.c	aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC	2017-01-24 22:36:05 +02:00
vp9dsp_init_12bpp_aarch64.c	aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC	2017-01-24 22:36:05 +02:00
vp9dsp_init_16bpp_aarch64_template.c	aarch64: Add NEON optimizations for 10 and 12 bit vp9 loop filter	2017-01-24 22:36:11 +02:00
vp9dsp_init_aarch64.c	aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC	2017-01-24 22:36:05 +02:00
vp9dsp_init.h	aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC	2017-01-24 22:36:05 +02:00
vp9itxfm_16bpp_neon.S	aarch64: Add NEON optimizations for 10 and 12 bit vp9 itxfm	2017-01-24 22:36:08 +02:00
vp9itxfm_neon.S	aarch64: vp9itxfm: Avoid reloading the idct32 coefficients	2017-03-11 13:14:51 +02:00
vp9lpf_16bpp_neon.S	aarch64: Add NEON optimizations for 10 and 12 bit vp9 loop filter	2017-01-24 22:36:11 +02:00
vp9lpf_neon.S	aarch64: vp9lpf: Use dup+rev16+uzp1 instead of dup+lsr+dup+trn1	2017-03-11 13:14:50 +02:00
vp9mc_16bpp_neon.S	aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC	2017-01-24 22:36:05 +02:00
vp9mc_neon.S	aarch64: vp9mc: Calculate less unused data in the 4 pixel wide horizontal filter	2017-03-11 13:14:48 +02:00