1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-23 12:43:46 +02:00
FFmpeg/libavcodec/aarch64
Martin Storsjö dc47bf3872 aarch64: vp9itxfm: Make the larger core transforms standalone functions
This work is sponsored by, and copyright, Google.

This reduces the code size of libavcodec/aarch64/vp9itxfm_neon.o from
19496 to 14740 bytes.

This gives a small slowdown of a couple of tens of cycles, but makes
it more feasible to add more optimized versions of these transforms.

Before:
vp9_inv_dct_dct_16x16_sub4_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub16_add_neon:   1372.2
vp9_inv_dct_dct_32x32_sub4_add_neon:    5180.0
vp9_inv_dct_dct_32x32_sub32_add_neon:   8095.7

After:
vp9_inv_dct_dct_16x16_sub4_add_neon:    1051.0
vp9_inv_dct_dct_16x16_sub16_add_neon:   1390.1
vp9_inv_dct_dct_32x32_sub4_add_neon:    5199.9
vp9_inv_dct_dct_32x32_sub32_add_neon:   8125.8

This is cherrypicked from libav commit
115476018d.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-03-11 13:14:22 +02:00
..
asm-offsets.h
cabac.h
fft_init_aarch64.c Merge commit '97aec6e75ef36ed0402653519daa8e1fc8ddb555' 2016-04-12 15:43:09 +01:00
fft_neon.S
fmtconvert_init.c
fmtconvert_neon.S
h264chroma_init_aarch64.c
h264cmc_neon.S avcodec: fix vc1dsp dependencies 2016-09-25 13:11:45 +02:00
h264dsp_init_aarch64.c
h264dsp_neon.S
h264idct_neon.S aarch64: h264idct: Use the offset parameter to movrel 2016-12-08 18:11:07 +01:00
h264pred_init.c
h264pred_neon.S
h264qpel_init_aarch64.c
h264qpel_neon.S
hpeldsp_init_aarch64.c
hpeldsp_neon.S
Makefile aarch64: Add NEON optimizations for 10 and 12 bit vp9 loop filter 2017-01-24 22:36:11 +02:00
mdct_neon.S
mpegaudiodsp_init.c
mpegaudiodsp_neon.S Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb' 2016-06-21 21:55:34 +02:00
neon.S Merge commit 'cdb1665f70def544ddab3e3ed3763ef99c8b3873' 2016-04-24 12:51:42 +01:00
neontest.c avcodec: fix arguments on xmm/neon clobber test wrappers 2016-10-02 02:15:47 -03:00
rv40dsp_init_aarch64.c
synth_filter_init.c avcodec/synth_filter: split off remaining code from dcadec files 2016-01-25 14:57:38 -03:00
synth_filter_neon.S
vc1dsp_init_aarch64.c
videodsp_init.c
videodsp.S
vorbisdsp_init.c
vorbisdsp_neon.S
vp9dsp_init_10bpp_aarch64.c aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:36:05 +02:00
vp9dsp_init_12bpp_aarch64.c aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:36:05 +02:00
vp9dsp_init_16bpp_aarch64_template.c aarch64: Add NEON optimizations for 10 and 12 bit vp9 loop filter 2017-01-24 22:36:11 +02:00
vp9dsp_init_aarch64.c aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:36:05 +02:00
vp9dsp_init.h aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:36:05 +02:00
vp9itxfm_16bpp_neon.S aarch64: Add NEON optimizations for 10 and 12 bit vp9 itxfm 2017-01-24 22:36:08 +02:00
vp9itxfm_neon.S aarch64: vp9itxfm: Make the larger core transforms standalone functions 2017-03-11 13:14:22 +02:00
vp9lpf_16bpp_neon.S aarch64: Add NEON optimizations for 10 and 12 bit vp9 loop filter 2017-01-24 22:36:11 +02:00
vp9lpf_neon.S aarch64: vp9: loop filter: replace 'orr; cbn?z' with 'adds; b.{eq,ne}; 2017-01-14 21:13:10 +01:00
vp9mc_16bpp_neon.S aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:36:05 +02:00
vp9mc_neon.S aarch64: vp9mc: Fix a comment to refer to a register with the right name 2017-01-14 21:13:43 +01:00