1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-18 03:19:31 +02:00
FFmpeg/libavcodec/arm
Martin Storsjö 3fbbad2984 arm/aarch64: vp9lpf: Keep the comparison to E within 8 bit
The theoretical maximum value of E is 193, so we can just
saturate the addition to 255.

Before:                     Cortex A7      A8      A9     A53  A53/AArch64
vp9_loop_filter_v_4_8_neon:     143.0   127.7   114.8    88.0         87.7
vp9_loop_filter_v_8_8_neon:     241.0   197.2   173.7   140.0        136.7
vp9_loop_filter_v_16_8_neon:    497.0   419.5   379.7   293.0        275.7
vp9_loop_filter_v_16_16_neon:   965.2   818.7   731.4   579.0        452.0
After:
vp9_loop_filter_v_4_8_neon:     136.0   125.7   112.6    84.0         83.0
vp9_loop_filter_v_8_8_neon:     234.0   195.5   171.5   136.0        133.7
vp9_loop_filter_v_16_8_neon:    490.0   417.5   377.7   289.0        271.0
vp9_loop_filter_v_16_16_neon:   951.2   814.7   732.3   571.0        446.7

This is cherrypicked from libav commit
c582cb8537.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-03-11 13:14:50 +02:00
..
aac.h
aacpsdsp_init_arm.c
aacpsdsp_neon.S
ac3dsp_arm.S
ac3dsp_armv6.S
ac3dsp_init_arm.c
ac3dsp_neon.S
asm-offsets.h
audiodsp_arm.h
audiodsp_init_arm.c
audiodsp_init_neon.c
audiodsp_neon.S
blockdsp_arm.h
blockdsp_init_arm.c
blockdsp_init_neon.c
blockdsp_neon.S
cabac.h
dca.h
fft_fixed_init_arm.c
fft_fixed_neon.S
fft_init_arm.c
fft_neon.S
fft_vfp.S
flacdsp_arm.S
flacdsp_init_arm.c
fmtconvert_init_arm.c
fmtconvert_neon.S
fmtconvert_vfp.S
g722dsp_init_arm.c
g722dsp_neon.S
h264chroma_init_arm.c
h264cmc_neon.S avcodec: fix vc1dsp dependencies 2016-09-25 13:11:45 +02:00
h264dsp_init_arm.c
h264dsp_neon.S
h264idct_neon.S
h264pred_init_arm.c
h264pred_neon.S
h264qpel_init_arm.c
h264qpel_neon.S
hevcdsp_arm.h
hevcdsp_deblock_neon.S
hevcdsp_idct_neon.S Merge commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d' 2017-01-31 15:31:34 +01:00
hevcdsp_init_arm.c
hevcdsp_init_neon.c Merge commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d' 2017-01-31 15:31:34 +01:00
hevcdsp_qpel_neon.S
hpeldsp_arm.h
hpeldsp_arm.S
hpeldsp_armv6.S
hpeldsp_init_arm.c
hpeldsp_init_armv6.c
hpeldsp_init_neon.c
hpeldsp_neon.S
idct.h
idctdsp_arm.h
idctdsp_arm.S
idctdsp_armv6.S
idctdsp_init_arm.c
idctdsp_init_armv5te.c
idctdsp_init_armv6.c
idctdsp_init_neon.c
idctdsp_neon.S
int_neon.S
jrevdct_arm.S
lossless_audiodsp_init_arm.c
lossless_audiodsp_neon.S
Makefile arm: Add NEON optimizations for 10 and 12 bit vp9 loop filter 2017-01-24 22:35:59 +02:00
mathops.h
mdct_fixed_neon.S
mdct_neon.S
mdct_vfp.S
me_cmp_armv6.S
me_cmp_init_arm.c
mlpdsp_armv5te.S
mlpdsp_armv6.S Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb' 2016-06-21 21:55:34 +02:00
mlpdsp_init_arm.c
mpegaudiodsp_fixed_armv6.S
mpegaudiodsp_init_arm.c
mpegvideo_arm.c
mpegvideo_arm.h
mpegvideo_armv5te_s.S
mpegvideo_armv5te.c Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb' 2016-06-21 21:55:34 +02:00
mpegvideo_neon.S
mpegvideoencdsp_armv6.S
mpegvideoencdsp_init_arm.c
neon.S
neontest.c avcodec: fix arguments on xmm/neon clobber test wrappers 2016-10-02 02:15:47 -03:00
pixblockdsp_armv6.S
pixblockdsp_init_arm.c
rdft_init_arm.c
rdft_neon.S
rv34dsp_init_arm.c
rv34dsp_neon.S
rv40dsp_init_arm.c
rv40dsp_neon.S
sbrdsp_init_arm.c
sbrdsp_neon.S
simple_idct_arm.S Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb' 2016-06-21 21:55:34 +02:00
simple_idct_armv5te.S
simple_idct_armv6.S
simple_idct_neon.S
startcode_armv6.S
startcode.h
synth_filter_init_arm.c
synth_filter_neon.S
synth_filter_vfp.S
vc1dsp_init_arm.c
vc1dsp_init_neon.c
vc1dsp_neon.S
vc1dsp.h
videodsp_arm.h
videodsp_armv5te.S
videodsp_init_arm.c
videodsp_init_armv5te.c
vorbisdsp_init_arm.c
vorbisdsp_neon.S
vp3dsp_init_arm.c
vp3dsp_neon.S
vp6dsp_init_arm.c
vp6dsp_neon.S
vp8_armv6.S
vp8.h
vp8dsp_armv6.S Merge commit '5f74bd31a9bd1ac7655103b11743c12d38e0419f' 2016-11-17 15:05:07 +01:00
vp8dsp_init_arm.c
vp8dsp_init_armv6.c
vp8dsp_init_neon.c
vp8dsp_neon.S Merge commit 'e8b96a77010dd62624c3c65c357d7ae3b397ceaa' 2016-11-14 15:21:49 +01:00
vp8dsp.h
vp9dsp_init_10bpp_arm.c arm: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:35:50 +02:00
vp9dsp_init_12bpp_arm.c arm: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:35:50 +02:00
vp9dsp_init_16bpp_arm_template.c arm: Add NEON optimizations for 10 and 12 bit vp9 loop filter 2017-01-24 22:35:59 +02:00
vp9dsp_init_arm.c arm: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:35:50 +02:00
vp9dsp_init.h arm: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:35:50 +02:00
vp9itxfm_16bpp_neon.S arm: Add NEON optimizations for 10 and 12 bit vp9 itxfm 2017-01-24 22:35:56 +02:00
vp9itxfm_neon.S arm: vp9itxfm: Optimize 16x16 and 32x32 idct dc by unrolling 2017-03-11 13:14:48 +02:00
vp9lpf_16bpp_neon.S arm: Add NEON optimizations for 10 and 12 bit vp9 loop filter 2017-01-24 22:35:59 +02:00
vp9lpf_neon.S arm/aarch64: vp9lpf: Keep the comparison to E within 8 bit 2017-03-11 13:14:50 +02:00
vp9mc_16bpp_neon.S arm: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:35:50 +02:00
vp9mc_neon.S arm: vp9mc: Calculate less unused data in the 4 pixel wide horizontal filter 2017-03-11 13:14:47 +02:00
vp56_arith.h