1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-02-14 22:22:59 +02:00
FFmpeg/libavcodec
Martin Storsjö 575e31e931 arm: vp9lpf: Implement the mix2_44 function with one single filter pass
For this case, with 8 inputs but only changing 4 of them, we can fit
all 16 input pixels into a q register, and still have enough temporary
registers for doing the loop filter.

The wd=8 filters would require too many temporary registers for
processing all 16 pixels at once though.

Before:                          Cortex A7      A8     A9     A53
vp9_loop_filter_mix2_v_44_16_neon:   289.7   256.2  237.5   181.2
After:
vp9_loop_filter_mix2_v_44_16_neon:   221.2   150.5  177.7   138.0

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-24 00:03:09 +02:00
..
2017-02-07 18:27:21 +01:00
2017-01-30 23:03:46 +00:00
2017-02-07 18:27:21 +01:00
2016-11-18 10:35:04 +01:00
2017-01-31 17:54:10 +01:00
2017-02-07 18:27:21 +01:00
2016-11-18 10:35:43 +01:00
2017-02-07 18:27:21 +01:00
2017-02-07 18:27:21 +01:00
2017-01-30 23:03:46 +00:00
2016-11-30 13:44:05 +01:00
2017-02-07 18:27:21 +01:00
2017-02-07 18:27:21 +01:00
2016-12-14 09:06:44 +01:00
2017-02-20 09:50:03 +01:00
2017-02-07 18:27:21 +01:00
2017-02-07 18:27:21 +01:00
2017-01-25 11:06:58 +01:00
2017-01-25 11:06:58 +01:00
2017-02-07 18:27:21 +01:00
2017-02-10 09:31:49 +02:00
2016-12-19 08:13:08 +01:00
2017-02-18 19:53:20 +00:00
2017-02-06 15:13:34 +01:00
2017-02-06 15:13:34 +01:00
2017-02-07 18:27:21 +01:00
2017-02-07 18:27:21 +01:00
2017-01-19 09:52:10 +01:00