1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00
FFmpeg/libswscale/aarch64
Swinney, Jonathan 0d7caa5b09 swscale/aarch64: add vscale specializations
This commit adds new code paths for vscale when filterSize is 2, 4, or
8. By using specialized code with unrolling to match the filterSize we
can improve performance.

On AWS c7g (Graviton 3, Neoverse V1) instances:
                                 before   after
yuv2yuvX_2_0_512_accurate_neon:  558.8    268.9
yuv2yuvX_4_0_512_accurate_neon:  637.5    434.9
yuv2yuvX_8_0_512_accurate_neon:  1144.8   806.2
yuv2yuvX_16_0_512_accurate_neon: 2080.5   1853.7

Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-16 13:40:42 +03:00
..
hscale.S libswscale/aarch64: add another hscale specialization 2022-08-16 12:08:38 +03:00
Makefile swscale: aarch64: Add a NEON implementation of interleaveBytes 2020-05-15 23:38:17 +03:00
output.S swscale/aarch64: add vscale specializations 2022-08-16 13:40:42 +03:00
rgb2rgb_neon.S swscale: aarch64: Add a NEON implementation of interleaveBytes 2020-05-15 23:38:17 +03:00
rgb2rgb.c swscale: aarch64: Add a NEON implementation of interleaveBytes 2020-05-15 23:38:17 +03:00
swscale_unscaled.c sws: rename SwsContext.swscale to convert_unscaled 2021-07-03 15:57:53 +02:00
swscale.c swscale/aarch64: add vscale specializations 2022-08-16 13:40:42 +03:00
yuv2rgb_neon.S aarch64/yuv2rgb_neon: fix return value 2020-07-09 10:33:14 +01:00