Ramiro Polla
|
c0666d8bed
|
swscale/aarch64/rgb2rgb: add neon implementation for rgb24toyv12
A55 A76
rgb24toyv12_16_200_c: 36890.6 17275.5
rgb24toyv12_16_200_neon: 12460.1 ( 2.96x) 5360.8 ( 3.22x)
rgb24toyv12_128_60_c: 83205.1 39884.8
rgb24toyv12_128_60_neon: 27468.4 ( 3.03x) 13552.5 ( 2.94x)
rgb24toyv12_512_16_c: 88111.6 42346.8
rgb24toyv12_512_16_neon: 29126.6 ( 3.03x) 14411.2 ( 2.94x)
rgb24toyv12_1920_4_c: 82068.1 39620.0
rgb24toyv12_1920_4_neon: 27011.6 ( 3.04x) 13492.2 ( 2.94x)
|
2024-09-06 23:11:13 +02:00 |
|
Ramiro Polla
|
d8848325a6
|
swscale/aarch64/rgb2rgb: add deinterleaveBytes neon implementation
A55 A76
deinterleave_bytes_c: 70342.0 34497.5
deinterleave_bytes_neon: 21594.5 ( 3.26x) 5535.2 ( 6.23x)
deinterleave_bytes_aligned_c: 71340.8 34651.2
deinterleave_bytes_aligned_neon: 8616.8 ( 8.28x) 3996.2 ( 8.67x)
|
2024-09-06 23:05:09 +02:00 |
|
Martin Storsjö
|
e0604d508e
|
swscale: aarch64: Add a NEON implementation of interleaveBytes
This allows speeding up format conversions from yuv420 to nv12.
Cortex A53 A72 A73
interleave_bytes_c: 86077.5 51433.0 66972.0
interleave_bytes_neon: 19701.7 23019.2 15859.2
interleave_bytes_aligned_c: 86603.0 52017.2 67484.2
interleave_bytes_aligned_neon: 9061.0 7623.0 6309.0
Signed-off-by: Martin Storsjö <martin@martin.st>
|
2020-05-15 23:38:17 +03:00 |
|