1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-24 13:56:33 +02:00
FFmpeg/libavcodec
Martin Storsjö 68a03f6424 aarch64: me_cmp: Switch from uabd to uabal in ff_pix_abs16_xy2_neon
Using absolute-difference-accumulate does use twice the amount of
absolute-difference instructions, but avoids the need for the
uaddl and add instructions, reducing the total number of instructions
by 3.

These can be interleaved in the rest of the calculation, to avoid
tight dependencies at the end. Unfortunately, this is marginally
slower on Cortex A53, but faster on A72 and A73.

Before:       Cortex A53    A72    A73   Graviton 3
pix_abs_0_3_neon:  175.7  109.2   92.0   41.2
After:
pix_abs_0_3_neon:  179.7   96.7   87.5   41.2

Signed-off-by: Martin Storsjö <martin@martin.st>
2022-07-16 17:25:54 +03:00
..
2022-07-07 21:52:52 +02:00
2022-06-25 09:05:58 +08:00
2022-06-24 15:37:23 +08:00
2022-06-24 15:37:23 +08:00
2022-05-08 10:38:54 +08:00
2022-07-09 20:06:47 +02:00
2022-07-07 21:52:52 +02:00
2022-07-12 21:55:22 +02:00
2022-06-25 09:05:58 +08:00
2022-05-30 19:44:11 +02:00
2022-06-16 10:23:30 +02:00
2022-07-03 15:16:31 +02:00
2022-07-04 15:04:08 +02:00
2022-07-03 15:16:31 +02:00
2022-07-03 15:16:31 +02:00
2022-07-03 15:16:31 +02:00
2022-04-10 20:12:23 +02:00
2022-04-13 00:37:06 +02:00
2022-06-25 09:05:58 +08:00
2022-07-13 00:29:05 +02:00
2022-07-12 21:55:22 +02:00