1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-12 19:18:44 +02:00
Commit Graph

8 Commits

Author SHA1 Message Date
Hubert Mazur
a2e45ad407 lavc/aarch64: Add neon implementation for pix_abs16_y2
Provide optimized implementation of pix_abs16_y2 function for arm64.

Performance comparison tests are shown below.
pix_abs_0_2_c: 317.2
pix_abs_0_2_neon: 37.5

Benchmarks and tests run with checkasm tool on AWS Graviton 3.

Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-18 12:07:26 +03:00
Hubert Mazur
d7abb7d143 lavc/aarch64: Add neon implementation for sse4
Provide neon implementation for sse4 function.

Performance comparison tests are shown below.
- sse_2_c: 80.7
- sse_2_neon: 31.0

Benchmarks and tests are run with checkasm tool on AWS Graviton 3.

Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-18 12:07:26 +03:00
Hubert Mazur
ad251fd262 lavc/aarch64: Add neon implementation for sse16
Provide neon implementation for sse16 function.

Performance comparison tests are shown below.
- sse_0_c: 268.2
- sse_0_neon: 43.5

Benchmarks and tests run with checkasm tool on AWS Graviton 3.

Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-18 12:07:26 +03:00
Martin Storsjö
60109d5b3d aarch64: me_cmp: Fix the indentation of function declarations
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-18 12:07:26 +03:00
Andreas Rheinhardt
abb85429f3 avcodec/me_cmp: Constify me_cmp_func buffer parameters
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-07-31 03:31:53 +02:00
Hubert Mazur
01e190dc99 lavc/aarch64: Add pix_abs16_x2 neon implementation
Provide neon implementation for pix_abs16_x2 function.

Performance tests of implementation are below.
 - pix_abs_0_1_c: 283.5
 - pix_abs_0_1_neon: 39.0

Benchmarks and tests run with checkasm tool on AWS Graviton 3.

Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-07-13 23:25:22 +03:00
Hubert Mazur
eb7ab3928f lavc/aarch64: Hook up the existing ff_pix_abs16_neon to the sad[0] function pointer
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-07-11 23:58:28 +03:00
Swinney, Jonathan
c471cc7474 lavc/aarch64: motion estimation functions in neon
- ff_pix_abs16_neon
 - ff_pix_abs16_xy2_neon

In direct micro benchmarks of these ff functions verses their C implementations,
these functions performed as follows on AWS Graviton 3.

ff_pix_abs16_neon:
pix_abs_0_0_c: 141.1
pix_abs_0_0_neon: 19.6

ff_pix_abs16_xy2_neon:
pix_abs_0_3_c: 269.1
pix_abs_0_3_neon: 39.3

Tested with:
./tests/checkasm/checkasm --test=motion --bench --disable-linux-perf

Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-06-28 00:51:39 +03:00