FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-12 19:18:44 +02:00

Author	SHA1	Message	Date
Andreas Rheinhardt	333b32af8e	avcodec/h264chroma: Constify src in h264_chroma_mc_func Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-08-05 03:02:13 +02:00
Ben Avison	6eee650289	avcodec/vc1: Arm 64-bit NEON unescape fast path checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_unescape_buffer_c: 655617.7 vc1dsp.vc1_unescape_buffer_neon: 118237.0 Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-04-01 10:03:34 +03:00
Ben Avison	501fdc017d	avcodec/vc1: Arm 64-bit NEON inverse transform fast paths checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_inv_trans_4x4_c: 158.2 vc1dsp.vc1_inv_trans_4x4_neon: 65.7 vc1dsp.vc1_inv_trans_4x4_dc_c: 86.5 vc1dsp.vc1_inv_trans_4x4_dc_neon: 26.5 vc1dsp.vc1_inv_trans_4x8_c: 335.2 vc1dsp.vc1_inv_trans_4x8_neon: 106.2 vc1dsp.vc1_inv_trans_4x8_dc_c: 151.2 vc1dsp.vc1_inv_trans_4x8_dc_neon: 25.5 vc1dsp.vc1_inv_trans_8x4_c: 365.7 vc1dsp.vc1_inv_trans_8x4_neon: 97.2 vc1dsp.vc1_inv_trans_8x4_dc_c: 139.7 vc1dsp.vc1_inv_trans_8x4_dc_neon: 16.5 vc1dsp.vc1_inv_trans_8x8_c: 547.7 vc1dsp.vc1_inv_trans_8x8_neon: 137.0 vc1dsp.vc1_inv_trans_8x8_dc_c: 268.2 vc1dsp.vc1_inv_trans_8x8_dc_neon: 30.5 Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-04-01 10:03:34 +03:00
Ben Avison	c62bbd4d20	avcodec/vc1: Arm 64-bit NEON deblocking filter fast paths checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C version can still outperform the NEON version in specific cases. The balance between different code paths is stream-dependent, but in practice the best case happens about 5% of the time, the worst case happens about 40% of the time, and the complexity of the remaining cases fall somewhere in between. Therefore, taking the average of the best and worst case timings is probably a conservative estimate of the degree by which the NEON code improves performance. vc1dsp.vc1_h_loop_filter4_bestcase_c: 10.7 vc1dsp.vc1_h_loop_filter4_bestcase_neon: 43.5 vc1dsp.vc1_h_loop_filter4_worstcase_c: 184.5 vc1dsp.vc1_h_loop_filter4_worstcase_neon: 73.7 vc1dsp.vc1_h_loop_filter8_bestcase_c: 31.2 vc1dsp.vc1_h_loop_filter8_bestcase_neon: 62.2 vc1dsp.vc1_h_loop_filter8_worstcase_c: 358.2 vc1dsp.vc1_h_loop_filter8_worstcase_neon: 88.2 vc1dsp.vc1_h_loop_filter16_bestcase_c: 51.0 vc1dsp.vc1_h_loop_filter16_bestcase_neon: 107.7 vc1dsp.vc1_h_loop_filter16_worstcase_c: 722.7 vc1dsp.vc1_h_loop_filter16_worstcase_neon: 140.5 vc1dsp.vc1_v_loop_filter4_bestcase_c: 9.7 vc1dsp.vc1_v_loop_filter4_bestcase_neon: 43.0 vc1dsp.vc1_v_loop_filter4_worstcase_c: 178.7 vc1dsp.vc1_v_loop_filter4_worstcase_neon: 69.0 vc1dsp.vc1_v_loop_filter8_bestcase_c: 30.2 vc1dsp.vc1_v_loop_filter8_bestcase_neon: 50.7 vc1dsp.vc1_v_loop_filter8_worstcase_c: 353.0 vc1dsp.vc1_v_loop_filter8_worstcase_neon: 69.2 vc1dsp.vc1_v_loop_filter16_bestcase_c: 60.0 vc1dsp.vc1_v_loop_filter16_bestcase_neon: 90.0 vc1dsp.vc1_v_loop_filter16_worstcase_c: 714.2 vc1dsp.vc1_v_loop_filter16_worstcase_neon: 97.2 Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-04-01 10:03:33 +03:00
James Almer	a8474df944	Merge commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c' * commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c': h264chroma: Change type of stride parameters to ptrdiff_t Merged-by: James Almer <jamrial@gmail.com>	2017-03-21 15:20:45 -03:00
Diego Biurrun	e4a94d8b36	h264chroma: Change type of stride parameters to ptrdiff_t This avoids SIMD-optimized functions having to sign-extend their stride argument manually to be able to do pointer arithmetic.	2016-09-29 14:48:04 +02:00
Michael Niedermayer	6f001d87ff	Merge commit '71617884a2a673908bd5c0f73d4f91fdca3da82a' * commit '71617884a2a673908bd5c0f73d4f91fdca3da82a': aarch64: h264 chroma motion compensation NEON optimizations Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-15 15:00:06 +01:00
Janne Grunau	71617884a2	aarch64: h264 chroma motion compensation NEON optimizations Since RV40 and VC-1 use almost the same algorithm so optimizations for those two decoders are easy to do and included.	2014-01-15 12:07:18 +01:00

8 Commits