FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-12 19:18:44 +02:00

Author	SHA1	Message	Date
Lynne	e0661fc805	dca_core: convert to lavu/tx Thanks to Martin Storsjö <martin@martin.st> for fixing and testing the arm32 and aarch64 changes.	2022-11-06 14:39:36 +01:00
Andreas Rheinhardt	76d8f0dd14	avcodec/ac3dsp: Remove unused parameter Forgotten in `fd98594a88`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-09-29 23:37:13 +02:00
Martin Storsjö	86519234b8	arm: vc1dsp: Canonicalize the syntax for aligned NEON loads/stores This hopefully should fix building with older toolchains, hopefully fixing the fate failures on http://fate.ffmpeg.org/history.cgi?slot=armel5tej-qemu-debian-gcc4.4. Signed-off-by: Martin Storsjö <martin@martin.st>	2022-09-29 10:28:45 +03:00
Andreas Rheinhardt	9beba05311	avcodec/fmtconvert: Remove unused AVCodecContext parameter Unused since `d74a8cb7e4`. Reviewed-by: Rémi Denis-Courmont <remi@remlab.net> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-09-21 20:26:40 +02:00
Rémi Denis-Courmont	b52034270a	lavc/vorbisdsp: use ptrdiff_t rather than intptr_t ... for a difference between pointers.	2022-09-19 13:51:00 -03:00
James Cowgill	50a4dff69f	avcodec/arm/sbcenc: avoid callee preserved vfp registers When compiling FFmpeg with GCC-9, some very random segfaults were observed in code which had previously called down into the SBC encoder NEON assembly routines. This was caused by these functions clobbering some of the vfp callee saved registers (d8 - d15 aka q4 - q7). GCC was using these registers to save local variables, but after these functions returned, they would contain garbage. Fix by reallocating the registers in the two affected functions in the following way: ff_sbc_analyze_4_neon: q2-q5 => q8-q11, then q1-q4 => q8-q11 ff_sbc_analyze_8_neon: q2-q9 => q8-q15 The reason for using these replacements is to keep closely related sets of registers consecutively numbered which hopefully makes the code more easy to follow. Since this commit only reallocates registers, it should have no performance impact. Signed-off-by: James Cowgill <jcowgill@debian.org> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-09-13 09:51:51 +03:00
Andreas Rheinhardt	a54e53a1c4	avcodec/vp8dsp: Constify src in vp8_mc_func Reviewed-by: Peter Ross <pross@xvid.org> Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-09-11 20:57:51 +02:00
Martin Storsjö	3f456dc245	arm: rv40dsp: Change stride parameters to ptrdiff_t These were missed when h264_chroma_mc_func was changed in `e4a94d8b36`. Signed-off-by: Martin Storsjö <martin@martin.st>	2022-09-02 23:04:58 +03:00
Martin Storsjö	826cd5e098	arm: vc1sdp: Change stride parameters to ptrdiff_t This was missed in `db54426975`. Signed-off-by: Martin Storsjö <martin@martin.st>	2022-09-02 23:04:55 +03:00
Lynne	f99d15cca0	arm/fft: disable NEON optimizations for 131072pt transforms This has been broken since the start, and it was only discovered when I started testing my replacement for the FFT. Disable it, since there's no point in fixing slower code that's about to be removed anyway. The vfp version is not affected.	2022-08-29 07:13:43 +02:00
Andreas Rheinhardt	6c4595190e	avcodec/flacdsp: Split encoder-only parts into a ctx of its own Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-08-05 03:28:45 +02:00
Andreas Rheinhardt	3a869cd5cd	avcodec/flacdsp: Remove unused function parameter Forgotten in `e609cfd697`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-08-05 03:28:45 +02:00
Andreas Rheinhardt	333b32af8e	avcodec/h264chroma: Constify src in h264_chroma_mc_func Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-08-05 03:02:13 +02:00
Andreas Rheinhardt	b3bbbb14d0	avcodec/hevcdsp: Constify src pointers Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-08-05 02:54:04 +02:00
Andreas Rheinhardt	966fc1230a	avcodec/mpegvideoencdsp: Allow pointers to const where possible Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-07-31 03:32:40 +02:00
Andreas Rheinhardt	abb85429f3	avcodec/me_cmp: Constify me_cmp_func buffer parameters Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-07-31 03:31:53 +02:00
Andreas Rheinhardt	af43da3e4d	avcodec/videodsp: Constify buf in VideoDSPContext.prefetch Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-07-31 03:14:34 +02:00
Andreas Rheinhardt	7ab9b30800	avcodec/vp56: Move VP5-9 range coder functions to a header of their own Also use a vpx prefix for them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-07-28 03:49:54 +02:00
Ben Avison	23c92e14f5	avcodec/vc1: Arm 32-bit NEON unescape fast path checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_unescape_buffer_c: 918624.7 vc1dsp.vc1_unescape_buffer_neon: 142958.0 Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-04-01 10:03:34 +03:00
Ben Avison	c07de58a72	avcodec/vc1: Arm 32-bit NEON deblocking filter fast paths checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C version can still outperform the NEON version in specific cases. The balance between different code paths is stream-dependent, but in practice the best case happens about 5% of the time, the worst case happens about 40% of the time, and the complexity of the remaining cases fall somewhere in between. Therefore, taking the average of the best and worst case timings is probably a conservative estimate of the degree by which the NEON code improves performance. vc1dsp.vc1_h_loop_filter4_bestcase_c: 19.0 vc1dsp.vc1_h_loop_filter4_bestcase_neon: 48.5 vc1dsp.vc1_h_loop_filter4_worstcase_c: 144.7 vc1dsp.vc1_h_loop_filter4_worstcase_neon: 76.2 vc1dsp.vc1_h_loop_filter8_bestcase_c: 41.0 vc1dsp.vc1_h_loop_filter8_bestcase_neon: 75.0 vc1dsp.vc1_h_loop_filter8_worstcase_c: 294.0 vc1dsp.vc1_h_loop_filter8_worstcase_neon: 102.7 vc1dsp.vc1_h_loop_filter16_bestcase_c: 54.7 vc1dsp.vc1_h_loop_filter16_bestcase_neon: 130.0 vc1dsp.vc1_h_loop_filter16_worstcase_c: 569.7 vc1dsp.vc1_h_loop_filter16_worstcase_neon: 186.7 vc1dsp.vc1_v_loop_filter4_bestcase_c: 20.2 vc1dsp.vc1_v_loop_filter4_bestcase_neon: 47.2 vc1dsp.vc1_v_loop_filter4_worstcase_c: 164.2 vc1dsp.vc1_v_loop_filter4_worstcase_neon: 68.5 vc1dsp.vc1_v_loop_filter8_bestcase_c: 43.5 vc1dsp.vc1_v_loop_filter8_bestcase_neon: 55.2 vc1dsp.vc1_v_loop_filter8_worstcase_c: 316.2 vc1dsp.vc1_v_loop_filter8_worstcase_neon: 72.7 vc1dsp.vc1_v_loop_filter16_bestcase_c: 62.2 vc1dsp.vc1_v_loop_filter16_bestcase_neon: 103.7 vc1dsp.vc1_v_loop_filter16_worstcase_c: 646.5 vc1dsp.vc1_v_loop_filter16_worstcase_neon: 110.7 Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-04-01 10:03:33 +03:00
Martin Storsjö	a78f136f3f	configure: Use a separate config_components.h header for $ALL_COMPONENTS This avoids unnecessary rebuilds of most source files if only the list of enabled components has changed, but not the other properties of the build, set in config.h. Signed-off-by: Martin Storsjö <martin@martin.st>	2022-03-16 14:12:49 +02:00
J. Dekker	7fc6015de9	Revert "arm: hevc_qpel: Fix the assembly to work with non-multiple of 8 widths" This reverts commit `2589060b92` which was originally to fix the FATE test. The real cause of the test breakage was fixed in `22b7c37275`. Signed-off-by: J. Dekker <jdek@itanimul.li>	2022-01-04 14:31:48 +01:00
J. Dekker	22b7c37275	lavc/arm: dont assign hevc_qpel functions for non-multiple of 8 widths The assembly is written assuming that the width is a multiple of 8. However the real issue is the functions were errorneously assigned to the 2, 4, 6 & 12 widths. This behaviour never broke the decoder as samples which trigger the functions for these widths have not been found in the wild. This relies on the mappings in ff_hevc_pel_weight[]. Signed-off-by: J. Dekker <jdek@itanimul.li>	2022-01-04 14:31:32 +01:00
Martin Storsjö	2d5a7f6d00	arm/aarch64: Improve scheduling in the avg form of h264_qpel Don't use the loaded registers directly, avoiding stalls on in order cores. Use vrhadd.u8 with q registers where easily possible. Signed-off-by: Martin Storsjö <martin@martin.st>	2021-10-18 14:27:36 +03:00
Martin Storsjö	2589060b92	arm: hevc_qpel: Fix the assembly to work with non-multiple of 8 widths This unbreaks the fate-checkasm-hevc_pel test on arm targets. The assembly assumed that the width passed to the DSP functions is a multiple of 8, while the checkasm test used other widths too. This wasn't noticed before, because the hevc_pel checkasm tests (that were added in `9c513edb79` in January) weren't run as part of fate until in `b492cacffd` in August. As this hasn't been an issue in practice with actual full decoding tests, it seems like the actual decoder doesn't call these functions with such widths. Therefore, we could alternatively fix the test to only test things that the real decoder does, and this modification could be reverted. Signed-off-by: Martin Storsjö <martin@martin.st>	2021-08-25 23:24:49 +03:00
Andreas Rheinhardt	afc95a10ac	avcodec/h264dsp, h264idct: Fix lengths of array parameters Fixes many -Warray-parameter warnings from GCC 11. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-08-08 17:44:57 +02:00
Andreas Rheinhardt	7c1f347b18	avcodec: Remove deprecated old encode/decode APIs Deprecated in commits `7fc329e2dd` and `31f6a4b4b8`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: James Almer <jamrial@gmail.com>	2021-04-27 10:43:12 -03:00
Andreas Rheinhardt	f3c197b129	Include attributes.h directly Some files currently rely on libavutil/cpu.h to include it for them; yet said file won't use include it any more after the currently deprecated functions are removed, so include attributes.h directly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-04-19 14:34:10 +02:00
James Almer	f1a894f9d3	avcodec: add missing FF_API_OLD_ENCDEC wrappers to xmm clobber functions Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-26 19:26:31 -03:00
Lynne	151b41c8cc	fft: remove 16-bit FFT and MDCT code No longer used by anything. Unfortunately the old FFT_FLOAT/FFT_FIXED_32 is left as-is. It's simply too much work for code meant to be all removed anyway.	2021-01-14 01:44:21 +01:00
Lynne	9e05421dbe	ac3enc_fixed: drop unnecessary fixed-point DSP code	2021-01-14 01:44:20 +01:00
Anton Khirnov	e15371061d	lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bump They are not properly namespaced and not intended for public use.	2021-01-01 14:14:57 +01:00
Anton Khirnov	c8c2dfbc37	lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h That is a more appropriate place for it.	2021-01-01 14:11:01 +01:00
Martin Storsjö	b252178321	libavcodec: arm: Add a NEON implementation of pixblockdsp Cortex A7 A8 A9 A53 A72 get_pixels_c: 144.7 146.0 143.0 137.7 69.0 get_pixels_armv6: 112.0 106.7 90.2 95.0 72.5 get_pixels_neon: 69.0 29.7 68.7 40.2 19.0 get_pixels_unaligned_c: 144.7 146.2 143.0 137.7 69.0 get_pixels_unaligned_neon: 77.0 36.5 72.5 48.5 19.0 diff_pixels_c: 376.7 319.7 265.5 307.7 148.0 diff_pixels_armv6: 179.0 159.5 205.5 139.0 142.0 diff_pixels_neon: 69.0 40.2 77.5 53.2 26.0 diff_pixels_unaligned_c: 376.7 319.7 265.5 307.7 148.0 diff_pixels_unaligned_neon: 85.0 54.5 93.5 66.7 26.0 Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 23:37:43 +03:00
qoroliang	cacdac819f	lavc/hevcdec: fix the HEVC decoder crash when memory over-read Fix an occasional crash for hevc decoder in ARM 32 platform, the root cause is the memory over read(read cross the memory boundary) in SAO NENO functions ff_hevc_sao_band_filter_neon_8 and ff_hevc_sao_edge_filter_neon_8. After this fix, the crash disapper in the massive Android phone test. Signed-off-by: qoroliang <qoroliang@tencent.com>	2020-04-20 10:28:04 +08:00
Aman Gupta	0e49560806	avcodec/arm/mlpdsp: add missing dependency for truehd Signed-off-by: Aman Gupta <aman@tmm1.net>	2019-11-11 11:29:55 -08:00
James Almer	47e12966b7	Merge commit '0676de935b1e81bc5b5698fef3e7d48ff2ea77ff' * commit '0676de935b1e81bc5b5698fef3e7d48ff2ea77ff': arm: Implement a NEON version of 422 h264_h_loop_filter_chroma Merged-by: James Almer <jamrial@gmail.com>	2019-03-22 16:06:04 -03:00
Martin Storsjö	0676de935b	arm: Implement a NEON version of 422 h264_h_loop_filter_chroma Previously, the 420 version was used even for 422. This fixes occasional checkasm failures. Signed-off-by: Martin Storsjö <martin@martin.st>	2019-03-21 22:03:46 +02:00
James Almer	d6b62ce1ac	Merge commit 'cef914e08310166112ac09567e66452a7679bfc8' * commit 'cef914e08310166112ac09567e66452a7679bfc8': arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2 Merged-by: James Almer <jamrial@gmail.com>	2019-03-14 16:19:41 -03:00
James Almer	7b9ca44cbc	arm/h264dsp: change loop filter stride argument to ptrdiff_t This was missed in `d5d699ab6e` Signed-off-by: James Almer <jamrial@gmail.com>	2019-02-20 19:38:55 -03:00
Martin Storsjö	cef914e083	arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2 This makes it similar to put_epel16_v6, and gives a 10-25% speedup of this function. Before: Cortex A7 A8 A9 A53 A72 vp8_put_epel16_h6v6_neon: 3058.0 2218.5 2459.8 2183.0 1572.2 After: vp8_put_epel16_h6v6_neon: 2670.8 1934.2 2244.4 1729.4 1503.9 Signed-off-by: Martin Storsjö <martin@martin.st>	2019-02-19 11:46:18 +02:00
Meng Wang	3b2fd96048	avcodec/arm/hevcdsp_sao : add NEON optimization for sao Signed-off-by: Meng Wang <wangmeng.kids@bytedance.com> Reviewed-by: Shengbin Meng <shengbinmeng@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2018-04-09 03:45:15 +02:00
Martin Storsjö	5f83935de4	arm: hevcdsp: Add commas between macro arguments When targeting darwin, clang requires commas between arguments, while the no-comma form is allowed for other targets. Since Xcode 9.3, the bundled clang supports altmacro and doesn't require using gas-preprocessor any longer. Signed-off-by: Martin Storsjö <martin@martin.st>	2018-03-31 21:59:01 +03:00
Martin Storsjö	6660bc034d	arm: hevcdsp: Avoid using macro expansion counters Clang supports the macro expansion counter (used for making unique labels within macro expansions), but not when targeting darwin. Convert uses of the counter into normal local labels, as used elsewhere. Since Xcode 9.3, the bundled clang supports altmacro and doesn't require using gas-preprocessor any longer. Signed-off-by: Martin Storsjö <martin@martin.st>	2018-03-31 21:55:32 +03:00
James Almer	a7109b82c4	Merge commit 'ab05d3934de8e932dbd77979a687e6598e67535c' * commit 'ab05d3934de8e932dbd77979a687e6598e67535c': arm: vc1dsp: Add commas between macro arguments Merged-by: James Almer <jamrial@gmail.com>	2018-03-30 15:47:31 -03:00
Martin Storsjö	ab05d3934d	arm: vc1dsp: Add commas between macro arguments When targeting darwin, clang requires commas between arguments, while the no-comma form is allowed for other targets. Since Xcode 9.3, the bundled clang supports altmacro and doesn't require using gas-preprocessor any longer. Signed-off-by: Martin Storsjö <martin@martin.st>	2018-03-30 15:47:24 +03:00
Aurelien Jacobs	f677718bc8	sbcenc: add armv6 and neon asm optimizations This was originally based on libsbc, and was fully integrated into ffmpeg.	2018-03-07 22:26:53 +01:00
Michael Niedermayer	7dbbb75ee3	avcodec/arm/sbrdsp_neon: Use a free register instead of putting 2 things in one Fixes high pitched shriek Fixes: 25420848_1478428308873746_4255813235963330560_n.mp4 Reported-by: Dale Curtis <dalecurtis@google.com> Reviewed-by: Dale Curtis <dalecurtis@chromium.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2018-01-12 22:45:02 +01:00
James Almer	36de24d5b7	arm/hevc_idct: fix compilation on Android Compilation error "out of range" fixed for armeabi-v7a. Compilation failed trying to build libvlc.aar for ARM7 android on ubuntu 16.04 host. Error messages is "Offset out of range". The reason of the error is assembler LDR directives in function "ff_hevc_transform_luma_4x4_neon_8" need local storage in range <1k, but no such storage provided. Based on a patch by Ihor Bobalo <bob@eleks.com> Suggested-by: wbs Signed-off-by: James Almer <jamrial@gmail.com>	2017-12-09 21:46:34 +02:00
Alexandra Hájková	7993ec19af	hevc: Add hevc_get_pixel_4/8/12/16/24/32/48/64 Checkasm timings: block size bitdepth C NEON 4 8 bit: 146.7 48.7 10 bit: 146.7 52.7 8 8 bit: 430.3 84.4 10 bit: 430.4 119.5 12 8 bit: 812.8 141.0 10 bit: 812.8 195.0 16 8 bit: 1499.1 268.0 10 bit: 1498.9 368.4 24 8 bit: 4394.2 574.8 10 bit: 3696.3 804.8 32 8 bit: 5108.6 568.9 10 bit: 4249.6 918.8 48 8 bit: 16819.6 2304.9 10 bit: 13882.0 3178.5 64 8 bit: 13490.8 1799.5 10 bit: 11018.5 2519.4 Signed-off-by: Martin Storsjö <martin@martin.st>	2017-12-08 23:41:01 +02:00

1 2 3 4 5 ...

956 Commits