FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-12 19:18:44 +02:00

Author	SHA1	Message	Date
Lynne	783270bfd1	checkasm: add h264chroma tests Checks all variants of put_h264_chroma and avg_h264_chroma.	2023-05-20 20:07:21 +02:00
J. Dekker	68c151cb1b	checkasm: add hevc_deblock chroma test Signed-off-by: J. Dekker <jdek@itanimul.li>	2023-04-06 06:16:57 +02:00
James Darnley	087faf8cac	checkasm: add test for bwdif	2023-03-25 02:38:17 +01:00
bwang30	3ab11dc5bb	libavfilter/x86/vf_convolution: add sobel filter optimization and unit test with intel AVX512 VNNI This commit enabled assembly code with intel AVX512 VNNI and added unit test for sobel filter sobel_c: 4537 sobel_avx512icl 2136 Signed-off-by: bwang30 <bin.wang@intel.com> Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>	2022-11-14 10:04:16 +08:00
James Darnley	1936c06f02	checkasm: add a verbose check function for uint32_t data	2022-11-04 19:37:46 +01:00
Rémi Denis-Courmont	c962c78901	checkasm: RISC-V 64-bit assembler test harness	2022-10-10 02:23:18 +02:00
Lynne	3ade6a8644	x86/lpc: implement a new Welch windowing function Old one was written with the assumption only even inputs would be given. This very messy replacement supports even and odd inputs, and supports AVX2 for extra speed. The buffers given are usually quite big (4k samples), so the speedup is worth it. The new SSE version is still faster than the old inline asm version by 33%. Also checkasm is provided to make sure this monstrosity works. This fixes some FATE tests.	2022-09-21 07:12:39 +02:00
James Almer	8f119b501e	tests/checkasm: add a test for VorbisDSPContext Signed-off-by: James Almer <jamrial@gmail.com>	2022-09-19 21:28:23 -03:00
Martin Storsjö	5cdf4c0bed	checkasm: Silence warnings about unused return value from read() This codepath is enabled by default on arm, if the linux perf API is available, unless disabled with --disable-linux-perf. Signed-off-by: Martin Storsjö <martin@martin.st>	2022-08-08 23:39:13 +03:00
Swinney, Jonathan	c471cc7474	lavc/aarch64: motion estimation functions in neon - ff_pix_abs16_neon - ff_pix_abs16_xy2_neon In direct micro benchmarks of these ff functions verses their C implementations, these functions performed as follows on AWS Graviton 3. ff_pix_abs16_neon: pix_abs_0_0_c: 141.1 pix_abs_0_0_neon: 19.6 ff_pix_abs16_xy2_neon: pix_abs_0_3_c: 269.1 pix_abs_0_3_neon: 39.3 Tested with: ./tests/checkasm/checkasm --test=motion --bench --disable-linux-perf Signed-off-by: Jonathan Swinney <jswinney@amazon.com> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-06-28 00:51:39 +03:00
Ben Avison	bd3615a81a	checkasm: Add idctdsp add/put-pixels-clamped tests Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-04-01 10:03:33 +03:00
Ben Avison	20cb43ea8b	checkasm: Add vc1dsp in-loop deblocking filter tests Note that the benchmarking results for these functions are highly dependent upon the input data. Therefore, each function is benchmarked twice, corresponding to the best and worst case complexity of the reference C implementation. The performance of a real stream decode will fall somewhere between these two extremes. Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-04-01 10:03:33 +03:00
Mark Reid	9e445a5be2	swscale/x86/output.asm: add x86-optimized planer gbr yuv2anyX functions changes since v2: * fixed label changes since v1: * remove vex intruction on sse4 path * some load/pack marcos use less intructions * fixed some typos yuv2gbrp_full_X_4_512_c: 12757.6 yuv2gbrp_full_X_4_512_sse2: 8946.6 yuv2gbrp_full_X_4_512_sse4: 5138.6 yuv2gbrp_full_X_4_512_avx2: 3889.6 yuv2gbrap_full_X_4_512_c: 15368.6 yuv2gbrap_full_X_4_512_sse2: 11916.1 yuv2gbrap_full_X_4_512_sse4: 6294.6 yuv2gbrap_full_X_4_512_avx2: 3477.1 yuv2gbrp9be_full_X_4_512_c: 14381.6 yuv2gbrp9be_full_X_4_512_sse2: 9139.1 yuv2gbrp9be_full_X_4_512_sse4: 5150.1 yuv2gbrp9be_full_X_4_512_avx2: 2834.6 yuv2gbrp9le_full_X_4_512_c: 12990.1 yuv2gbrp9le_full_X_4_512_sse2: 9118.1 yuv2gbrp9le_full_X_4_512_sse4: 5132.1 yuv2gbrp9le_full_X_4_512_avx2: 2833.1 yuv2gbrp10be_full_X_4_512_c: 14401.6 yuv2gbrp10be_full_X_4_512_sse2: 9133.1 yuv2gbrp10be_full_X_4_512_sse4: 5126.1 yuv2gbrp10be_full_X_4_512_avx2: 2837.6 yuv2gbrp10le_full_X_4_512_c: 12718.1 yuv2gbrp10le_full_X_4_512_sse2: 9106.1 yuv2gbrp10le_full_X_4_512_sse4: 5120.1 yuv2gbrp10le_full_X_4_512_avx2: 2826.1 yuv2gbrap10be_full_X_4_512_c: 18535.6 yuv2gbrap10be_full_X_4_512_sse2: 33617.6 yuv2gbrap10be_full_X_4_512_sse4: 6264.1 yuv2gbrap10be_full_X_4_512_avx2: 3422.1 yuv2gbrap10le_full_X_4_512_c: 16724.1 yuv2gbrap10le_full_X_4_512_sse2: 11787.1 yuv2gbrap10le_full_X_4_512_sse4: 6282.1 yuv2gbrap10le_full_X_4_512_avx2: 3441.6 yuv2gbrp12be_full_X_4_512_c: 13723.6 yuv2gbrp12be_full_X_4_512_sse2: 9128.1 yuv2gbrp12be_full_X_4_512_sse4: 7997.6 yuv2gbrp12be_full_X_4_512_avx2: 2844.1 yuv2gbrp12le_full_X_4_512_c: 12257.1 yuv2gbrp12le_full_X_4_512_sse2: 9107.6 yuv2gbrp12le_full_X_4_512_sse4: 5142.6 yuv2gbrp12le_full_X_4_512_avx2: 2837.6 yuv2gbrap12be_full_X_4_512_c: 18511.1 yuv2gbrap12be_full_X_4_512_sse2: 12156.6 yuv2gbrap12be_full_X_4_512_sse4: 6251.1 yuv2gbrap12be_full_X_4_512_avx2: 3444.6 yuv2gbrap12le_full_X_4_512_c: 16687.1 yuv2gbrap12le_full_X_4_512_sse2: 11785.1 yuv2gbrap12le_full_X_4_512_sse4: 6243.6 yuv2gbrap12le_full_X_4_512_avx2: 3446.1 yuv2gbrp14be_full_X_4_512_c: 13690.6 yuv2gbrp14be_full_X_4_512_sse2: 9120.6 yuv2gbrp14be_full_X_4_512_sse4: 5138.1 yuv2gbrp14be_full_X_4_512_avx2: 2843.1 yuv2gbrp14le_full_X_4_512_c: 14995.6 yuv2gbrp14le_full_X_4_512_sse2: 9119.1 yuv2gbrp14le_full_X_4_512_sse4: 5126.1 yuv2gbrp14le_full_X_4_512_avx2: 2843.1 yuv2gbrp16be_full_X_4_512_c: 12367.1 yuv2gbrp16be_full_X_4_512_sse2: 8233.6 yuv2gbrp16be_full_X_4_512_sse4: 4820.1 yuv2gbrp16be_full_X_4_512_avx2: 2666.6 yuv2gbrp16le_full_X_4_512_c: 10904.1 yuv2gbrp16le_full_X_4_512_sse2: 8214.1 yuv2gbrp16le_full_X_4_512_sse4: 4824.1 yuv2gbrp16le_full_X_4_512_avx2: 2629.1 yuv2gbrap16be_full_X_4_512_c: 26569.6 yuv2gbrap16be_full_X_4_512_sse2: 10884.1 yuv2gbrap16be_full_X_4_512_sse4: 5488.1 yuv2gbrap16be_full_X_4_512_avx2: 3272.1 yuv2gbrap16le_full_X_4_512_c: 14010.1 yuv2gbrap16le_full_X_4_512_sse2: 10562.1 yuv2gbrap16le_full_X_4_512_sse4: 5463.6 yuv2gbrap16le_full_X_4_512_avx2: 3255.1 yuv2gbrpf32be_full_X_4_512_c: 14524.1 yuv2gbrpf32be_full_X_4_512_sse2: 8552.6 yuv2gbrpf32be_full_X_4_512_sse4: 4636.1 yuv2gbrpf32be_full_X_4_512_avx2: 2474.6 yuv2gbrpf32le_full_X_4_512_c: 13060.6 yuv2gbrpf32le_full_X_4_512_sse2: 9682.6 yuv2gbrpf32le_full_X_4_512_sse4: 4298.1 yuv2gbrpf32le_full_X_4_512_avx2: 2453.1 yuv2gbrapf32be_full_X_4_512_c: 18629.6 yuv2gbrapf32be_full_X_4_512_sse2: 11363.1 yuv2gbrapf32be_full_X_4_512_sse4: 15201.6 yuv2gbrapf32be_full_X_4_512_avx2: 3727.1 yuv2gbrapf32le_full_X_4_512_c: 16677.6 yuv2gbrapf32le_full_X_4_512_sse2: 10221.6 yuv2gbrapf32le_full_X_4_512_sse4: 5693.6 yuv2gbrapf32le_full_X_4_512_avx2: 3656.6 Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2022-01-11 16:33:17 -03:00
J. Dekker	b492cacffd	checkasm: collapse hevc pel tests Also add to `make fate-checkasm' target. Signed-off-by: J. Dekker <jdek@itanimul.li>	2021-08-24 22:12:06 +02:00
J. Dekker	9a727235fd	lavu/checkasm: add (private) kperf timing for macOS Signed-off-by: J. Dekker <jdek@itanimul.li>	2021-07-20 19:40:03 +02:00
Lynne	1978b143eb	checkasm: add av_tx FFT SIMD testing code This sadly required making changes to the code itself, due to the same context needing to be reused for both versions. The lookup table had to be duplicated for both versions.	2021-04-24 17:19:17 +02:00
Josh Dekker	9c513edb79	checkasm: add hevc_pel tests Co-authored-by: Niklas Haas <git@haasn.xyz> Signed-off-by: Josh Dekker <josh@itanimul.li>	2021-01-25 09:24:11 +01:00
Martin Storsjö	ed7d73355e	checkasm: aarch64: Check for stack overflows Also fill x8-x17 with garbage before calling the function. Figure out the number of stack parameters and make sure that the value on the stack after those is untouched. Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 21:22:36 +03:00
Martin Storsjö	6cb2d4d94b	checkasm: arm: Check for stack overflows Figure out the number of stack parameters and make sure that the value on the stack after those is untouched. Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 21:22:34 +03:00
Josh de Kock	5913cd4e6c	checkasm: add hscale test This tests the hscale 8bpp to 14/18bpp functions with different filter sizes. Signed-off-by: Josh de Kock <josh@itanimul.li>	2020-05-15 10:29:30 +01:00
Martin Storsjö	3ce1b2bf8d	checkasm: add function to check and diff memory This was ported from dav1d (c950e7101bdf5f7117bfca816984a21e550509f0). Signed-off-by: Josh de Kock <josh@itanimul.li>	2020-05-15 10:29:30 +01:00
Ting Fu	9691e2a426	checkasm/vf_eq: add test for vf_eq Signed-off-by: Ting Fu <ting.fu@intel.com> Signed-off-by: Ruiling Song <ruiling.song@intel.com>	2019-09-26 08:10:31 +08:00
Lynne	4ce1e13b54	checkasm: add opusdsp tests	2019-09-11 03:28:22 +01:00
Ruiling Song	8f4963ad25	checkasm/vf_gblur: add test for horiz_slice simd Signed-off-by: Ruiling Song <ruiling.song@intel.com>	2019-06-12 08:54:05 +08:00
James Darnley	76c370af64	checkasm: add test for v210dec	2019-05-02 19:21:37 +02:00
James Almer	ba89dc27b5	checkasm: add an af_afir test Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2019-01-03 10:12:18 -03:00
Clément Bœsch	f679711c1b	checkasm: add vf_nlmeans test for ssd_integral_image	2018-05-08 10:28:06 +02:00
Martin Vignali	a9a7ed4f27	checkasm/swscale : add test for rgb shuffle_bytes func	2018-03-24 20:22:12 +01:00
Yingming Fan	80798e3857	checkasm/hevc_sao : add hevc_sao for checkasm Signed-off-by: James Almer <jamrial@gmail.com>	2018-03-07 23:53:32 -03:00
Martin Vignali	78b982d3b9	checkasm : add test for losslessvideoencdsp for diff bytes and sub_left_pred	2018-01-28 20:23:16 +01:00
James Almer	da03242778	Revert "checkasm/vf_interlace : add test for lowpass_line 8 and 16" This reverts commit `adff97be5e`. It currently fails on Windows targets. Signed-off-by: James Almer <jamrial@gmail.com>	2017-12-19 19:07:24 -03:00
Martin Vignali	adff97be5e	checkasm/vf_interlace : add test for lowpass_line 8 and 16	2017-12-19 20:59:51 +01:00
Martin Vignali	cefb7e0060	checkasm/vf_hflip : add test for vf_hflip byte and short simd	2017-12-13 11:34:29 +01:00
Martin Vignali	cfce442750	checkasm/vf_threshold : add checkasm test for threshold8	2017-12-03 19:17:15 +01:00
Martin Vignali	4a6aa6d1b2	checkasm : add test for huffyuvdsp add_int16	2017-11-21 09:41:42 +01:00
Martin Vignali	6a7eb65e1b	checkasm : add utvideodsp test	2017-11-21 09:00:27 +01:00
James Almer	7323c896b2	checkasm: add an exrdsp test Signed-off-by: James Almer <jamrial@gmail.com>	2017-09-17 19:01:40 -03:00
Clément Bœsch	e0d56f097f	checkasm: use perf API on Linux ARM* On ARM platforms, accessing the PMU registers requires special user access permissions. Since there is no other way to get accurate timers, the current implementation of timers in FFmpeg rely on these registers. Unfortunately, enabling user access to these registers on Linux is not trivial, and generally involve compiling a random and unreliable github kernel module, or patching somehow your kernel. Such module is very unlikely to reach the upstream anytime soon. Quoting Robin Murphin from ARM: > Say you do give userspace direct access to the PMU; now run two or more > programs at once that believe they can use the counters for their own > "minimal-overhead" profiling. Have fun interpreting those results... > > And that's not even getting into the implications of scheduling across > different CPUs, CPUidle, etc. where the PMU state is completely beyond > userspace's control. In general, the plan to provide userspace with > something which might happen to just about work in a few corner cases, > but is meaningless, misleading or downright broken in all others, is to > never do so. As a result, the alternative is to use the Performance Monitoring Linux API which makes use of these registers internally (assuming the PMU of your ARM board is supported in the kernel, which is definitely not a given...). While the Linux API is obviously cross platform, it does have a significant overhead which needs to be taken into account. As a result, that mode is only weakly enabled on ARM platforms exclusively. Note on the non flexibility of the implementation: the timers (native FFmpeg vs Linux API) are selected at compilation time to prevent the need of function calls, which would result in a negative impact on the cycle counters.	2017-09-08 18:51:05 +02:00
James Almer	823cc7e25f	checkasm: add a g722dsp test Signed-off-by: James Almer <jamrial@gmail.com>	2017-07-13 17:00:19 -03:00
Matthieu Bouron	7864e07f4a	checkasm: add sbrdsp tests	2017-07-03 14:28:17 +02:00
Clément Bœsch	edd041e64c	checkasm: add AAC PS tests This includes various fixes and improvements from James Almer. Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-28 12:22:39 +02:00
Diego Biurrun	fd502f4f5f	build: Generalize yasm/nasm-related variable names None of them are specific to the YASM assembler. (Cherry-picked from libav commit `39e208f4d4`) Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-21 17:00:29 -03:00
James Almer	5b10f484e2	checkasm: add float_dsp tests Ported from libavutil/tests/float_dsp.c Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-14 19:20:10 -03:00
James Almer	37388b119c	checkasm: add a checkasm_checked_call function that doesn't issue emms Meant for DSP functions returning a float or double, as they'd fail if emms is called after every run on x86_32. Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-14 19:18:56 -03:00
James Almer	7b3cb953f7	checkasm: add fixed_dsp tests Tested-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>	2017-04-11 18:05:13 -03:00
Clément Bœsch	3d4039f964	Merge commit 'ed48a9d8143d2575a4458589cebde69ec326afd8' * commit 'ed48a9d8143d2575a4458589cebde69ec326afd8': checkasm: Add a test for HEVC add_residual Merged-by: Clément Bœsch <u@pkh.me>	2017-03-24 12:37:09 +01:00
James Almer	67b639b496	Merge commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f' * commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f': checkasm: aarch64: Add filler args to make sure all parameters are passed on the stack Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 17:38:20 -03:00
James Almer	a2d34cc51b	Merge commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6' * commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6': checkasm: aarch64: Clobber the stack before calling functions Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 17:36:53 -03:00
Clément Bœsch	7c2a7f9c11	Merge commit '22c3ab18646924ce24dc6017a9e882ff69689e40' * commit '22c3ab18646924ce24dc6017a9e882ff69689e40': checkasm: Add test for huffyuvdsp add_bytes huffyuvdsp is renamed to llviddsp to be consistent with our codebase. Note: `af607b7e07` wasn't actually required for this test since this commit is not actually testing huffyuvdsp. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-22 16:31:38 +01:00
Clément Bœsch	8414755486	Merge commit 'e9ef6171396dc4106526aaa86b620c61ca3d1017' * commit 'e9ef6171396dc4106526aaa86b620c61ca3d1017': checkasm: add tests for audiodsp Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 19:10:56 +01:00

1 2 3

107 Commits