FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-18 03:19:31 +02:00

Author	SHA1	Message	Date
Martin Storsjö	e00db9f78b	checkasm: hevc: Add a hevc_ prefix to the add_residual functions This makes it easier to group them with the rest when running e.g. --bench=hevc. Signed-off-by: Martin Storsjö <martin@martin.st>	2017-04-21 13:32:44 +03:00
Diego Biurrun	dcc39ee10e	lavc: Remove deprecated XvMC support hacks Deprecated in 11/2013.	2017-03-23 10:09:14 +01:00
Diego Biurrun	39e208f4d4	build: Generalize yasm/nasm-related variable names None of them are specific to the YASM assembler.	2017-03-01 10:18:15 +01:00
Diego Biurrun	7cb1d9e2db	build: Fine-grained link-time dependency settings Previously, all link-time dependencies were added for all libraries, resulting in bogus link-time dependencies since not all dependencies are shared across libraries. Also, in some cases like libavutil, not all dependencies were taken into account, resulting in some cases of underlinking. To address all this mess a machinery is added for tracking which dependency belongs to which library component and then leveraged to determine correct dependencies for all individual libraries.	2017-03-01 09:00:40 +01:00
Diego Biurrun	3794062ab1	Remove Plan 9 support Supporting the system was a nice joke for the 9 release, but it has run its course. Nowadays Plan 9 receives no testing and has no practical usefulness.	2016-12-03 09:15:01 +01:00
Martin Storsjö	9c8bc74c2b	arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32 This work is sponsored by, and copyright, Google. Previously all subpartitions except the eob=1 (DC) case ran with the same runtime: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_16x16_sub16_add_neon: 3188.1 2435.4 2499.0 1969.0 vp9_inv_dct_dct_32x32_sub32_add_neon: 18531.7 16582.3 14207.6 12000.3 By skipping individual 4x16 or 4x32 pixel slices in the first pass, we reduce the runtime of these functions like this: vp9_inv_dct_dct_16x16_sub1_add_neon: 274.6 189.5 211.7 235.8 vp9_inv_dct_dct_16x16_sub2_add_neon: 2064.0 1534.8 1719.4 1248.7 vp9_inv_dct_dct_16x16_sub4_add_neon: 2135.0 1477.2 1736.3 1249.5 vp9_inv_dct_dct_16x16_sub8_add_neon: 2446.7 1828.7 1993.6 1494.7 vp9_inv_dct_dct_16x16_sub12_add_neon: 2832.4 2118.3 2266.5 1735.1 vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.7 2475.3 2523.5 1983.1 vp9_inv_dct_dct_32x32_sub1_add_neon: 756.2 456.7 862.0 553.9 vp9_inv_dct_dct_32x32_sub2_add_neon: 10682.2 8190.4 8539.2 6762.5 vp9_inv_dct_dct_32x32_sub4_add_neon: 10813.5 8014.9 8518.3 6762.8 vp9_inv_dct_dct_32x32_sub8_add_neon: 11859.6 9313.0 9347.4 7514.5 vp9_inv_dct_dct_32x32_sub12_add_neon: 12946.6 10752.4 10192.2 8280.2 vp9_inv_dct_dct_32x32_sub16_add_neon: 14074.6 11946.5 11001.4 9008.6 vp9_inv_dct_dct_32x32_sub20_add_neon: 15269.9 13662.7 11816.1 9762.6 vp9_inv_dct_dct_32x32_sub24_add_neon: 16327.9 14940.1 12626.7 10516.0 vp9_inv_dct_dct_32x32_sub28_add_neon: 17462.7 15776.1 13446.2 11264.7 vp9_inv_dct_dct_32x32_sub32_add_neon: 18575.5 17157.0 14249.3 12015.1 I.e. in general a very minor overhead for the full subpartition case due to the additional loads and cmps, but a significant speedup for the cases when we only need to process a small part of the actual input data. In common VP9 content in a few inspected clips, 70-90% of the non-dc-only 16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left 8x8 or 16x16 subpartitions respectively. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-30 23:54:07 +02:00
Ronald S. Bultje	06fec74cac	checkasm: vp9dsp: benchmark all sub-IDCTs (but not WHT or ADST). Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-23 23:55:38 +02:00
Martin Storsjö	effc1430b2	Revert "checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately" This reverts commit `81d7f0bbca`. Instead of just benchmarking dc separately, test all relevant subparts (in the next commit). Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-23 23:55:26 +02:00
Martin Storsjö	81d7f0bbca	checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately The dc-only mode is already checked to work correctly above, but this allows benchmarking this mode for performance tuning, and allows making sure that it actually is correctly hooked up. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-16 10:06:32 +02:00
Ronald S. Bultje	0b37cd09a6	checkasm: add vp9dsp.itxfm_add tests. This includes fixes by Henrik Gramner. The forward transforms are derived from the reference encoder. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-11 11:09:05 +02:00
Diego Biurrun	9498237049	checkasm: Add --test parameter to check only specific components Inspired by a patch from Martin Storsjö <martin@martin.st>.	2016-11-08 17:32:25 +01:00
Martin Storsjö	2e55e26b40	vp9: Flip the order of arguments in MC functions This makes it match the pattern already used for VP8 MC functions. This also makes the signature match ffmpeg's version of these functions, easing porting of code in both directions. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-03 09:12:02 +02:00
Alexandra Hájková	ed48a9d814	checkasm: Add a test for HEVC add_residual	2016-10-22 17:33:35 +02:00
Martin Storsjö	dd5d4a0e1e	checkasm: aarch64: Don't clobber x29 in checkasm_stack_clobber x29 (FP) is a callee saved register and should be restored on return. Instead of backing up x29 and restoring it here, back up sp in a register that we are allowed to overwrite. This fixes crashes in checkasm on aarch64 since `f1b3e13138`. For some reason, gcc builds didn't crash, but clang builds do. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-10-18 16:17:12 +03:00
Diego Biurrun	2816f8a8bb	build: Drop arch-specific checkasm Makefiles They only contain one line and will never contain more.	2016-10-17 16:25:38 +02:00
Diego Biurrun	93d5b022a9	build: Drop duplicate asm recipe And move the asm recipe to the top-level Makefile next to the other local pattern rules for .o files.	2016-10-17 16:25:35 +02:00
Martin Storsjö	c91d6a33f8	checkasm: aarch64: Add filler args to make sure all parameters are passed on the stack This, combined with clobbering the stack space prior to the call, increases the chances of finding cases where 32 bit parameters are erroneously treated as 64 bit. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-10-16 23:26:33 +03:00
Martin Storsjö	f1b3e13138	checkasm: aarch64: Clobber the stack before calling functions Signed-off-by: Martin Storsjö <martin@martin.st>	2016-10-16 23:26:22 +03:00
Martin Storsjö	a05cc56124	checkasm: arm/aarch64: Fix the amount of space reserved for stack parameters Even if MAX_ARGS - 2 (for arm) or MAX_ARGS - 7 (for aarch64) parameters are passed on the stack to checkasm_checked_call, we actually only need to store MAX_ARGS - 4 (for arm) or MAX_ARGS - 8 (for aarch64) parameters on the stack when calling the tested function. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-10-16 23:26:15 +03:00
Alexandra Hájková	e3f941cb03	checkasm: add a test for HEVC IDCT Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-10-11 18:15:40 +02:00
Ronald S. Bultje	c935b54bd6	checkasm: add VP9 loopfilter tests. The randomize_buffer() implementation assures that "most of the time", we'll do a good mix of wide16/wide8/hev/regular/no filters for complete code coverage. However, this is not mathematically assured because that would make the code either much more complex, or much less random. Some fixes and improvements by Rodger Combs <rodger.combs@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-10-04 10:54:07 +02:00
Alexandra Hájková	22c3ab1864	checkasm: Add test for huffyuvdsp add_bytes Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2016-10-02 17:13:26 +02:00
Diego Biurrun	ba479f3daa	hevc: Change type of array stride parameters to ptrdiff_t ptrdiff_t is the correct type for array strides and similar.	2016-09-29 17:54:23 +02:00
Anton Khirnov	683da86aab	audiodsp: reorder arguments for vector_clipf This will make the x86 asm simpler. ARM conversion by Martin Storsjö <martin@martin.st> and Janne Grunau <janne-libav@jannau.net>	2016-09-22 09:47:52 +02:00
Anton Khirnov	e9ef617139	checkasm: add tests for audiodsp	2016-09-22 09:47:52 +02:00
Anton Khirnov	2eb97af66a	checkasm: add a test for blockdsp	2016-09-22 09:47:52 +02:00
Luca Barbato	e89cef4050	checkasm: Read the unsigned value as it should Reading a value larger than int using atoi() may give the wrong result.	2016-09-11 14:12:18 +02:00
Diego Biurrun	87c6c78604	vp8: Change type of stride parameters to ptrdiff_t ptrdiff_t is the correct type for array strides and similar.	2016-08-26 11:36:53 +02:00
Ronald S. Bultje	e99ecda550	checkasm: add vp9 MC tests. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-08-03 11:07:01 +02:00
Luca Barbato	40ad05bab2	checkasm: Cast unsigned to signed Avoid a warning for passing an unsigned value to abs(), some compilers might optimize away abs().	2016-07-23 08:27:32 +02:00
Alexandra Hájková	9064777dbb	checkasm: add HEVC test for testing IDCT DC Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-07-22 19:08:12 +02:00
Martin Storsjö	6f9e34baea	arm: Check for support for the .fpu directive When targeting COFF (windows), clang doesn't support this directive (while binutils supports it for all targets). Signed-off-by: Martin Storsjö <martin@martin.st>	2016-07-21 12:52:10 +03:00
Martin Storsjö	37961044c6	checkasm: arm: Ignore changes to bits 0-4 and 7 of FPSCR These bits are set by exceptions in NEON instructions. Also print the differing bits when FPSCR is clobbered, and use bic instead of lsl, for clearing the topmost bits. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-07-17 21:48:17 +03:00
Janne Grunau	59aeed93e4	cheackasm/arm: remove NEON instructions from checkasm_checked_call_vfp Fixes AS error on non NEON builds introduced in `71a0472114`. Also set the fpu directly to vfp in checkasm.S to cause build errors on NEON builds.	2016-07-17 11:28:21 +02:00
Martin Storsjö	446353ea18	checkasm: arm: Don't start new const blocks for each string Each const block needs to be terminated by one endconst invocation so either call endconst after each, or just declare plain labels to the later strings. This fixes errors such as this, on some binutils versions: checkasm.S:38: Error: Macro `endconst' was already defined Signed-off-by: Martin Storsjö <martin@martin.st>	2016-07-17 12:21:19 +03:00
Janne Grunau	71a0472114	checkasm: arm: report the first clobbered register in checkasm_checked_call	2016-07-16 12:57:18 +02:00
Janne Grunau	7b1ae0e73a	checkasm/arm: preserve the stack alignment checkasm_checked_call The stack used by checkasm_checked_call_vfp was a multiple of 4 when the checked function is called. AAPCS requires a double word (8 byte) aligned stack public interfaces. Since both calls are public interfaces the stack is misaligned when the checked is called. Might fix the SIGBUS error in the armv7-linux-clang-3.7 fate config.	2016-07-13 22:18:53 +02:00
Janne Grunau	80fbb7beca	checkasm: vp8.mc: initialize the full src buffer after `ec32574209` Fixes "Use of uninitialised value" valgrind warnings in checkasm.	2016-07-13 22:18:52 +02:00
Janne Grunau	8c816c0c9b	checkasm/arm: align the clobber check data properly for ldrd Should fix the SIGBUS in the armv7-linux-clang-3.7 fate target.	2016-07-10 13:35:41 +02:00
Janne Grunau	ec32574209	checkasm: vp8: mc: test unequal width/height for partitions	2016-07-10 13:35:41 +02:00
Martin Storsjö	f8d17d5395	checkasm: Add tests for vp8dsp The tests are inspired by similar tests for vp9 by Ronald Bultje. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-07-08 14:10:46 +03:00
Martin Storsjö	67cb2c0f73	checkasm: hevc: Iterate over features first, then over bitdepths This avoids listing the same feature multiple times in the test output. Previously the output contained something like this: SSE2: - hevc_mc.qpel [OK] - hevc_mc.epel [OK] - hevc_mc.unweighted_pred [OK] - hevc_mc.qpel [OK] - hevc_mc.epel [OK] - hevc_mc.unweighted_pred [OK] Signed-off-by: Martin Storsjö <martin@martin.st>	2016-06-29 21:12:05 +03:00
Martin Storsjö	e48746deec	checkasm: h264dsp: Move the x and y variables into the randomize_buffer macro This avoids the risk of accidentally clobbering such variables outside of the macro if the same variables are used there. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-06-28 14:24:04 +03:00
Martin Storsjö	e57de6faa1	checkasm: h264dsp: Initialize the padding area This fixes valgrind warnings about conditional jumps based on uninitialized data (even though the uninitialized data only ever was compared with a direct copy of the same uninitialized data). Signed-off-by: Martin Storsjö <martin@martin.st>	2016-06-28 14:24:01 +03:00
Martin Storsjö	dc7501e524	checkasm: Issue emms after benchmarking functions The functions may not clean up properly after using MMX registers. For the normal testing calls, the checkasm_checked_call functions will do the cleanup (and check that functions that should clean up do it as well), but when benchmarking functions that don't clean up, we don't currently properly clean up at all. This causes issues if a benchmarked function is followed by testing of a function that is supposed to not clobber the MMX/FPU state but doesn't touch it at all. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-06-21 22:09:29 +03:00
Martin Storsjö	105998fb5c	checkasm: Add tests for h264 idct The tests are inspired by similar tests for vp9 by Ronald Bultje. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-06-17 21:37:56 +03:00
Diego Biurrun	257f00ec1a	Split global .gitignore file into per-directory files	2016-05-13 14:55:56 +02:00
Vittorio Giovara	41ed7ab45f	cosmetics: Fix spelling mistakes Signed-off-by: Diego Biurrun <diego@biurrun.de>	2016-05-04 18:16:21 +02:00
Diego Biurrun	01621202aa	build: miscellaneous cosmetics Restore alphabetical order in lists, break overly long lines, do some prettyprinting, add some explanatory section comments, group parts together that belong together logically.	2016-04-07 15:26:08 +02:00
Diego Biurrun	7c82d31cbe	checkasm: Use standard multiple inclusion guards	2016-02-18 15:35:44 +01:00

1 2

83 Commits