FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-03-28 12:32:17 +02:00

Author	SHA1	Message	Date
Martin Storsjö	388f6e6715	arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32 This work is sponsored by, and copyright, Google. Previously all subpartitions except the eob=1 (DC) case ran with the same runtime: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_16x16_sub16_add_neon: 3188.1 2435.4 2499.0 1969.0 vp9_inv_dct_dct_32x32_sub32_add_neon: 18531.7 16582.3 14207.6 12000.3 By skipping individual 4x16 or 4x32 pixel slices in the first pass, we reduce the runtime of these functions like this: vp9_inv_dct_dct_16x16_sub1_add_neon: 274.6 189.5 211.7 235.8 vp9_inv_dct_dct_16x16_sub2_add_neon: 2064.0 1534.8 1719.4 1248.7 vp9_inv_dct_dct_16x16_sub4_add_neon: 2135.0 1477.2 1736.3 1249.5 vp9_inv_dct_dct_16x16_sub8_add_neon: 2446.7 1828.7 1993.6 1494.7 vp9_inv_dct_dct_16x16_sub12_add_neon: 2832.4 2118.3 2266.5 1735.1 vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.7 2475.3 2523.5 1983.1 vp9_inv_dct_dct_32x32_sub1_add_neon: 756.2 456.7 862.0 553.9 vp9_inv_dct_dct_32x32_sub2_add_neon: 10682.2 8190.4 8539.2 6762.5 vp9_inv_dct_dct_32x32_sub4_add_neon: 10813.5 8014.9 8518.3 6762.8 vp9_inv_dct_dct_32x32_sub8_add_neon: 11859.6 9313.0 9347.4 7514.5 vp9_inv_dct_dct_32x32_sub12_add_neon: 12946.6 10752.4 10192.2 8280.2 vp9_inv_dct_dct_32x32_sub16_add_neon: 14074.6 11946.5 11001.4 9008.6 vp9_inv_dct_dct_32x32_sub20_add_neon: 15269.9 13662.7 11816.1 9762.6 vp9_inv_dct_dct_32x32_sub24_add_neon: 16327.9 14940.1 12626.7 10516.0 vp9_inv_dct_dct_32x32_sub28_add_neon: 17462.7 15776.1 13446.2 11264.7 vp9_inv_dct_dct_32x32_sub32_add_neon: 18575.5 17157.0 14249.3 12015.1 I.e. in general a very minor overhead for the full subpartition case due to the additional loads and cmps, but a significant speedup for the cases when we only need to process a small part of the actual input data. In common VP9 content in a few inspected clips, 70-90% of the non-dc-only 16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left 8x8 or 16x16 subpartitions respectively. This is cherrypicked from libav commit 9c8bc74c2b40537b0997f646c87c008042d788c2. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-01-14 21:13:30 +01:00
Ronald S. Bultje	1c8fbd7b90	checkasm/vp9: benchmark all sub-IDCTs (but not WHT or ADST).	2016-12-27 10:02:33 -05:00
Hendrik Leppkes	286d8bae61	Merge commit '7b1ae0e73ab7f7c5eabc70dbe2e579127c6e154f' * commit '7b1ae0e73ab7f7c5eabc70dbe2e579127c6e154f': checkasm/arm: preserve the stack alignment checkasm_checked_call Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-11-17 15:21:32 +01:00
Hendrik Leppkes	c0af1ee90d	Merge commit '80fbb7becae530167373fe5178966b7d7604306e' * commit '80fbb7becae530167373fe5178966b7d7604306e': checkasm: vp8.mc: initialize the full src buffer after ec32574209f Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-11-17 15:20:10 +01:00
Hendrik Leppkes	90b72f6bda	Merge commit '8c816c0c9b12fdefd9046415e97df299880bc9b8' * commit '8c816c0c9b12fdefd9046415e97df299880bc9b8': checkasm/arm: align the clobber check data properly for ldrd Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-11-17 15:06:10 +01:00
Hendrik Leppkes	4fe013fc70	Merge commit 'ec32574209f36467ef0d22c21a7e811ba98c15b6' * commit 'ec32574209f36467ef0d22c21a7e811ba98c15b6': checkasm: vp8: mc: test unequal width/height for partitions Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-11-17 15:05:25 +01:00
Hendrik Leppkes	47f75839e4	Merge commit 'f8d17d53957056c053a46f9320fa7ae6fe1479a5' * commit 'f8d17d53957056c053a46f9320fa7ae6fe1479a5': checkasm: Add tests for vp8dsp Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-11-14 15:29:08 +01:00
Hendrik Leppkes	f75035b06f	Merge commit 'e48746deec48e9ff195841bc3266b4e153a878cd' * commit 'e48746deec48e9ff195841bc3266b4e153a878cd': checkasm: h264dsp: Move the x and y variables into the randomize_buffer macro Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-11-13 23:02:39 +01:00
Hendrik Leppkes	6fc74934de	Merge commit 'dc7501e524dc3270335749302c7aa449973625f3' * commit 'dc7501e524dc3270335749302c7aa449973625f3': checkasm: Issue emms after benchmarking functions Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-10-07 13:18:05 +02:00
Martin Storsjö	2e95054ebb	checkasm: h264dsp: Initialize the padding area This fixes valgrind warnings about conditional jumps based on uninitialized data (even though the uninitialized data only ever was compared with a direct copy of the same uninitialized data). Signed-off-by: Martin Storsjö <martin@martin.st> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2016-08-11 19:55:16 +02:00
James Almer	54a0a52be1	checkasm/vp9dsp: use declare_func_emms in check_loopfilter Fixes checkasm failures on mmxext functions Signed-off-by: James Almer <jamrial@gmail.com>	2016-07-26 22:16:21 -03:00
Janne Grunau	7b1ae0e73a	checkasm/arm: preserve the stack alignment checkasm_checked_call The stack used by checkasm_checked_call_vfp was a multiple of 4 when the checked function is called. AAPCS requires a double word (8 byte) aligned stack public interfaces. Since both calls are public interfaces the stack is misaligned when the checked is called. Might fix the SIGBUS error in the armv7-linux-clang-3.7 fate config.	2016-07-13 22:18:53 +02:00
Janne Grunau	80fbb7beca	checkasm: vp8.mc: initialize the full src buffer after ec32574209f Fixes "Use of uninitialised value" valgrind warnings in checkasm.	2016-07-13 22:18:52 +02:00
Matthieu Bouron	a91c330a29	Merge commit '105998fb5ca3c343f5c8cb39ce3197f87a5e4d36' * commit '105998fb5ca3c343f5c8cb39ce3197f87a5e4d36': checkasm: Add tests for h264 idct Merged-by: Matthieu Bouron <matthieu.bouron@stupeflix.com>	2016-07-13 17:22:29 +02:00
Matthieu Bouron	495a40cecb	tests/checkasm: reduce cosmetic diff with libav Chunk was not merged in ca5ec2bf51d8c4f8bb0a829d0a65c70c968888a3.	2016-07-13 17:11:58 +02:00
Janne Grunau	8c816c0c9b	checkasm/arm: align the clobber check data properly for ldrd Should fix the SIGBUS in the armv7-linux-clang-3.7 fate target.	2016-07-10 13:35:41 +02:00
Janne Grunau	ec32574209	checkasm: vp8: mc: test unequal width/height for partitions	2016-07-10 13:35:41 +02:00
Martin Storsjö	f8d17d5395	checkasm: Add tests for vp8dsp The tests are inspired by similar tests for vp9 by Ronald Bultje. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-07-08 14:10:46 +03:00
Michael Niedermayer	fb6b6b5166	tests/checkasm/pixblockdsp: Test 8 byte aligned positions The code is documented as to require 8byte alignment Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2016-07-02 22:21:53 +02:00
Martin Storsjö	67cb2c0f73	checkasm: hevc: Iterate over features first, then over bitdepths This avoids listing the same feature multiple times in the test output. Previously the output contained something like this: SSE2: - hevc_mc.qpel [OK] - hevc_mc.epel [OK] - hevc_mc.unweighted_pred [OK] - hevc_mc.qpel [OK] - hevc_mc.epel [OK] - hevc_mc.unweighted_pred [OK] Signed-off-by: Martin Storsjö <martin@martin.st>	2016-06-29 21:12:05 +03:00
Martin Storsjö	e48746deec	checkasm: h264dsp: Move the x and y variables into the randomize_buffer macro This avoids the risk of accidentally clobbering such variables outside of the macro if the same variables are used there. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-06-28 14:24:04 +03:00
Martin Storsjö	e57de6faa1	checkasm: h264dsp: Initialize the padding area This fixes valgrind warnings about conditional jumps based on uninitialized data (even though the uninitialized data only ever was compared with a direct copy of the same uninitialized data). Signed-off-by: Martin Storsjö <martin@martin.st>	2016-06-28 14:24:01 +03:00
Clément Bœsch	5558ff3a9f	Merge commit '257f00ec1ab06a2a161f535036c6512f3fc8e801' * commit '257f00ec1ab06a2a161f535036c6512f3fc8e801': Split global .gitignore file into per-directory files Merged-by: Clément Bœsch <clement@stupeflix.com>	2016-06-22 11:28:51 +02:00
Clément Bœsch	8ef57a0d61	Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb' * commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb': cosmetics: Fix spelling mistakes Merged-by: Clément Bœsch <u@pkh.me>	2016-06-21 21:55:34 +02:00
Martin Storsjö	dc7501e524	checkasm: Issue emms after benchmarking functions The functions may not clean up properly after using MMX registers. For the normal testing calls, the checkasm_checked_call functions will do the cleanup (and check that functions that should clean up do it as well), but when benchmarking functions that don't clean up, we don't currently properly clean up at all. This causes issues if a benchmarked function is followed by testing of a function that is supposed to not clobber the MMX/FPU state but doesn't touch it at all. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-06-21 22:09:29 +03:00
Martin Storsjö	105998fb5c	checkasm: Add tests for h264 idct The tests are inspired by similar tests for vp9 by Ronald Bultje. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-06-17 21:37:56 +03:00
Diego Biurrun	257f00ec1a	Split global .gitignore file into per-directory files	2016-05-13 14:55:56 +02:00
Derek Buitenhuis	ca5ec2bf51	Merge commit '01621202aad7e27b2a05c71d9ad7a19dfcbe17ec' * commit '01621202aad7e27b2a05c71d9ad7a19dfcbe17ec': build: miscellaneous cosmetics Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2016-05-09 16:25:28 +01:00
Vittorio Giovara	41ed7ab45f	cosmetics: Fix spelling mistakes Signed-off-by: Diego Biurrun <diego@biurrun.de>	2016-05-04 18:16:21 +02:00
Michael Niedermayer	3c0511f29e	tests/checkasm/vf_colorspace: Make bpp_mask const Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2016-04-13 22:39:41 +02:00
Michael Niedermayer	4d59d075a9	tests/checkasm/vf_colorspace: Fix dst array sizes Suggested & Approved by: BBB Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2016-04-12 23:46:52 +02:00
Ronald S. Bultje	5ce703a6bf	vf_colorspace: x86-64 SIMD (SSE2) optimizations.	2016-04-12 16:42:48 -04:00
Diego Biurrun	01621202aa	build: miscellaneous cosmetics Restore alphabetical order in lists, break overly long lines, do some prettyprinting, add some explanatory section comments, group parts together that belong together logically.	2016-04-07 15:26:08 +02:00
Derek Buitenhuis	ca408cf557	Merge commit '7c82d31cbe9fc5d5a321ad49c14a472bd629b50f' * commit '7c82d31cbe9fc5d5a321ad49c14a472bd629b50f': checkasm: Use standard multiple inclusion guards Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2016-02-24 17:36:52 +00:00
James Almer	26034929d5	checkasm: bench each vf_blend mode once Also bench a smaller buffer. This drastically reduces --bench runtime and reports smaller, more readable numbers. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>	2016-02-22 13:54:07 -03:00
James Almer	76af0c7877	checkasm: fix dependencies for vf_blend tests They will now compile if avcodec is disabled Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2016-02-19 16:31:55 -03:00
Diego Biurrun	7c82d31cbe	checkasm: Use standard multiple inclusion guards	2016-02-18 15:35:44 +01:00
Timothy Gu	ebf648d490	checkasm/vf_blend: Decrease iteration count The test is already slow.	2016-02-14 10:48:24 -08:00
Timothy Gu	a953a2991e	checkasm: Add vf_blend tests	2016-02-14 10:46:56 -08:00
Timothy Gu	180f9a0958	all: Make header guard names consistent	2016-01-31 15:44:11 -08:00
foo86	ae5b2c5250	avcodec/dca: add new decoder based on libdcadec	2016-01-31 17:09:38 +01:00
foo86	4608996772	avcodec/dca: remove old decoder Remove all files and functions which are not going to be reused, and disable all functions and FATE tests temporarily which will be.	2016-01-31 17:09:38 +01:00
Geza Lore	cc602061ee	x86inc: Add debug symbols indicating sizes of compiled functions Some debuggers/profilers use this metadata to determine which function a given instruction is in; without it they get can confused by local labels (if you haven't stripped those). On the other hand, some tools are still confused even with this metadata. e.g. this fixes `gdb`, but not `perf`. Currently only implemented for ELF. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-01-23 20:46:28 +01:00
Geza Lore	d39c229e54	x86inc: Add debug symbols indicating sizes of compiled functions Some debuggers/profilers use this metadata to determine which function a given instruction is in; without it they get can confused by local labels (if you haven't stripped those). On the other hand, some tools are still confused even with this metadata. e.g. this fixes `gdb`, but not `perf`. Currently only implemented for ELF.	2016-01-21 23:19:46 +01:00
Ronald S. Bultje	8c9103c4af	checkasm: add videodsp emulated_edge_mc test.	2016-01-21 10:25:27 -05:00
Hendrik Leppkes	7e29903526	Merge commit 'fec76cd430f3c865183a6e5b4caec0743e055605' * commit 'fec76cd430f3c865183a6e5b4caec0743e055605': checkasm: Check register clobbering on aarch64 Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-01-19 08:50:44 +01:00
Hendrik Leppkes	0b40e290e3	Merge commit '26ec75aec3576daea691dee53a78ec67c0dc4040' * commit '26ec75aec3576daea691dee53a78ec67c0dc4040': checkasm: Check register clobbering on arm Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-01-19 08:49:27 +01:00
Martin Storsjö	fec76cd430	checkasm: Check register clobbering on aarch64 This is disabled on iOS, since iOS uses a slightly different ABI for vararg parameters. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-01-07 09:33:24 +02:00
Martin Storsjö	26ec75aec3	checkasm: Check register clobbering on arm Use two separate functions, depending on whether VFP/NEON is available. This is set to require armv5te - it uses blx, which is only available since armv5t, but we don't have a separate configure item for that. (It also uses ldrd, which requires armv5te, but this could be avoided if necessary.) Signed-off-by: Martin Storsjö <martin@martin.st>	2016-01-07 09:33:24 +02:00
Hendrik Leppkes	0eefc758e2	Merge commit 'f0f54117c8f206e8045d301c2eb975b26e9f263d' * commit 'f0f54117c8f206e8045d301c2eb975b26e9f263d': checkasm: x86: post commit review fixes Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-01-02 13:26:28 +01:00

1 2 3

135 Commits