FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-07 11:13:41 +02:00

Author	SHA1	Message	Date
Lynne	bbe95f7353	x86: replace explicit REP_RETs with RETs From x86inc: > On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either > a branch or a branch target. So switch to a 2-byte form of ret in that case. > We can automatically detect "follows a branch", but not a branch target. > (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.) x86inc can automatically determine whether to use REP_RET rather than REP in most of these cases, so impact is minimal. Additionally, a few REP_RETs were used unnecessary, despite the return being nowhere near a branch. The only CPUs affected were AMD K10s, made between 2007 and 2011, 16 years ago and 12 years ago, respectively. In the future, everyone involved with x86inc should consider dropping REP_RETs altogether.	2023-02-01 04:23:55 +01:00
James Almer	37388b119c	checkasm: add a checkasm_checked_call function that doesn't issue emms Meant for DSP functions returning a float or double, as they'd fail if emms is called after every run on x86_32. Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-14 19:18:56 -03:00
James Almer	f23078904f	Merge commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055' * commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055': build: Drop arch-specific checkasm Makefiles Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 18:01:47 -03:00
James Almer	3ddae9eee9	Merge commit '93d5b022a9fd3a1a1f9c521a1eac7f0410e05b81' * commit '93d5b022a9fd3a1a1f9c521a1eac7f0410e05b81': build: Drop duplicate asm recipe Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 17:57:35 -03:00
Diego Biurrun	2816f8a8bb	build: Drop arch-specific checkasm Makefiles They only contain one line and will never contain more.	2016-10-17 16:25:38 +02:00
Diego Biurrun	93d5b022a9	build: Drop duplicate asm recipe And move the asm recipe to the top-level Makefile next to the other local pattern rules for .o files.	2016-10-17 16:25:35 +02:00
Geza Lore	cc602061ee	x86inc: Add debug symbols indicating sizes of compiled functions Some debuggers/profilers use this metadata to determine which function a given instruction is in; without it they get can confused by local labels (if you haven't stripped those). On the other hand, some tools are still confused even with this metadata. e.g. this fixes `gdb`, but not `perf`. Currently only implemented for ELF. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-01-23 20:46:28 +01:00
Geza Lore	d39c229e54	x86inc: Add debug symbols indicating sizes of compiled functions Some debuggers/profilers use this metadata to determine which function a given instruction is in; without it they get can confused by local labels (if you haven't stripped those). On the other hand, some tools are still confused even with this metadata. e.g. this fixes `gdb`, but not `perf`. Currently only implemented for ELF.	2016-01-21 23:19:46 +01:00
Hendrik Leppkes	0eefc758e2	Merge commit 'f0f54117c8f206e8045d301c2eb975b26e9f263d' * commit 'f0f54117c8f206e8045d301c2eb975b26e9f263d': checkasm: x86: post commit review fixes Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-01-02 13:26:28 +01:00
Hendrik Leppkes	69ead86027	Merge commit '711781d7a1714ea4eb0217eb1ba04811978c43d1' * commit '711781d7a1714ea4eb0217eb1ba04811978c43d1': x86: checkasm: check for or handle missing cleanup after MMX instructions Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-01-02 11:55:44 +01:00
Janne Grunau	f0f54117c8	checkasm: x86: post commit review fixes Check the full FPU tag word instead of only the lower half and simplify the comparison. Use upper-case function base name as macro name to instantiate both checked_call variants.	2015-12-29 12:50:38 +01:00
Janne Grunau	711781d7a1	x86: checkasm: check for or handle missing cleanup after MMX instructions Not every asm routine is expected clear the MMX state after returning. It is however a requisite for testing floating point code in checkasm. Annotate functions requiring cleanup with declare_func_emms() and issue emms after the call. The remaining functions are checked for having a cleared MMX state after return.	2015-12-21 17:40:18 +01:00
Henrik Gramner	cc28552100	checkasm/x86: Correctly handle variadic functions The System V ABI on x86-64 specifies that the al register contains an upper bound of the number of arguments passed in vector registers when calling variadic functions, so we aren't allowed to clobber it. checkasm_fail_func() is a variadic function so also zero al before calling it. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2015-09-28 14:25:59 +02:00
Henrik Gramner	7ca1de5b4f	checkasm/x86: Correctly handle variadic functions The System V ABI on x86-64 specifies that the al register contains an upper bound of the number of arguments passed in vector registers when calling variadic functions, so we aren't allowed to clobber it. checkasm_fail_func() is a variadic function so also zero al before calling it.	2015-09-27 20:21:26 +02:00
Henrik Gramner	c457bdebe7	checkasm: Fix floating point arguments on 64-bit Windows Signed-off-by: Anton Khirnov <anton@khirnov.net>	2015-08-28 09:54:54 +02:00
Henrik Gramner	33a58d7bf4	checkasm: Fix floating point arguments on 64-bit Windows	2015-08-25 19:34:46 +02:00
Henrik Gramner	515b69f8f8	checkasm: Explicitly declare function prototypes Now we no longer have to rely on function pointers intentionally declared without specified argument types. This makes it easier to support functions with floating point parameters or return values as well as functions returning 64-bit values on 32-bit architectures. It also avoids having to explicitly cast strides to ptrdiff_t for example. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2015-08-20 19:22:34 +02:00
Henrik Gramner	e13da244f4	checkasm: x86: properly save rdx/edx in checked_call() If the return value doesn't fit in a single register rdx/edx can in some cases be used in addition to rax/eax. Doesn't affect any of the existing checkasm tests but might be useful later. Also comment the relevant code a bit better. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2015-08-20 19:22:06 +02:00
Henrik Gramner	e6b8797b82	checkasm: x86: properly save rdx/edx in checked_call() If the return value doesn't fit in a single register rdx/edx can in some cases be used in addition to rax/eax. Doesn't affect any of the existing checkasm tests but might be useful later. Also comment the relevant code a bit better.	2015-08-19 16:17:35 +02:00
Henrik Gramner	18b101ff59	checkasm: Explicitly declare function prototypes Now we no longer have to rely on function pointers intentionally declared without specified argument types. This makes it easier to support functions with floating point parameters or return values as well as functions returning 64-bit values on 32-bit architectures. It also avoids having to explicitly cast strides to ptrdiff_t for example.	2015-08-19 16:17:35 +02:00
Michael Niedermayer	7c944b0a36	tests/checkasm/x86/Makefile: Use ASMSTRIPFLAGS for asm Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-07-13 03:15:44 +02:00
Michael Niedermayer	f14fc55969	Merge commit '8bc67ec2c0d2b5444d51a1bed1d50f0e10d92717' * commit '8bc67ec2c0d2b5444d51a1bed1d50f0e10d92717': Checkasm: assembly testing and benchmarking tool Merged-by: Michael Niedermayer <michael@niedermayer.cc>	2015-07-12 21:03:06 +02:00
Henrik Gramner	8bc67ec2c0	Checkasm: assembly testing and benchmarking tool It provides the following features: * verify correctness by comparing output to the C version. * detect failure to save and restore clobbered callee-saved registers. * detect 32-bit parameters being used as if they were 64-bit in x86-64 (the upper halves are not guaranteed to be zero - but in practice they very often are, which makes those bugs hard to spot otherwise). * easy benchmarking. Compile by running 'make checkasm'. Execute by running 'tests/checkasm/checkasm'. Optional arguments are '--bench' to run benchmarks for all functions, '--bench=<pattern>' to run benchmarks for all functions that starts with <pattern>, and '<integer>' to seed the PRNG for reproducible results. Contains unit tests for most h264pred functions to get started, more tests can be added afterwards using those as a reference. Loosely based on code from x264. Currently only supports x86 and x86-64, but additional architectures shouldn't be too much of an obstacle to add. Note that functions with floating point parameters or floating point return values are not supported. Some compiler-specific features or preprocessor hacks would likely be required to add support for that. Signed-off-by: Janne Grunau <janne-libav@jannau.net>	2015-07-12 16:39:07 +02:00

23 Commits