1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-07 11:13:41 +02:00
Commit Graph

23 Commits

Author SHA1 Message Date
Lynne
bbe95f7353
x86: replace explicit REP_RETs with RETs
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)

x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.

The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.

In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
2023-02-01 04:23:55 +01:00
James Almer
37388b119c checkasm: add a checkasm_checked_call function that doesn't issue emms
Meant for DSP functions returning a float or double, as they'd fail if emms
is called after every run on x86_32.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-14 19:18:56 -03:00
James Almer
f23078904f Merge commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055'
* commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055':
  build: Drop arch-specific checkasm Makefiles

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 18:01:47 -03:00
James Almer
3ddae9eee9 Merge commit '93d5b022a9fd3a1a1f9c521a1eac7f0410e05b81'
* commit '93d5b022a9fd3a1a1f9c521a1eac7f0410e05b81':
  build: Drop duplicate asm recipe

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 17:57:35 -03:00
Diego Biurrun
2816f8a8bb build: Drop arch-specific checkasm Makefiles
They only contain one line and will never contain more.
2016-10-17 16:25:38 +02:00
Diego Biurrun
93d5b022a9 build: Drop duplicate asm recipe
And move the asm recipe to the top-level Makefile next to the other
local pattern rules for .o files.
2016-10-17 16:25:35 +02:00
Geza Lore
cc602061ee x86inc: Add debug symbols indicating sizes of compiled functions
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.

Currently only implemented for ELF.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-01-23 20:46:28 +01:00
Geza Lore
d39c229e54 x86inc: Add debug symbols indicating sizes of compiled functions
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.

Currently only implemented for ELF.
2016-01-21 23:19:46 +01:00
Hendrik Leppkes
0eefc758e2 Merge commit 'f0f54117c8f206e8045d301c2eb975b26e9f263d'
* commit 'f0f54117c8f206e8045d301c2eb975b26e9f263d':
  checkasm: x86: post commit review fixes

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 13:26:28 +01:00
Hendrik Leppkes
69ead86027 Merge commit '711781d7a1714ea4eb0217eb1ba04811978c43d1'
* commit '711781d7a1714ea4eb0217eb1ba04811978c43d1':
  x86: checkasm: check for or handle missing cleanup after MMX instructions

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 11:55:44 +01:00
Janne Grunau
f0f54117c8 checkasm: x86: post commit review fixes
Check the full FPU tag word instead of only the lower half and simplify
the comparison.
Use upper-case function base name as macro name to instantiate both
checked_call variants.
2015-12-29 12:50:38 +01:00
Janne Grunau
711781d7a1 x86: checkasm: check for or handle missing cleanup after MMX instructions
Not every asm routine is expected clear the MMX state after returning.
It is however a requisite for testing floating point code in checkasm.
Annotate functions requiring cleanup with declare_func_emms() and issue
emms after the call. The remaining functions are checked for having  a
cleared MMX state after return.
2015-12-21 17:40:18 +01:00
Henrik Gramner
cc28552100 checkasm/x86: Correctly handle variadic functions
The System V ABI on x86-64 specifies that the al register contains an upper
bound of the number of arguments passed in vector registers when calling
variadic functions, so we aren't allowed to clobber it.

checkasm_fail_func() is a variadic function so also zero al before calling it.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-09-28 14:25:59 +02:00
Henrik Gramner
7ca1de5b4f checkasm/x86: Correctly handle variadic functions
The System V ABI on x86-64 specifies that the al register contains an upper
bound of the number of arguments passed in vector registers when calling
variadic functions, so we aren't allowed to clobber it.

checkasm_fail_func() is a variadic function so also zero al before calling it.
2015-09-27 20:21:26 +02:00
Henrik Gramner
c457bdebe7 checkasm: Fix floating point arguments on 64-bit Windows
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-08-28 09:54:54 +02:00
Henrik Gramner
33a58d7bf4 checkasm: Fix floating point arguments on 64-bit Windows 2015-08-25 19:34:46 +02:00
Henrik Gramner
515b69f8f8 checkasm: Explicitly declare function prototypes
Now we no longer have to rely on function pointers intentionally
declared without specified argument types.

This makes it easier to support functions with floating point parameters
or return values as well as functions returning 64-bit values on 32-bit
architectures. It also avoids having to explicitly cast strides to
ptrdiff_t for example.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-08-20 19:22:34 +02:00
Henrik Gramner
e13da244f4 checkasm: x86: properly save rdx/edx in checked_call()
If the return value doesn't fit in a single register rdx/edx can in some
cases be used in addition to rax/eax.

Doesn't affect any of the existing checkasm tests but might be useful later.

Also comment the relevant code a bit better.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-08-20 19:22:06 +02:00
Henrik Gramner
e6b8797b82 checkasm: x86: properly save rdx/edx in checked_call()
If the return value doesn't fit in a single register rdx/edx can in some
cases be used in addition to rax/eax.

Doesn't affect any of the existing checkasm tests but might be useful later.

Also comment the relevant code a bit better.
2015-08-19 16:17:35 +02:00
Henrik Gramner
18b101ff59 checkasm: Explicitly declare function prototypes
Now we no longer have to rely on function pointers intentionally
declared without specified argument types.

This makes it easier to support functions with floating point parameters
or return values as well as functions returning 64-bit values on 32-bit
architectures. It also avoids having to explicitly cast strides to
ptrdiff_t for example.
2015-08-19 16:17:35 +02:00
Michael Niedermayer
7c944b0a36 tests/checkasm/x86/Makefile: Use ASMSTRIPFLAGS for asm
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-13 03:15:44 +02:00
Michael Niedermayer
f14fc55969 Merge commit '8bc67ec2c0d2b5444d51a1bed1d50f0e10d92717'
* commit '8bc67ec2c0d2b5444d51a1bed1d50f0e10d92717':
  Checkasm: assembly testing and benchmarking tool

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-12 21:03:06 +02:00
Henrik Gramner
8bc67ec2c0 Checkasm: assembly testing and benchmarking tool
It provides the following features:
 * verify correctness by comparing output to the C version.
 * detect failure to save and restore clobbered callee-saved registers.
 * detect 32-bit parameters being used as if they were 64-bit in x86-64
   (the upper halves are not guaranteed to be zero - but in practice
   they very often are, which makes those bugs hard to spot otherwise).
 * easy benchmarking.

Compile by running 'make checkasm'.
Execute by running 'tests/checkasm/checkasm'.

Optional arguments are '--bench' to run benchmarks for all functions,
'--bench=<pattern>' to run benchmarks for all functions that starts with
<pattern>, and '<integer>' to seed the PRNG for reproducible results.

Contains unit tests for most h264pred functions to get started, more tests
can be added afterwards using those as a reference.

Loosely based on code from x264. Currently only supports x86 and x86-64,
but additional architectures shouldn't be too much of an obstacle to add.

Note that functions with floating point parameters or floating point
return values are not supported. Some compiler-specific features or
preprocessor hacks would likely be required to add support for that.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2015-07-12 16:39:07 +02:00