FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-18 03:19:31 +02:00

Author	SHA1	Message	Date
Henrik Gramner	aad1b6786e	x86inc: Add some additional cpuflag relations Simplifies writing assembly code that depends on available instructions. LZCNT implies SSE2 BMI1 implies AVX+LZCNT AVX2 implies BMI2	2017-06-12 11:41:25 +02:00
Anton Mitrofanov	d991b3e8a8	x86inc: Remove argument from WIN64_RESTORE_XMM The use of rsp was pretty much hardcoded there and probably didn't work otherwise with stack_size > 0.	2017-06-09 13:43:01 +02:00
Henrik Gramner	cd4ca82459	x86inc: Prefer r14/r15 over r12/r13 on x86-64 Due to a peculiarity in the ModR/M addressing encoding, the r12 and r13 registers sometimes requires an additional byte when used as a base register. r14 and r15 doesn't have that issue, so prefer using them.	2017-06-09 13:43:00 +02:00
Henrik Gramner	88dcdfad09	x86inc: Make REP_RET identical to RET in SSSE3+ functions There's no point in emitting a rep prefix before ret on modern CPUs.	2017-06-09 13:43:00 +02:00
Henrik Gramner	406e0ddc0b	x86inc: Fix call with memory operands We overload the `call` instruction with a macro, but it would misbehave when the macro argument wasn't a valid identifier. Fix it by explicitly checking if the argument is an identifier.	2017-06-09 13:43:00 +02:00
Henrik Gramner	cd09e3b349	x86inc: Avoid using eax/rax for storing the stack pointer When allocating stack space with an alignment requirement that is larger than the current stack alignment we need to store a copy of the original stack pointer in order to be able to restore it later. If we chose to use another register for this purpose we should not pick eax/rax since it can be overwritten as a return value.	2017-01-09 16:00:29 +01:00
Anton Mitrofanov	e428f3b30c	x86inc: Enable AVX emulation in additional cases Allows emulation to work when dst is equal to src2 as long as the instruction is commutative, e.g. `addps m0, m1, m0`.	2016-04-20 19:16:22 +02:00
Anton Mitrofanov	4bd5583ace	x86inc: Improve handling of %ifid with multi-token parameters The yasm/nasm preprocessor only checks the first token, which means that parameters such as `dword [rax]` are treated as identifiers, which is generally not what we want.	2016-04-20 19:16:22 +02:00
Anton Mitrofanov	42be240ad6	x86inc: Fix AVX emulation of some instructions	2016-04-20 19:16:22 +02:00
Henrik Gramner	8dd3ee9ddd	x86inc: Fix AVX emulation of scalar float instructions Those instructions are not commutative since they only change the first element in the vector and leave the rest unmodified.	2016-04-20 19:16:22 +02:00
Geza Lore	d39c229e54	x86inc: Add debug symbols indicating sizes of compiled functions Some debuggers/profilers use this metadata to determine which function a given instruction is in; without it they get can confused by local labels (if you haven't stripped those). On the other hand, some tools are still confused even with this metadata. e.g. this fixes `gdb`, but not `perf`. Currently only implemented for ELF.	2016-01-21 23:19:46 +01:00
Henrik Gramner	d3662777e0	x86inc: Avoid creating unnecessary local labels The REP_RET workaround is only needed on old AMD cpus, and the labels clutter up the symbol table and confuse debugging/profiling tools, so use EQU to create SHN_ABS symbols instead of creating local labels. Furthermore, skip the workaround completely in functions that definitely won't run on such cpus. Note that EQU is just creating a local label when using nasm instead of yasm. This is probably a bug, but at least it doesn't break anything.	2016-01-21 23:19:46 +01:00
Henrik Gramner	87b587d4fe	x86inc: Simplify AUTO_REP_RET cpuflags is never undefined any more, it's set to 0 instead. Also fix an incorrect comment.	2016-01-21 23:19:46 +01:00
Henrik Gramner	2d60b18cf0	x86inc: Use more consistent indentation	2016-01-21 23:19:46 +01:00
Henrik Gramner	dfe771dc5a	x86inc: Preserve arguments when allocating stack space When allocating stack space with a larger alignment than the known stack alignment a temporary register is used for storing the stack pointer. Ensure that this isn't one of the registers used for passing arguments.	2016-01-21 23:19:46 +01:00
Henrik Gramner	b1496008ee	x86inc: Improve FMA instruction handling * Correctly handle FMA instructions with memory operands. * Print a warning if FMA instructions are used without the correct cpuflag. * Simplify the instantiation code. * Clarify documentation. Only the last operand in FMA3 instructions can be a memory operand. When converting FMA4 instructions to FMA3 instructions we can utilize the fact that multiply is a commutative operation and reorder operands if necessary to ensure that a memory operand is used only as the last operand.	2016-01-21 23:19:46 +01:00
Henrik Gramner	6cbd0fdf28	x86inc: Be more verbose in assertion failures	2016-01-21 23:19:46 +01:00
Rodger Combs	1e477a970f	lavu: add AESNI CPU flag	2015-10-28 04:23:14 -05:00
Henrik Gramner	17710550c4	x86inc: Make cpuflag() and notcpuflag() return 0 or 1 Makes it possible to use them in arithmetic expressions.	2015-10-01 18:14:12 +02:00
Anton Mitrofanov	8db0f71b49	x86inc: warn if XOP integer FMA instruction emulation is impossible Signed-off-by: Henrik Gramner <henrik@gramner.com>	2015-08-05 16:15:40 +02:00
Henrik Gramner	f0b7882ceb	x86inc: Drop SECTION_TEXT macro The .text section is already 16-byte aligned by default on all supported platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.	2015-08-04 20:13:09 +02:00
Henrik Gramner	826790f596	x86inc: Support arbitrary stack alignments Change ALLOC_STACK to always align the stack before allocating stack space for consistency. Previously alignment would occur either before or after allocating stack space depending on whether manual alignment was required or not.	2015-08-04 20:13:09 +02:00
James Almer	5750d6c5e9	x86: move XOP emulation code back to x86inc Only two functions that use xop multiply-accumulate instructions where the first operand is the same as the fourth actually took advantage of the macros. This further reduces differences with x264's x86inc. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2015-08-03 17:11:13 -03:00
Henrik Gramner	127203ba5a	x86inc: Various minor backports from x264 Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-08-03 04:08:33 +02:00
Henrik Gramner	f151fbd9e5	x86inc: Disable vpbroadcastq workaround in newer yasm versions The bug was fixed in 1.3.0, so only perform the workaround in earlier versions. Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-08-03 03:13:20 +02:00
Timothy Gu	204b228a1d	x86inc: Clear __SECT__ This commit silences warning(s) like: libavcodec/x86/fft.asm:93: warning: section flags ignored on section redeclaration The cause of this warning is that because `struc` and `endstruc` attempts to revert to the previous section state [1]. The section state is stored in the macro __SECT__, defined by x86inc.asm to be `.note.GNU-stack ...`, through the `SECTION` directive [2]. Thus, the `.note.GNU-stack` section is defined twice (once in x86inc.asm, once during `endstruc`), causing the warning. That is the first part of the commit: using the primitive `[section]` format for .note.GNU-stack etc., which does not update `__SECT__` [2]. That fixes only half of the problem. Even without any `SECTION` directives, `__SECT__` is predefined as `.text`, which conflicting with the later `SECTION_TEXT` (which expands to `.text align=16`). [1]: http://www.nasm.us/doc/nasmdoc6.html#section-6.4 [2]: http://www.nasm.us/doc/nasmdoc6.html#section-6.3 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-05-28 00:08:37 +02:00
Christophe Gisquet	d9293c776e	x86inc: Correctly warn on use of SSE2 instructions in SSE functions SSE2 instructions that are XMM-implementations of pre-existing MMX/MMX2 instructions did not issue warnings when used in SSE functions. Handle it by also checking the register type when such instructions are used. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-02-17 12:35:58 +01:00
Christophe Gisquet	e93d3a22cb	x86: lavu/x264asm: fix ymm register instantiation This mimicks what is done for the other instruction sets. Tested-by: James Almer <jamrial@gmail.com> Tested-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-02-04 00:18:29 +01:00
James Darnley	12120174ce	lavu/x86/x86inc: deprecate INIT_AVX The same can be done with INIT_XMM avx Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-02-02 01:09:16 +01:00
Anton Mitrofanov	a1684311b3	x264asm: warn when inappropriate instruction used in function with specified cpuflags Requested-by: Christophe Gisquet <christophe.gisquet@gmail.com> Requested-by: "Ronald S. Bultje" <rsbultje@gmail.com>	2015-02-02 00:06:14 +01:00
Henrik Gramner	428aa14a48	x86inc: Make INIT_CPUFLAGS support an arbitrary number of cpuflags Previously there was a limit of two cpuflags. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-05 14:06:03 +02:00
Henrik Gramner	720c21d11f	x86inc: Make ym# behave the same way as xm# This makes more sense for future implementations of templates with zmm registers. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-05 01:55:28 +02:00
Loren Merritt	a4dbabc8b3	x86inc: free up variable name "n" in global namespace Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-05 01:41:50 +02:00
Michael Niedermayer	8d0c7031a8	Merge commit '79793f833784121d574454af4871866576c0749d' * commit '79793f833784121d574454af4871866576c0749d': Update Fiona's name in copyright statements. Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-01 15:43:40 +02:00
Diego Biurrun	79793f8337	Update Fiona's name in copyright statements.	2014-07-01 03:26:51 -07:00
James Almer	3f3d748cab	x86: Move XOP emulation to x86util We need the emulation to support the cases where the first argument is the same as the fourth. To achieve this a fifth argument working as a temporary may be needed. Emulation that doesn't obey the original instruction semantics can't be in x86inc. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-02-24 08:30:19 +01:00
James Almer	23a8c63452	x86inc: Extend FMA_INSTR functionality Support the cases where the first and last operand of the XOP instruction are the same. Also add vpmacsdql emulation. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-02-13 22:14:24 +01:00
Loren Merritt	b7d0d10a1d	x86inc: Speed up assembling with Yasm Work around Yasm's inefficiency with handling large numbers of variables in the global scope. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2014-01-26 18:40:08 +01:00
Loren Merritt	4d55fe7204	x86inc: speed up compilation with yasm Work around yasm's inefficiency with handling large numbers of variables in the global scope.	2014-01-18 01:19:16 +01:00
Michael Niedermayer	f9bef2bec9	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: more AVX2 framework Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-14 16:13:57 +02:00
Michael Niedermayer	e3e0e3d0c9	Merge commit 'c6908d6b4b377a04a5d055ba874bdbcf06c80497' * commit 'c6908d6b4b377a04a5d055ba874bdbcf06c80497': x86inc: FMA3/4 Support Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-14 16:06:22 +02:00
Michael Niedermayer	9ac124c889	Merge commit '206895708ea2b464755d340e44501daf9a07c310' * commit '206895708ea2b464755d340e44501daf9a07c310': x86inc: Remove our FMA4 support Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-14 15:54:23 +02:00
Michael Niedermayer	12e4493f9c	Merge commit 'c108ba0175d4fc3a3253a8b0f782fbfb96ba5098' * commit 'c108ba0175d4fc3a3253a8b0f782fbfb96ba5098': x86inc: Use VEX-encoded instructions in AVX functions Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-14 15:48:34 +02:00
Jason Garrett-Glaser	a3fabc6cb3	x86: more AVX2 framework Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2013-10-14 12:41:56 +01:00
Jason Garrett-Glaser	c6908d6b4b	x86inc: FMA3/4 Support Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2013-10-14 12:41:54 +01:00
Derek Buitenhuis	206895708e	x86inc: Remove our FMA4 support This is so we can sync to x264's version of FMA4 support. This partialy reverts commit `79687079a9`. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2013-10-14 12:39:29 +01:00
Henrik Gramner	c108ba0175	x86inc: Use VEX-encoded instructions in AVX functions Automatically use VEX-encoding in AVX/AVX2/XOP/FMA3/FMA4 functions for all instructions that exists in a VEX-encoded version. This change makes it easier to extend existing code to use AVX2. Also add support for AVX emulation of a few instructions that were missing before. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2013-10-14 12:36:11 +01:00
Michael Niedermayer	31d0d35560	Merge remote-tracking branch 'qatar/master' * qatar/master: x86inc: Remove .rodata kludges Conflicts: libavutil/x86/x86inc.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-09 14:29:42 +02:00
Henrik Gramner	ad7d7d4f6a	x86inc: Remove .rodata kludges The Mach-O bug was fixed in yasm 0.8.0 and we don't support versions that old anymore. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2013-10-09 07:44:30 -04:00
Michael Niedermayer	19c3890819	Merge commit '3e2fa991db7ef172579422accd61624d52777e5a' * commit '3e2fa991db7ef172579422accd61624d52777e5a': x86inc: remove misaligned cpu flag Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-08 12:02:21 +02:00

1 2 3

137 Commits