FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00

Author	SHA1	Message	Date
Lynne	bbe95f7353	x86: replace explicit REP_RETs with RETs From x86inc: > On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either > a branch or a branch target. So switch to a 2-byte form of ret in that case. > We can automatically detect "follows a branch", but not a branch target. > (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.) x86inc can automatically determine whether to use REP_RET rather than REP in most of these cases, so impact is minimal. Additionally, a few REP_RETs were used unnecessary, despite the return being nowhere near a branch. The only CPUs affected were AMD K10s, made between 2007 and 2011, 16 years ago and 12 years ago, respectively. In the future, everyone involved with x86inc should consider dropping REP_RETs altogether.	2023-02-01 04:23:55 +01:00
Alexander Kanavin	91326dc942	libavutil: include assembly with full path from source root Otherwise nasm writes the full host-specific paths into .o output, which breaks binary reproducibility. Signed-off-by: Alexander Kanavin <alex.kanavin@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2022-02-08 10:42:26 +01:00
Ganesh Ajjanagadde	5989add4ab	lavu/x86/lls: add fma3 optimizations for update_lls This improves accuracy (very slightly) and speed for processors having fma3. Sample benchmark (fate flac-16-lpc-cholesky, Haswell): old: 5993610 decicycles in ff_lpc_calc_coefs, 64 runs, 0 skips 5951528 decicycles in ff_lpc_calc_coefs, 128 runs, 0 skips new: 5252410 decicycles in ff_lpc_calc_coefs, 64 runs, 0 skips 5232869 decicycles in ff_lpc_calc_coefs, 128 runs, 0 skips Tested with FATE and --disable-fma3, also examined contents of lavu/lls-test. Reviewed-by: James Almer <jamrial@gmail.com> Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>	2016-01-15 16:46:13 -05:00
Michael Niedermayer	70b8668fb5	drop LLS1, rename LLS2 to LLS Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-09 23:20:31 +02:00
Michael Niedermayer	c3814ab654	rename new lls code to lls2 to avoid conflict with the old which has a different ABI also remove failed attempt at a compatibility layer, the code simply cannot work Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-17 16:41:08 +01:00
Michael Niedermayer	a478e99a60	avutil/x86: reenable ff_update_lls_avx() The bug has been fixed in `c8b920a9b7` by Loren Merritt Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-02 12:02:08 +02:00
Michael Niedermayer	d1fa671895	Merge commit 'c8b920a9b7fa534a6141695ace4e8c2dfcd56cee' * commit 'c8b920a9b7fa534a6141695ace4e8c2dfcd56cee': lls/x86: use 3-operator vaddpd in ADDPD_MEM Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-02 11:40:44 +02:00
Loren Merritt	c8b920a9b7	lls/x86: use 3-operator vaddpd in ADDPD_MEM Fixes build with yasm-1.1 Signed-off-by: Anton Khirnov <anton@khirnov.net>	2013-07-02 10:15:09 +02:00
Michael Niedermayer	4e488ac5f5	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: lpc: fix a segfault in av_evaluate_lls_sse2() Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-01 02:26:22 +02:00
Loren Merritt	1221bb6239	x86: lpc: fix a segfault in av_evaluate_lls_sse2()	2013-06-30 23:11:19 +00:00
Michael Niedermayer	6e76e6a05a	Merge commit 'b545179fdff1ccfbbb9d422e4e9720cb6c6d9191' * commit 'b545179fdff1ccfbbb9d422e4e9720cb6c6d9191': x86: lpc: simd av_evaluate_lls Conflicts: libavutil/x86/lls.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-06-30 12:15:12 +02:00
Michael Niedermayer	a285079bc7	lls.asm: disable ff_update_lls_avx The code doesnt build with yasm from ubuntu 12.04 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-06-30 12:12:11 +02:00
Michael Niedermayer	0b40c50508	lls.asm: put avx code under if HAVE_AVX_EXTERNAL Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-06-30 12:12:01 +02:00
Michael Niedermayer	78b5479633	Merge commit '502ab21af0ca68f76d6112722c46d2f35c004053' * commit '502ab21af0ca68f76d6112722c46d2f35c004053': x86: lpc: simd av_update_lls The versions are bumped due to changes in lls.h which is used across libraries affecting intra library ABI (This version bump also covers changes to lls.h in the immedeatly previous commits) Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-06-30 11:35:52 +02:00
Loren Merritt	b545179fdf	x86: lpc: simd av_evaluate_lls 1.5x-1.8x faster on sandybridge Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-06-29 13:23:57 +02:00
Loren Merritt	502ab21af0	x86: lpc: simd av_update_lls 4x-6x faster on sandybridge Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-06-29 13:23:57 +02:00

16 Commits