FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-02 03:06:28 +02:00

Author	SHA1	Message	Date
James Almer	e3851169ee	x86/vf_ssim: add ff_ssim_4x4_line_xop ~20% faster than ssse3. Also enabled for x86_32 Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2015-07-20 13:18:05 -03:00
James Almer	e1778fb657	x86/vf_ssim: fix some instruction comments Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2015-07-20 13:17:58 -03:00
Paul B Mahol	eea08efc0d	avfilter/x86/vf_psnr.asm: split one line of license text into two Signed-off-by: Paul B Mahol <onemda@gmail.com>	2015-07-14 23:54:26 +00:00
James Darnley	bff7242608	avfilter/vf_removegrain: add x86 and x86_64 SSE2 functions Speed of all modes increased by a factor between 7.4 and 19.8 largely depending on whether bytes are unpacked into words. Modes 2, 3, and 4 have been sped-up by a factor of 43 (thanks quick sort!) All modes are available on x86_64 but only modes 1, 10, 11, 12, 13, 14, 19, 20, 21, and 22 are available on x86 due to the number of SIMD registers used. With a contribution from James Almer <jamrial@gmail.com>	2015-07-14 23:50:50 +00:00
Ronald S. Bultje	ae4c9ddebc	vf_psnr: sse2 optimizations for sum-squared-error. The internal line accumulator for 16bit can overflow, so I changed that from int to uint64_t in the C code. The matching assembly looks a little weird but output looks correct. (avx2 should be trivial to add later.) Reviewed-by: Paul B Mahol <onemda@gmail.com> Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-07-14 17:57:14 +02:00
Ronald S. Bultje	dfc58584b4	vf_ssim: x86 simd for ssim_4x4xN and ssim_endN. Both are 2-2.5x faster than their C counterpart. Reviewed-by: Paul B Mahol <onemda@gmail.com> Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-07-14 05:07:07 +02:00
James Almer	c16e99e3b3	x86: check for AV_CPU_FLAG_AVXSLOW where useful Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-06-01 00:15:35 +02:00
Michael Niedermayer	52fc3e372f	avfilter/x86/vf_hqdn3d: Fix register types Fixes Ticket4301 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-05-27 05:18:55 +02:00
Michael Niedermayer	5bc2c39527	avfilter/x86/vf_fspp: Fix invalid combination of opcode and operands Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-05-26 01:43:47 +02:00
Michael Niedermayer	a6f9a5d0f6	avfilter/x86/vf_fspp: Fix loop condition for column_fidct() Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-01-28 17:23:27 +01:00
Michael Niedermayer	f5b3257c50	avfilter/vf_eq: mark src as const Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-01-27 01:46:08 +01:00
Michael Niedermayer	530bf8ece6	avfilter/vf_eq: Fix clipping code Found-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-01-26 23:46:44 +01:00
Arwa Arif	4c38e960d0	avfilter: Port mp=eq/eq2 to lavfi Code adapted from James Darnley's port Some fixes from Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-01-26 00:14:04 +01:00
James Almer	da02ee127a	x86/vf_pp7: port dctB_mmx to yasm Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2015-01-09 20:02:27 -03:00
Arwa Arif	a299cd5ab3	lavfi: port mp=pp7 to libavfilter The only difference with mp=pp7 is that default mode is "medium", as stated in the MPlayer docs, rather than "hard". Signed-off-by: Stefano Sabatini <stefasab@gmail.com>	2015-01-09 17:26:31 +01:00
James Almer	a4f876a1a2	x86/vf_fspp: move pxor in store slice functions out of the loop m7 is not overwritten, so we only need to clear it once. Found by Christophe Gisquet. Signed-off-by: James Almer <jamrial@gmail.com>	2014-12-26 17:15:34 -03:00
James Almer	466e32bf25	x86/vf_fspp: port inline asm to yasm Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-12-26 15:39:51 -03:00
James Almer	b94e85453e	avfilter/vf_fspp: add missing inline asm guards	2014-12-24 15:44:06 -03:00
Arwa Arif	bdc4db0ee3	lavfi: port mp=fspp to a native libavfilter filter Signed-off-by: Stefano Sabatini <stefasab@gmail.com>	2014-12-24 16:29:18 +01:00
Michael Niedermayer	6706a2986c	avfilter/vf_spp: Fix overflow in 8bit store slice Fixes regression with ffplay -f lavfi -i testsrc=640x480 -vf format=gray,boxblur=20:10,geq="'mod(lum(X,Y),16)15'",boxblur=10,geq="'abs(mod(lum(X,Y),15)-7)32'",spp=4:40 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-21 01:48:19 +01:00
Michael Niedermayer	838aa08d75	avfilter/vf_spp: support 10bit per sample Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-15 18:49:35 +01:00
Michael Niedermayer	30d2ac4bf9	avfilter/vf_spp: change temporary to unsigned More consistent with uspp and allows for future 10bit support Found-by: ubitux Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-12 13:34:18 +01:00
Michael Niedermayer	ca59b5b6ec	avfilter/x86/vf_interlace: remove redundant instructions Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-25 12:37:19 +01:00
Michael Niedermayer	3fe3c8abb1	Merge commit 'ca5c3ff90972a5c97aabda2ace57ba72dcd7d83b' * commit 'ca5c3ff90972a5c97aabda2ace57ba72dcd7d83b': vf_interlace: x86: improve asm performance Conflicts: libavfilter/x86/vf_interlace.asm See: `05e4b25e9b` Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-25 12:31:45 +01:00
Michael Niedermayer	ca5c3ff909	vf_interlace: x86: improve asm performance 4775 decicycles -> 3688 decicycles	2014-11-25 02:00:06 +00:00
Michael Niedermayer	05e4b25e9b	avfilter/x86/vf_interlace: rewrite asm 4775 decicycles -> 3688 decicycles Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-15 04:09:03 +01:00
Michael Niedermayer	fb3eb57369	avfilter/tinterlace: add Support for ff_lowpass_line_avx() & ff_lowpass_line_sse2() Based-on: `2e1704059a` by Kieran Kunhya Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-15 04:02:33 +01:00
Michael Niedermayer	6f373d75e8	Merge commit '2e1704059ae8625beda2ffde847ad22c5ba416dc' * commit '2e1704059ae8625beda2ffde847ad22c5ba416dc': vf_interlace: Add SIMD for lowpass filter Conflicts: libavfilter/vf_interlace.c libavfilter/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-15 02:39:49 +01:00
Kieran Kunhya	2e1704059a	vf_interlace: Add SIMD for lowpass filter Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2014-11-15 00:35:31 +01:00
James Almer	864f9326fb	x86/vf_noise: move asm code to a separate file Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-10-17 00:44:35 -03:00
Pascal Massimino	649b7a9946	av_filter/x86/idet: use HADDD where appropriate Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-09 19:02:49 -03:00
Pascal Massimino	e3fd6a3a4e	av_filter/x86/idet: MMX/SSE2 implementation of 16bits filter_line() tested on http://ps-auxw.de/10bit-h264-sample/10bit-eldorado.mkv MMX: ~30% faster decoding overall SSE2:~40% faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-09 16:47:22 +02:00
James Darnley	db8970d7b6	vfi/x86/vf_idet: fix incorrect use of paddq paddq is an SSE2 instruction so it cannot be used for MMX. This was probably just a typo because the sums are dwords anyway. Reviewed-by: Pascal Massimino <pascal.massimino@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-05 12:49:34 +02:00
Pascal Massimino	161fc0f463	avfilter/x86/idet: fix license header (GPL -> LGPL) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-05 12:22:36 +02:00
skal	406a9ccffe	avfilter/vf_idet: MMX/MMXEXT/SSE2 implementation of idet's filter_line() integration by Neil Birkbeck, with help from Vitor Sessak. core SSE2 loop by Skal (pascal.massimino@gmail.com) Reviewed-by: Clément Bœsch <u@pkh.me> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-04 22:19:00 +02:00
Andreas Cadhalpun	39a6e02fd4	fix spelling errors Reviewed-by: Timothy Gu <timothygu99@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-12 22:33:27 +02:00
James Almer	ddea3b7106	x86/yadif-10: remove duplicate ABS macro And use the x86util ones instead, which are optimized for mmxext/sse2. About ~1% increase in performance on pre SSSE3 processors. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-10 21:06:51 +02:00
Michael Niedermayer	a348f4befe	avfilter/x86/vf_pullup: fix "invalid combination of opcode and operands" with nasm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-28 16:06:00 +02:00
Michael Niedermayer	b8255a4c70	avfilter/x86/vf_pullup: fix old typo This makes C and MMX match, no change to fate as the differences where apparently not sufficient to show up in fate Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-25 18:22:48 +02:00
Michael Niedermayer	6dffc8f5aa	avfilter/vf_pullup: use ptrdiff_t as stride argument for dsp functions This should avoid issues on x86_64 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-25 18:22:31 +02:00
Christophe Gisquet	9107612818	x86util: add and use RSHIFT/LSHIFT macros Those macros take a byte number as shift argument, as this argument differs between MMX and SSE2 instructions. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-15 13:19:27 +02:00
Michael Niedermayer	ebb21887b8	Merge commit '01c5779f56cf708e6cb88b11cfdc248cae7e2ee8' * commit '01c5779f56cf708e6cb88b11cfdc248cae7e2ee8': x86: Drop some unnecessary YASM ifdefs Conflicts: libavfilter/x86/vf_yadif_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-04-05 19:16:39 +02:00
Diego Biurrun	01c5779f56	x86: Drop some unnecessary YASM ifdefs Dead code elimination is enough to avoid undefined references in these cases.	2014-04-04 19:08:05 +02:00
Robert Krüger	194ef56ba7	Change license of yadif from GPL to LGPL Signed-off-by: Robert Krüger <krueger@lesspain.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-14 14:19:15 +01:00
Robert Krüger	4a38eeec38	Revert "Revert "vf_yadif: move x86 init code to x86/yadif.c"" This reverts commit `975110a85e`. Signed-off-by: Robert Krüger <krueger@lesspain.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-14 14:19:14 +01:00
Robert Krüger	d8e763fda7	vf_yadif: Relicense from GPL to LGPL All copyright holders have agreed to the relicensing.	2014-01-14 00:04:59 +01:00
Michael Niedermayer	975110a85e	Revert "vf_yadif: move x86 init code to x86/yadif.c" This reverts commit `a87b17f328`. This reduces the amount of non LGPL code, making a relicensing to LGPL easier Conflicts: libavfilter/vf_yadif.c libavfilter/x86/yadif.c libavfilter/x86/yadif_template.c libavfilter/yadif.h Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-12-01 20:26:26 +01:00
Clément Bœsch	969329fe11	Revert "Merge commit 'ed1a11ed52bbd1f15bb9b0416d69b7924bee3191'" This reverts commit `fc5fe4804f`, reversing changes made to `ffe3350098`. The factoring is broken; it's not calling the ssse3 code anymore, and calling the mmx2 code with bad alignment. It also broke some FATE instances. Conflicts: libavfilter/x86/vf_gradfun_init.c	2013-11-01 14:28:08 +01:00
Michael Niedermayer	c6125f5e1c	avfilter/x86/vf_gradfun_init: fix some consts & related warnings Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-01 14:20:10 +01:00
Michael Niedermayer	fc5fe4804f	Merge commit 'ed1a11ed52bbd1f15bb9b0416d69b7924bee3191' * commit 'ed1a11ed52bbd1f15bb9b0416d69b7924bee3191': gradfun: x86: Factor out common code for some gradfun_filter_line() variants Conflicts: libavfilter/x86/vf_gradfun_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-01 10:26:49 +01:00

1 2 3 4

154 Commits