FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-29 22:00:58 +02:00

Author	SHA1	Message	Date
Andreas Rheinhardt	4d7128be9a	avfilter/x86/vf_yadif: Remove obsolete MMXEXT functions The only system which benefit from these are truely ancient 32bit x86s as all other systems use at least the SSE2 versions (this includes all x64 cpus (which is why this code is restricted to x86-32)). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-06-22 13:37:48 +02:00
James Almer	ddea3b7106	x86/yadif-10: remove duplicate ABS macro And use the x86util ones instead, which are optimized for mmxext/sse2. About ~1% increase in performance on pre SSSE3 processors. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-10 21:06:51 +02:00
Christophe Gisquet	9107612818	x86util: add and use RSHIFT/LSHIFT macros Those macros take a byte number as shift argument, as this argument differs between MMX and SSE2 instructions. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-15 13:19:27 +02:00
Robert Krüger	194ef56ba7	Change license of yadif from GPL to LGPL Signed-off-by: Robert Krüger <krueger@lesspain.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-14 14:19:15 +01:00
James Darnley	c9a51c29fc	yadif: remove an 'm' from the LOAD macro definition Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:33:49 +01:00
James Darnley	1d3b14cac2	yadif: remove repeated check on width The filter already checks that width (and height) are greater than 3. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:33:30 +01:00
James Darnley	0a5814c9ba	yadif: x86 assembly for 9 to 14-bit samples These smaller samples do not need to be unpacked to double words allowing the code to process more pixels every iteration (still 2 in MMX but 6 in SSE2). It also avoids emulating the missing double word instructions on older instruction sets. Like with the previous code for 16-bit samples this has been tested on an Athlon64 and a Core2Quad. Athlon64: 1809275 decicycles in C, 32718 runs, 50 skips 911675 decicycles in mmx, 32727 runs, 41 skips, 2.0x faster 495284 decicycles in sse2, 32747 runs, 21 skips, 3.7x faster Core2Quad: 921363 decicycles in C, 32756 runs, 12 skips 486537 decicycles in mmx, 32764 runs, 4 skips, 1.9x faster 293296 decicycles in sse2, 32759 runs, 9 skips, 3.1x faster 284910 decicycles in ssse3, 32759 runs, 9 skips, 3.2x faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:32:54 +01:00

7 Commits