FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-23 12:43:46 +02:00

Author	SHA1	Message	Date
James Almer	864f9326fb	x86/vf_noise: move asm code to a separate file Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-10-17 00:44:35 -03:00
Pascal Massimino	649b7a9946	av_filter/x86/idet: use HADDD where appropriate Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-09 19:02:49 -03:00
Pascal Massimino	e3fd6a3a4e	av_filter/x86/idet: MMX/SSE2 implementation of 16bits filter_line() tested on http://ps-auxw.de/10bit-h264-sample/10bit-eldorado.mkv MMX: ~30% faster decoding overall SSE2:~40% faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-09 16:47:22 +02:00
James Darnley	db8970d7b6	vfi/x86/vf_idet: fix incorrect use of paddq paddq is an SSE2 instruction so it cannot be used for MMX. This was probably just a typo because the sums are dwords anyway. Reviewed-by: Pascal Massimino <pascal.massimino@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-05 12:49:34 +02:00
Pascal Massimino	161fc0f463	avfilter/x86/idet: fix license header (GPL -> LGPL) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-05 12:22:36 +02:00
skal	406a9ccffe	avfilter/vf_idet: MMX/MMXEXT/SSE2 implementation of idet's filter_line() integration by Neil Birkbeck, with help from Vitor Sessak. core SSE2 loop by Skal (pascal.massimino@gmail.com) Reviewed-by: Clément Bœsch <u@pkh.me> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-04 22:19:00 +02:00
Andreas Cadhalpun	39a6e02fd4	fix spelling errors Reviewed-by: Timothy Gu <timothygu99@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-12 22:33:27 +02:00
James Almer	ddea3b7106	x86/yadif-10: remove duplicate ABS macro And use the x86util ones instead, which are optimized for mmxext/sse2. About ~1% increase in performance on pre SSSE3 processors. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-10 21:06:51 +02:00
Michael Niedermayer	a348f4befe	avfilter/x86/vf_pullup: fix "invalid combination of opcode and operands" with nasm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-28 16:06:00 +02:00
Michael Niedermayer	b8255a4c70	avfilter/x86/vf_pullup: fix old typo This makes C and MMX match, no change to fate as the differences where apparently not sufficient to show up in fate Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-25 18:22:48 +02:00
Michael Niedermayer	6dffc8f5aa	avfilter/vf_pullup: use ptrdiff_t as stride argument for dsp functions This should avoid issues on x86_64 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-25 18:22:31 +02:00
Christophe Gisquet	9107612818	x86util: add and use RSHIFT/LSHIFT macros Those macros take a byte number as shift argument, as this argument differs between MMX and SSE2 instructions. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-15 13:19:27 +02:00
Michael Niedermayer	ebb21887b8	Merge commit '01c5779f56cf708e6cb88b11cfdc248cae7e2ee8' * commit '01c5779f56cf708e6cb88b11cfdc248cae7e2ee8': x86: Drop some unnecessary YASM ifdefs Conflicts: libavfilter/x86/vf_yadif_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-04-05 19:16:39 +02:00
Diego Biurrun	01c5779f56	x86: Drop some unnecessary YASM ifdefs Dead code elimination is enough to avoid undefined references in these cases.	2014-04-04 19:08:05 +02:00
Robert Krüger	194ef56ba7	Change license of yadif from GPL to LGPL Signed-off-by: Robert Krüger <krueger@lesspain.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-14 14:19:15 +01:00
Robert Krüger	4a38eeec38	Revert "Revert "vf_yadif: move x86 init code to x86/yadif.c"" This reverts commit `975110a85e`. Signed-off-by: Robert Krüger <krueger@lesspain.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-14 14:19:14 +01:00
Robert Krüger	d8e763fda7	vf_yadif: Relicense from GPL to LGPL All copyright holders have agreed to the relicensing.	2014-01-14 00:04:59 +01:00
Michael Niedermayer	975110a85e	Revert "vf_yadif: move x86 init code to x86/yadif.c" This reverts commit `a87b17f328`. This reduces the amount of non LGPL code, making a relicensing to LGPL easier Conflicts: libavfilter/vf_yadif.c libavfilter/x86/yadif.c libavfilter/x86/yadif_template.c libavfilter/yadif.h Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-12-01 20:26:26 +01:00
Clément Bœsch	969329fe11	Revert "Merge commit 'ed1a11ed52bbd1f15bb9b0416d69b7924bee3191'" This reverts commit `fc5fe4804f`, reversing changes made to `ffe3350098`. The factoring is broken; it's not calling the ssse3 code anymore, and calling the mmx2 code with bad alignment. It also broke some FATE instances. Conflicts: libavfilter/x86/vf_gradfun_init.c	2013-11-01 14:28:08 +01:00
Michael Niedermayer	c6125f5e1c	avfilter/x86/vf_gradfun_init: fix some consts & related warnings Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-01 14:20:10 +01:00
Michael Niedermayer	fc5fe4804f	Merge commit 'ed1a11ed52bbd1f15bb9b0416d69b7924bee3191' * commit 'ed1a11ed52bbd1f15bb9b0416d69b7924bee3191': gradfun: x86: Factor out common code for some gradfun_filter_line() variants Conflicts: libavfilter/x86/vf_gradfun_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-01 10:26:49 +01:00
Michael Niedermayer	ffe3350098	Merge commit 'ee80cf741a44115758e62399b7bde08d33161151' * commit 'ee80cf741a44115758e62399b7bde08d33161151': avfilter: x86: K&R formatting cosmetics Conflicts: libavfilter/x86/vf_gradfun_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-01 10:20:20 +01:00
Diego Biurrun	ed1a11ed52	gradfun: x86: Factor out common code for some gradfun_filter_line() variants	2013-10-31 16:34:18 +01:00
Diego Biurrun	ee80cf741a	avfilter: x86: K&R formatting cosmetics	2013-10-31 12:15:54 +01:00
Michael Niedermayer	a826efb55a	avfilter/x86/vf_gradfun_init: fix const and related warnings Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-24 12:12:59 +02:00
Michael Niedermayer	1ea28ffc4d	Merge commit '0e730494160d973400aed8d2addd1f58a0ec883e' * commit '0e730494160d973400aed8d2addd1f58a0ec883e': avfilter: x86: Port gradfun filter optimizations to yasm Conflicts: libavfilter/x86/vf_gradfun_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-24 10:35:39 +02:00
Daniel Kang	0e73049416	avfilter: x86: Port gradfun filter optimizations to yasm Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-10-23 14:50:27 +02:00
Michael Niedermayer	f4f8499c19	Merge commit 'f6633c55a3c0e93a5b2bab6aa0692fb608f2a38d' * commit 'f6633c55a3c0e93a5b2bab6aa0692fb608f2a38d': avfilter: Fix typo in Loren's email address Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-23 12:14:49 +02:00
Diego Biurrun	f6633c55a3	avfilter: Fix typo in Loren's email address	2013-10-23 10:25:14 +02:00
Paul B Mahol	112017e990	avfilter/x86/vf_pullup: try to fix build on x64 Signed-off-by: Paul B Mahol <onemda@gmail.com>	2013-09-17 17:20:58 +00:00
Paul B Mahol	9c774459a9	avfilter: port pullup filter from libmpcodecs Signed-off-by: Paul B Mahol <onemda@gmail.com>	2013-09-17 17:03:36 +00:00
Michael Niedermayer	9d01bf7d66	Merge remote-tracking branch 'qatar/master' * qatar/master: Consistently use "cpu_flags" as variable/parameter name for CPU flags Conflicts: libavcodec/x86/dsputil_init.c libavcodec/x86/h264dsp_init.c libavcodec/x86/hpeldsp_init.c libavcodec/x86/motion_est.c libavcodec/x86/mpegvideo.c libavcodec/x86/proresdsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-18 09:53:47 +02:00
Diego Biurrun	3ac7fa81b2	Consistently use "cpu_flags" as variable/parameter name for CPU flags	2013-07-18 00:31:35 +02:00
Clément Bœsch	a2c547ffec	lavfi: add spp filter.	2013-06-14 01:27:22 +02:00
James Darnley	b0ef0ae776	yadif: restore speed of the C filtering code Always use the special filter for the first and last 3 columns (only). Changes made in `64ed397` slowed the filter to just under 3/4 of what it was. This commit restores the speed while maintaining identical output. For reference, on my Athlon64: 1733222 decicycles in old 2358563 decicycles in new 1727558 decicycles in this Signed-off-by: Anton Khirnov <anton@khirnov.net>	2013-05-14 09:23:55 +02:00
Michael Niedermayer	696f5f98e2	Merge commit '6e9f8d6a7d7392a236df19fef6f4eba41f18167e' * commit '6e9f8d6a7d7392a236df19fef6f4eba41f18167e': x86: vf_yadif: Remove stray dsputil_mmx #include Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-09 11:51:40 +02:00
Diego Biurrun	6e9f8d6a7d	x86: vf_yadif: Remove stray dsputil_mmx #include	2013-05-08 18:18:23 +02:00
Michael Niedermayer	a8ff830b79	Merge commit '093804a93cc5da3f95f98265a5df116912443cec' * commit '093804a93cc5da3f95f98265a5df116912443cec': avfilter: Add av_cold attributes to init/uninit functions Conflicts: libavfilter/af_ashowinfo.c libavfilter/af_volume.c libavfilter/src_movie.c libavfilter/vf_lut.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-05 11:42:18 +02:00
Diego Biurrun	093804a93c	avfilter: Add av_cold attributes to init/uninit functions	2013-05-04 21:10:05 +02:00
Michael Niedermayer	0a73803c86	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: Move some conditional code around to avoid unused variable warnings Conflicts: libavcodec/x86/dsputil_mmx.c libavfilter/x86/vf_yadif_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-23 11:01:46 +02:00
Diego Biurrun	c1ad70c3cb	x86: Move some conditional code around to avoid unused variable warnings	2013-04-22 17:50:02 +02:00
Clément Bœsch	1ae44c87c9	lavfi/gradfun: remove rounding to match C and SSE code. There is no noticable benefit for such precision. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2013-03-28 07:59:29 +01:00
Clément Bœsch	38a2f88d39	lavfi/gradfun: fix dithering in MMX code. Current dithering only uses the first 4 instead of the whole 8 random values. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2013-03-28 07:59:18 +01:00
Clément Bœsch	2d66fc543b	lavfi/gradfun: fix rounding in MMX code. Current code divides before increasing precision. Also reduce upper bound for strength from 255 to 64. This will prevent an overflow in the SSSE3 and MMX filter_line code: delta is expressed as an u16 being shifted by 2 to the left. If it overflows, having a strength not above 64 will make sure that m is set to 0 (making the mmdelta >> 14 expression void). A value above 64 should not make any sense unless gradfun is used as a blur filter. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2013-03-28 07:59:04 +01:00
James Darnley	c9a51c29fc	yadif: remove an 'm' from the LOAD macro definition Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:33:49 +01:00
James Darnley	1d3b14cac2	yadif: remove repeated check on width The filter already checks that width (and height) are greater than 3. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:33:30 +01:00
James Darnley	7976d92dac	yadif: cosmetic indentation from previous commits Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:33:06 +01:00
James Darnley	0a5814c9ba	yadif: x86 assembly for 9 to 14-bit samples These smaller samples do not need to be unpacked to double words allowing the code to process more pixels every iteration (still 2 in MMX but 6 in SSE2). It also avoids emulating the missing double word instructions on older instruction sets. Like with the previous code for 16-bit samples this has been tested on an Athlon64 and a Core2Quad. Athlon64: 1809275 decicycles in C, 32718 runs, 50 skips 911675 decicycles in mmx, 32727 runs, 41 skips, 2.0x faster 495284 decicycles in sse2, 32747 runs, 21 skips, 3.7x faster Core2Quad: 921363 decicycles in C, 32756 runs, 12 skips 486537 decicycles in mmx, 32764 runs, 4 skips, 1.9x faster 293296 decicycles in sse2, 32759 runs, 9 skips, 3.1x faster 284910 decicycles in ssse3, 32759 runs, 9 skips, 3.2x faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:32:54 +01:00
James Darnley	17e7b49501	yadif: x86 assembly for 16-bit samples This is a fairly dumb copy of the assembly for 8-bit samples but it works and produces identical output to the C version. The options have been tested on an Athlon64 and a Core2Quad. Athlon64: 1810385 decicycles in C, 32726 runs, 42 skips 1080744 decicycles in mmx, 32744 runs, 24 skips, 1.7x faster 818315 decicycles in sse2, 32735 runs, 33 skips, 2.2x faster Core2Quad: 924025 decicycles in C, 32750 runs, 18 skips 623995 decicycles in mmx, 32767 runs, 1 skips, 1.5x faster 406223 decicycles in sse2, 32764 runs, 4 skips, 2.3x faster 387842 decicycles in ssse3, 32767 runs, 1 skips, 2.4x faster 307726 decicycles in sse4, 32763 runs, 5 skips, 3.0x faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:32:34 +01:00
James Darnley	0735b50880	yadif: restore speed of the C filtering code Always use the special filter for the first and last 3 columns (only). Changes made in `64ed397` slowed the filter to just under 3/4 of what it was. This commit restores the speed while maintaining identical output. For reference, on my Athlon64: 1733222 decicycles in old 2358563 decicycles in new 1727558 decicycles in this Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-13 22:07:25 +01:00

1 2 3

125 Commits