FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-12 19:18:44 +02:00

Author	SHA1	Message	Date
Derek Buitenhuis	2605967f7e	Merge commit '4c297249ac0f513a610a62691ce96d6b62f65b94' * commit '4c297249ac0f513a610a62691ce96d6b62f65b94': rdft: arm: Split RDFT initialization into a separate file Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2016-04-12 15:43:34 +01:00
Derek Buitenhuis	197fa698c6	Merge commit '97aec6e75ef36ed0402653519daa8e1fc8ddb555' * commit '97aec6e75ef36ed0402653519daa8e1fc8ddb555': fft: arm: Drop unnecessary #include, add missing ones Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2016-04-12 15:43:09 +01:00
Diego Biurrun	4c297249ac	rdft: arm: Split RDFT initialization into a separate file	2016-02-26 14:34:58 +01:00
Diego Biurrun	97aec6e75e	fft: arm: Drop unnecessary #include, add missing ones	2016-02-26 14:34:58 +01:00
Hendrik Leppkes	e754c8e8ca	Merge commit 'e2710e790c09e49e86baa58c6063af0097cc8cb0' * commit 'e2710e790c09e49e86baa58c6063af0097cc8cb0': arm: add a cpu flag for the VFPv2 vector mode Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2016-01-02 11:01:29 +01:00
Janne Grunau	e2710e790c	arm: add a cpu flag for the VFPv2 vector mode The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu implementations do not support it in hardware. Vector mode code will depending the OS either be emulated in software or result in an illegal instruction on cpus which does not support it. This was not really problem in practice since NEON implementations of the same functions are preferred. It will however become a problem for checkasm which tests every cpu flag separately. Since this is a cpu feature newer cpu do not support anymore the behaviour of this flag differs from the other flags. It can be only activated by runtime cpu feature selection.	2015-12-14 16:42:35 +01:00
Michael Niedermayer	c27adb37ef	Merge commit '87552d54d3337c3241e8a9e1a05df16eaa821496' * commit '87552d54d3337c3241e8a9e1a05df16eaa821496': armv6: Accelerate ff_fft_calc for general case (nbits != 4) Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-18 03:12:02 +02:00
Ben Avison	87552d54d3	armv6: Accelerate ff_fft_calc for general case (nbits != 4) The previous implementation targeted DTS Coherent Acoustics, which only requires nbits == 4 (fft16()). This case was (and still is) linked directly rather than being indirected through ff_fft_calc_vfp(), but now the full range from radix-4 up to radix-65536 is available. This benefits other codecs such as AAC and AC3. The implementaion is based upon the C version, with each routine larger than radix-16 calling a hierarchy of smaller FFT functions, then performing a post-processing pass. This pass benefits a lot from loop unrolling to counter the long pipelines in the VFP. A relaxed calling standard also reduces the overhead of the call hierarchy, and avoiding the excessive inlining performed by GCC probably helps with I-cache utilisation too. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in the FFT routines (fft4() to fft512() and pass()) for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4% FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2% Signed-off-by: Martin Storsjö <martin@martin.st>	2014-07-18 01:34:23 +03:00
Michael Niedermayer	99e7c702db	Merge commit 'bd549cbaacd33dfb7be81d0619c9b107b8a85be7' * commit 'bd549cbaacd33dfb7be81d0619c9b107b8a85be7': arm: dcadsp: Move synth filter initialization to dcadsp file Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-08-29 16:00:45 +02:00
Diego Biurrun	bd549cbaac	arm: dcadsp: Move synth filter initialization to dcadsp file	2013-08-29 11:24:14 +02:00
Michael Niedermayer	2305a6775d	Merge commit 'b63bb251ea6d6ba23295294e37a92625c0192206' * commit 'b63bb251ea6d6ba23295294e37a92625c0192206': arm: Add VFP-accelerated version of imdct_half Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-22 11:57:05 +02:00
Michael Niedermayer	0129026c4e	Merge commit '41ef1d360bac65032aa32f6b43ae137666507ae5' * commit '41ef1d360bac65032aa32f6b43ae137666507ae5': arm: Add VFP-accelerated version of synth_filter_float Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-22 11:41:37 +02:00
Martin Storsjö	b63bb251ea	arm: Add VFP-accelerated version of imdct_half Before After Mean StdDev Mean StdDev Change This function 2653.0 28.5 1108.8 51.4 +139.3% Overall 17049.5 408.2 15973.0 223.2 +6.7% Signed-off-by: Martin Storsjö <martin@martin.st>	2013-07-22 10:15:37 +03:00
Ben Avison	41ef1d360b	arm: Add VFP-accelerated version of synth_filter_float Before After Mean StdDev Mean StdDev Change This function 9295.0 114.9 4853.2 83.5 +91.5% Overall 23699.8 397.6 19285.5 292.0 +22.9% Signed-off-by: Martin Storsjö <martin@martin.st>	2013-07-22 10:15:17 +03:00
Carl Eugen Hoyos	cf36180143	Only set accelerated arm fft functions if fft is enabled. Fixes lavc compilation (linking) for configurations without fft. Reported-by: tyler wear Tested-by: Gavin Kinsey	2013-02-17 17:29:55 +01:00
Michael Niedermayer	92ef4be4ab	Merge remote-tracking branch 'qatar/master' * qatar/master: ARM: allow runtime masking of CPU features dsputil: remove unused functions mov: Treat keyframe indexes as 1-origin if starting at non-zero. mov: Take stps entries into consideration also about key_off. Remove lowres video decoding Conflicts: ffmpeg.c ffplay.c libavcodec/arm/vp8dsp_init_arm.c libavcodec/libopenjpegdec.c libavcodec/mjpegdec.c libavcodec/mpegvideo.c libavcodec/utils.c libavformat/mov.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-04-22 22:26:42 +02:00
Mans Rullgard	d526c5338d	ARM: allow runtime masking of CPU features This allows masking CPU features with the -cpuflags avconv option which is useful for testing different optimisations without rebuilding. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-04-22 12:30:45 +01:00
Michael Niedermayer	8f0768cc22	Merge remote-tracking branch 'qatar/master' * qatar/master: Add a tool that uses avio to read and write, doing a plain copy of data ARM: fix build with FFT enabled and MDCT disabled lavf: force single-threaded decoding in avformat_find_stream_info avidec: migrate last of lavf from FF_ER_* to AV_EF_* avserver: fix build after the next bump. Conflicts: libavformat/Makefile libavformat/avidec.c libavformat/utils.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-01-21 01:33:31 +01:00
Felipe Contreras	c3d5e290ca	ARM: fix build with FFT enabled and MDCT disabled Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-01-20 16:14:01 +00:00
Michael Niedermayer	d4a50a2100	Merge remote-tracking branch 'newdev/master' Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-03-21 03:33:28 +01:00
Mans Rullgard	0aded9484d	Move dct and rdft definitions to separate files This leaves fft.h with only the core FFT and MDCT definitions thus making it more managable. Signed-off-by: Mans Rullgard <mans@mansr.com>	2011-03-20 17:15:33 +00:00
Mans Rullgard	2912e87a6c	Replace FFmpeg with Libav in licence headers Signed-off-by: Mans Rullgard <mans@mansr.com>	2011-03-19 13:33:20 +00:00
Loren Merritt	11ab1e409f	FFT: factor a shuffle out of the inner loop and merge it into fft_permute. 6% faster SSE FFT on Conroe, 2.5% on Penryn. Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net> (cherry picked from commit `e6b1ed693a`)	2011-02-14 23:58:19 +01:00
Loren Merritt	e6b1ed693a	FFT: factor a shuffle out of the inner loop and merge it into fft_permute. 6% faster SSE FFT on Conroe, 2.5% on Penryn. Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>	2011-02-13 15:36:39 +01:00
Justin Ruggles	a8ae4e0e7b	Remove unneeded add bias from 3 functions. DSPContext.vector_fmul_window() DCADSPContext.lfe_fir() SynthFilterContext.synth_filter_float() Signed-off-by: Mans Rullgard <mans@mansr.com> (cherry picked from commit `80ba1ddb58`)	2011-02-02 03:40:48 +01:00
Justin Ruggles	80ba1ddb58	Remove unneeded add bias from 3 functions. DSPContext.vector_fmul_window() DCADSPContext.lfe_fir() SynthFilterContext.synth_filter_float() Signed-off-by: Mans Rullgard <mans@mansr.com>	2011-01-31 20:28:42 +00:00
Måns Rullgård	e73d1a5efc	ARM: NEON optimised synth_filter_float 2.7x faster DCA decoding on Cortex-A8 Originally committed as revision 22828 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-04-10 16:27:56 +00:00
Måns Rullgård	a8bb9ea532	ARM: NEON optimised RDFT Originally committed as revision 22641 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-23 03:35:02 +00:00
Måns Rullgård	1429224b04	Move FFT parts from dsputil.h to fft.h Originally committed as revision 22235 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-06 14:34:46 +00:00
Måns Rullgård	f7a3b6030c	ARM: interleave cos/sin tables for improved NEON MDCT Originally committed as revision 19940 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-09-21 02:56:09 +00:00
Måns Rullgård	01b2214758	Merge FFTContext and MDCTContext Originally committed as revision 19931 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-09-20 17:30:20 +00:00
Måns Rullgård	f486321395	Move per-arch fft init bits into the corresponding subdirs Originally committed as revision 19864 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-09-15 21:14:14 +00:00

32 Commits