FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00

Author	SHA1	Message	Date
James Almer	3cec54b7d7	x86/flacdsp: add SSE2 and AVX decorrelate functions Two to four times faster depending on instruction set, block size and channel count.	2014-11-13 13:47:55 -03:00
James Almer	c99a882814	avcodec/idctdsp: change {put,add}_pixels_clamped to ptrdiff_t line_size Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-24 21:43:19 -03:00
Bernd Kuhls	6b733be755	Fix compile error on arm4/arm5 platform Since these commits http://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=adf8227cf4e7b4fccb2ad88e1e09b6dc00dd00ed http://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=db7f1c7c5a1d37e7f4da64a79a97bea1c4b6e9f8 compilation on arm4/arm5 fails: libavcodec/libavcodec.so: undefined reference to `ff_startcode_find_candidate_armv6' Because libavcodec/arm/Makefile contains ARMV6-OBJS-$(CONFIG_STARTCODE) += arm/startcode_armv6.o function ff_startcode_find_candidate_armv6 is not included for older ARM archs. The bug was found during automatic buildroot builds: http://autobuild.buildroot.net/results/ec7/ec71e4f16ee9106747dff5f15999cbd17903e76f//build-end.log Quote from configure summary: ARCH arm (armv4t) big-endian no runtime cpu detection yes ARMv5TE enabled no ARMv6 enabled no ARMv6T2 enabled no http://autobuild.buildroot.net/results/be7/be72eb182eaccf0064a32c9dfc2ac1c0d6555506/build-end.log ARCH arm (armv5te) big-endian no runtime cpu detection yes ARMv5TE enabled yes ARMv6 enabled no ARMv6T2 enabled no This patch provides the necessary #if clauses as discussed with Michael: https://ffmpeg.org/pipermail/ffmpeg-devel/2014-September/163329.html Signed-off-by: Bernd Kuhls <bernd.kuhls@t-online.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-23 21:11:05 +02:00
Michael Niedermayer	5db23c07a3	Merge commit '95c0cec03acec0a80cc1c7db48f3b2355d9e767b' * commit '95c0cec03acec0a80cc1c7db48f3b2355d9e767b': idctdsp: Add global function pointers for {add\|put}_pixels_clamped functions Conflicts: libavcodec/arm/idctdsp_init_arm.c libavcodec/dct.h libavcodec/idctdsp.c libavcodec/jrevdct.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-03 03:19:40 +02:00
Diego Biurrun	95c0cec03a	idctdsp: Add global function pointers for {add\|put}_pixels_clamped functions These function pointers already existed in the ARM code. Adding them globally allows calls to the function pointers to access arch-optimized versions of the functions transparently.	2014-09-02 14:41:13 -07:00
Michael Niedermayer	3bb2297351	Merge commit 'efd26bedec9a345a5960dbfcbaec888418f2d4e6' * commit 'efd26bedec9a345a5960dbfcbaec888418f2d4e6': build: Add explanatory comments to (optimization) blocks in the Makefiles Conflicts: libavcodec/ppc/Makefile libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-15 20:25:12 +02:00
Michael Niedermayer	c1df467d73	Merge commit '835f798c7d20bca89eb4f3593846251ad0d84e4b' * commit '835f798c7d20bca89eb4f3593846251ad0d84e4b': mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes Conflicts: libavcodec/h261dec.c libavcodec/intrax8.c libavcodec/mjpegenc.c libavcodec/mpeg12dec.c libavcodec/mpeg12enc.c libavcodec/mpeg4videoenc.c libavcodec/mpegvideo.c libavcodec/mpegvideo.h libavcodec/mpegvideo_enc.c libavcodec/rv10.c libavcodec/x86/mpegvideoenc.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-15 20:11:56 +02:00
Diego Biurrun	efd26bedec	build: Add explanatory comments to (optimization) blocks in the Makefiles	2014-08-15 02:55:21 -07:00
Diego Biurrun	835f798c7d	mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes	2014-08-15 01:26:33 -07:00
James Almer	a8592db9bb	avcodec/idctdsp: make add/put_pixels_clamped_c internal functions This reduces code duplication and differences with the fork. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-13 01:44:41 +02:00
Michael Niedermayer	305f72aee7	avcodec: Change get_pixels() to ptrdiff_t linesize Found-by: ubitux Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-06 15:50:54 +02:00
Michael Niedermayer	bf7ed956ff	Merge commit 'adf8227cf4e7b4fccb2ad88e1e09b6dc00dd00ed' * commit 'adf8227cf4e7b4fccb2ad88e1e09b6dc00dd00ed': vc-1: Add platform-specific start code search routine to VC1DSPContext. Conflicts: configure libavcodec/arm/vc1dsp_init_arm.c libavcodec/vc1dsp.c libavcodec/vc1dsp.h See: `9d8ecdd8ca` Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-05 13:00:41 +02:00
Michael Niedermayer	77aafadc56	Merge commit 'db7f1c7c5a1d37e7f4da64a79a97bea1c4b6e9f8' * commit 'db7f1c7c5a1d37e7f4da64a79a97bea1c4b6e9f8': h264: Move start code search functions into separate source files. Conflicts: libavcodec/arm/Makefile libavcodec/arm/h264dsp_init_arm.c libavcodec/h264_parser.c libavcodec/h264dsp.c libavcodec/startcode.c libavcodec/startcode.h See: `270cede3f3` Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-05 12:46:10 +02:00
Ben Avison	adf8227cf4	vc-1: Add platform-specific start code search routine to VC1DSPContext. Initialise VC1DSPContext for parser as well as for decoder. Note, the VC-1 code doesn't actually use the function pointer yet. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2014-08-04 22:22:54 +02:00
Ben Avison	db7f1c7c5a	h264: Move start code search functions into separate source files. This permits re-use with parsers for codecs which use similar start codes. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2014-08-04 22:22:54 +02:00
Michael Niedermayer	b051a1bbb9	avcodec/arm/idctdsp_init_arm*: Only select non bitexact IDCTs by default when bitexact is not set Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-27 14:21:36 +02:00
Michael Niedermayer	2904d052b7	Merge commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac' * commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac': qpeldsp: Mark source pointer in qpel_mc_func function pointer const Conflicts: libavcodec/h264qpel_template.c libavcodec/x86/cavsdsp.c libavcodec/x86/rv40dsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-25 13:05:08 +02:00
Diego Biurrun	7fb993d338	qpeldsp: Mark source pointer in qpel_mc_func function pointer const	2014-07-25 02:52:54 -07:00
Michael Niedermayer	7cdb3b2b79	Merge commit '6869612f5c7d4d2f20f69a5658328a761deadb1c' * commit '6869612f5c7d4d2f20f69a5658328a761deadb1c': arm: Macroize the test for 'setend' CPU instruction support Conflicts: libavcodec/arm/h264dsp_init_arm.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-22 12:46:13 +02:00
Ben Avison	6869612f5c	arm: Macroize the test for 'setend' CPU instruction support Signed-off-by: Diego Biurrun <diego@biurrun.de>	2014-07-21 15:08:01 -07:00
Michael Niedermayer	d986c414de	Merge commit '81b9bf319226fe03436c80aaa8a2c91767cab7ce' * commit '81b9bf319226fe03436c80aaa8a2c91767cab7ce': dct-test: Move arch-specific bits into arch-specific subdirectories Conflicts: libavcodec/dct-test.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-21 13:33:51 +02:00
Diego Biurrun	81b9bf3192	dct-test: Move arch-specific bits into arch-specific subdirectories	2014-07-21 01:10:11 -07:00
Michael Niedermayer	110420aac0	Merge commit '4de8b60684ce13dff3e3d372dae4f49b9e53f755' * commit '4de8b60684ce13dff3e3d372dae4f49b9e53f755': idct: Move arm-specific declarations to a header in the arm directory Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-21 01:56:22 +02:00
Diego Biurrun	4de8b60684	idct: Move arm-specific declarations to a header in the arm directory	2014-07-20 13:02:17 -07:00
Michael Niedermayer	521f569734	Merge commit '8b0dd4942aac320d1ca3c40fa7ea1be342c71273' * commit '8b0dd4942aac320d1ca3c40fa7ea1be342c71273': idctdsp: prettyprinting cosmetics Conflicts: libavcodec/idctdsp.c libavcodec/ppc/idctdsp.c libavcodec/x86/idctdsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-18 22:16:04 +02:00
Michael Niedermayer	42d326353c	Merge commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae' * commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae': idct: Convert IDCT permutation #defines to an enum Conflicts: libavcodec/idctdsp.c libavcodec/x86/cavsdsp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-18 22:01:17 +02:00
Diego Biurrun	8b0dd4942a	idctdsp: prettyprinting cosmetics	2014-07-18 07:51:03 -07:00
Diego Biurrun	b4987f7219	idct: Convert IDCT permutation #defines to an enum Also rename the enum values to be consistent with other DCT permutations.	2014-07-18 07:51:03 -07:00
Michael Niedermayer	d13effb0b4	Merge commit '7e18a727d2c2a19f22fcf68875d1b05fd2eafcef' * commit '7e18a727d2c2a19f22fcf68875d1b05fd2eafcef': arm: cosmetics: Consistently use lowercase for shift operators Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-18 13:17:29 +02:00
Michael Niedermayer	cd4497d8c5	Merge commit 'fe67f3fbb5f9f6a6b60f837f6bc5e087ac11f3bf' * commit 'fe67f3fbb5f9f6a6b60f837f6bc5e087ac11f3bf': arm: cosmetics: Fix a misaligned asm operand Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-18 12:44:03 +02:00
Martin Storsjö	7e18a727d2	arm: cosmetics: Consistently use lowercase for shift operators Signed-off-by: Martin Storsjö <martin@martin.st>	2014-07-18 11:17:40 +03:00
Martin Storsjö	fe67f3fbb5	arm: cosmetics: Fix a misaligned asm operand Signed-off-by: Martin Storsjö <martin@martin.st>	2014-07-18 11:17:35 +03:00
Michael Niedermayer	c27adb37ef	Merge commit '87552d54d3337c3241e8a9e1a05df16eaa821496' * commit '87552d54d3337c3241e8a9e1a05df16eaa821496': armv6: Accelerate ff_fft_calc for general case (nbits != 4) Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-18 03:12:02 +02:00
Ben Avison	87552d54d3	armv6: Accelerate ff_fft_calc for general case (nbits != 4) The previous implementation targeted DTS Coherent Acoustics, which only requires nbits == 4 (fft16()). This case was (and still is) linked directly rather than being indirected through ff_fft_calc_vfp(), but now the full range from radix-4 up to radix-65536 is available. This benefits other codecs such as AAC and AC3. The implementaion is based upon the C version, with each routine larger than radix-16 calling a hierarchy of smaller FFT functions, then performing a post-processing pass. This pass benefits a lot from loop unrolling to counter the long pipelines in the VFP. A relaxed calling standard also reduces the overhead of the call hierarchy, and avoiding the excessive inlining performed by GCC probably helps with I-cache utilisation too. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in the FFT routines (fft4() to fft512() and pass()) for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4% FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2% Signed-off-by: Martin Storsjö <martin@martin.st>	2014-07-18 01:34:23 +03:00
Ben Avison	5c22e8e4ad	armv6: Accelerate ff_imdct_half for general case (mdct_bits != 6) The previous implementation targeted DTS Coherent Acoustics, which only requires mdct_bits == 6. This relatively small size lent itself to unrolling the loops a small number of times, and encoding offsets calculated at assembly time within the load/store instructions of each iteration. In the more general case (codecs such as AAC and AC3) much larger arrays are used - mdct_bits == [8, 9, 11]. The old method does not scale for these cases, so more integer registers are used with non-unrolled versions of the loops (and with some stack spillage). The postrotation filter loop is still unrolled by a factor of 2 to permit the double-buffering of some VFP registers to facilitate overlap of neighbouring iterations. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same example AAC stream: Before After Mean StdDev Mean StdDev Confidence Change aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8% ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1% Signed-off-by: Martin Storsjö <martin@martin.st>	2014-07-18 01:34:08 +03:00
Michael Niedermayer	3a2d1465c8	Merge commit '2d60444331fca1910510038dd3817bea885c2367' * commit '2d60444331fca1910510038dd3817bea885c2367': dsputil: Split motion estimation compare bits off into their own context Conflicts: configure libavcodec/Makefile libavcodec/arm/Makefile libavcodec/dvenc.c libavcodec/error_resilience.c libavcodec/h264.h libavcodec/h264_slice.c libavcodec/me_cmp.c libavcodec/me_cmp.h libavcodec/motion_est.c libavcodec/motion_est_template.c libavcodec/mpeg4videoenc.c libavcodec/mpegvideo.c libavcodec/mpegvideo_enc.c libavcodec/x86/Makefile libavcodec/x86/me_cmp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-17 23:27:40 +02:00
Diego Biurrun	2d60444331	dsputil: Split motion estimation compare bits off into their own context	2014-07-17 09:07:10 -07:00
Michael Niedermayer	21dfabfa64	Merge commit 'adff0a8166345bb9513f0f658043fb6387e90122' * commit 'adff0a8166345bb9513f0f658043fb6387e90122': arm: dsputil: Coalesce all init files Conflicts: libavcodec/arm/Makefile libavcodec/arm/dsputil_arm.h libavcodec/arm/dsputil_init_arm.c libavcodec/arm/dsputil_init_armv6.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-16 20:09:25 +02:00
Diego Biurrun	adff0a8166	arm: dsputil: Coalesce all init files	2014-07-16 06:18:23 -07:00
Ben Avison	42c1cc35b7	armv6: Accelerate ff_imdct_half for general case (mdct_bits != 6) The previous implementation targeted DTS Coherent Acoustics, which only requires mdct_bits == 6. This relatively small size lent itself to unrolling the loops a small number of times, and encoding offsets calculated at assembly time within the load/store instructions of each iteration. In the more general case (codecs such as AAC and AC3) much larger arrays are used - mdct_bits == [8, 9, 11]. The old method does not scale for these cases, so more integer registers are used with non-unrolled versions of the loops (and with some stack spillage). The postrotation filter loop is still unrolled by a factor of 2 to permit the double-buffering of some VFP registers to facilitate overlap of neighbouring iterations. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same example AAC stream: Before After Mean StdDev Mean StdDev Confidence Change aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8% ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1% Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-13 15:17:04 +02:00
Michael Niedermayer	b8cdf04726	Merge commit '1173320249745eab01c901a39054fc0fced33c87' * commit '1173320249745eab01c901a39054fc0fced33c87': dsputil: Drop unused bit_depth parameter from all init functions Conflicts: libavcodec/dsputil.c libavcodec/dsputil.h libavcodec/ppc/dsputil_ppc.c libavcodec/x86/dsputilenc_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-11 20:29:40 +02:00
Diego Biurrun	1173320249	dsputil: Drop unused bit_depth parameter from all init functions	2014-07-11 06:38:26 -07:00
Michael Niedermayer	2d5e9451de	Merge commit 'f46bb608d9d76c543e4929dc8cffe36b84bd789e' * commit 'f46bb608d9d76c543e4929dc8cffe36b84bd789e': dsputil: Split off pixel block routines into their own context Conflicts: configure libavcodec/dsputil.c libavcodec/mpegvideo_enc.c libavcodec/pixblockdsp_template.c libavcodec/x86/dsputilenc.asm libavcodec/x86/dsputilenc_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-10 01:22:14 +02:00
Diego Biurrun	f46bb608d9	dsputil: Split off pixel block routines into their own context	2014-07-09 08:05:26 -07:00
Michael Niedermayer	1f935c3d0b	Merge commit '79fce1ec8abd017593c003917fc123f7119a78d6' * commit '79fce1ec8abd017593c003917fc123f7119a78d6': arm: Avoid using the 'setend' instruction on ARMv7 and newer Conflicts: libavcodec/arm/h264dsp_init_arm.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-08 14:44:12 +02:00
Martin Storsjö	79fce1ec8a	arm: Avoid using the 'setend' instruction on ARMv7 and newer This instruction is deprecated on ARMv8, and it is serializing on some ARMv7 cores as well [1]. [1] http://article.gmane.org/gmane.linux.ports.arm.kernel/339293 CC: libav-stable@libav.org Signed-off-by: Martin Storsjö <martin@martin.st>	2014-07-08 12:09:09 +03:00
Michael Niedermayer	020865f557	Merge commit 'c166148409fe8f0dbccef2fe684286a40ba1e37d' * commit 'c166148409fe8f0dbccef2fe684286a40ba1e37d': dsputil: Move pix_sum, pix_norm1, shrink function pointers to mpegvideoenc Conflicts: libavcodec/dsputil.c libavcodec/mpegvideo_enc.c libavcodec/x86/dsputilenc.asm libavcodec/x86/dsputilenc_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-07 15:36:58 +02:00
Diego Biurrun	c166148409	dsputil: Move pix_sum, pix_norm1, shrink function pointers to mpegvideoenc	2014-07-06 14:26:53 -07:00
Michael Niedermayer	581b5f0b9b	Merge commit 'e3fcb14347466095839c2a3c47ebecff02da891e' * commit 'e3fcb14347466095839c2a3c47ebecff02da891e': dsputil: Split off IDCT bits into their own context Conflicts: configure libavcodec/aic.c libavcodec/arm/Makefile libavcodec/arm/dsputil_init_arm.c libavcodec/arm/dsputil_init_armv6.c libavcodec/asvdec.c libavcodec/dnxhdenc.c libavcodec/dsputil.c libavcodec/dvdec.c libavcodec/dxva2_mpeg2.c libavcodec/intrax8.c libavcodec/mdec.c libavcodec/mjpegdec.c libavcodec/mjpegenc_common.h libavcodec/mpegvideo.c libavcodec/ppc/dsputil_altivec.h libavcodec/ppc/dsputil_ppc.c libavcodec/ppc/idctdsp.c libavcodec/x86/Makefile libavcodec/x86/dsputil_init.c libavcodec/x86/dsputil_mmx.c libavcodec/x86/dsputil_x86.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-01 15:22:11 +02:00
Diego Biurrun	e3fcb14347	dsputil: Split off IDCT bits into their own context	2014-06-30 07:58:46 -07:00

1 2 3 4 5 ...

706 Commits