FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-07 11:13:41 +02:00

Author	SHA1	Message	Date
Rémi Denis-Courmont	c177108ae1	lavu/riscv: add <intmath.h> optimisations This provides some micro-optimisations for signed integer clipping, and support for bit weight with the Zbb extension.	2022-09-13 16:50:43 -03:00
James Almer	28d5a3a74a	lavu: rename and move ff_parity to av_parity av_popcount is not defined in intmath.h. Reviewed-by: ubitux Signed-off-by: James Almer <jamrial@gmail.com>	2016-01-07 20:04:24 -03:00
Clément Bœsch	2ce29d1765	lavu: add ff_parity()	2016-01-07 22:51:31 +01:00
Ganesh Ajjanagadde	0dd8a3d71e	lavu/intmath: add faster clz support This should be useful for the sofalizer filter. Reviewed-by: Kieran Kunhya <kierank@ob-encoder.com> Reviewed-by: Clément Bœsch <u@pkh.me> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>	2015-12-19 09:35:34 -08:00
Michael Niedermayer	00efaa7983	avutil/intmath: fix undefined behavior in ff_ctzll_c() Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-22 14:10:42 +02:00
Matt Oliver	b0bb1dc62d	lavu/intmath.h: Move x86 only msvc/icl functions to x86 specific header. Signed-off-by: Matt Oliver <protogonoi@gmail.com>	2015-10-19 13:40:51 +11:00
Ganesh Ajjanagadde	55d3e97970	avutil/intmath: use de Bruijn based ff_ctz It has already been demonstrated that the de Bruijn method has benefits over the current implementation: commit `971d12b7f9`. That commit implemented it for long long, this extends it to the int version. Tested with FATE. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2015-10-14 13:39:42 -04:00
Ronald S. Bultje	93866c2aa2	intmath: remove av_ctz. It's a non-installed header and only used in one place (flacenc). Since ff_ctz is static inline, it's fine to use that instead.	2015-10-11 18:03:10 -04:00
Michael Niedermayer	2a4d1a66e8	avutil/intmath: Change debruijn_ctz64 to use 8bit elements This reduces the memory & cache need from 256 to 64 bytes the code also seems faster with this change Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-11 04:21:01 +02:00
Ganesh Ajjanagadde	971d12b7f9	avutil/mathematics: speed up av_gcd by using Stein's binary GCD algorithm This uses Stein's binary GCD algorithm: https://en.wikipedia.org/wiki/Binary_GCD_algorithm to get a roughly 4x speedup over Euclidean GCD on standard architectures with a compiler intrinsic for ctzll, and a roughly 2x speedup otherwise. At the moment, the compiler intrinsic is used on GCC and Clang due to its easy availability. Quick note regarding overflow: yes, subtractions on int64_t can, but the llabs takes care of that. The llabs is also guaranteed to be safe, with no annoying INT64_MIN business since INT64_MIN being a power of 2, is shifted down before being sent to llabs. The binary GCD needs ff_ctzll, an extension of ff_ctz for long long (int64_t). On GCC, this is provided by a built-in. On Microsoft, there is a BitScanForward64 analog of BitScanForward that should work; but I can't confirm. Apparently it is not available on 32 bit builds; so this may or may not work correctly. On Intel, per the documentation there is only an intrinsic for _bit_scan_forward and people have posted on forums regarding _bit_scan_forward64, but often their documentation is woeful. Again, I don't have it, so I can't test. As such, to be safe, for now only the GCC/Clang intrinsic is added, the rest use a compiled version based on the De-Bruijn method of Leiserson et al: http://supertech.csail.mit.edu/papers/debruijn.pdf. Tested with FATE, sample benchmark (x86-64, GCC 5.2.0, Haswell) with a START_TIMER and STOP_TIMER in libavutil/rationsl.c, followed by a make fate. aac-am00_88.err: builtin: 714 decicycles in av_gcd, 4095 runs, 1 skips de-bruijn: 1440 decicycles in av_gcd, 4096 runs, 0 skips previous: 2889 decicycles in av_gcd, 4096 runs, 0 skips Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-11 04:08:41 +02:00
Timothy Gu	c5d9e9b354	doxygen: Remove lavu_internal group There is no use in an internal group for a public API documentation.	2015-08-22 10:07:05 -07:00
James Almer	78347549a4	avutil/intmath: check for ICC before GCC Intel compiler also defines __GNUC__, so the Intel specific intrinsics were not really being used. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2015-07-18 19:59:34 -03:00
James Almer	bc65abc8d7	libavutil: add x86 optimized av_popcount Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2015-02-25 19:58:00 -03:00
Michael Niedermayer	f8607cfb0a	avutil/intmath: Add () to protect the ff_log2() argument Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-02-17 00:20:49 +01:00
Matthew Oliver	2060f4cbba	avutil/intmath: enable builtin intrinsics for icl and msvc. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-10-26 17:20:55 +01:00
Reimar Döffinger	1a558cec64	intmath.h: Remove duplicated ARM include. Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>	2014-08-31 18:33:27 +02:00
Michael Niedermayer	7d26be63c2	Merge commit '5ff998a233d759d0de83ea6f95c383d03d25d88e' * commit '5ff998a233d759d0de83ea6f95c383d03d25d88e': flacenc: use uint64_t for bit counts flacenc: remove wasted trailing 0 bits lavu: add av_ctz() for trailing zero bit count flacenc: use a separate buffer for byte-swapping for MD5 checksum on big-endian fate: aac: Place LATM tests and general AAC tests in different groups build: The A64 muxer depends on rawenc.o for ff_raw_write_packet() Conflicts: doc/APIchanges libavutil/version.h tests/fate/aac.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-05 22:51:20 +01:00
Justin Ruggles	dfde8a34e5	lavu: add av_ctz() for trailing zero bit count	2012-11-05 15:32:29 -05:00
Michael Niedermayer	aa760b1735	Merge commit '2d09b36c0379fcda8f984bc8ad8816c8326fd7bd' * commit '2d09b36c0379fcda8f984bc8ad8816c8326fd7bd': doc/platform: Add info on shared builds with MSVC doc/platform: Move a caveat down to the notes section ARM: reinstate optimised intmath.h ffv1: update to ffv1 version 3 Conflicts: doc/platform.texi libavcodec/ffv1.c libavcodec/ffv1.h libavcodec/ffv1dec.c libavcodec/ffv1enc.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-10-21 16:13:55 +02:00
Michael Niedermayer	dcbff35199	Merge commit 'd15c21e5fa3961f10026da1a3080a3aa3cf4cec9' * commit 'd15c21e5fa3961f10026da1a3080a3aa3cf4cec9': avutil: Add a copy of ff_sqrt_tab back into avutil to restore ABI compatibility avutil: make some tables visible again avutil: remove inline av_log2 from public API celp_math: rename ff_log2 to ff_log2_q15 Conflicts: libavutil/libavutil.v Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-10-21 13:35:42 +02:00
Mans Rullgard	ebe46b8063	ARM: reinstate optimised intmath.h Use of the ARM optimised intmath.h was accidentally dropped in `9734b8b`. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-10-20 17:26:37 +01:00
Mans Rullgard	8c0a3d5fe0	avutil: remove inline av_log2 from public API This removes inline av_log2 and av_log2_16bit from the public API, instead exporting them as regular functions. In-tree code still gets the inline and otherwise optimised variants. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-10-20 12:28:45 +01:00
Michael Niedermayer	e335658370	Merge commit '9734b8ba56d05e970c353dfd5baafa43fdb08024' * commit '9734b8ba56d05e970c353dfd5baafa43fdb08024': Move avutil tables only used in libavcodec to libavcodec. Conflicts: libavcodec/mathtables.c libavutil/intmath.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-10-12 14:26:46 +02:00
Diego Biurrun	9734b8ba56	Move avutil tables only used in libavcodec to libavcodec.	2012-10-11 18:29:36 +02:00
Mans Rullgard	5b170c0bea	x86: remove FASTDIV inline asm GCC 4.3 and later do the right thing with the plain C code. Earlier versions in 32-bit mode generate one extra instruction, needlessly zeroing what would be the high half of the shifted value. At least two gcc configurations miscompile the inline asm in some situations. In 64-bit mode, all gcc versions generate imul r64, r64 followed by shr. On Intel i7 and later, this imul is faster 32-bit mul. On older Intel and all AMD, it is slightly slower. On Atom it is much slower. Considering where the FASTDIV macro is used, any overall negative performance impact of this change should be negligible. If anyone cares, they should file a bug against gcc and get the instruction selection fixed. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-08-22 14:29:10 +01:00
Michael Niedermayer	3699960690	Merge remote-tracking branch 'qatar/master' * qatar/master: build: x86: Only compile mpegvideo optimizations when necessary configure: Drop fastdiv option build: Make the E-AC-3 encoder select the AC-3 encoder fate: flac: Only run tests requiring samples when samples are available Conflicts: configure Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-08-22 14:37:03 +02:00
Diego Biurrun	66baa45801	configure: Drop fastdiv option There is no point in having the user disable any fastdiv macros. Besides the condition implementation was broken and only disabled the C implementation, but no platform specific assembly versions.	2012-08-22 01:02:18 +02:00
Michael Niedermayer	0b9a69f244	Merge remote-tracking branch 'qatar/master' * qatar/master: (22 commits) aacdec: Fix PS in ADTS. avconv: Consistently use PIX_FMT_NONE. dsputil: use cpuflags in x86 emu_edge_core dsputil: use movups instead of movdqu in ff_emu_edge_core_sse() wma: initialize prev_block_len_bits, next_block_len_bits, and block_len_bits. mov: Remove some redundant and obsolete comments. Add libavutil/mathematics.h #includes for INFINITY doxy: structure libavformat groups doxy: introduce an empty structure in libavcodec doxy: provide a start page and document libavutil doxy: cleanup pixfmt.h regtest: split video encode/decode tests into individual targets ARM: add explicit .arch and .fpu directives to asm.S pthread: do not touch has_b_frames avconv: cleanup the transcoding loop in output_packet(). avconv: split subtitle transcoding out of output_packet(). avconv: split video transcoding out of output_packet(). avconv: split audio transcoding out of output_packet(). avconv: reindent. avconv: move streamcopy-only code out of decoding loop. ... Conflicts: avconv.c libavcodec/aaccoder.c libavcodec/pthread.c libavcodec/version.h libavutil/audioconvert.h libavutil/avutil.h libavutil/mem.h tests/ref/vsynth1/dv tests/ref/vsynth1/mpeg2thread tests/ref/vsynth2/dv tests/ref/vsynth2/mpeg2thread Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-11-23 04:02:17 +01:00
Luca Barbato	757cd8d876	doxy: provide a start page and document libavutil Introduce a basic layout, the subpages are currently left empty. Split libavutil in multiple groups as example of the structure	2011-11-22 17:16:02 +01:00
Mans Rullgard	2912e87a6c	Replace FFmpeg with Libav in licence headers Signed-off-by: Mans Rullgard <mans@mansr.com>	2011-03-19 13:33:20 +00:00
Måns Rullgård	a955b59658	Remove macro duplication between common.h and intmath.h Originally committed as revision 24086 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-07 17:27:43 +00:00
Måns Rullgård	2e874c7704	intmath: whitespace cosmetics Originally committed as revision 24085 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-07 17:27:39 +00:00
Måns Rullgård	b90b1b4c3c	Fix build on configurations without fast av_log2() This is a bit hackish. I will try to think of something nicer, but this will do for now. Originally committed as revision 22366 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-09 01:19:28 +00:00
Måns Rullgård	94ca624fbc	Move ff_sqrt() to libavutil/intmath.h Originally committed as revision 22345 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-08 21:19:56 +00:00
Måns Rullgård	75fb5c24ed	Move FASTDIV macro to intmath.h Originally committed as revision 21335 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-19 23:25:36 +00:00
Måns Rullgård	544f5a922f	Optimise av_log2 with clz when available 10% faster flac decoding on x86 and ARM. Originally committed as revision 21217 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-14 19:58:12 +00:00

36 Commits