1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-21 10:55:51 +02:00
FFmpeg/libavutil
Ganesh Ajjanagadde 971d12b7f9 avutil/mathematics: speed up av_gcd by using Stein's binary GCD algorithm
This uses Stein's binary GCD algorithm:
https://en.wikipedia.org/wiki/Binary_GCD_algorithm
to get a roughly 4x speedup over Euclidean GCD on standard architectures
with a compiler intrinsic for ctzll, and a roughly 2x speedup otherwise.
At the moment, the compiler intrinsic is used on GCC and Clang due to
its easy availability.

Quick note regarding overflow: yes, subtractions on int64_t can, but the
llabs takes care of that. The llabs is also guaranteed to be safe, with
no annoying INT64_MIN business since INT64_MIN being a power of 2, is
shifted down before being sent to llabs.

The binary GCD needs ff_ctzll, an extension of ff_ctz for long long (int64_t). On
GCC, this is provided by a built-in. On Microsoft, there is a
BitScanForward64 analog of BitScanForward that should work; but I can't confirm.
Apparently it is not available on 32 bit builds; so this may or may not
work correctly. On Intel, per the documentation there is only an
intrinsic for _bit_scan_forward and people have posted on forums
regarding _bit_scan_forward64, but often their documentation is
woeful. Again, I don't have it, so I can't test.

As such, to be safe, for now only the GCC/Clang intrinsic is added, the rest
use a compiled version based on the De-Bruijn method of Leiserson et al:
http://supertech.csail.mit.edu/papers/debruijn.pdf.

Tested with FATE, sample benchmark (x86-64, GCC 5.2.0, Haswell)
with a START_TIMER and STOP_TIMER in libavutil/rationsl.c, followed by a
make fate.

aac-am00_88.err:
builtin:
714 decicycles in av_gcd,    4095 runs,      1 skips

de-bruijn:
1440 decicycles in av_gcd,    4096 runs,      0 skips

previous:
2889 decicycles in av_gcd,    4096 runs,      0 skips

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-11 04:08:41 +02:00
..
aarch64 Merge commit '780cd20b00a69e26bbfffbb8eec16fbe999ea793' 2014-12-09 12:08:29 +01:00
arm avutil/attributes: add AV_GCC_VERSION_AT_MOST 2015-09-18 12:41:29 -03:00
avr32
bfin Merge commit '880e2aa23645ed9871c66ee1cbd00f93c72d2d73' 2014-06-02 19:38:01 +02:00
mips mips: intreadwrite: Only execute that code for mips r1 or r2 2015-09-29 11:10:37 +02:00
ppc avutil/ppc/cpu: add include avassert.h 2015-06-05 19:12:58 +02:00
sh4
tomi
x86 x86inc: Make cpuflag() and notcpuflag() return 0 or 1 2015-10-01 18:14:12 +02:00
adler32.c avutil/adler32: Fix data type in test code 2015-06-19 02:25:48 +02:00
adler32.h adler32: Fix doxy group definition 2014-04-07 01:31:02 +02:00
aes.c lavu: Drop deprecated context size variables 2015-08-28 16:04:27 +02:00
aes.h lavu: Drop deprecated context size variables 2015-08-28 16:04:27 +02:00
atomic_gcc.h lavu/atomic: add support for the new memory model aware gcc built-ins 2014-10-29 14:09:58 -03:00
atomic_suncc.h
atomic_win32.h msvc: Fix compilation errors due to header include order. 2014-11-27 12:40:18 +01:00
atomic.c avutil/atomic: reuse ret to avoid dereferencing twice the same value. 2014-12-27 22:14:23 +01:00
atomic.h Merge remote-tracking branch 'qatar/master' 2013-12-20 13:16:56 +01:00
attributes.h avutil/attributes: add av_warn_unused_result 2015-10-05 19:30:20 +02:00
audio_fifo.c avfilter: add showfreqs filter 2015-08-19 16:15:13 +00:00
audio_fifo.h avfilter: add showfreqs filter 2015-08-19 16:15:13 +00:00
avassert.h
avstring.c avutil/avstring: Inline some tiny functions 2015-09-26 22:08:02 +02:00
avstring.h avutil/avstring: Inline some tiny functions 2015-09-26 22:08:02 +02:00
avutil.h doxygen: Remove lavu_internal group 2015-08-22 10:07:05 -07:00
avutilres.rc Add Windows resource file support for shared libraries 2013-12-05 23:42:07 +01:00
base64.c Merge commit 'fb0c9d41d685abb58575c5482ca33b8cd457c5ec' 2014-01-26 01:54:55 +01:00
base64.h
blowfish.c Merge commit '7a7df34c91e16ea8936f59524145a2cdd6b790f9' 2015-08-02 10:38:12 +02:00
blowfish.h Merge commit '7a7df34c91e16ea8936f59524145a2cdd6b790f9' 2015-08-02 10:38:12 +02:00
bprint.c avutil & avdevice: remove av_bprint_fd_contents() 2014-07-15 21:49:56 +02:00
bprint.h avutil/bprint: C++ compatible AVBPrint definition. 2014-11-29 03:51:35 +01:00
bswap.h Fix compile error on bfin. 2014-08-05 01:54:47 +02:00
buffer_internal.h Merge commit 'fbd6c97f9ca858140df16dd07200ea0d4bdc1a83' 2014-11-27 23:42:16 +01:00
buffer.c avutil/buffer: Avoid moving the AVBufferRef to a new place in memory in av_buffer_make_writable() 2015-03-12 02:15:28 +01:00
buffer.h Revert "lavu/buffer: add release function" 2014-03-06 03:23:40 +01:00
camellia.c libavutil: camellia: remove unwanted memory loads 2015-02-10 17:15:36 +01:00
camellia.h avutil/camellia: fix documentation for av_camellia_crypt() 2015-01-02 21:23:45 +01:00
cast5.c avutil/cast5: Make iv array static 2015-05-02 14:37:48 +02:00
cast5.h libavutil: Added cbc mode to cast128 2014-12-19 14:35:29 +01:00
channel_layout.c lavu: Drop FF_API_GET_CHANNEL_LAYOUT_COMPAT cruft 2015-09-05 20:36:19 +02:00
channel_layout.h Merge commit 'e23f84d9652474353d8bbc42787a56ec1991908f' 2015-08-24 10:40:24 +02:00
color_utils.c avutil/color_utils: Add basic transfer functions for each AVColorTransferCharacteristic 2015-09-10 23:53:05 +02:00
color_utils.h avutil/color_utils: Add basic transfer functions for each AVColorTransferCharacteristic 2015-09-10 23:53:05 +02:00
colorspace.h avutil/colorspace: Remove RGB_TO_Y/U/V 2015-06-06 18:21:01 +02:00
common.h Merge commit 'cdfe45ad371b7a8e6135b6c063b6b2a93152cb3a' 2015-09-05 17:17:15 +02:00
cpu_internal.h Merge commit 'cae39851201b7781f1262e1c23627b45e6e80bb4' 2015-05-31 23:59:48 +02:00
cpu.c x86: add AV_CPU_FLAG_AVXSLOW flag 2015-05-31 12:07:11 +02:00
cpu.h lavu/cpu: remove old cmov cruft 2015-09-05 17:23:28 +02:00
crc.c avutil/crc: Fix type of p table so its content fits without overflwoing 2015-06-19 02:25:48 +02:00
crc.h Merge commit '0983d48111f578e17e8c1967d25ce593fce62b63' 2014-04-17 22:38:51 +02:00
des.c Merge commit 'd9e8b47e3144262d6bc4681740411d4bdafad6ac' 2015-08-02 10:41:16 +02:00
des.h Merge commit 'a686e58165ca0f83966431a9166cb6e17bf6095c' 2015-09-07 12:28:25 +02:00
dict.c avutil/dict: Use size_t for appending strings 2015-05-10 16:09:07 +02:00
dict.h avutil: remove FF_CONST_AVUTIL53, its no longer needed 2014-11-24 02:22:19 +01:00
display.c Merge commit 'e4fe535d12f4f30df2dd672e30304af112a5a827' 2015-03-24 01:14:31 +01:00
display.h Merge commit 'e4fe535d12f4f30df2dd672e30304af112a5a827' 2015-03-24 01:14:31 +01:00
downmix_info.c Merge commit 'c98f3169bfb578c1a4e407b44524f0bfa3b4dc0c' 2014-02-16 02:05:29 +01:00
downmix_info.h fix spelling errors 2014-07-12 22:33:27 +02:00
dynarray.h fix spelling errors 2014-07-12 22:33:27 +02:00
error.c avutil/error: list most common error code in error_entries when strerror_r() is unavailable 2015-02-10 23:02:24 +01:00
error.h avutil/error: Introduce new error codes for 4XX and 5XX replies from remote servers 2014-10-19 22:32:14 +02:00
eval.c avutil: add ff_reverse as av_reverse replacement 2015-08-12 00:14:14 +02:00
eval.h Do not leave positive values undefined when negative are defined as error 2013-10-19 16:42:57 +02:00
fifo.c avfilter: add showfreqs filter 2015-08-19 16:15:13 +00:00
fifo.h avfilter: add showfreqs filter 2015-08-19 16:15:13 +00:00
file_open.c Merge commit '9326d64ed1baadd7af60df6bbcc59cf1fefede48' 2014-11-27 11:10:26 +01:00
file.c Merge commit 'bf704132a51f5d838365158331d4e535e1df4c8e' 2015-02-14 21:27:44 +01:00
file.h avutil/file: fix av_tempfile() documentation 2014-11-24 04:59:02 +01:00
fixed_dsp.c avutil/fixed_dsp: remove ff_ prefix from static function 2015-06-20 03:39:09 -03:00
fixed_dsp.h libavutil: Add new fixed dsp functions. 2015-06-03 22:50:53 +02:00
float_dsp.c avutil/float_dsp: Remove use of deprecated av_set_cpu_flags_mask() 2015-08-07 22:46:27 +02:00
float_dsp.h avutil/float_dsp: Fix ambiguous wording about vector products 2015-06-01 16:22:27 +02:00
frame.c Merge commit '1aa24df74c052a73175c43e57d35b4835e537ec8' 2015-10-03 09:52:39 +02:00
frame.h Merge commit '1aa24df74c052a73175c43e57d35b4835e537ec8' 2015-10-03 09:52:39 +02:00
hash.c lavu/hash.c: Add missing "static const". 2014-08-31 10:33:02 +02:00
hash.h lavu/hash: add hash_final helpers. 2014-04-29 13:24:11 +02:00
hmac.c lavu/hmac: remove deprecated type ids 2015-09-05 18:07:20 +02:00
hmac.h lavu/hmac: remove deprecated type ids 2015-09-05 18:07:20 +02:00
imgutils.c Merge commit '2268db2cd052674fde55c7d48b7a5098ce89b4ba' 2015-09-08 16:35:28 +02:00
imgutils.h Replace a few leftover instances of enum PixelFormat with enum AVPixelFormat 2015-03-17 23:53:33 +02:00
integer.c
integer.h
internal.h lavu: Drop FF_API_GET_CHANNEL_LAYOUT_COMPAT cruft 2015-09-05 20:36:19 +02:00
intfloat.h Reinstate proper FFmpeg license for all files. 2013-08-30 15:47:38 +00:00
intmath.c
intmath.h avutil/mathematics: speed up av_gcd by using Stein's binary GCD algorithm 2015-10-11 04:08:41 +02:00
intreadwrite.h libavutil: document side effects of macros 2014-07-19 14:55:46 +02:00
lfg.c
lfg.h
libavutil.v lavu: stop exporting internal functions 2014-08-12 04:35:52 +02:00
libm.h Remove fminf() emulation. 2014-11-08 11:31:11 +01:00
lls.c lavu: Drop deprecated private lls functions 2015-08-28 16:04:27 +02:00
lls.h lavu: Drop deprecated private lls functions 2015-08-28 16:04:27 +02:00
log2_tab.c
log.c avutil/log: fix zero length gnu_printf format string warning 2015-09-17 18:58:01 +02:00
log.h avutil/log: modify AV_LOG_MAX_OFFSET for AV_LOG_TRACE 2015-06-26 14:02:35 +02:00
lzo.c avutil/lzo: fix resource leak 2014-10-11 12:15:26 +02:00
lzo.h
macros.h Merge remote-tracking branch 'qatar/master' 2013-12-30 11:23:32 +01:00
Makefile Merge commit '2d40968dd3ff17b12f7c80dbfad409b14418e267' 2015-09-05 17:18:05 +02:00
mathematics.c avutil/mathematics: speed up av_gcd by using Stein's binary GCD algorithm 2015-10-11 04:08:41 +02:00
mathematics.h fix various typos 2014-06-03 10:58:19 -08:00
md5.c lavu: Drop deprecated context size variables 2015-08-28 16:04:27 +02:00
md5.h lavu: Drop deprecated context size variables 2015-08-28 16:04:27 +02:00
mem_internal.h avutil/mem_internal: add missing header includes 2015-07-13 21:54:15 -03:00
mem.c Factor duplicated ff_fast_malloc() out into mem_internal.h 2015-07-13 02:41:43 +02:00
mem.h Merge commit '8ddc32629a6d6be77256694c9e322dde134609f3' 2014-08-14 00:29:06 +02:00
motion_vector.h avutil/motion_vector.h: fix coordinate types 2014-08-21 12:27:34 +02:00
murmur3.c avutil/murmur3: Add () to protect the ROT() arguments 2015-02-17 00:18:15 +01:00
murmur3.h
opencl_internal.c lavu: rename ff_opencl_set_parameter() to avpriv_opencl_set_parameter() 2014-08-12 03:49:45 +02:00
opencl_internal.h lavu: rename ff_opencl_set_parameter() to avpriv_opencl_set_parameter() 2014-08-12 03:49:45 +02:00
opencl.c avutil/opencl: Fix volatile pointer 2015-09-26 20:28:29 +02:00
opencl.h OpenCL: Fix ABI incompatibility issues 2015-04-28 12:28:53 +02:00
opt.c lavu/opt: add flag to return NULL when applicable in av_opt_get 2015-10-09 04:12:57 -05:00
opt.h lavu/opt: add flag to return NULL when applicable in av_opt_get 2015-10-09 04:12:57 -05:00
parseutils.c Merge commit '219b39a71a5694b1c14a07b86477f665a5b6849b' 2015-07-21 16:55:39 +02:00
parseutils.h Merge commit '27f274628234c1f934b9a6a6380ed567c1b4ceae' 2015-04-07 20:46:25 +02:00
pca.c avutil/pca: Check for av_malloc* failures 2015-03-30 04:37:42 +02:00
pca.h avutil/pca: Make argument of ff_pca_add() const 2014-09-28 16:17:18 +02:00
pixdesc.c pixfmt: Add new SMPTE color primaries and transfer characteristic values 2015-09-17 10:31:43 +02:00
pixdesc.h Merge commit '7b02cb29d9d60cdd5ef321043d11d02023e7dc8f' 2015-09-12 13:03:04 +02:00
pixelutils.c avutil: check pixdescs in a different place 2015-02-10 15:45:02 +01:00
pixelutils.h avutil: add pixelutils API 2014-08-05 21:05:52 +02:00
pixfmt.h pixfmt: Add new SMPTE color primaries and transfer characteristic values 2015-09-17 10:31:43 +02:00
qsort.h
random_seed.c msvc: fix implicitly declared read/close. 2014-08-02 14:52:17 +02:00
random_seed.h
rational.c avutil: Add av_q2intfloat() 2015-05-26 18:31:53 +02:00
rational.h avutil: Add av_q2intfloat() 2015-05-26 18:31:53 +02:00
rc4.c Merge commit 'ae365453c370c85f278bff7fbf9e20d9d335cb2a' 2015-08-02 10:38:33 +02:00
rc4.h Merge commit 'b469832de993dabbfe037bef59c68e90e82ebca5' 2015-08-02 10:38:53 +02:00
replaygain.h Merge commit '8542f9c4f17125d483c40c0c5723842f1c982f81' 2014-04-04 22:52:12 +02:00
reverse.c avutil: add ff_reverse as av_reverse replacement 2015-08-12 00:14:14 +02:00
ripemd.c ripemd: move ripemd{256, 320} into separate functions 2015-05-07 15:40:56 +02:00
ripemd.h
samplefmt.c avutil: remove obsolete FF_API_SAMPLES_UTILS_RETURN_ZERO cruft 2014-10-05 17:09:56 -03:00
samplefmt.h avutil: remove obsolete FF_API_GET_BITS_PER_SAMPLE_FMT cruft 2014-10-05 17:09:49 -03:00
sha512.c lavu/sha512: Fully unroll the transform function loops 2013-09-11 21:55:59 +02:00
sha512.h
sha.c lavu: Drop deprecated context size variables 2015-08-28 16:04:27 +02:00
sha.h lavu: Drop deprecated context size variables 2015-08-28 16:04:27 +02:00
softfloat_tables.h avutil/softfloat_tables: add missing stdint.h include 2015-04-30 17:38:41 -03:00
softfloat.c avutil/softfloat: Add a test for av_sincos_sf() 2015-07-25 21:42:42 +02:00
softfloat.h avutil/softfloat: move av_sincos_sf() back to header 2015-07-22 23:12:21 -03:00
stereo3d.c Merge commit '159a06dfc83d189f753c4583583ddfb571552ff5' 2014-08-14 00:17:47 +02:00
stereo3d.h Merge commit '440842c4eb1d7709654ec97cd687663d11ef499c' 2014-06-19 23:47:10 +02:00
tea.c Add support for TEA (Tiny Encryption Algorithm) 2015-07-21 23:10:44 +02:00
tea.h Add support for TEA (Tiny Encryption Algorithm) 2015-07-21 23:10:44 +02:00
thread.h thread: use "" instead of <> for including the w32pthreads wrapper 2014-12-14 18:15:57 +01:00
threadmessage.c lavu: add thread message API. 2014-05-26 11:40:15 +02:00
threadmessage.h lavu: add thread message API. 2014-05-26 11:40:15 +02:00
time_internal.h avutil/time_internal: do not attempt to override *time_r() macros 2014-11-05 18:44:15 +01:00
time.c Merge commit '1bd0bdcdc236099d5c0d179696951f35f5310fa5' 2014-10-24 11:06:56 +02:00
time.h Merge commit '1bd0bdcdc236099d5c0d179696951f35f5310fa5' 2014-10-24 11:06:56 +02:00
timecode.c Timecode: Support 48fps 2014-05-28 03:25:41 +02:00
timecode.h
timer.h avutil/timer: give each printed value of STOP_TIMER a fixed length 2015-03-27 04:44:58 +01:00
timestamp.h avutil/timestamp: Warn about missing __STDC_FORMAT_MACROS for C++ use 2014-03-13 17:32:15 +01:00
tree.c lavu: Drop deprecated context size variables 2015-08-28 16:04:27 +02:00
tree.h lavu: Drop deprecated context size variables 2015-08-28 16:04:27 +02:00
twofish.c libavutil: optimize twofish cipher 2015-02-18 00:59:55 +01:00
twofish.h libavutil: Added twofish symmetric block cipher 2015-01-29 01:56:11 +01:00
utf8.c avutil/utf8: put under #ifdef TEST 2013-11-22 17:16:11 +01:00
utils.c lavu: disable wrong value check in get_version() upon api bump. 2015-08-18 15:57:20 -04:00
version.h lavu/opt: add flag to return NULL when applicable in av_opt_get 2015-10-09 04:12:57 -05:00
wchar_filename.h Merge commit '9326d64ed1baadd7af60df6bbcc59cf1fefede48' 2014-11-27 11:10:26 +01:00
x86_cpu.h
xga_font_data.c
xga_font_data.h
xtea.c Merge commit '5d8bea3bb2357bb304f8f771a4107039037c5549' 2015-08-02 10:39:37 +02:00
xtea.h Merge commit '5d8bea3bb2357bb304f8f771a4107039037c5549' 2015-08-02 10:39:37 +02:00