1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-12 19:18:44 +02:00
Commit Graph

39 Commits

Author SHA1 Message Date
Ganesh Ajjanagadde
68e79b27a5 avutil/lls: speed up performance of solve_lls
This is a trivial rewrite of the loops that results in better
prefetching and associated cache efficiency. Essentially, the problem is
that modern prefetching logic is based on finite state Markov memory, a reasonable
assumption that is used elsewhere in CPU's in for instance branch
predictors.

Surrounding loops all iterate forward through the array, making the
predictor think of prefetching in the forward direction, but the
intermediate loop is unnecessarily in the backward direction.

Speedup is nontrivial. Benchmarks obtained by 10^6 iterations within
solve_lls, with START/STOP_TIMER. File is tests/data/fate/flac-16-lpc-cholesky.err.
Hardware: x86-64, Haswell, GNU/Linux.

new:
  17291 decicycles in solve_lls, 2096706 runs,    446 skips
  17255 decicycles in solve_lls, 4193657 runs,    647 skips
  17231 decicycles in solve_lls, 8384997 runs,   3611 skips
  17189 decicycles in solve_lls,16771010 runs,   6206 skips
  17132 decicycles in solve_lls,33544757 runs,   9675 skips
  17092 decicycles in solve_lls,67092404 runs,  16460 skips
  17058 decicycles in solve_lls,134188213 runs,  29515 skips

old:
  18009 decicycles in solve_lls, 2096665 runs,    487 skips
  17805 decicycles in solve_lls, 4193320 runs,    984 skips
  17779 decicycles in solve_lls, 8386855 runs,   1753 skips
  18289 decicycles in solve_lls,16774280 runs,   2936 skips
  18158 decicycles in solve_lls,33548104 runs,   6328 skips
  18420 decicycles in solve_lls,67091793 runs,  17071 skips
  18310 decicycles in solve_lls,134187219 runs,  30509 skips

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-26 09:20:46 -05:00
Michael Niedermayer
579a0fdc21 avutil/lls: Make unchanged function arguments const
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-09-28 19:32:07 +02:00
Michael Niedermayer
70b8668fb5 drop LLS1, rename LLS2 to LLS
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-09 23:20:31 +02:00
Michael Niedermayer
bbe66ef912 avutil: rename lls to lls2
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-17 16:30:23 +01:00
Michael Niedermayer
78b5479633 Merge commit '502ab21af0ca68f76d6112722c46d2f35c004053'
* commit '502ab21af0ca68f76d6112722c46d2f35c004053':
  x86: lpc: simd av_update_lls

The versions are bumped due to changes in lls.h which is used across
libraries affecting intra library ABI
(This version bump also covers changes to lls.h in the immedeatly previous
 commits)

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 11:35:52 +02:00
Michael Niedermayer
c93a424718 Merge commit '41578f70cf8aec8e7565fba1ca7e07f3dc46c3d2'
* commit '41578f70cf8aec8e7565fba1ca7e07f3dc46c3d2':
  lpc: use function pointers, in preparation for asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 11:23:43 +02:00
Michael Niedermayer
d3bd320e63 Merge commit 'cc6714bb16b1f0716ba43701d47273dbe9657b8b'
* commit 'cc6714bb16b1f0716ba43701d47273dbe9657b8b':
  lpc: remove "decay" argument

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 10:54:20 +02:00
Loren Merritt
502ab21af0 x86: lpc: simd av_update_lls
4x-6x faster on sandybridge

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-06-29 13:23:57 +02:00
Loren Merritt
41578f70cf lpc: use function pointers, in preparation for asm
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-06-29 13:23:57 +02:00
Loren Merritt
cc6714bb16 lpc: remove "decay" argument
We never used the rolling-average mode, and this makes av_update_lls 15% faster.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-06-29 13:23:57 +02:00
Michael Niedermayer
3c200aa693 Merge commit '1fda184a85178cfd7b98d9e308d18e1ded76a511'
* commit '1fda184a85178cfd7b98d9e308d18e1ded76a511':
  avutil: Add av_cold attributes to init functions missing them

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-05 12:53:50 +02:00
Diego Biurrun
1fda184a85 avutil: Add av_cold attributes to init functions missing them 2013-05-04 22:48:05 +02:00
Michael Niedermayer
4b335e73da Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lls: Do not return from void functions

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-01 13:50:19 +01:00
Michael Niedermayer
7da1b235e8 Merge commit '4da950c0ae224b9b8ef952dadf614be2c050023e'
* commit '4da950c0ae224b9b8ef952dadf614be2c050023e':
  lls: #ifndef --> #if in FF_API_ version guard

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-01 13:44:46 +01:00
Michael Niedermayer
28bb17ca36 Merge commit '399663be9d4a839b894c48a21b62926eb8497d72'
* commit '399663be9d4a839b894c48a21b62926eb8497d72':
  lls: mark max_order as unsigned short

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-01 13:32:16 +01:00
Michael Niedermayer
76f8d3c8d9 Merge commit '9d4da474f5f40b019cb4cb931c8499deee586174'
* commit '9d4da474f5f40b019cb4cb931c8499deee586174':
  lls: move to the private namespace

Conflicts:
	libavutil/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-01 13:19:00 +01:00
Diego Biurrun
0b31129389 lls: Do not return from void functions 2013-03-01 01:44:35 +01:00
Diego Biurrun
4da950c0ae lls: #ifndef --> #if in FF_API_ version guard 2013-03-01 01:44:35 +01:00
Luca Barbato
399663be9d lls: mark max_order as unsigned short
The value is within 0 and 32.

Remove an `array subscript is below array bounds` warning.
2013-02-28 17:39:24 +01:00
Luca Barbato
9d4da474f5 lls: move to the private namespace
The functions are private.
2013-02-28 17:39:24 +01:00
Michael Niedermayer
e10979ff56 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  changelog: misc typo and wording fixes
  H.264: add filter_mb_fast support for >8-bit decoding
  doc: Remove outdated comments about gcc 2.95 and gcc 3.3 support.
  lls: use av_lfg instead of rand() in test program
  build: remove unnecessary dependency on libs from 'all' target
  H.264: avoid redundant alpha/beta calculations in loopfilter
  H.264: optimize intra/inter loopfilter decision
  mpegts: fix Continuity Counter error detection
  build: remove unnecessary FFLDFLAGS variable
  vp8/mt: flush worker thread, not application thread context, on seek.
  mt: proper locking around release_buffer calls.
  DxVA2: unbreak build after [657ccb5ac7]
  hwaccel: unbreak build
  Eliminate FF_COMMON_FRAME macro.

Conflicts:
	Changelog
	Makefile
	doc/developer.texi
	libavcodec/avcodec.h
	libavcodec/h264.c
	libavcodec/mpeg4videodec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-12 01:42:32 +02:00
Mans Rullgard
7ce914fb5a lls: use av_lfg instead of rand() in test program
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-11 21:09:27 +01:00
Michael Niedermayer
58257ea29e Merge remote-tracking branch 'qatar/master'
* qatar/master: (28 commits)
  mp3enc: write a xing frame containing number of frames in the file
  lavf: update AVStream.nb_frames when muxing.
  ffmpeg: remove unused variables from InputStream.
  doc: update ffmpeg -ar and -ac documentation to reflect reality.
  ffmpeg: remove pointless if (nb_input_files)
  ffmpeg: merge input_files_ts_offset into input_files.
  ffmpeg: merge input_codecs into input_streams.
  ffmpeg: drop AV prefixes from struct names.
  ffmpeg: deprecate loop_input and loop_output options
  gif: add loop private option.
  img2: add loop private option.
  AVOptions: in av_opt_find() don't return named constants unless unit is specified.
  x11grab: replace undocumented nomouse hackery with a private option.
  dict: extend documentation.
  lls: whitespace cosmetics
  docs: Use proper markup for a literal command line option
  docs: Remove a remark that isn't relevant any longer
  docs: Explain how to regenerate import libraries with MSVC tools
  docs: Mention that libraries for MSVC can be built with a cross compiler
  docs: Remove old docs that mention setting up a build environment with lib.exe
  ...

Conflicts:
	doc/ffmpeg.texi
	doc/general.texi
	ffmpeg.c
	libavcodec/Makefile
	libavcodec/dnxhddata.c
	libavformat/mp3enc.c
	libavformat/utils.c
	libavutil/Makefile
	tests/copycooker.sh

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-09 02:06:40 +02:00
Mans Rullgard
fdaf1d0640 lls: whitespace cosmetics
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-08 15:03:40 +01:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Diego Biurrun
ba87f0801d Remove explicit filename from Doxygen @file commands.
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.

Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-20 14:45:34 +00:00
Diego Biurrun
3d7b15e450 Remove disabled code cruft.
Originally committed as revision 19616 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-08-10 13:42:16 +00:00
Diego Biurrun
578f90a8d5 Align test program output columns.
Originally committed as revision 18068 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-03-20 11:40:05 +00:00
Diego Biurrun
bad5537e2c Use full internal pathname in doxygen @file directives.
Otherwise doxygen complains about ambiguous filenames when files exist
under the same name in different subdirectories.

Originally committed as revision 16912 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-01 02:00:19 +00:00
Benoit Fouet
bc02bc8686 Remove unused redefinition of av_log for test.
Originally committed as revision 14657 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-07 07:01:54 +00:00
Michael Niedermayer
b8fe8ab062 Fix the following using void* casts, proper casts are less readable and
avoiding casts would be even less readable, but other suggestions are welcome.
lls.c:56: warning: initialization from incompatible pointer type
lls.c:57: warning: initialization from incompatible pointer type

Originally committed as revision 11697 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-01-31 20:52:14 +00:00
Diego Biurrun
38c162c1a9 Remove unused variable variance.
Originally committed as revision 11471 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-01-08 22:48:26 +00:00
Diego Biurrun
f8a80fd69d main() --> main(void)
Originally committed as revision 11079 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-11-23 00:52:56 +00:00
Diego Biurrun
e5a389a1b7 license header consistency cosmetics
Originally committed as revision 9484 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-07-05 10:40:25 +00:00
Diego Biurrun
b78e7197a8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
and fix GPL/LGPL version mismatches.

Originally committed as revision 6577 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 15:30:46 +00:00
Diego Biurrun
538389c981 Fix FSF postal address.
Originally committed as revision 5829 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-26 01:12:26 +00:00
Michael Niedermayer
408ec4e2a6 calculate all coefficients for several orders during cholesky factorization, the resulting coefficients are not strictly optimal though as there is a small difference in the autocorrelation matrixes which is ignored for the smaller orders
Originally committed as revision 5758 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-15 23:43:38 +00:00
Michael Niedermayer
2c779260a9 unneeded #include
Originally committed as revision 5743 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-14 12:01:53 +00:00
Michael Niedermayer
82ab5ad7b2 linear least squares solver using cholesky factorization
Originally committed as revision 5740 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-14 10:03:09 +00:00