FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-08-15 14:13:16 +02:00

Go to file

Ben Avison a0d7f9ec9a vc-1: Optimise parser (with special attention to ARM)

The previous implementation of the parser made four passes over each input
buffer (reduced to two if the container format already guaranteed the input
buffer corresponded to frames, such as with MKV). But these buffers are
often 200K in size, certainly enough to flush the data out of L1 cache, and
for many CPUs, all the way out to main memory. The passes were:

1) locate frame boundaries (not needed for MKV etc)
2) copy the data into a contiguous block (not needed for MKV etc)
3) locate the start codes within each frame
4) unescape the data between start codes

After this, the unescaped data was parsed to extract certain header fields,
but because the unescape operation was so large, this was usually also
effectively operating on uncached memory. Most of the unescaped data was
simply thrown away and never processed further. Only step 2 - because it
used memcpy - was using prefetch, making things even worse.

This patch reorganises these steps so that, aside from the copying, the
operations are performed in parallel, maximising cache utilisation. No more
than the worst-case number of bytes needed for header parsing is unescaped.
Most of the data is, in practice, only read in order to search for a start
code, for which optimised implementations already existed in the H264 codec
(notably the ARM version uses prefetch, so we end up doing both remaining
passes at maximum speed). For MKV files, we know when we've found the last
start code of interest in a given frame, so we are able to avoid doing even
that one remaining pass for most of the buffer.

In some use-cases (such as the Raspberry Pi) video decode is handled by the
GPU, but the entire elementary stream is still fed through the parser to
pick out certain elements of the header which are necessary to manage the
decode process. As you might expect, in these cases, the performance of the
parser is significant.

To measure parser performance, I used the same VC-1 elementary stream in
either an MPEG-2 transport stream or a MKV file, and fed it through ffmpeg
with -c:v copy -c:a copy -f null. These are the gperftools counts for
those streams, both filtered to only include vc1_parse() and its callees,
and unfiltered (to include the whole binary). Lower numbers are better:

                Before          After
File  Filtered  Mean   StdDev   Mean   StdDev  Confidence  Change
M2TS  No        861.7  8.2      650.5  8.1     100.0%      +32.5%
MKV   No        868.9  7.4      731.7  9.0     100.0%      +18.8%
M2TS  Yes       250.0  11.2     27.2   3.4     100.0%      +817.9%
MKV   Yes       149.0  12.8     1.7    0.8     100.0%      +8526.3%

Yes, that last case shows vc1_parse() running 86 times faster! The M2TS
case does show a larger absolute improvement though, since it was worse
to begin with.

This patch has been tested with the FATE suite (albeit on x86 for speed).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

2014-04-25 02:36:29 +02:00

compat

Merge remote-tracking branch 'qatar/master'

2014-03-10 12:05:02 +01:00

doc

Merge commit 'e2834567d73bd1e46478ba67ac133cb8ef5f50fd'

2014-04-23 20:47:13 +02:00

libavcodec

vc-1: Optimise parser (with special attention to ARM)

2014-04-25 02:36:29 +02:00

libavdevice

avdevice/qtkit: fix include

2014-04-24 05:12:07 +02:00

libavfilter

Fix vf_eq.c and vf_eq2.c compilation with !HAVE_6REGS.

2014-04-24 17:50:27 +02:00

libavformat

Merge remote-tracking branch 'cehoyos/master'

2014-04-25 01:56:23 +02:00

libavresample

Merge commit 'a24a252709dd38f12aa4929ce4981f87091a5113'

2014-04-25 01:19:27 +02:00

libavutil

Merge remote-tracking branch 'cehoyos/master'

2014-04-25 01:56:23 +02:00

libpostproc

Fix libpostproc compilation with !HAVE_6REGS.

2014-04-24 17:50:02 +02:00

libswresample

swresample: fix AV_CH_LAYOUT_STEREO_DOWNMIX input

2014-04-24 01:25:46 +02:00

libswscale

Fix compilation with !HAVE_6REGS.

2014-04-19 09:56:01 +02:00

presets

presets: specify the codecs.

2012-05-04 18:40:36 +02:00

tests

fate: Add fic-in-avi test

2014-04-24 22:01:33 +01:00

tools

tools/uncoded_frame: fix audio codec generation

2014-03-29 09:25:14 +01:00

.gitignore

examples: rename avcodec.c to decoding_encoding.c

2014-04-23 10:32:42 +02:00

arch.mak

Merge commit '8675bcb0addb1c7fb0b04682d1f3f95d5b8dae14'

2014-04-07 02:15:18 +02:00

Changelog

Merge commit 'e2834567d73bd1e46478ba67ac133cb8ef5f50fd'

2014-04-23 20:47:13 +02:00

cmdutils_common_opts.h

Allow hiding the banner.

2013-12-29 22:57:20 +01:00

cmdutils_opencl.c

cmdutils & opencl: add -opencl_bench option to test and show available OpenCL devices

2013-12-09 21:21:36 +01:00

cmdutils.c

cmdutils: use av_mallocz_array()

2014-04-08 15:44:32 +02:00

cmdutils.h

Merge commit '85698be461c07be10d873dd34348bcfe9ffc56e0'

2014-03-29 14:33:39 +01:00

common.mak

lavd: Add QTKit input device.

2014-03-30 20:45:07 +02:00

configure

configure: Fix ld flags when rpath is enabled.

2014-04-24 03:03:45 +02:00

COPYING.GPLv2

…

COPYING.GPLv3

…

COPYING.LGPLv2.1

cosmetics: Delete empty lines at end of file.

2012-02-09 12:26:45 +01:00

COPYING.LGPLv3

…

CREDITS

CREDITS: redirect to Git log, remove current outdated content

2013-01-31 18:02:52 +01:00

ffmpeg_filter.c

Merge remote-tracking branch 'qatar/master'

2013-11-24 05:21:19 +01:00

ffmpeg_opt.c

ffmpeg_opt: check that a subtitle encoder is available before auto mapping streams

2014-03-16 15:15:02 +01:00

ffmpeg_vdpau.c

Merge commit '7671dd7cd7d51bbd637cc46d8f104a141bc355ea'

2013-11-23 14:46:48 +01:00

ffmpeg.c

Merge commit '1ae8198bca749a0cff205196cc83d35b9962849b'

2014-04-22 13:45:34 +02:00

ffmpeg.h

Merge commit '4754345027eb85cfa51aeb88beec68d7b036c11e'

2014-03-24 16:40:35 +01:00

ffplay.c

avformat: add av_format_inject_global_side_data(), and disable it by default

2014-04-15 02:37:40 +02:00

ffprobe.c

ffprobe: fix scaling of vali in value_string() in case -prefix is selected

2014-04-23 10:32:42 +02:00

ffserver.c

ffserver: don't hardcode RTSP status codes

2014-04-07 00:24:00 -03:00

INSTALL

…

library.mak

Merge commit 'b339182eba34f28de5f1a477cdd2c84f1ef35d90'

2014-02-17 02:22:01 +01:00

LICENSE

Add libx265 encoder

2014-02-12 13:13:17 +00:00

MAINTAINERS

MAINTAINERS: Add myself as FIC maintainer

2014-04-21 21:27:32 -04:00

Makefile

Merge commit '8675bcb0addb1c7fb0b04682d1f3f95d5b8dae14'

2014-04-07 02:15:18 +02:00

README

README: be a tiny bit more verbose

2012-04-06 10:23:26 +02:00

RELEASE

Prepare for 11_alpha1 Release

2014-03-13 08:24:11 -04:00

version.sh

version.sh: add preprocessing guards

2013-11-30 21:42:03 +01:00

README

FFmpeg README
-------------

1) Documentation
----------------

* Read the documentation in the doc/ directory in git.
  You can also view it online at http://ffmpeg.org/documentation.html

2) Licensing
------------

* See the LICENSE file.

3) Build and Install
--------------------

* See the INSTALL file.

Languages

C 90.1%

Assembly 7.9%

Makefile 1.3%

C++ 0.2%

Objective-C 0.2%

Other 0.1%