1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00
Commit Graph

19334 Commits

Author SHA1 Message Date
Ben Avison
701e8b42e1 vc-1: Optimise parser (with special attention to ARM)
The previous implementation of the parser made four passes over each input
buffer (reduced to two if the container format already guaranteed the input
buffer corresponded to frames, such as with MKV). But these buffers are
often 200K in size, certainly enough to flush the data out of L1 cache, and
for many CPUs, all the way out to main memory. The passes were:

1) locate frame boundaries (not needed for MKV etc)
2) copy the data into a contiguous block (not needed for MKV etc)
3) locate the start codes within each frame
4) unescape the data between start codes

After this, the unescaped data was parsed to extract certain header fields,
but because the unescape operation was so large, this was usually also
effectively operating on uncached memory. Most of the unescaped data was
simply thrown away and never processed further. Only step 2 - because it
used memcpy - was using prefetch, making things even worse.

This patch reorganises these steps so that, aside from the copying, the
operations are performed in parallel, maximising cache utilisation. No more
than the worst-case number of bytes needed for header parsing is unescaped.
Most of the data is, in practice, only read in order to search for a start
code, for which optimised implementations already existed in the H264 codec
(notably the ARM version uses prefetch, so we end up doing both remaining
passes at maximum speed). For MKV files, we know when we've found the last
start code of interest in a given frame, so we are able to avoid doing even
that one remaining pass for most of the buffer.

In some use-cases (such as the Raspberry Pi) video decode is handled by the
GPU, but the entire elementary stream is still fed through the parser to
pick out certain elements of the header which are necessary to manage the
decode process. As you might expect, in these cases, the performance of the
parser is significant.

To measure parser performance, I used the same VC-1 elementary stream in
either an MPEG-2 transport stream or a MKV file, and fed it through avconv
with -c:v copy -c:a copy -f null. These are the gperftools counts for
those streams, both filtered to only include vc1_parse() and its callees,
and unfiltered (to include the whole binary). Lower numbers are better:

                Before          After
File  Filtered  Mean   StdDev   Mean   StdDev  Confidence  Change
M2TS  No        861.7  8.2      650.5  8.1     100.0%      +32.5%
MKV   No        868.9  7.4      731.7  9.0     100.0%      +18.8%
M2TS  Yes       250.0  11.2     27.2   3.4     100.0%      +817.9%
MKV   Yes       149.0  12.8     1.7    0.8     100.0%      +8526.3%

Yes, that last case shows vc1_parse() running 86 times faster! The M2TS
case does show a larger absolute improvement though, since it was worse
to begin with.

This patch has been tested with the FATE suite (albeit on x86 for speed).

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2014-08-04 22:22:54 +02:00
Ben Avison
adf8227cf4 vc-1: Add platform-specific start code search routine to VC1DSPContext.
Initialise VC1DSPContext for parser as well as for decoder.
Note, the VC-1 code doesn't actually use the function pointer yet.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2014-08-04 22:22:54 +02:00
Ben Avison
db7f1c7c5a h264: Move start code search functions into separate source files.
This permits re-use with parsers for codecs which use similar start codes.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2014-08-04 22:22:54 +02:00
Diego Biurrun
990e2f3555 avcodec: Suppress deprecation warnings from DTG code scheduled for removal 2014-08-04 11:08:35 -07:00
Carl Eugen Hoyos
60cbd6ad84 tiff: support reading gray+alpha at 8 bits
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2014-08-04 12:57:39 +01:00
Vittorio Giovara
bcc5f69b33 tiff: support reading gray+alpha at 16 bits 2014-08-04 12:57:38 +01:00
Vittorio Giovara
e64f0bf2d2 png: support reading gray+alpha at 16 bits 2014-08-04 12:57:38 +01:00
Vittorio Giovara
2257165bff png: disable broken MMX/SIMD code for bpp <= 2
The decoder was producing different results when ASM was disabled.
Based on a long debug session with Kostya.
2014-08-04 12:57:38 +01:00
Vittorio Giovara
e96c3b81ca avutil: rename AV_PIX_FMT_Y400A to AV_PIX_FMT_YA8
The rationale is that you have a packed format in form
<greyscale sample> <alpha sample> <greyscale sample> <alpha sample>
and shortening greyscale to 'G' might make one thing about Greenscale instead.
An alias pixel format and color space name are provided for compatibility.
2014-08-04 12:55:08 +01:00
Kieran Kunhya
1ef9e83764 avcodec: Deprecate dtg_active_format field in favor of avframe side-data
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-08-03 15:43:02 -07:00
Diego Biurrun
d0393d79bc huffyuv: Check and propagate function return values
Bug-Id: CVE-2013-0868

inspired by a patch from Michael Niedermayer <michaelni@gmx.at>
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Diego Biurrun <diego@biurrun.de>

CC: libav-stable@libav.org
2014-08-03 15:35:30 -07:00
Diego Biurrun
6234058148 huffyuv: Return proper error codes 2014-08-03 15:18:58 -07:00
Diego Biurrun
3160bdc7f7 huffyuv: Use avpriv_report_missing_feature() where appropriate 2014-08-03 15:18:58 -07:00
Diego Biurrun
b7616f5716 huffyuv: Eliminate some pointless casts 2014-08-03 15:18:58 -07:00
Diego Biurrun
c065f4a0c6 huffyuv: K&R formatting cosmetics 2014-08-03 15:18:58 -07:00
Anton Khirnov
f89d76c103 mpeg4video: Initialize xvididct for all threads
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-08-03 15:18:58 -07:00
Janne Grunau
ac6b95dbc0 aarch64: add ',' between assembler macro arguments where missing
llvm's integrated assembler does not accept spaces as macro argument
delimiter when targeting darwin. Using a explicit delimiter is a good
idea in principle since it makes case like 'macro 4 -2' vs 'macro 4 - 2'
clear.
2014-08-04 00:17:21 +02:00
Diego Biurrun
9f17685dfb avcodec: Deprecate unused defines and options 2014-08-03 03:24:16 -07:00
Diego Biurrun
bad81800bb avcodec: options: Add missing deprecation ifdefs around emu_edge 2014-08-03 03:24:15 -07:00
Diego Biurrun
c697c590fb lcl: Disentangle pointers to input data and decompression buffer
This is cleaner and avoids a cast plus a related const qualifier warning.
2014-08-03 01:29:43 -07:00
Diego Biurrun
df507d5aa0 tiff: Replace deprecated PIX_FMT names by modern ones 2014-08-02 12:54:37 -07:00
Diego Biurrun
7835c24e19 dv: Update DV-profile-related functions to current public API 2014-08-02 12:54:37 -07:00
Diego Biurrun
ffa4d4ef0b ppc: fft: Build AltiVec optimizations in the standard way 2014-08-02 07:40:37 -07:00
Vittorio Giovara
7ab551f9fd h264: prevent theoretical infinite loop in SEI parsing
Properly address CVE-2011-3946 and parse bitstream as described in the spec.

CC: libav-stable@libav.org
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
2014-08-01 13:08:32 +01:00
Vittorio Giovara
92a36a6b33 pngdec: correctly indent macros 2014-08-01 13:07:53 +01:00
Diego Biurrun
4da8cdbb91 tscc: Eliminate pointless variable indirections in decode_frame() 2014-08-01 04:08:46 -07:00
Diego Biurrun
5735552f1f pngenc: Drop pointless pointer cast in png_write_row() 2014-08-01 04:08:45 -07:00
Diego Biurrun
a786c8259d idct: Split off Xvid IDCT
The Xvid IDCT is only required to decode some Xvid-encoded MPEG-4 files,
so there is no point in having it as an unconditional part of idctdsp.
2014-08-01 01:25:18 -07:00
Diego Biurrun
03c9f357a4 ppc: idctdsp: Immediately return if no AltiVec is available
This is how all the other init functions operate.
2014-08-01 01:23:11 -07:00
Michael Niedermayer
d98e6c5d5d pgssubdec: Check RLE size before copying
Make sure the buffer size does not exceed the expected
RLE size.

Prevent an out of array bound write.

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Bug-Id: CVE-2013-0852

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2014-08-01 02:13:32 +02:00
Nidhi Makhijani
ccbf370f20 mpegvideo: move vol_control_parameters to the only place it is used
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-07-29 09:13:18 -07:00
Diego Biurrun
019a28cd63 sanm: Use correct printf conversion specifiers for POSIX int types 2014-07-28 13:19:04 -07:00
Diego Biurrun
4f8cf0dc4e x86: build: Restore ordering of OBJS lines 2014-07-28 13:19:04 -07:00
Anton Khirnov
e76f2d1197 hevc: eliminate the last element from TransformTree
Replace it by passing an additional parameter to transform_unit()
2014-07-28 08:10:35 +00:00
Anton Khirnov
4aa80808bc hevc: eliminate unnecessary cbf_c{b,r} arrays
They are replaced by passing additional parameters to the transform
functions.
2014-07-28 08:09:18 +00:00
Anton Khirnov
0daa255463 hevc: do not store the transform inter_split flag in the context
It does not need to be preserved.
2014-07-28 08:05:47 +00:00
Anton Khirnov
53a11135f2 hevc: simplify splitting the transform tree blocks 2014-07-28 08:04:19 +00:00
Anton Khirnov
e36a2f4c52 hevc: eliminate an unnecessary array
We do not need to store the value of the split flag.
2014-07-28 08:03:53 +00:00
Anton Khirnov
4b169321b8 codec_desc: fix some typos in long codec names
The rv20 typo spotted by Hendrik Leppkes <h.leppkes@gmail.com>
2014-07-28 08:03:13 +00:00
Anton Khirnov
c5fca0174d lavc: add a property for marking codecs that support frame reordering 2014-07-28 08:02:50 +00:00
Anton Khirnov
541427ab4d eamad: use the bytestream2 API instead of AV_RL
This is safer and possibly fixes invalid reads on truncated data.

CC:libav-stable@libav.org
2014-07-27 07:10:54 +00:00
Diego Biurrun
53abe32409 avcodec: Mark argument in av_{parser|hwaccel|bitstream_filter}_next as const 2014-07-26 14:51:16 -07:00
Pierre Edouard Lepere
1a880b2fb8 hevc: SSE2 and SSSE3 loop filters
Additional contributions by James Almer <jamrial@gmail.com>,
Carl Eugen Hoyos <cehoyos@ag.or.at>, Fiona Glaser <fiona@x264.com> and
Anton Khirnov <anton@khirnov.net>

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-07-26 15:01:01 +00:00
Anton Khirnov
73bb8f61d4 hevcdsp: remove an unneeded variable in the loop filter
beta0 and beta1 will always be the same
2014-07-26 15:00:11 +00:00
Diego Biurrun
d8520d3ee0 mpegvideo: Move QMAT_SHIFT* defines to the only place they are used 2014-07-25 12:00:53 -07:00
Diego Biurrun
4fbb62a21b mpegvideo: Move ME_MAP_* defines to the only place they are used 2014-07-25 12:00:53 -07:00
Diego Biurrun
ff85334375 mpegvideo: Drop unused MPEG_BUF_SIZE and CHROMA_444 defines 2014-07-25 12:00:52 -07:00
Diego Biurrun
165e9df195 fft-test: Pass the right struct members instead of casting 2014-07-25 06:54:37 -07:00
Diego Biurrun
58e65e44f4 vc1dsp: Add wrappers for {avg|put}_vc1_mspel_mc00_c
This avoids invoking the wrapped functions with too many arguments.
2014-07-25 02:52:54 -07:00
Diego Biurrun
7fb993d338 qpeldsp: Mark source pointer in qpel_mc_func function pointer const 2014-07-25 02:52:54 -07:00