When turned on, H264/CAVLC gets ~15% (CVPCMNL1_SVA_C.264) slower for
ultra-high-bitrate files, or ~2.5% (CVFI1_SVA_C.264) for lower-bitrate
files. Other codecs are affected to a lesser extent because they are
less optimized; e.g., VC-1 slows down by less than 1% (all on x86).
The patch generated 3 extra instructions (cmp, cmovae and mov) per
call to get_bits().
The performance penalty on ARM is within the error margin for most
files, up to 4% in extreme cases such as CVPCMNL1_SVA_C.264.
Based on work (for GCI) by Aneesh Dogra <lionaneesh@gmail.com>, and
inspired by patch in Chromium by Chris Evans <cevans@chromium.org>.
* qatar/master:
get_bits: remove A32 variant
avconv: support stream specifiers in -metadata and -map_metadata
wavpack: Fix 32-bit clipping
wavpack: Clip samples after shifting
h264: don't drop B-frames after next keyframe on POC reset.
get_bits: remove useless pointer casts
configure: refactor lists of tests and components into variables
rv40: NEON optimised weak loop filter
mpegts: replace some magic numbers with the existing define
swscale: add unscaled packed 16 bit per component endianess conversion
Conflicts:
libavcodec/get_bits.h
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The A32 bitstream reader variant is only used on ARMv5 and for
Prores due to the larger bit cache this decoder requires.
In benchmarks on ARMv5 (Marvell Sheeva) with gcc 4.6, the only
statistically significant difference between ALT and A32 is
a 4% advantage for ALT in FLAC decoding. There is thus no (longer)
any reason to keep the A32 reader from this point of view.
This patch adds an option to the ALT reader increasing the bit
cache to 32 bits as required by the Prores decoder. Benchmarking
shows no significant change in speed on Intel i7. Again, the
A32 reader fails to justify its existence.
Signed-off-by: Mans Rullgard <mans@mansr.com>
In the case that (frame_flags & 0x03) == 3, hybrid_maxclip
may have had a signed integer overflow.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
It doesn't make much sense to clip pre-shift,
nor is it correct for proper decoding.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The keyframe after a POC reset may not be the first to be returned to
the user. Therefore, don't reset the expected next POC once we return
a keyframe to the user, but once we know that the next frame in the
return-queue is a keyframe.
width and height might get passed as 0 and would cause floating point
exceptions in decode_frame.
Fixes bugzilla #149
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This uses the old demuxing code for OP1a and separate demuxing code for OPAtom.
Timestamp output is added to the old demuxing code.
The seeking code is made to seek to the start of the desired EditUnit only,
from which the normal demuxing code takes over (if OP1a). This means we don't
use delta entries or slices, only StreamOffsets.
OPAtom seeking basically works like before.
This also makes D-10 seeking behave the same way as OP1a and OPAtom. In other
words, we allow seeking before the start or past the end for D-10 too.
This fixes ticket #746.
This changes mxf_compute_ptses() to be used for MXFIndexTable, and also adds
code for computing the fake index to it.
This also temporarily disables PTS computation. A future patch will restore it.