Start code emulation prevention is only required in Annex B bytestream
packed NAL units. For other coding formats the size is already known.
Looking for a start code prefix can result in false positives like in
http://streams.videolan.org/streams/mp4/Mr_MrsSmith-h264_aac.mp4
which has a false positive in the SPS.
Width and height might get passed as 0 and would cause floating point
exceptions in decode_frame.
Fixes bugzilla #149
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
This was intended as an optimisation for skipped blocks in MPEG2
P-frames and never used elsewhere. Removing this "optimisation"
speeds up MPEG2 decoding by 1-2% (ARM Cortex-A9).
Signed-off-by: Mans Rullgard <mans@mansr.com>
Previously the decoder only worked if the user had set avctx->pix_fmt
manually. For some reason the libavformat tmv demuxer sets this, so
the problem was not visible in avplay etc.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The buffer splicing relies on the bitstream reader over-reading
the end of the buffer as declared in init_get_bits(), although
more data is actually present. Manually moving the bitstream
boundary after init_get_bits() allows this to work as expected.
Signed-off-by: Mans Rullgard <mans@mansr.com>
When turned on, H264/CAVLC gets ~15% (CVPCMNL1_SVA_C.264) slower for
ultra-high-bitrate files, or ~2.5% (CVFI1_SVA_C.264) for lower-bitrate
files. Other codecs are affected to a lesser extent because they are
less optimized; e.g., VC-1 slows down by less than 1% (all on x86).
The patch generated 3 extra instructions (cmp, cmovae and mov) per
call to get_bits().
The performance penalty on ARM is within the error margin for most
files, up to 4% in extreme cases such as CVPCMNL1_SVA_C.264.
Based on work (for GCI) by Aneesh Dogra <lionaneesh@gmail.com>, and
inspired by patch in Chromium by Chris Evans <cevans@chromium.org>.
The A32 bitstream reader variant is only used on ARMv5 and for
Prores due to the larger bit cache this decoder requires.
In benchmarks on ARMv5 (Marvell Sheeva) with gcc 4.6, the only
statistically significant difference between ALT and A32 is
a 4% advantage for ALT in FLAC decoding. There is thus no (longer)
any reason to keep the A32 reader from this point of view.
This patch adds an option to the ALT reader increasing the bit
cache to 32 bits as required by the Prores decoder. Benchmarking
shows no significant change in speed on Intel i7. Again, the
A32 reader fails to justify its existence.
Signed-off-by: Mans Rullgard <mans@mansr.com>
In the case that (frame_flags & 0x03) == 3, hybrid_maxclip
may have had a signed integer overflow.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
It doesn't make much sense to clip pre-shift,
nor is it correct for proper decoding.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The keyframe after a POC reset may not be the first to be returned to
the user. Therefore, don't reset the expected next POC once we return
a keyframe to the user, but once we know that the next frame in the
return-queue is a keyframe.
The mode is set in libgsm_decode_init, but the decoder
object is simply destroyed and recreated in the flush
function - therefore the mode has to be set again.
This fixes playback using the libgsm_ms decoder in avplay.
Signed-off-by: Martin Storsjö <martin@martin.st>
v410 is a packed 10-bit 4:4:4 YCbCr format used in
QuickTime.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
This patch is a generalization of what Michael Niedermayer
fixed in a single case.
The wmv8-drm fate test had been updated accordingly.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The change in 599b4c6ef didn't turn out to work properly on
i386 on OS X, where it broke building with PIC enabled.
Signed-off-by: Martin Storsjö <martin@martin.st>
Make the function prototype match the argument of
AVCodecCntext.execute() and remove the cast hiding
this mismatch.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This replaces the explicit offset(reg) memory references with
"m" operands for the same locations. As a result, one fewer
register operand is needed for these inline asm statements.
Signed-off-by: Mans Rullgard <mans@mansr.com>
When the buf and last pointers are equal, the FFSWAP() results
in an invalid call to memcpy() with same source and destination
on some targets. Although assigning a struct to itself is valid
C99, gcc does not check for this before calling memcpy().
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667
Signed-off-by: Mans Rullgard <mans@mansr.com>
The existing functions defined in intfloat_readwrite.[ch] are
both slow and incorrect (infinities are not handled).
This introduces a new header with fast, inline conversion
functions using direct union punning assuming an IEEE-754
system, an assumption already made throughout the code.
The one use of Intel/Motorola extended 80-bit format is
replaced by simpler code sufficient under the present
constraints (positive normal values).
The old functions are marked deprecated and retained for
compatibility.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Return the whole packet as consumed in this case and not the size the
packet should have had. Move the insufficient data check into the for
condition to fix a ISO C90 error on bigendian.
This groups the encode/decode parts under single ifdefs and
eliminates the encode_init() function as it merely calls
common_init(). Also fix whitespace in moved code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Earlier, bits per sample was defined as 8, since
bits_per_coded_sample was used to indicate whether to ignore
the lower bits of the codeword, having values 6, 7 or 8.
g722 encodes 2 samples into one byte codeword, therefore the
bits per sample is 4. By changing this, the generated timestamps
for streams encoded with g722 become correct.
This makes timestamp generation for g722 data correct (both when
encoding and when demuxing from raw g722 files).
Signed-off-by: Martin Storsjö <martin@martin.st>
This avoids using bits_per_coded_sample for this information.
bits_per_coded_sample should be 4 for this codec normally,
since two samples are encoded into one 8 bit codeword.
In principle, this might be info that needs to be passed from
a demuxer, and in that case, a private AVOption isn't the best
choice, but no such samples are available at the moment, so
that use case is purely theoretical at the moment.
Signed-off-by: Martin Storsjö <martin@martin.st>
When decoding lossy WavPack samples, they are supposed
to be clipped, in order to be decoded correctly.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Pass the correct size in bits to mpeg4audio_get_config and add a flag
to disable parsing of the sync extension when the size is not known.
Latm with AudioMuxVersion 0 does not specify the size of the audio
specific config. Data after the audio specific config can be
misinterpreted as sync extension resulting in random and wrong configs.
Add AV_NUM_DATA_POINTERS to simplify the bump transition.
This will allow for supporting more planar audio channels without having to
allocate separate pointer arrays.
There's no reason to use two arrays for this.
Based off commit 2fea60c600
to FFmpeg by Michael Niedermayer.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The next call to decode() will update from an invalid index, which will
either lead to a memcpy() where dest==src (2 threads), or lead to a
crash (>2 threads).
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Interlaced videos can contain progressive frames too and now wrong scantable
is selected for them.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This fixes a compile error on mingw32 when using p->thread
directly (as if it were a pointer) to track thread existence,
because the type is opaque and may be a non-pointer.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The way these values are used, they should have an unsigned type.
A similar change was made for mpegvideo in cb66847.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This simplifies the decoder so it doesn't have to process an in-packet header
or handle arbitrary-sized packets. It also fixes decoding of files with large
headers.
Instead of using fixed coefficients, the correct way is to calculate the
coefficients using the highpass cutoff frequency from the ADX stream header
and the sample rate.
This multiplication can overflow the signed range but not the
unsigned. After right-shifting it will thus fit in the signed
range again.
Signed-off-by: Mans Rullgard <mans@mansr.com>
It makes more sense for a bit mask to use an unsigned type.
The change should be source and binary compatible on all
supported systems, hence micro version bump.
Fixes a few invalid shifts.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This is a hand-tuned version of the code with impossible parts of
the FASTDIV function ommitted.
2-5% faster overall on Cortex-A8.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The initial values are not checked against the number of block sizes.
Initializing them to frame_len_bits will result in a block size index of 0
in these cases instead of something that might be out-of-range.
Fixes Bug 81.
This prevents build errors when compiler and assembler default
targets differ. Ideally each file would declare the highest
level it requires. This is however not easily possible as it
complicates assembling pre-armv6t2 code in Thumb-2 mode.
HAVE_NEON is used as indicator for ARMv7-A since no other
symbol exists for this and NEON is only available in this
variant.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Adding the thread count in frame level multithreading to has_b_frames
as an additional delay causes more problems than it solves.
For example inconsistent behaviour during timestamp calculation in
libavformat.
Thread count and frame level multithreading are both set by the user.
If the additional delay caused by frame level multithreading needs
to be considered in the calling code it has all information to take
it into account.
Should it become necessary to calculate a maximum delay inside
libavcodec it should be exported as its own field and not reusing
an existing field.
Based on a patch by Michael Niedermayer.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
A new field, AVCodecContext.internal is used to hold a new struct
AVCodecInternal, which has private fields that are not codec-specific and are
used by general libavcodec functions.
Moved internal_buffer, internal_buffer_count, and is_copy.
libavcodec/options.c:583: warning: assignment discards qualifiers from pointer
target type
libavcodec/options.c:589: warning: initialization discards qualifiers from
pointer target type