Use sched_getaffinity to determine the number of logical CPUs.
Limits the number of threads to 16 since slice threading of H.264
seems to be buggy with more than 16 threads.
This file does not use anything from get_bits.h but needs
intreadwrite.h.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
The 'fiel' atoms can be found in H.264 tracks clobbering the extradata.
MJPEG supports non field based extradata, and this data should be
preserved when copying.
Also define a codec capability for codecs that can handle
parameters changed externally between decoded packets.
Signed-off-by: Martin Storsjö <martin@martin.st>
On 32-bit OS X with gcc 4.0/4.2 and shared libraries enabled, the ebx register
is not available, but required to assemble the functions.
This reverts commit 8742a4f to a simplified version of the original constraints.
This fixes a deadlock VLC triggered with multithreaded decoding. The
wait forces one of the current waiters to wake and not the thread
which calls pthread_cond_signal() itself.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Interlaced content for most codec requires it.
This patch is a stop-gap pending a serious rework to support
codecs with non 16 pixel macroblocks.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Trailing bits are likely to be non-zero if the NAL unit is truncated.
Clearing the bits make overreads of the bitstream less likely in this
case. Fixes playback of
http://streams.videolan.org/streams/mp4/Mr_MrsSmith-h264_aac.mp4 which
has a forbidden byte sequence of 0x00 0x00 0x00 in it SPS.
Start code emulation prevention is only required in Annex B bytestream
packed NAL units. For other coding formats the size is already known.
Looking for a start code prefix can result in false positives like in
http://streams.videolan.org/streams/mp4/Mr_MrsSmith-h264_aac.mp4
which has a false positive in the SPS.
Width and height might get passed as 0 and would cause floating point
exceptions in decode_frame.
Fixes bugzilla #149
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
This was intended as an optimisation for skipped blocks in MPEG2
P-frames and never used elsewhere. Removing this "optimisation"
speeds up MPEG2 decoding by 1-2% (ARM Cortex-A9).
Signed-off-by: Mans Rullgard <mans@mansr.com>
Previously the decoder only worked if the user had set avctx->pix_fmt
manually. For some reason the libavformat tmv demuxer sets this, so
the problem was not visible in avplay etc.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The buffer splicing relies on the bitstream reader over-reading
the end of the buffer as declared in init_get_bits(), although
more data is actually present. Manually moving the bitstream
boundary after init_get_bits() allows this to work as expected.
Signed-off-by: Mans Rullgard <mans@mansr.com>
When turned on, H264/CAVLC gets ~15% (CVPCMNL1_SVA_C.264) slower for
ultra-high-bitrate files, or ~2.5% (CVFI1_SVA_C.264) for lower-bitrate
files. Other codecs are affected to a lesser extent because they are
less optimized; e.g., VC-1 slows down by less than 1% (all on x86).
The patch generated 3 extra instructions (cmp, cmovae and mov) per
call to get_bits().
The performance penalty on ARM is within the error margin for most
files, up to 4% in extreme cases such as CVPCMNL1_SVA_C.264.
Based on work (for GCI) by Aneesh Dogra <lionaneesh@gmail.com>, and
inspired by patch in Chromium by Chris Evans <cevans@chromium.org>.