Quite often, the original weights are multiple of 512. By prescaling them
by 1/512 when they are computed (once per frame), no intermediate shifting
is needed, and no prescaling on each call either.
The x86 code already used that trick.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
VASliceParameterBufferMPEG2.slice_vertical_position shall express
the slice vertical position from the original bitstream. The HW
decoder will correctly decode to the right line computed from the
appropriate top_field_first and is_first_field flags.
This patch aligns with DXVA's definition, which is what most HW and
drivers expect. In particular, Intel PowerVR (Cedarview et al.) and
NVIDIA (through VA-to-VDPAU layer). Since it looks more complex to fix
binary drivers, I aligned the Intel Gen driver (Sandy Bridge et al.)
to this behaviour, while maintaining compatibility with codec layers
not providing this patch yet.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
If user opted to present fields as they come, then the first field
picture needs to be submitted to the HW for decoding. In particular,
this fixes MPEG-2 decoding of interlaced streams.
Tested on Intel Cedar Trail, Sandy Bridge and Ivy Bridge platforms.
Someone reported on the ffmpeg-devel@ list this also works on DXVA
(Windows) and other Linux platforms (NVIDIA, through the VA wrapper).
This also means a similar patch to non-hwaccel VDPAU may be necessary.
Note: I believe the SLICE_FLAG_ALLOW_FIELD is useless since the first
field shall always be submitted to the HW anyway. Nobody uses HW accels
(dxva, vaapi, vdpau, etc.) without that flag though.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Also break some long lines, remove codec function placeholder comments
and add spaces in sample/pixel format lists.
Signed-off-by: Martin Storsjö <martin@martin.st>
This fixes the warning:
libavcodec/aacenc.c:524: warning: passing argument 2 of ‘deinterleave_input_samples’ discards qualifiers from pointer target type
pthread_cond_wait is supposed to return an integer,
and indeed does sometimes. Fix its function declaration
to match its behavior and POSIX.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Fixes a floating-point exception further down.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
The square is always passed as 1 whenever the function is called and
thus the if block never gets executed.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
SHOW_UBITS() is only defined up to n_bits is 25, therefore forbid
values larger than this in get_vlc2() (max_bits). tokens[][] can be
used as an index in deltas[], which has a size of 64, so ensure the
values are smaller than that.
This prevents crashes on corrupt bitstreams.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Looks like some LAME versions produce dual stereo mode MP3s with
flags for intensity and middle stereo set. In this mode those flags
should be ignored like the reference decoder and derived ones do.
A call to decode_packet() does not always decode a complete WMA packet.
Moreover, this is not the correct place to document calls that are part
of the public API.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Also use correct buffer sizes in calls to tm2_read_stream(). Together,
this prevents overreads.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Version from vqa header does not dictate which sound chunks may
appear in file.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
Also remove some write-only variables or write-only variable
assignments, remove internal colorspace conversion to native
endianness (that can be done by swscale much more efficiently),
and some cosmetics.
Prevents running error resilience on a previous frame which will write
to the pic->mb_type[] array of the previous image. The array might
already be re-used for a new image in a subsequent thread, thus cause
two threads to write to the same pic->mb_type[] array, causing a race
condition which can crash in rv34_decode_cbp(), called by
rv34_decode_inter_mb_header() (which accesses mb_type[] twice,
assuming values are maintained, which the race condition breaks).
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Before this, they were only added to the delayed release queue and not
freed until later. This could lead to unnecessary memory use or buffer
exhaustion.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
lf_delta.ref[i] and lf_delta.mode[i] were incorrectly reset to 0 if
specific delta value was not updated. Fixed to keep the previous value
if flag indicates that element in question is not updated.
Signed-off-by: Janne Salonen <jsalonen@google.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This makes sure the reset flag gets set when SBR gets turned back on
and sets control variables for unguided mode back to their defaults.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
If the next header frame codes zero envelopes the previous frame's
values will be used. Consequently the invalid values must be cleared.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Prevents a signflip in the counter, and a subsequent crash because of
overreads/overwrites.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Correct handling of errors to prevent hags or crashes is very complex
otherwise.
The frame initializing is also moved from decode_slice() to
decode_frame() for clarity.
Takes encoder delay into account by comparing first the coded page
duration with the calculated page duration. Handles last packet duration
if needed, also by comparing coded duration with calculated duration.
Also does better handling of timestamp generation for packets in the
first page for streamed ogg files where the start time is not
necessarily zero.
This change avoids accessing the segment map of the previous frame if
segmentation is not enabled for the current frame. The caller of
decode_mb_mode() only calls ff_thread_await_progress() on the reference
segmentation index array if segmentation is enabled, so Chromium's TSAN
will report a race when accessing this data while segmentation is not
enabled.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
An obscure Japanese lossless video codec, originally intended
for use with a remote desktop application.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
It may have improved cross-platform stability, but wasn't the only place in
the encoder with bitexact issues. It is no longer needed because we have FATE
tests for float encoders using fuzzy comparison.
Calling avcodec_flush_buffers() and then avcodec_decode_video2() with
a 0-sized packet (to get remaining buffered frames) could incorrectly
return an old frame from before the avcodec_flush_buffers() call. Add
a loop in ff_thread_flush() to zero the got_frame field of each thread
to ensure the old frames will not be returned.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
If the dummy frame are not created from a reference frame they could
be deleted untimely resulting in multithreaded decoder waiting on
the current frame to finish.
Noticed by Ronald S. Bultje in the RV34 decoder with a broken file.
If decoding a second complementary field, and the first was
decoded in our thread, mark decoding of that field as complete.
If decoding fails, mark the decoded field/frame as complete.
Do not allow switching between field modes or field/frame mode
between slices within the same field/frame. Ensure that two
subsequent fields cover top/bottom (rather than top/frame,
bottom/frame or such nonsense situations).
Fixes various deadlocks when decoding samples with errors in
reference frames.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Progressive images can have only 16 references, error out if there are
more, since the data is almost certainly corrupt, and the invalid value
will lead to random crashes or invalid writes later on.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Interlaced images can have 32 references (16 per field), so limiting the
array size to 16 leads to invalid writes.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
The safe bitstream reader broke it since the buffer size was specified
in bytes instead of bits.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
CC: libav-stable@libav.org
Parsing the entire NAL as SPS fixes decoding of some AVC bitstreams
with broken escaping. Since the size of the NAL unit is known and
checked against the buffer end we can parse it entirely without buffer
overreads.
Fixes playback of
http://streams.videolan.org/streams/mp4/Mr_MrsSmith-h264_aac.mp4
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
This reverts commit 729ebb2f18.
There was an off-by-one error in the bit mask calculation clearing
actually the last valid bit and causing
http://bugzilla.libav.org/show_bug.cgi?id=227
The broken sample (Mr_MrsSmith-h264_aac.mp4) the commit was fixing
does not work after correcting the off-by-one error.
CC: libav-stable@libav.org
The were broken since August of 2010 without anyone noticing until
three weeks ago. Nobody cares about it anymore and hopefully Marvell
will support NEON like in the PXA978 from now on.
MPC8 allows indices of mpc_CC up to -1, and mpc_SCF up to -6, thus pad
the tables by that much on the left end.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
An unpaired SCE preceding a CPE only makes sense for front SCEs
preceding the first CPE.
Split from FFmpeg commit a8d67efa53
Signed-off-by: Alex Converse <alex.converse@gmail.com>
Set the element to channel vector (e2c_vec) size to be the maximum
number of aac channel elements. This makes it slightly larger than it
needs to be because CCEs are never mapped to output channel locations.
Also add a check that all input tags (legal or not) will fit.
Split from FFmpeg commit a8d67efa53
Signed-off-by: Alex Converse <alex.converse@gmail.com>
Matroska demuxer needs to recreate tta header, so just display
crc error without aborting.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This fixes some invalid memory access caused later in the function
by res_chan[] not being set for all channels. This happens when a
channel doesn't appear a submap. This change simply returns a
decoder error when this situation is detected.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
We slightly overread the input buffer, so we require
padding at the end of the buffer, as is documented in the
get_bits API. Without padding, we'll read uninitialized
data or beyond the end of the .rodata, which may crash.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
The codec would keep returning the last decoded frame if the stream
contains B-frames, since it wouldn't clear that frame from the list of
frames to be returned to the user.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Since the values are floats, using the float operations
makes sense, improves performance on some CPUs and
makes the code SSE compatible instead of needing SSE2.
Based on suggestion by Jason.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
There is only one caller, which does not need the shifting. Other use cases
are situations where different roundings would be needed.
The x86 and neon versions are modified accordingly.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The length is even, so some unrolling can be performed. Timings are for x86:
- 32bits: 102c -> 82c
- 64bits: 82c -> 69c
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This was an incorrect copy-and-paste to a code not needing the original code.
Spotted by Jason in a previous review but forgotten in the commit.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>