Altivec can only load naturally aligned vectors. To handle possibly
unaligned data a second vector is loaded from an offset of the original
location and the data is recovered through a vector permutation.
Overreads are minimal if the offset for second load points to the last
element of data. This is 7 for loading eight 8-bit pixels and overreads
are reduced from 16 bytes to 8 bytes if the pixels are 64-bit aligned.
For unaligned pixels the overread is reduced from 23 bytes to 15 bytes
in the worst case.
The official Ut Video decoder only threads with slices, thus until
now any files encoded by the libavcodec encoder have only been
decodable with a single thread. The default slice count is now
set to subsampled_height / 120.
Also sets slices to 1 for the Ut Video encoder tests to keep them
green.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
User data is usually coded before slice data. That means the frame
the user data belongs to is not available while parsing the user data.
The stereo3D side data has to use the same indirection over the private
context as pan scan information and A53 captions.
Bug-Id:632
AVFrame.sample_rate is set in ff_get_buffer, but aacdec calls
ff_get_buffer before the samplerate is known. So it needs to be
set again before returning the frame.
It was done only in check_mvset(), while mv_scale() is called also by
dist_scale().
Sample-Id: 00001579-google
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
When request_channel_layout is 0,
all substreams should be decoded.
Thanks to Michael Niedermayer for spotting.
Also fix a mismatch between the parser and
decoder when request_channel_layout is a
subset of Stereo.
Don't decode further substreams if request_channel_layout
is a subset of the current substream's channel_layout.
Before, we would only discard further substreams if
request_channel_layout matched the substream's
channel_layout extactly, thus decoding additional
channels which the caller would probably end up downmixing.
The x86 runs short on registers because numerous elements are not static.
In addition, splitting them allows more optimized code, at least for x86.
Arm asm changes by Janne Grunau.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
For the callable function (as opposed to the inline one):
C SSE SSE2 SSE4
Win32: 47 42 29 26
Win64: 30 33 25 23
The SSE version is neither compiled nor set for ARCH_X86_64, as the
inlinable function takes over.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.
Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.
On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
Also adjust header #include order and some comments.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Prevents using GetBitContexts with data from previous calls.
Fixes access to freed memory.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
With cli usage the decoder might have not set the colorspace during
encoder init, manual colorspace override might be needed in such
cases.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
A dependent slice cannot have address 0.
Prevent an out of array bound load in ff_hevc_cabac_init().
Sample-Id: 00001406-google
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
According to my understanding of T-REC-H.265-2013044 chapter 8.6.1.
Sample-Id: 00001438-google
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Makes it easier to recreate an AVCodecContext for ATRAC3+ decoding,
which is needed in multimedia frameworks, as well as in general cases
where demuxing and decoding are separate entities.
Should fix crashes or corrupt output on pre-SSE2 CPUs when they were
using SSE2-code (e.g. AMD Athlon XP 2400+ or Intel Pentium III) in
hfix or hvar single-edge (left/right) extension functions.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
- The memcpy was completely wrong because
s->prob_ctx[s->framectxid].coef is a [4][2][2][6][6][3] array, whereas
s->prob.coef is a [4][2][2][6][6][11] array.
- The additional check was committed to ffmpeg by Ronald S. Bultje.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This handles macroblock edges for the chroma components in the same way
as for the luma compoment for 4:4:4 streams. The Spec explicitly states
that the deblocking filter is not applied to edges at the boundary of
the picture.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
The default get_buffer2() implementation (and possibly some
user ones) does not allocate edges when this flag is set, which may
expose bugs in some decoders. Until the 10 release is out, it is safer
to remove this part.
The T-REC-H.265-2013044 page 79 states it has to be in the range
[-s->sps->qp_bd_offset, 51].
Sample-Id: 00001386-google
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
The src buffer should only contain values in the interval
[0, (1 << BIT_DEPTH) - 1].
Since shift = (BIT_DEPTH - 5), src[x] >> shift must be in
the interval [0, 31], so no clip is needed.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The reconstructed picture should always be clipped (see section 8.6.5),
previously we did not clip coding units where
cu_transquant_bypass_flag == 1.
Sample-Id: 00001325-google
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
And use unsigned datatypes.
Otherwise it would overflow.
Sample-Id: 00001315-google
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
This matches how its done for SPS/PPS.
Fixes null pointer dereference.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Fixes an issue where the B-frame coding mode switches from interlaced
fields to interlaced frames, causing incorrect decisions in the motion
compensation code and resulting in visual artifacts.
CC: libav-stable@libav.org
Signed-off-by: Tim Walker <tdskywalker@gmail.com>
An invalid VUI is not considered a fatal error, so the SPS containing it
may still be used. Leaving an invalid value of num_reorder_frames there
can result in writing over the bounds of H264Context.delayed_pic.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
Otherwise the ER code might try to use some already freed references.
Fixes possible access to freed memory.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
Higher modes are not allowed for 16x16/chroma, which is what this
function is used for. Otherwise this function would return 0 (vertical
prediction) for invalid higher modes, which could result in invalid
reads.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
In this case we may not have a current frame, while first_field being
set implies we do.
Fixes invalid reads.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
They are not measurably faster on x86, they might be somewhat faster on
other platforms due to missing emu edge SIMD, but the gain is not large
enough to justify the added complexity.
They are not measurably faster on x86, they might be somewhat faster on
other platforms due to missing emu edge SIMD, but the gain is not large
enough to justify the added complexity.
Several decoders disable those anyway and they are not measurably faster
on x86. They might be somewhat faster on other platforms due to missing
emu edge SIMD, but the gain is not large enough (and those decoders
relevant enough) to justify the added complexity.
The function macro always sets .align 2 before declaring the
function label (since 5c5e1ea3) and always sets the section to
.text (since 278caa6a).
The .align 5 before certain functions, added in fc252eba, were added
before .text and .align were added to the function macro and thus
became useless/unused when the function macro got them.
This restores the original intention, to align the loop entry
points.
Signed-off-by: Martin Storsjö <martin@martin.st>
This file no longer uses the pld instruction at all, all such uses
have been split into hpeldsp_arm.S.
Signed-off-by: Martin Storsjö <martin@martin.st>
There is no point in delaying the check and it avoids bugs with a
half-initialized context.
Fixes invalid reads.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
They end up overwriting past the line end.
Partially based on a patch by Michael Niedermayer <michaelni@gmx.at>
Bug-Id: vlc/9700
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The decoder currently sets CODEC_FLAG_EMU_EDGE and relies on
get_buffer2() to always provide buffers with linesize == 2 * width.
This is wrong, since we place no such restriction on get_buffer2()
implementations.
Fix this by decoding into internal buffers and copying them to output
frames. Since this is a very obscure decoder, the performance hit should
not be an issue.
The decoder currently sets CODEC_FLAG_EMU_EDGE and relies on
get_buffer2() to always provide buffers with linesize == 2 * width.
This is wrong, since we place no such restriction on get_buffer2()
implementations.
Fix this by decoding into internal buffers and copying them to output
frames. Since this is a very obscure decoder, the performance hit should
not be an issue.
When downmixing 2.1 to 2-channel, if the 2.0 portion is Lt/Rt, sum-difference or dual mono, the actual output will be the same (with the LFE either mixed-in or discarded).
Also, when downmixing an arbitrary layout to 2-channel, if the bitstream contains custom downmix coefficients targeting Lt/Rt, then the output will be Lt/Rt rather than regular Stereo.
If it was set before then we can end up trying to decode a slice without
a valid slice header, which can lead to invalid memory access.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
It has been checking the number of bits in the offset instead of the
actual offset.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
Otherwise the generic code will unref them, which can then result in
last_picture_ptr == current_picture_ptr, which causes deadlocks at least
in rv40.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
Introduced by 28243b0d35
Intensity compensation is always used once it was encountered, because
v->next_use_ic is never set back to zero.
Reset v->next_use_ic, when resetting v->next_luty/uv.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
According to the spec, the value of XXX_reserved_zero_44bits should be
ignored, so don't report an error when it's not zero.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
For:
ff_vc1_inv_trans_{8,4}x{8,4}_{dc_,}neon
ff_put_pixels8x8_neon
ff_put_vc1_mspel_mc{0,1,2,3}{0,1,2,3}_neon (except for 00)
Based on ARM assembly code in libavcodec/arm by Rob Clark and Mans
Rullgard.
Signed-off-by: Martin Storsjö <martin@martin.st>
Since pc.state is populated by shifting in from the end of the
32 bit word, the content within pc.state is already in native endian
and should not be read with the AV_R{L,B} functions.
This was already done correctly for state64 above.
This fixes the fate-corepng test on big endian.
Signed-off-by: Martin Storsjö <martin@martin.st>
Tables are always allocated now with sufficient space for either progressive
or interlaced content. The alternative would be to detect a change
and reallocate.
This fixes decoding of a sample.
Signed-off-by: Martin Storsjö <martin@martin.st>
This fixes decoding, broken since 7e35037.
This is similar to what was done for the normal mp3 decoder in
f4a86bc9.
Signed-off-by: Martin Storsjö <martin@martin.st>
This is a temporary workaround to allow deprecating
avcodec_get_frame_defaults(). The proper solution will be using a
properly allocated AVFrame in Picture.
This is a temporary workaround to allow deprecating
avcodec_get_frame_defaults(). The proper solution will be using a
properly allocated AVFrame in Picture.
Use only proper AVFrame API (no assigning of whole frames, since that
hardcodes sizeof(AVFrame) into lavc).
Make a copy of the side data, so the caller can use av_frame_unref/free
on non-refcounted frames, eliminating the need for
avcodec_get_frame_defaults()/avcodec_free_frame().
Not just on failure. This is the same thing that is done in the audio
path and should prevent leaks in decoders that allocate a frame, but
then end up not writing into it.
The vlc reader cannot handle 0-bit huffman codes. For most
situations WebP uses the "simple" huffman coding for this case,
but that will only handle symbols up to 255. For the LZ77 distance
codes, larger symbol values are needed, so it can happen in rare
cases that a normal huffman table is used that only has a single
symbol.
The encoder uses almost none of the mpegvideo infrastructure, only some
fields from MpegEncContext.
The FATE results change because now an all-zero quant matrix is written
into the file. Since it is not used for anything for ljpeg, this should
not be a problem.