The frame dimensions are 16bit, so the mv bounds can easily overflow
int16 for large videos.
Bug-Id: Handbrake/46
CC: libav-stable@libav.org
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Roughly 25% faster MC than ssse3 for blocksizes 32 and 64.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
pavgb is an sse integer instruction, so the mmxext flag is enough
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This reverts commit 25bacd0a0c.
Since 230b1c070, the bytewise AV_W*() macros only expand their
argument once, so revert to the more readable version of these.
Signed-off-by: Martin Storsjö <martin@martin.st>
AV_WB32 can be implemented as a macro that expands its parameters
multiple times (in case AV_HAVE_FAST_UNALIGNED isn't set and the
compiler doesn't support GCC attributes); make sure not to read
multiple times from the source in this case.
Signed-off-by: Martin Storsjö <martin@martin.st>
The reference frames are used in update_thread_context(), so modifying
them after finish_setup() is a race. The frame in question will be
released during the next decode call.
CC: libav-stable@libav.org
The current code will ignore the init_get_bits() failure and do an
invalid read from the uninitialized GetBitContext.
Found-By: Jan Ruge <jan.s.ruge@gmail.com>
Bug-Id: 952
This fixes retrieving a valid profile for many of the FATE conformance samples,
allowing them to be properly decoded by the HWAccel after adding a profile check.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>.
Integrated to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
Based on patch 250430bf28
by Mickaël Raulet <mraulet@insa-rennes.fr>, integrated
to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
Since we only know whether a NAL unit corresponds to a new field after
parsing the slice header, this requires reorganizing the calls to slice
parsing, per-slice/field/frame init and actual decoding.
In the previous code, the function for slice header decoding also
immediately started a new field/frame as necessary, so any slices
already queued for decoding would no longer be decodable.
After this patch, we first parse the slice header, and if we determine
that a new field needs to be started we decode all the queued slices.
This function's purpose is not very well defined. Currently it does two
(only marginally related) things: selecting the next output frame and
calling ff_thread_finish_setup() for frame threading. The first of those
more properly belongs under field_start(), while the second can be
called directly from decode_nal_units().
Fixes a regression in ca2f19b9cc with some mov/mp4 files. The files have
several NAL units in the supposed single NAL unit after the size field.
Annex B start code prefixes are used to separate them. The first NAL unit
is correctly parsed but the buffer does not point to the next size field.
Instead semi random data (it seems to be the rbsp_stop_one_bit and the
start code prefix) is then parsed as length and will exceed the
remaining length of the buffer.
Patch based on the code in h264's decode_nal_units() and a similar
patch by Hendrik Leppkes in FFmpeg (a9bb4cf87d).
Bug-Id: ffmpeg/trac5529
Reported-By: Vittorio Giovara
Currently, SPS.mb_height is actually what the spec calls
PicHeightInMapUnits, which is half the frame height when interlacing is
allowed. Calling this 'mb_height' is quite confusing, and there are at
least two associated bugs where this field is treated as the actual
frame height - in the h264 parser and in the code computing maximum
reordering buffer size for a given level.
Fix those issues (and avoid possible future ones) by exporting the real
frame height in this field.
This comment isn't true, the height can be different from the width
for these functions (which is why the height is passed as a parameter
to them).
Signed-off-by: Martin Storsjö <martin@martin.st>
GNU as evaluates true as '-1' while Apple's variant and llvm's internal
assembler evaluate it as '1'. The best way to avoid this madness is to
eliminate boolean expressions instead of trying to fix it with
preprocessor directives. Use a direct formula to calculate the
required temporary space on the stack in
ff_put_vp8_{epel,bilin}{4,8,16}_h[246]v[246]_armv6().
Fixes a checkasm segfault in vp8dsp.mc when using llvm's internal
assembler for a non-Apple target.
While it is less featureful (and slower) than the built-in H264
decoder, one could potentially want to use it to take advantage
of the cisco patent license offer.
Signed-off-by: Martin Storsjö <martin@martin.st>
Previously we would allocate a new one for every frame. This instead
maintains an AVBufferPool of them to use as-needed.
Also makes the maximum size of an output buffer adapt to the frame
size - the fixed upper bound was a bit too easy to hit when encoding
large pictures at high quality.
The encode function is supposed to just return 0 on success.
This stems from a mixup with the return value of decode functions.
Signed-off-by: Martin Storsjö <martin@martin.st>
Currently it's exported as AVFrame.pkt_pts, which is also the only use
for that field. The reason it is done like this is that lavc used to
export various codec-specific "timing" information in AVFrame.pts, which
is not done anymore.
Since it is confusing to the callers to have a separate field which is
used only for decoder timestamps and nothing else, deprecate pkt_pts and
use just AVFrame.pts everywhere.
This is a more appropriate place for this. H264Context.recovery_frame is
shared between frame threads, so modifying it where it is right now is
invalid.
Move the NAL unit types into it. This will allow to stop including the
whole decoder-specific h264dec.h in some code that is unrelated to the
decoder and only needs some enum values.
Right now this code is mixed with selecting the next output frame. Move
it to a separate function called from h264_field_start(), which is a
more appropriate place for this.
While the value of those variables will be constant for the whole frame,
they are only used in two functions called from slice header decoding.
Moving them to the per-slice context allows us to make the H264Context
passed to slice_header_parse() constant.
There is no bitstream parsing in that block and messing with
decoder-global state is not something that belongs into header parsing.
Nothing else in this function depends on the value of current_slice,
except for two validity checks. Those checks are also moved out of
slice_header_parse().
Replace the decoder-global nal_unit_type/nal_ref_idc variables with the
per-NAL ones. The decoder-global ones still cannot be removed because
they are used by hwaccels.
This mimics the behaviour of other av_*_new_side_data().
This is not caught by the malloc check, since padding
is always added to the allocated size.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
The only difference is that the first of them contains a
ff_h264_flush_change() call. While that is not necessary in the second
block, it should cause no problems either.
Reduce the verbosity of the reinit log message from info to verbose,
since now it will be displayed during every decode session.
Do it right before the MMCOs are applied to the DPB. This will allow
moving the frame_start() call out of the slice header parsing, since
generating the implicit MMCOs needs to be done after frame_start().
They are stored in the slice header, so technically they are per-slice
(though they must be the same in every slice). This will simplify the
following commits.
This function does not do any bitstream parsing and it depends on the
current frame being allocated, so this will allow the frame_start() to
be moved out eventually.
This will allow postponing the reference list construction (and by
consequence some other functions, like frame_start) until the whole
slice header has been parsed.
Currently it's done in the code that initialises the ref list for
MBAFF, which is not a logical place for it. Move it to the function that
parses the pred table from the bitstream, which is analogous to what is
done for the implicit weight table as well.
That function is currently very long and entangles bitstream parsing and
decoder configuration. This makes the code much harder to read than
necessary.
Begin splitting the code that configures the decoder state based on the
slice header information from the parsing of the slice header.
This avoids the danger that get_bits.h might get indirectly #included before
BITSTREAM_READER_LE is defined.
Also sort headers into canonical order where appropriate.
According to avcodec.h, avcodec_decode_video2 should return the number of
bytes used if a frame was decoded.
The current implementation returns size - used size of all the subframes.
This fixes the VLC's bug https://trac.videolan.org/vlc/ticket/16836.
The superframe is always fully consumed.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Split version files into one line per symbol/directive to allow compatibility
with the Solaris linker without preprocessing and eliminate $ from version file
templates to simplify the postprocessing shell command.
Experimental; requires Skylake and VAAPI 0.39.1 (not yet released).
Also increases the allowed range of the quality option - in low-power
mode, the Intel driver supports levels 1-8 (and 0 meaning default).
Non-reference frames (nal_ref_idc == 0) should be discardable, so
frame_num does not advance after them. Before this change, a stream
containing unreferenced B-frames would be rejected by the reference
decoder.
This prevents attempts to use unsupported modes, such as low-power
H.264 mode on non-Skylake targets. Also fixes a crash on invalid
configuration, when trying to destroy an invalid VA config/context.
We cannot deprecate it until the new parser API is in place, because of
the way libavformat works. But the majority of the users can already
simply replace it with avcodec_free_context(), which will simplify the
transition once it is finally deprecated.
This function is supposed to "reset" a codec context to a clean state so
that it can be opened again. The only reason it exists is to allow using
AVStream.codec as a decoding context (after it was already
opened/used/closed by avformat_find_stream_info()). Since that behaviour
is now deprecated, there is no reason for this function to exist
anymore.
Since AVCodecContext contains a lot of complex state, copying a codec
context is not a well-defined operation. The purpose for which it is
typically used (which is well-defined) is copying the stream parameters
from one codec context to another. That is now possible with through the
AVCodecParameters API. Therefore, there is no reason for
avcodec_copy_context() to exist.
Initialize the bit buffer with the correct size (amount of bits that will
be read) instead of relying on the bitstream reader overreading the
correct values.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
For reasons we are not privy to, nvidia decided that the nvenc encoder
should apply aspect ratio compensation to 'DVD like' content, assuming that
the content is not BT.601 compliant, but needs to be BT.601 compliant. In
this context, that means that they make the following, questionable,
assumptions:
1) If the input dimensions are 720x480 or 720x576, assume the content has
an active area of 704x480 or 704x576.
2) Assume that whatever the input sample aspect ratio is, it does not account
for the difference between 'physical' and 'active' dimensions.
From these assumptions, they then conclude that they can 'help', by adjusting
the sample aspect ratio by a factor of 45/44. And indeed, if you wanted to
display only the 704 wide active area with the same aspect ratio as the full
720 wide image - this would be the correct adjustment factor, but what if you
don't? And more importantly, what if you're used to lavc not making this kind
of adjustment at encode time - because none of the other encoders do this!
And, what if you had already accounted for BT.601 and your input had the
correct attributes? Well, it's going to apply the compensation anyway!
So, if you take some content, and feed it through nvenc repeatedly, it
will keep scaling the aspect ratio every time, stretching your video out
more and more and more.
So, clearly, regardless of whether you want to apply bt.601 aspect ratio
adjustments or not, this is not the way to do it. With any other lavc
encoder, you would do it as part of defining your input parameters or do
the adjustment at playback time, and there's no reason by nvenc should
be any different.
This change adds some logic to undo the compensation that nvenc would
otherwise do.
nvidia engineers have told us that they will work to make this
compensation mechanism optional in a future release of the nvenc
SDK. At that point, we can adapt accordingly.
Signed-off-by: Philip Langdale <philipl@overt.org>
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The code needs only a few definitions from cuda.h, so define them
directly when CUDA is not enabled. CUDA is still required for accepting
HW frames as input.
Based on the code by Timo Rothenpieler <timo@rothenpieler.org>.