Based on patch 250430bf28
by Mickaël Raulet <mraulet@insa-rennes.fr>, integrated
to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
Since we only know whether a NAL unit corresponds to a new field after
parsing the slice header, this requires reorganizing the calls to slice
parsing, per-slice/field/frame init and actual decoding.
In the previous code, the function for slice header decoding also
immediately started a new field/frame as necessary, so any slices
already queued for decoding would no longer be decodable.
After this patch, we first parse the slice header, and if we determine
that a new field needs to be started we decode all the queued slices.
This function's purpose is not very well defined. Currently it does two
(only marginally related) things: selecting the next output frame and
calling ff_thread_finish_setup() for frame threading. The first of those
more properly belongs under field_start(), while the second can be
called directly from decode_nal_units().
Fixes a regression in ca2f19b9cc with some mov/mp4 files. The files have
several NAL units in the supposed single NAL unit after the size field.
Annex B start code prefixes are used to separate them. The first NAL unit
is correctly parsed but the buffer does not point to the next size field.
Instead semi random data (it seems to be the rbsp_stop_one_bit and the
start code prefix) is then parsed as length and will exceed the
remaining length of the buffer.
Patch based on the code in h264's decode_nal_units() and a similar
patch by Hendrik Leppkes in FFmpeg (a9bb4cf87d).
Bug-Id: ffmpeg/trac5529
Reported-By: Vittorio Giovara
Currently, SPS.mb_height is actually what the spec calls
PicHeightInMapUnits, which is half the frame height when interlacing is
allowed. Calling this 'mb_height' is quite confusing, and there are at
least two associated bugs where this field is treated as the actual
frame height - in the h264 parser and in the code computing maximum
reordering buffer size for a given level.
Fix those issues (and avoid possible future ones) by exporting the real
frame height in this field.
This comment isn't true, the height can be different from the width
for these functions (which is why the height is passed as a parameter
to them).
Signed-off-by: Martin Storsjö <martin@martin.st>
GNU as evaluates true as '-1' while Apple's variant and llvm's internal
assembler evaluate it as '1'. The best way to avoid this madness is to
eliminate boolean expressions instead of trying to fix it with
preprocessor directives. Use a direct formula to calculate the
required temporary space on the stack in
ff_put_vp8_{epel,bilin}{4,8,16}_h[246]v[246]_armv6().
Fixes a checkasm segfault in vp8dsp.mc when using llvm's internal
assembler for a non-Apple target.
While it is less featureful (and slower) than the built-in H264
decoder, one could potentially want to use it to take advantage
of the cisco patent license offer.
Signed-off-by: Martin Storsjö <martin@martin.st>
Previously we would allocate a new one for every frame. This instead
maintains an AVBufferPool of them to use as-needed.
Also makes the maximum size of an output buffer adapt to the frame
size - the fixed upper bound was a bit too easy to hit when encoding
large pictures at high quality.
The encode function is supposed to just return 0 on success.
This stems from a mixup with the return value of decode functions.
Signed-off-by: Martin Storsjö <martin@martin.st>
Currently it's exported as AVFrame.pkt_pts, which is also the only use
for that field. The reason it is done like this is that lavc used to
export various codec-specific "timing" information in AVFrame.pts, which
is not done anymore.
Since it is confusing to the callers to have a separate field which is
used only for decoder timestamps and nothing else, deprecate pkt_pts and
use just AVFrame.pts everywhere.
This is a more appropriate place for this. H264Context.recovery_frame is
shared between frame threads, so modifying it where it is right now is
invalid.
Move the NAL unit types into it. This will allow to stop including the
whole decoder-specific h264dec.h in some code that is unrelated to the
decoder and only needs some enum values.
Right now this code is mixed with selecting the next output frame. Move
it to a separate function called from h264_field_start(), which is a
more appropriate place for this.
While the value of those variables will be constant for the whole frame,
they are only used in two functions called from slice header decoding.
Moving them to the per-slice context allows us to make the H264Context
passed to slice_header_parse() constant.
There is no bitstream parsing in that block and messing with
decoder-global state is not something that belongs into header parsing.
Nothing else in this function depends on the value of current_slice,
except for two validity checks. Those checks are also moved out of
slice_header_parse().
Replace the decoder-global nal_unit_type/nal_ref_idc variables with the
per-NAL ones. The decoder-global ones still cannot be removed because
they are used by hwaccels.
This mimics the behaviour of other av_*_new_side_data().
This is not caught by the malloc check, since padding
is always added to the allocated size.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
The only difference is that the first of them contains a
ff_h264_flush_change() call. While that is not necessary in the second
block, it should cause no problems either.
Reduce the verbosity of the reinit log message from info to verbose,
since now it will be displayed during every decode session.
Do it right before the MMCOs are applied to the DPB. This will allow
moving the frame_start() call out of the slice header parsing, since
generating the implicit MMCOs needs to be done after frame_start().
They are stored in the slice header, so technically they are per-slice
(though they must be the same in every slice). This will simplify the
following commits.
This function does not do any bitstream parsing and it depends on the
current frame being allocated, so this will allow the frame_start() to
be moved out eventually.
This will allow postponing the reference list construction (and by
consequence some other functions, like frame_start) until the whole
slice header has been parsed.
Currently it's done in the code that initialises the ref list for
MBAFF, which is not a logical place for it. Move it to the function that
parses the pred table from the bitstream, which is analogous to what is
done for the implicit weight table as well.
That function is currently very long and entangles bitstream parsing and
decoder configuration. This makes the code much harder to read than
necessary.
Begin splitting the code that configures the decoder state based on the
slice header information from the parsing of the slice header.
This avoids the danger that get_bits.h might get indirectly #included before
BITSTREAM_READER_LE is defined.
Also sort headers into canonical order where appropriate.
According to avcodec.h, avcodec_decode_video2 should return the number of
bytes used if a frame was decoded.
The current implementation returns size - used size of all the subframes.
This fixes the VLC's bug https://trac.videolan.org/vlc/ticket/16836.
The superframe is always fully consumed.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Split version files into one line per symbol/directive to allow compatibility
with the Solaris linker without preprocessing and eliminate $ from version file
templates to simplify the postprocessing shell command.
Experimental; requires Skylake and VAAPI 0.39.1 (not yet released).
Also increases the allowed range of the quality option - in low-power
mode, the Intel driver supports levels 1-8 (and 0 meaning default).
Non-reference frames (nal_ref_idc == 0) should be discardable, so
frame_num does not advance after them. Before this change, a stream
containing unreferenced B-frames would be rejected by the reference
decoder.
This prevents attempts to use unsupported modes, such as low-power
H.264 mode on non-Skylake targets. Also fixes a crash on invalid
configuration, when trying to destroy an invalid VA config/context.
We cannot deprecate it until the new parser API is in place, because of
the way libavformat works. But the majority of the users can already
simply replace it with avcodec_free_context(), which will simplify the
transition once it is finally deprecated.
This function is supposed to "reset" a codec context to a clean state so
that it can be opened again. The only reason it exists is to allow using
AVStream.codec as a decoding context (after it was already
opened/used/closed by avformat_find_stream_info()). Since that behaviour
is now deprecated, there is no reason for this function to exist
anymore.
Since AVCodecContext contains a lot of complex state, copying a codec
context is not a well-defined operation. The purpose for which it is
typically used (which is well-defined) is copying the stream parameters
from one codec context to another. That is now possible with through the
AVCodecParameters API. Therefore, there is no reason for
avcodec_copy_context() to exist.
Initialize the bit buffer with the correct size (amount of bits that will
be read) instead of relying on the bitstream reader overreading the
correct values.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
For reasons we are not privy to, nvidia decided that the nvenc encoder
should apply aspect ratio compensation to 'DVD like' content, assuming that
the content is not BT.601 compliant, but needs to be BT.601 compliant. In
this context, that means that they make the following, questionable,
assumptions:
1) If the input dimensions are 720x480 or 720x576, assume the content has
an active area of 704x480 or 704x576.
2) Assume that whatever the input sample aspect ratio is, it does not account
for the difference between 'physical' and 'active' dimensions.
From these assumptions, they then conclude that they can 'help', by adjusting
the sample aspect ratio by a factor of 45/44. And indeed, if you wanted to
display only the 704 wide active area with the same aspect ratio as the full
720 wide image - this would be the correct adjustment factor, but what if you
don't? And more importantly, what if you're used to lavc not making this kind
of adjustment at encode time - because none of the other encoders do this!
And, what if you had already accounted for BT.601 and your input had the
correct attributes? Well, it's going to apply the compensation anyway!
So, if you take some content, and feed it through nvenc repeatedly, it
will keep scaling the aspect ratio every time, stretching your video out
more and more and more.
So, clearly, regardless of whether you want to apply bt.601 aspect ratio
adjustments or not, this is not the way to do it. With any other lavc
encoder, you would do it as part of defining your input parameters or do
the adjustment at playback time, and there's no reason by nvenc should
be any different.
This change adds some logic to undo the compensation that nvenc would
otherwise do.
nvidia engineers have told us that they will work to make this
compensation mechanism optional in a future release of the nvenc
SDK. At that point, we can adapt accordingly.
Signed-off-by: Philip Langdale <philipl@overt.org>
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The code needs only a few definitions from cuda.h, so define them
directly when CUDA is not enabled. CUDA is still required for accepting
HW frames as input.
Based on the code by Timo Rothenpieler <timo@rothenpieler.org>.
For some unknown reason enabling these causes proper CBR padding,
so as there are no known downsides just always enable them in CBR mode.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
There is no real advantage to listing some codecs or subsystems
separately simply because they are somehow "hw-accelerated", on the
contrary it makes them harder to find than in a plain alphabetically
ordered list.
Use the newly created vlc.h directly instead of including get_bits when needed.
The VLC and RL_VLC_ELEM structures are independent from the bitreader.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Widen the values from limited to full range and use BT.709 where it
should be used according to the video resolution:
SD is BT.601, HD is BT.709
Default to BT.709 due to most observed HDMV content being HD.
This uses a new MMAL feature, which limits the number of extra frames
that can be buffered within the decoder. VIDEO_MAX_NUM_CALLBACKS can
be defined as positive or negative number. Positive numbers are
absolute, and can lead to deadlocks if the user underestimates the
number of required buffers. Negative numbers specify the number of extra
buffers, e.g. -1 means no extra buffer, (-1-N) means N extra buffers.
Set a gratuitous default of -11 (N=10). This is much lower than the
firmware default, which appears to be 96.
This is backwards compatible, but needs a symbol only present in newer
firmware headers. (It's an enum item, so it requires a check in
configure.)
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Slight simplification. The result is the same. Also, change the
wording of the message as requested in patch review.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Fixes apparent mmal_port_disable() freezes in ffmmal_stop_decoder() when
calling ffmmal_decode() with flush semantics a large number of times in
a row.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The assert in ffmmal_stop_decoder() could trigger sometimes. The
packets_buffered counter was indeed not correctly maintained, and
packets were not subtracted from it if they were still in the waiting
queue.
For some reason, this happened especially with VC-1.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Register mmaldec as mpeg2 decoder. Supporting mpeg2 in mmaldec is just a
matter of setting the correct MMAL_ENCODING on the input port. To ease the
addition of further supported mmal codecs a macro is introduced to generate
the decoder and decoder class structs.
Signed-off-by: Julian Scheel <julian@jusst.de>
Signed-off-by: wm4 <nfxjfg@googlemail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
There is no avpriv_atomic_get, instead avpriv_atomic_int_get is to be used for
integers. This fixes building mmaldec.
Signed-off-by: Julian Scheel <julian@jusst.de>
Reviewed-by: wm4 <nfxjfg@googlemail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
In such a case, decode the MBs in parallel without the loop filter, then
execute the filter serially.
The ref2frm array was previously moved to H264SliceContext. That was
incorrect, since it applies to all the slices and should properly be in
H264Context (it did not actually break decoding, since this distinction
only becomes relevant with slice threading and deblocking_filter=1,
which was not implemented before this commit). The ref2frm array is thus
moved back to H264Context.
It is always unconditionally initialized in decode_postinit() and then
immediately used in one place further below. All the other places where
it is accessed are just useless fluff.
Make the SPS/PPS parsing independent of the H264Context, to allow
decoupling the parser from the decoder. The change is modelled after the
one done earlier for HEVC.
Move the dequant buffers to the PPS to avoid complex checks whether they
changed and an expensive copy for frame threads.
Modifying global header extradata in encode_frame is an API violation
and only happens to work currently because mov writes its header
at the end of the file.
Heavily based off of a patch from 2012 by Nicolas George.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Right now they are the first encoders for those codecs in the list, so
they are selected when the caller requests a codec by id.
Since they require special treatment, they should not be selected by
default if there are other encoders (e.g. libx264/5) available.
This can only be used if the input data happens to be laid out
exactly correctly.
This might not be supported on all encoders, so only enable it
with an option, but enable it automatically on raspberry pi,
where it is known to be supported.
Signed-off-by: Martin Storsjö <martin@martin.st>
The raspberry pi uses the alternative API/ABI for OMX; this makes
such builds incompatible with all the normal OpenMAX implementations.
Since this can't easily be detected at configure time (one can
build for raspberry pi's OMX just fine using the generic, pristine
Khronos OpenMAX IL headers, no need for their own extensions),
require a separate configure switch for it instead.
The broadcom host library can't be unloaded once loaded and started;
the deinit function that it provides is a no-op, and after started,
it has got background threads running, so dlclosing it makes it
crash.
Signed-off-by: Martin Storsjö <martin@martin.st>
Restore alphabetical order in lists, break overly long lines, do some
prettyprinting, add some explanatory section comments, group parts
together that belong together logically.
The first byte contains compression level together with keyframe status.
When a frame is not interpreted correctly, its data is summed to the
reference, and would degrade over time, producing an incorrect result.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Commit ca2f19b9cc modified the meaning of
H264SliceContext.gb: it is now initialised at the start of the NAL unit
header, rather than at the start of the slice header. The VAAPI slice
decoder uses the offset after parsing to determine the offset of the
slice data in the bitstream, so with the changed meaning we no longer
need to add the extra byte to account for the NAL unit header because
it is now included directly.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>