In such a case, decode the MBs in parallel without the loop filter, then
execute the filter serially.
The ref2frm array was previously moved to H264SliceContext. That was
incorrect, since it applies to all the slices and should properly be in
H264Context (it did not actually break decoding, since this distinction
only becomes relevant with slice threading and deblocking_filter=1,
which was not implemented before this commit). The ref2frm array is thus
moved back to H264Context.
It is always unconditionally initialized in decode_postinit() and then
immediately used in one place further below. All the other places where
it is accessed are just useless fluff.
Make the SPS/PPS parsing independent of the H264Context, to allow
decoupling the parser from the decoder. The change is modelled after the
one done earlier for HEVC.
Move the dequant buffers to the PPS to avoid complex checks whether they
changed and an expensive copy for frame threads.
Modifying global header extradata in encode_frame is an API violation
and only happens to work currently because mov writes its header
at the end of the file.
Heavily based off of a patch from 2012 by Nicolas George.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Right now they are the first encoders for those codecs in the list, so
they are selected when the caller requests a codec by id.
Since they require special treatment, they should not be selected by
default if there are other encoders (e.g. libx264/5) available.
This can only be used if the input data happens to be laid out
exactly correctly.
This might not be supported on all encoders, so only enable it
with an option, but enable it automatically on raspberry pi,
where it is known to be supported.
Signed-off-by: Martin Storsjö <martin@martin.st>
The raspberry pi uses the alternative API/ABI for OMX; this makes
such builds incompatible with all the normal OpenMAX implementations.
Since this can't easily be detected at configure time (one can
build for raspberry pi's OMX just fine using the generic, pristine
Khronos OpenMAX IL headers, no need for their own extensions),
require a separate configure switch for it instead.
The broadcom host library can't be unloaded once loaded and started;
the deinit function that it provides is a no-op, and after started,
it has got background threads running, so dlclosing it makes it
crash.
Signed-off-by: Martin Storsjö <martin@martin.st>
Restore alphabetical order in lists, break overly long lines, do some
prettyprinting, add some explanatory section comments, group parts
together that belong together logically.
The first byte contains compression level together with keyframe status.
When a frame is not interpreted correctly, its data is summed to the
reference, and would degrade over time, producing an incorrect result.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Commit ca2f19b9cc modified the meaning of
H264SliceContext.gb: it is now initialised at the start of the NAL unit
header, rather than at the start of the slice header. The VAAPI slice
decoder uses the offset after parsing to determine the offset of the
slice data in the bitstream, so with the changed meaning we no longer
need to add the extra byte to account for the NAL unit header because
it is now included directly.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Change log level from warning to debug: the E-AC-3 "core"
substream can be successfully decoded without the additional
and dependent substreams, and their presence is already
indicated via avpriv_request_sample in ff_eac3_parse_header.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
These errors neither prevent nor stop successful decoding
of the E-AC-3 stream's "core", causing avpriv_request_sample
to be called for every single frame in the bitstream.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Instead of handling the problem inside NAL decoding code, add a higher
level wrapper function. This should be more robust against future
changes (and easier to read).
Previously, ff_h264_idct_add_neon (originally in the arm version) used
a non-regular transpose in order to be able to use more instructions
that deal with registers as 128 bit register pairs. The aarch64
translation doesn't do it to the same extent, but brought along the
same structure since it was a straight translation.
This reshuffles ff_h264_idct_add_neon, bringing it closer to
the C implementation, making the transpose_4x4H macro do a regular
transpose, usable for other algorithms as well.
Previously, the third and fourth output from transpose_4x4H were
swapped, and prior to cc29d96d5a, the same inputs as well. In
addition to just swapping the outputs, also renumber the intermediate
registers for better readability (making the register order match
transpose_4x8B).
This runs with the same number of cycles as before.
Signed-off-by: Martin Storsjö <martin@martin.st>
These buffers are just a way to store frame pointers and be able to
modify them without touching the original ones.
The two dependent decoders (WMV2 and VC1) do not need special care for
these fields: the former does not seem to use the dest buffers, while
the latter reinits them every time to the current frame data buffers.
So only keep a local copy rather than the one from mpegvideo.
Until now, the decoding API was restricted to outputting 0 or 1 frames
per input packet. It also enforces a somewhat rigid dataflow in general.
This new API seeks to relax these restrictions by decoupling input and
output. Instead of doing a single call on each decode step, which may
consume the packet and may produce output, the new API requires the user
to send input first, and then ask for output.
For now, there are no codecs supporting this API. The API can work with
codecs using the old API, and most code added here is to make them
interoperate. The reverse is not possible, although for audio it might.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The intrax8 decoding process does not imply any kind of error
resilience, and the only call present is more related to how mpegvideo
works rather than anything else.
Therefore have the parent decoders carry out er when actually needed.
* Change log level from error to debug
* Print report after the first decoded frame, not at the end of decoding
* Drop macro guard and use a context variable instead
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
If chan2 is not smaller than the number of channels, it can cause
segmentation faults due to dereferencing a NULL pointer.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The original code left-shifts negative values, which is undefined
in the C99 specification (the one used during normal Libav compilation).
This change multiplies by (1 << shift), which is functionally equivalent,
but has defined behavior.
With this change, fate-idct8x8 compiled with --fsanitize=undefined works.
Bug-Id: 686
Rename luma table to delta table and change how it is used.
CC: libav-stable@libav.org
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>