It is only used to indicate to ff_h263_show_pict_info()
that we are decoding H.263+; pass this information
via a function parameter instead.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This means that these buffers won't be allocated any more
for H.263+ with AIC disabled.
Also remove setting h263_plus for the RV20 encoder,
as it has only been done to force allocating dc_val.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Just ensure that dct_unquantize_inter is set iff it is used
and check for the function pointer instead.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This allows to remove checks from ff_xvid_idct_init()
(and also the AVCodecContext* parameter).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_xvid_idct_init() is now only called from ff_idctdsp_init()
and only if idct_algo is FF_IDCT_XVID.
This also implies that it is unnecessary to initalize
the permutation on our own.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unnecessary: If the dst context is not already initialized,
then it will be initialized by memcpy(dst, src, sizeof(*dst),
which already initializes the IDCT to the desired one, potentially
followed by ff_mpv_common_init(), which does not touch the IDCT.
(This call has been added in f89d76c10355242c39b08f253c1d1524f45ef778;
the aforementioned copying took place back then, too, but
ff_mpv_common_init() reinitialized the IDCT to a non-xvid one,
therefore the initialization here has been added to fix
multithreaded decoding.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These are sufficiently different to warrant their own functions;
in particular, the earlier code had two callsites for the actual
"write block" function, one for intra and one for inter, yet
the "write block" function nevertheless performed a check
(that the compiler can't optimize away) for whether the current
MB is intra or not.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
While it offers modest speedups compared to the ordinary code,
removing it completely offers even bigger speedups; and given
that these speedups also exist in the ordinary mode, they are
even more important. The no-output code has a 7.8% performance
improvement (based on benchmarking mpeg4_encode_mb()), yet
removing the no-output code resulted in a 9.4% improvement.
Furthermore, the no-output code was broken for the majority of
its existence (until 9207dc3b0d)
and no one complained, so it is likely not used at all.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The MPEG-4 decoder can now initialize ff_mpeg4_rl_intra
directly given that the MPEG-4 encoder no longer wants
it performed, too.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There four cases for the LUT entry: An ordinary entry
or one of three escaping methods. Three of these methods
are only rarely possible --they correspond to the 102
codes of the underlying VLC and so only 102 of 16384 entries
are possible.
The earlier code would nevertheless try them all for every
LUT entry and use the best one; the new code meanwhile only
uses one method (the fallback one (i.e. the worst)) for them all
and only processes the 102 valid entries afterwards.
The implementation used also means that index_run, max_level
and max_run of the RLTable are no longer needed; the earlier
code would initialize said static tables although they are only
used for a short time to initialize something else.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Multiple PutBitContexts are used when encoding partitioned
frames. When there are multiple candidates for macroblock types,
multiple states (namely the state before encoding the current MB,
the best state among the ones already tried and the current one)
need to be kept; duplicates of the PutBitContexts are among this
state. The temporary buffers for them are kept on the stack
in encode_thread() and their size is quite generous (MAX_MB_SIZE
- 3000 bytes). This commit uses tighter bounds, bringing the
size of the pb2 buffer down to 15 bytes.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The reversible VLC tables use a simpler escaping method
than the ordinary VLCs: It does not use max_run, max_level etc.
and therefore one does not need to initialize these at all.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is confusing, because the AV_RL32("ASUS") already
returns an endian-independent value, so converting
it via av_le2ne32() makes no real sense: one would need
to transform the native value to le and write it as
a natie endian uint32_t for it to make sense (the current
code only works because le2ne32 and ne2le32 are the same
for both endianness that we care about). Or one can just
use AV_RL32 and create the number via MKTAG().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Removes implicit checks for "do I need to output the buffer now?".
Codesize with Clang 19 with -O3 decreased from 7136B to 6108B
(although asv2_put_level() is now inlined).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, the encoder replicated all the border pixels
for incomplete 16x16 macroblocks. In case the available width
or height is <= 8, some of the luma blocks of the MB
do not correspond to actual input, so that we should encode
them using the least amount of bits. Zeroing the block coefficients
(as this commit does) achieves this, replicating the pixels
and performing an FDCT does not.
This commit also removes the frame copying code for insufficiently
aligned dimensions.
The vsynth3-asv[12] FATE tests use a 34x34 input file and are
therefore affected by this. As the ref updates show, the size
and checksum of the encoded changes, yet the decoded output
stays the same.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Add three missing requirements on bitstream conformance from 7.4.3.19 of
H.266 (V3). Issue found using fuzzing.
Signed-off-by: Frank Plowman <post@frankplowman.com>
When called for palette-predicted CUs, boundary_strength could cause
undefined behaviour due to accessing uninitialised motion information.
The spec doesn't include this, but in the reference software it seems
the deblock strength is always set to 0 for palette CUs due to some
implementation details: perhaps this is a spec issue?
Signed-off-by: Frank Plowman <post@frankplowman.com>
In d5dbcc00d8, it was hoped that detection
of subpicture overlaps could be performed at the tile level, so as to
avoid introducing per-CTU checks. Unfortunately since that patch,
fuzzing has indicated there are some structures involving
pps_subpic_one_or_more_tiles_slice where tile-level checking is not
sufficient. Performing the check at the CTU level should (touch wood)
be the be-all and and-all of this, as CTUs are the lowest common
denominator of the picture partitioning.
Signed-off-by: Frank Plowman <post@frankplowman.com>
The issue is that state->cur[] is 8-bits, but a+b+1 can overflow
before being clipped to 0xF in the following line, causing an incorrect
state to be saved for the next symbol.
This solves numerous bitstream desyncs, particularly when coefficients
with magnitude greater than 127 are sent.
Don't use a 7.1 EAC3 input file for which our decoder is not
bitexact; instead just use the asynth-44100-8.wav file
which (as a 7.1 file) exhibits the same issue fixed by
1b3f4842c1.
(Either the encoder or the resampler are still not completely
bitexact, so we limit the number of frames output.)
Also switch to a framecrc test so that the output channel layout
is directly contained in the ref file.
Reviewed-by: James Almer <jamrial@gmail.com>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Flipping can already be accomplished by setting the crop_w/h expressions to
their negative values, so together these options can implement any of the
common frame orientations.
io_open and io_close2 callbacks may use opaque pointer stored in the
context. They are already inherited, so opaque should also be passed
through.
Fixes IMF playback in mpv.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Reviewed-by: Pierre-Anthony Lemieux <pal@sandflow.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The buffer references may not be writable at this point, as the decoder
calls get_buffer2() with the AV_GET_BUFFER_FLAG_REF flag.
Fixes races as reported by tsan, producing correct output regardless of
threading choices.
Signed-off-by: James Almer <jamrial@gmail.com>
Currently, planarRgbToplanarRgbWrapper() always sets the alpha value to 255,
without taking the bit depth into consideration.
This commit restricts the alpha value to the bit depth.
Currently, packed16togbra16() always sets the alpha value to 0xFFFF,
without taking the bit depth into consideration.
This causes a bug on x86, which can be reproduced with:
./libswscale/tests/swscale -unscaled 1 -src xyz12le -dst gbrap12be
The problem arises in ff_hscale14to15_4_ssse3(), in the conversion
from gbrap12be to yuva444p, which comes after the conversion from
xyz12le to gbrap12be.
It has something to do with pmaddwd not working on unsigned values.
There is some code to deal with 0xFFFF if the input has a bit depth of
16, but not for bit depths < 16.
We could fix ff_hscale14to15_4_ssse3() to also work correctly with
0xFFFF on bit depths < 16, or we could just not write 0xFFFF there in
the first place, which is what this commit does.
Currently, packed30togbra10() always sets the alpha value to 0xFFFF,
without taking the bit depth into consideration.
This commit restricts the alpha value to the bit depth.
Currently, planarCopyWrapper() assumes that src[3] must be NULL when
the source format has no alpha plane.
This commit updates the condition for filling the alpha plane based on
the number of components available in the source format as well.
The SLIBOBJS variable was introduced in 56572787ae but is no longer used.
Another variable, SHLIBOBJS, was introduced after SLIBOBJS, in 20b0d24c2f.
The functionality from SLIBOBJS was effectively migrated to SHLIBOBJS in b77fff47d0.
No code has used SLIBOBJS since.
This commit removes all remaining references to SLIBOBJS from the build system.
The issue is that there is an explicit lack of synchronization as only the very
first invocation writes symbols and updates the state, which other invocations
then store.