These tables are supposed to contain the number of bits needed
to encode a given (run, level) pair. Yet the number of bits
for pairs needing the escape code was wrong (it only contained
the escape code and not the bits needed for run and level).
Furthermore, H.261 (a format with explicit end-of-block codes)
does not work well together with the RLTable API from rl.c:
The EOB code is the first one in ff_h261_rl_tcoeff's VLC table
and has a run value of zero. Therefore the result of get_rl_index()
is off by one for run == 0 and level values with explicit
(run, level) pair.
Fixing this necessitated changing the ref files of the
vsynth*-h261-trellis tests. Both filesizes as well as PSNR
decreased. If one used a qscale value of 11 for this test,
one would have received files with about the same size as
before this patch (with qscale 12), but with better PSNR.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The RLTable API in rl.c is not well designed for codecs with
an explicit end-of-block code. ff_h261_rl_tcoeff's vlc has
the EOB code as first element (presumably so that the decoder
can check for it via "if (level == 0)") and this implies
that the indices returned by get_rl_index() are off by one
for run == 0 which is therefore explicitly checked.
This commit changes this by adding a simple LUT for the
values not requiring escaping. It is easy to directly
include the sign bit into this, so this has also been done.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These must not be modified (even when they are initialized at runtime
and therefore modifiable).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is a better place for it; no non-h263-based decoder needs
these functions any more (both H.261 and the error resilience
code recently stopped doing so).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is an MPEG-4-only value; it is always five for the MPEG-4
encoder, so just hardcode this value and move the MpegEncContext
field to Mpeg4DecContext.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The RV10 and RV20 decoders use ff_h263_decode_mb() and also the
H.263 DSP and VLCs. Despite not calling ff_h263_decode_frame(),
it is nevertheless beneficial to call ff_h263_decode_init()
to reduce code duplication.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The error resilience code does not make up block coefficients
and therefore zeroes them in order to disable the IDCT.
But this can be done in a simpler manner, namely by setting
block_last_index to a negative value. Doing so also has
the advantage that the dct_unquantize functions are never even
called for those codecs that do not use ff_mpv_reconstruct_mb()
for ordinary decoding (namely RV-30/40 and the VC-1 family).
This approach would not work for intra macroblocks (there is always
at least one coefficient for them and therefore there is no check
for block_last_index for them), but this does not happen at all.
Add an assert for this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Don't use a LUT to negate followed by a conditional ordinary
negation immediately thereafter. Instead fold the two.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is beneficial for performance: When concatenating
the file from the vsynth1-h261 fate-test 100 times,
performance (measured by timing the codec's decode callback)
improved by 9.6%.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unnecessary because ff_mpeg1_dc_scale_table is the default
for both dc_scale_tables.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_init_block_index() sets MpegEncContext.dest and
MpegEncContext.block_index. The latter is unused by
ff_mpv_reconstruct_mb() (which is what this code is
preparatory for) and dest is overwritten a few lines below.
So don't initialize block_index at all.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
inter_scantable is only used by the dct_unquantize_h263_inter
functions, yet this is not used by the MPEG-4 decoder at all
(in case H.263 quantization is used, the unquantization already
happens in mpeg4_decode_block()).
Also move the common initialization of ff_permute_scantable()
out of the if.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The WMV2 decoder does not support lowres, so one can optimize
the WMV2 specific code away in the lowres version of this function.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There are only two mpegvideo decoders that use another
(software) pixel format than YUV420: MPEG-1/2 and
the MPEG-4 studio profile. Neither of these use this part
of the code, so one can optimize the 422 code away when
this code is compiled for the decoder.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
h261_resync() can be completely removed, because
h261_decode_gob_header() checks for a GOB header itself if
gob_start_code_skipped is zero.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
last_resync_gb is never initialized, causing NULL + 0
in align_get_bits(). In addition to that, the loop is never
entered.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
All valid values of dc_lum and dc_chrom are in the range 0..9,
because they are initialized via tables with 10 elements.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything that init_block_index() sets will be overwritten
a few lines below again, so don't call it and simply calculate
the only thing that is used (namely block_index[0]) manually.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The no-output mode (guarded by AV_CODEC_FLAG2_NO_OUTPUT)
does not provide a noteworthy speedup; in fact, it even
turned out to be slower than the code with the no-output
code removed (ordinary encode: 153259721 decicycles,
noout encode: 153259721; encode with this patch applied:
152451581 decicycles; timings are for encode_frame callbacks
when encoding a 1080p sample to MPEG-4).
(Furthermore, this code was broken for most of its existence
(since 9207dc3b0db368bb9cf5eb295cbc1129c2975e31) and no one
noticed, so the no-output mode is probably not used at all.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The earlier code had two problems:
1. For reference frames that are not directly output (happens unless
low_delay is set), the mb skip values referred to the next reference
frame to be decoded.
2. For non-reference frames, every macroblock was always considered
skipped.
This makes the output (worse than) useless; that no one ever
complained about this shows that this feature is not really used.
It is therefore removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is not a stream property, but a property of an individual picture
(in fact, it is only set by the FLV decoder that does not even support
frame threading).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is always allocated in ff_mpv_frame_start(), so the only
reason to put it into ff_mpeg_update_thread_context()
would be for the case that a frame-threaded decoder
that supports coded fields implements frame-threading.
The only mpegvideo-decoders supporting coded fields
are MPEG-1/2 and VC-1. The latter's bitstream requires
both coded fields to be part of the same access unit/packet,
so that every frame thread will always call ff_mpv_frame_start()
itself. The former only "need" the framesize buffers when
using lowres. If MPEG-1/2 gains frame-threading, one could either
perform framesize allocation in its update_thread_context
or when starting a field.
(Given that the next packet may trigger a reinitialization
due to a frame size change, it was possible for the buffers
that were allocated here to be thrown away unused.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no reason to use a temporary buffer as destination
for the new macroblock before copying it into its proper place.
(Originally, this has been added in commit
b68ab2609c67d07b6f12ed65125d76bf9a054479 due to concerns about
copying from GPU memory.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Broken in 5ecf5b93dda9d0c69875b80d28929f0d97dd7d06.
More precisely, 3994623df2efd2749631c3492184dd8d4ffa9d1b changed
the precursor of ff_mpv_reconstruct_mb() to always decode
to the first row of macroblocks for B pictures when
a draw_horiz_band callback is set and to (they are exported to
the caller via said callback and each row overwrites the previously
decoded row; this was probably intended as a cache-optimization).
This first macroblock row was used as source for the draw_horiz_band
callback.
This of course means that the ordinary output B-frame was not
decoded correctly at all. Therefore the first aforementioned commit
removed this special handling of draw_horiz_band; yet it did not
remove the special handling for B-frames in ff_draw_horiz_band(),
which broke draw_horiz_band for B-frames. This commit fixes this.
(Actually, draw_horiz_band was already broken before
5ecf5b93dda9d0c69875b80d28929f0d97dd7d06 when using slice-threading:
All slice-threads would write to the first row of macroblocks
for B-frames, leading to data races. It seems no one has ever complained
about this, just as no one has ever complained about the breakage
caused by 5ecf5b93dda9d0c69875b80d28929f0d97dd7d06. Probably no one
uses draw_horiz_band.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Calling it is the first thing ff_clean_h263_qscales() and
ff_clean_mpeg4_qscales() do anyway.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
A MECmpContext is quite big (792B here) and given
how ff_update_duplicate_context() works, it is (unfortunately)
copied quite frequently when using slice threading.
Therefore keep only what is needed from MECmpContext
and remove MECmpContext from MpegEncContext.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Not every function will be set, so zero the context
to initialize everything.
This also allows to remove an initialization in dvenc.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This avoids using MpegEncContext.mecc; it already allows
to avoid touching the latter for snowenc and svq1enc.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Several of the potential choices of comparison functions
need an initialized MpegEncContext (initialized for encoding,
not only ff_mpv_common_init()) or they crash when called.
Modify ff_set_cmp() to check for this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
MECmpContext.ildct_cmp is an array of function pointers that
are not set by ff_me_cmp_init(), but that are set by users
to one of the other arrays via ff_set_cmp().
Remove these pointers from MECmpContext and add pointers
for the actually used functions to its users. (The DV encoder
already did so.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>