Up until now, the return value of get_vlc2() has been used as an index
in an array that contained the value one is really interested in. Yet
since b613bacca9 this is no longer
necessary, as one can store the value that is right now stored in the
array in the VLC internal table.
This also means that all the information from the eight bit Huffman trees
are now stored in the corresponding VLC table; this will enable us to
remove several allocations lateron.
This improved performance: For GCC 9 the time for one call of
smka_decode_frame() for the sample from ticket #2425 decreased from
1811706 to 1794494 decicycles; for Clang 9 the number went from 1471663
to 1449420 decicycles.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This will mean that we will need less stack space lateron when these
arrays are no longer heap-allocated.
No discernible speed impact.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Smacker uses two types of Huffman trees: Those for eight bit values and
those for 16 bit values. Given that both return their values via arrays
and that both need to check not to overrun their array, the context for
parsing eight bit values (HuffContext) will necessarily exhibit certain
similarities with the context used for parsing 16 bit values (DBCtx).
These similarities led to using a HuffContext in addition a DBCtx for
parsing 16 bit trees.
This stands in the way of further developments for the HuffContext struct
(when parsing eight bit trees, the length of the arrays are always 256,
so that one can inline said value and move the currently heap-allocated
tables directly in the structure).
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Using explicit checks has the advantage that one can combine several
checks into one and does not have to check every time. E.g. reading a
16bit PCM sample involves two calls to get_vlc2(), each of which may
read up to three times up to SMKTREE_BITS (= 9) bits. But given that the
padding that the input packet is supposed to have is large enough, it is
no problem to only check once for each sample.
This turned out to be beneficial for performance: For GCC 9, the time for
one call of smka_decode_frame() for the sample from ticket #2425 went down
from 2055905 to 1804751 decicycles; for Clang 9 it went down from 1510538
to 1479680 decicycles.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The VLC codes in question originate from a Huffmann tree and so every
sequence of bits that is longer than the longest code contains an
initial sequence that is a valid code. Given that it has been checked
during reading said tree (and once again in ff_init_vlc_sparse()) that
the length of each code is <= 3 * the number of bits read at once when
reading codes, get_vlc2() will always find a matching entry.
These checks have been added in 71d3c25a7e
at a time when the length of the codes had not been checked when parsing
the tree.
For GCC 9 and the sample from ticket #2425 this led to a slight
performance regression: The time for one call to smka_decode_frame()
increased from 2053671 to 2064529 decicycles; for Clang 9, performance
improved from 1521288 to 1508459 decicycles.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
When length is zero for a leaf node (which happens iff the Huffman tree
consists of one leaf node only), prefix is also automatically zero.
Performance impact is negligible: For GCC 9 and the sample from #2425,
the time for one call to smka_decode_frame() decreased from 2053758 to
2053671 decicycles; for Clang 9 it went from 1523153 to 1521288.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
With the possible exception of the "last" values when decoding video,
only the part that is actually initialized with values derived from the
bitstream is used afterwards, so it is unnecessary to zero everything at
the beginning. This is also no problem for the "last" values at all,
because they are reset for every frame anyway.
While at it, use sizeof(variable) instead of sizeof(type).
Performance increased slightly: For GCC, from 2068389 decicycles per call
to smka_decode_frame() when decoding the sample from ticket #2425 to 2053758
decicycles; for Clang, from 1534188 to 1523153 decicycles.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Using the real number of read codes allows to leave a loop in
ff_init_vlc_sparse earlier; notice that all codes not explicitly
set by reading data have been set to zero earlier (i.e. they are
zero-length codes) and such codes are ignored by ff_init_vlc_sparse.
This improves performance: When compiled with GCC 9, the time spent on
one call to smka_decode_frame() for the sample from ticket #2425
decreased from 2195367 decicycles to 2068389 decicycles. For Clang 9,
it improved from 1602075 to 1534188 decicycles. These tests have been
performed 20 times and each times the input file has been looped
32 times to get a sufficient number of frames.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Given that the code currently accepts only 27 bits long Huffman codes,
the shift 1 << (length - 1) with length in 1..28 that is performed when
parsing the tree is safe. Yet if this limit were ever expanded to the
full 32 bits, this shift would be potentially undefined. So simply use
unsigned.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
smacker_decode_header_tree() uses different variables for return values
(res) and for errors (err) leading to code like
res = foo(bar);
if (res < 0) {
err = res;
goto error;
}
Given that no positive return value is ever used at all one can simplify
the above by removing the intermediate res.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The earlier version did not error out directly in case an error happens,
because it would lead to a leak: An allocated array is only reachable
via a local variable at that time; it is only attached to more permanent
storage at the end. While it would be possible to add custom code for
freeing on error (instead of reusing the ordinary code for doing so),
this commit takes the opposite approach and attaches the newly allocated
array to its permanent place immediately after its allocation.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The extradata for Smacker video contains Huffman trees as well as a
field containing the size (in bytes) of said Huffman tree when stored
as a table. Due to three special values the decoder allocates more than
the size field indicates; yet when it parses the table it only errors
out if the number of elements exceeds the number of allocated elements
and not the number of elements as indicated by the size field. As a
consequence, there might be less than three elements available at the
end, so that another check for this is necessary.
This commit changes this: It is always made sure that the three elements
reserved to (potentially) use them to store the special values are not
used to store ordinary tree entries. This allows to remove the extra
check at the end.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
ff_init_vlc_sparse() supports arrays of uint8_t, uint16_t and uint32_t
as input (and it also supports padding/other elements in between the
elements). This makes the typical case in which the input is a simple
array more cumbersome. E.g. for an array of uint8_t one would either
need to call the function with arguments like "array, sizeof(array[0]),
sizeof(array[0])" or with "array, 1, 1". The former is nicer, but
longer, so that the latter is mostly used. Therefore this commit adds a
macro that expands to the sizeof() construct.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The Huffmann tables used by Smacker can consist of exactly one leaf only
in which case the length of the corresponding code is zero; there is
then exactly one value encoded. Our VLC can't handle this and therefore
this case needs to be treated separately; it has been implemented in
commit 48cbdaea15. Yet said commit also
made the decoder emit an error message (despite not erroring out) in this
case, although it seems that this is rather a limitation of our VLC API.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The AV1 decoder has the FF_CODEC_CAP_INIT_CLEANUP flag set and yet
the decoder's close function is called manually on some error paths.
This is unnecessary and has been removed in this commit.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Up until now, the AV1 decoder always checks before calling its wrapper
around ff_thread_release_buffer() whether the ThreadFrame was used at
all, i.e. it checked whether the first data buffer of the AVFrame
contained therein is NULL or not. Yet this presumes that the AVFrame has
been successfully allocated, even though this can of course fail; and if
it did, one would encounter a segfault.
Fix this by removing the checks altogether: ff_thread_release_buffer()
can handle both unallocated as well as empty frames (since commit
f6774f905f).
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Before patch, memory was allocated in each thread functions,
which may cause more than one time of memory allocation and
cause crash.
After patch, memory is allocated in the main thread once,
an index was parsed into thread functions. Bug fixed.
Signed-off-by: Xu Jun <xujunzz@sjtu.edu.cn>
Freeing a buffer allocated in the VBLE decoder's init function
is the only thing the decoder's close function does and this implies
that it is unnecessary to call it in case said allocation fails. Yet
this is what has been done.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Up until now, the SVQ3 decoder allocated several refcounted buffers,
despite no sharing/refcounting happening at all: Their refcount never
exceeds one and they are treated like ordinary buffers (with the
exception that the pointer used to access them is in the middle of the
allocated buffer, but this does not warrant using the AVBuffer API at
all). Given that using the AVBuffer API incurs overhead, it is no longer
used at all.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Commit b2361cfb94 made all of the
error paths in svq3_decode_init() call svq3_decode_end(); yet several
new error paths that were added later (in merges from Libav) returned
directly without cleaning up properly. This commit fixes the resulting
potential memleaks by setting the FF_CODEC_CAP_INIT_CLEANUP flag. This
also allows to simplify freeing by returning directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The very first thing the SVQ3 decoder currently does is allocating several
SVQ3Frames, a structure which contains members that need to be freed on
their own. If one of these allocations fails, the decoder calls its own
close function to not leak the already allocated SVQ3Frames. Yet said
function presumes that the SVQ3Frames have been successfully allocated
as there is no check before freeing the members that need to be freed.
This commit fixes this by making these frames part of the SVQ3Context,
thereby avoiding the allocations altogether. Notice that the pointers
to the frames have been retained in order to allow to just swap them as
the code already does.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The Sonic decoder and encoders allocate several buffers in their init
function and return immediately if one of these allocations fails; this
will lead to leaks if there was an earlier successfull allocation. Fix
this by setting the FF_CODEC_CAP_INIT_CLEANUP flag.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
If allocating a buffer in RoQ DPCM encoder's init function failed,
the close function would be called manually; all this function does is
freeing said buffer, but given that it has not been allocated at all,
this is unnecessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Do this by only keeping the only function pointer from the
AVFloatDSPContext that is needed lateron. This also allows to remove the
decoders' close function.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Do this by only keeping the only function pointer from the
AVFloatDSPContext that is needed lateron. This also allows to remove the
decoder's close function.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The init function of the real_144 encoder calls its own close function
if a call to ff_lpc_init() fails; yet nothing has been allocated before
that point and ff_lpc_init() can be expected to clean up after itself on
error (the documentation does not say anything to the contrary and the
current implementation can only fail if the only allocation fails, so
there is nothing to clean up on error anyway), so this is unnecessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The qtrle encoder allocates several buffers and an AVFrame in its init
function. If one of these allocations fails, but others succeed, the
successfully allocated objects leak. This is fixed by setting the
FF_CODEC_CAP_INIT_CLEANUP flag.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Now that ff_ffv1_close() is called upon failure for both the FFV1 encoder
and decoder, the code contained therein can be used to free the partially
allocated slice contexts if allocating the slice contexts failed. One just
has to set the correct number of slice contexts on error. This allows to
remove the code for freeing partially allocated slice contexts in
ff_ffv1_init_slice_contexts().
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The FFV1 encoder has so far not cleaned up after itself in this case;
but it can be done easily by setting the FF_CODEC_CAP_INIT_CLEANUP flag.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
When allocating FFV1 slice contexts fails, ff_ffv1_init_slice_contexts()
frees everything that it has allocated, yet it does not reset the
counter for the number of allocated slice contexts. This inconsistent
state leads to segfaults lateron in ff_ffv1_close(), because said
function presumes that the slice contexts have been allocated.
Fix this by making sure that the number of slice contexts on error is
consistent (namely zero).
(This issue only affected the FFV1 decoder, because the encoder does not
clean up after itself on init failure.)
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>