The previous code ended in multiple different infinite
loops. See stl_ten_1_big.sfd as example with and without zzuf
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This should be replaced by a more appropriate error code of course but
we should not leave compilation broken until that is decided.
Found-by: jb
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
that way qatar maintains the code for me and i dont need to resolve conflicts.
If someone wants the a32 reader back, only thing you need to do is maintain
it, i would be happy to have it back, iam just not volunteering to maintain
it due to lack of time.
Based on: a1e98f198e by Mans Rullgard.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Many of the test programs directly access internal symbols not
exported from the shared libraries. This allows tests to run
when configured with shared libraries.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This fixes the same overflow as in the RGB48/16-bit YUV scaling;
some filters can overflow both negatively and positively (e.g.
spline/lanczos), so we bias a signed integer so it's "half signed"
and "half unsigned", and can cover overflows in both directions
while maintaining full 31-bit depth.
Signed-off-by: Mans Rullgard <mans@mansr.com>
No difference in PSNR or bitrate in the printed precission with the matrix lobby scene at 322x242
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
We're shifting individual components (8-bit, unsigned) left by 24,
so making them unsigned should give the same results without the
overflow.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
For certain types of filters where the intermediate sum of coefficients
can go above the fixed-point equivalent of 1.0 in the middle of a filter,
the sum of a 31-bit calculation can overflow in both directions and can
thus not be represented in a 32-bit signed or unsigned integer. To work
around this, we subtract 0x40000000 from a signed integer base, so that
we're halfway signed/unsigned, which makes it fit even if it overflows.
After the filter finishes, we add the scaled bias back after a shift.
We use the same trick for 16-bit bpc YUV output routines.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The buffer splicing relies on the bitstream reader over-reading
the end of the buffer as declared in init_get_bits(), although
more data is actually present. Manually moving the bitstream
boundary after init_get_bits() allows this to work as expected.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The sample has an incomplete last frame. Decoding it is pointless.
The garbage produced was changed by the bitstream reader now
protecting against over-reads.
Signed-off-by: Mans Rullgard <mans@mansr.com>
When turned on, H264/CAVLC gets ~15% (CVPCMNL1_SVA_C.264) slower for
ultra-high-bitrate files, or ~2.5% (CVFI1_SVA_C.264) for lower-bitrate
files. Other codecs are affected to a lesser extent because they are
less optimized; e.g., VC-1 slows down by less than 1% (all on x86).
The patch generated 3 extra instructions (cmp, cmovae and mov) per
call to get_bits().
The performance penalty on ARM is within the error margin for most
files, up to 4% in extreme cases such as CVPCMNL1_SVA_C.264.
Based on work (for GCI) by Aneesh Dogra <lionaneesh@gmail.com>, and
inspired by patch in Chromium by Chris Evans <cevans@chromium.org>.