The call to ff_exp2fi() here always uses arguments in the normal
range, so that the branches in ff_exp2fi() are unnecessary.
This is so because JPEG2000 itself only supports up to
128 bits per component per pixel (we only support far less);
furthermore, expn is always 0..31 for the decoder and also
sane for the encoder, so that the difference between these
two values is always in the normal range of -126..128.
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The JPEG-2000 decoder and encoder share common luts; the decoder
initializes them once, guarded by a dedicated AVOnce, whereas
the encoder initializes them always during init. This means that
the decoder is not init-threadsafe; in fact there is a potential
data race because these luts can be initialized while an active
decoder/encoder is using them.
Fix this and make the decoder init-threadsafe by making the
initialization function guard initialization itself with a dedicated
AVOnce.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The implementation of the tag tree did not
set the correct reset value for the encoder.
This lead to inefficent tag tree being encoded.
This patch fixes the implementation of the
ff_tag_tree_zero() function.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This patch allows setting a compression ratio and to
set multiple layers. The user has to input a compression
ratio for each layer.
The per layer compression ration can be set as follows:
-layer_rates "r1,r2,...rn"
for to create 'n' layers.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The implementation of tag tree encoding was incorrect.
However, this error was not visible as the current j2k
encoder encodes only 1 layer.
This patch fixes tag tree coding for JPEG2000 such tag
tree coding would work for multi layer encoding.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This patch makes the tag_tree_zero() and tag_tree_size()
functions non static and callable from other files.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This patch removes a check which throws an error if
the log2 precinct width/height is 0. The standard allows
the first component to have 0 as the log2 width/height.
However, to ensure proper intialization of coding style,
an extra check has been added.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The calculation of precinct boundaries has been
fixed. The precinct boundaries were calculated
as an offset to the band boundary, but must
instead be calculated as an offset from the
reslevel. This patch fixes#4669 and #4679.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: value 1.87633e+10 is outside the range of representable values of type 'int'
Fixes: Undefined behavior
Fixes: 14246/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_JPEG2000_fuzzer-5758393601490944
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: OOM
Fixes: 3541/clusterfuzz-testcase-minimized-6469958596820992
Adds support for decoding codeblock data larger than 8kb
Reduces decoder memory consumption
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
pow is a very wasteful function for this purpose. A low hanging fruit
would be simply to replace with exp2f, and that does yield some speedup.
However, there are 2 drawbacks of this:
1. It does not exploit the integer nature of the argument.
2. (minor) Some platforms lack a proper exp2f routine, making benefits available
only to non broken libm.
3. exp2f does not solve the same issue that plagues pow, namely terrible
worst case performance. This is a fundamental issue known as the
"table-maker's dilemma" recognized by Prof. Kahan himself and
subsequently elaborated and researched by many others. All this is clear from benchmarks below.
This exploits the IEEE-754 format to get very good performance even in
the worst case for integer powers of 2. This solves all the issues noted
above. Function tested with clang usan over [-1000, 1000] (beyond range of
relevance for this, which is [-255, 255]), patch itself with FATE.
Benchmarks obtained on x86-64, Haswell, GNU-Linux via 10^5 iterations of
the pow call, START/STOP, and command ffplay ~/samples/jpeg2000/chiens_dcinema2K.mxf.
Low number of runs also given to prove the point about worst case:
pow:
216270 decicycles in pow, 1 runs, 0 skips
110175 decicycles in pow, 2 runs, 0 skips
56085 decicycles in pow, 4 runs, 0 skips
29013 decicycles in pow, 8 runs, 0 skips
15472 decicycles in pow, 16 runs, 0 skips
8689 decicycles in pow, 32 runs, 0 skips
5295 decicycles in pow, 64 runs, 0 skips
3599 decicycles in pow, 128 runs, 0 skips
2748 decicycles in pow, 256 runs, 0 skips
2304 decicycles in pow, 511 runs, 1 skips
2072 decicycles in pow, 1022 runs, 2 skips
1963 decicycles in pow, 2044 runs, 4 skips
1894 decicycles in pow, 4091 runs, 5 skips
1860 decicycles in pow, 8184 runs, 8 skips
exp2f:
134140 decicycles in pow, 1 runs, 0 skips
68110 decicycles in pow, 2 runs, 0 skips
34530 decicycles in pow, 4 runs, 0 skips
17677 decicycles in pow, 8 runs, 0 skips
9175 decicycles in pow, 16 runs, 0 skips
4931 decicycles in pow, 32 runs, 0 skips
2808 decicycles in pow, 64 runs, 0 skips
1747 decicycles in pow, 128 runs, 0 skips
1208 decicycles in pow, 256 runs, 0 skips
952 decicycles in pow, 512 runs, 0 skips
822 decicycles in pow, 1024 runs, 0 skips
765 decicycles in pow, 2047 runs, 1 skips
722 decicycles in pow, 4094 runs, 2 skips
693 decicycles in pow, 8190 runs, 2 skips
exp2fi:
2740 decicycles in pow, 1 runs, 0 skips
1530 decicycles in pow, 2 runs, 0 skips
955 decicycles in pow, 4 runs, 0 skips
622 decicycles in pow, 8 runs, 0 skips
477 decicycles in pow, 16 runs, 0 skips
368 decicycles in pow, 32 runs, 0 skips
317 decicycles in pow, 64 runs, 0 skips
291 decicycles in pow, 128 runs, 0 skips
277 decicycles in pow, 256 runs, 0 skips
268 decicycles in pow, 512 runs, 0 skips
265 decicycles in pow, 1024 runs, 0 skips
263 decicycles in pow, 2048 runs, 0 skips
263 decicycles in pow, 4095 runs, 1 skips
260 decicycles in pow, 8191 runs, 1 skips
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Fixes: out of array read
Fixes: 36b8096fefab16c4c9326a508053e95c/signal_sigsegv_1d9ce18_3233_1a55196b018106dfabeace071a432d9e.r3d
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes assertion failure
Fixes: 03e0abe721b1174856d41a1eb5d6a896/signal_sigabrt_7ffff6ae7cc9_3813_e71bf3541abed3ccba031cd5ba0269a4.avi
This fix is choosen to be simple to backport, better solution
for master is planed
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '95a41311ac3a44773cc4dc407408aca35b1f8e26':
jpeg2000: Factor out band stepsize initialization
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
This also reduces the amount of memory needed
Fixes Ticket4672
The new code seems slightly faster as well, probably due to better cache usage
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This reduces the number of operations
Its not done for 9/7i as that would overflow thanks to JPEG2000 allowing
32 decomposition levels
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
thats how the specification defines it, this also improves numerical
accuracy of the integer wavelet implementation. It otherwise should
be equivalent, in case of overflows this can be reverted.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
These 2 are highly related so they are in the same commit
Fixes part of Ticket4605
Fixes p0_04.j2k
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>