Besides simplifying address computations (it saves 432B of .text
in hevcdsp.o alone here) it also fixes undefined behaviour that
occurs if mx or my are 0 (happens when the filters are unused)
because they lead to an array index of -1 in the old code.
This happens in the checkasm-hevc_pel FATE-test.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes: out of array read
Fixes: 47875/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5719393113341952
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: left shift of negative value -1
Fixes: 4690/clusterfuzz-testcase-minimized-6117482428366848
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: runtime error: left shift of negative value -180
Fixes: 4626/clusterfuzz-testcase-minimized-5647837887987712
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: runtime error: left shift of negative value -3
Fixes: 4524/clusterfuzz-testcase-minimized-6055590120914944
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: runtime error: left shift of negative value -127
Fixes: 4397/clusterfuzz-testcase-minimized-4779061080489984
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: runtime error: left shift of negative value -255
Fixes: 4037/clusterfuzz-testcase-minimized-5290998163832832
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: runtime error: left shift of negative value -255
Fixes: 3373/clusterfuzz-testcase-minimized-5604083912146944
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: runtime error: left shift of negative value -95
Fixes: 3077/clusterfuzz-testcase-minimized-4684917524922368
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit 'ba479f3daafc7e4359ec1212164569ebe59f0bb7':
hevc: Change type of array stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
* commit 'cc16da75c2f99d92f7a6461100f041352deb6d88':
hevc: Add coefficient limiting to speed up IDCT
Noop again as we have these changes already, only random spacing
changes.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '4f247de3b797cdc9d243d26534412f81c306e5b5':
hevcdsp_template: Templatize IDCT
This commit is a noop as we already have that code from a previous
commits (see 92cccb7bcd).
Spacing is adjusted to reduce the diff.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d':
hevc: Separate adding residual to prediction from IDCT
This commit should be a noop but isn't because of the following renames:
- transform_add → add_residual
- transform_skip → dequant
- idct_4x4_luma → transform_4x4_luma
Merged-by: Clément Bœsch <cboesch@gopro.com>
Based on patch 250430bf28
by Mickaël Raulet <mraulet@insa-rennes.fr>, integrated
to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
GCC 4.9.2 on a Core i5-4200U @ 1.60GHz, Linux x86_64
Before
715487 decicycles in sao_edge_filter_8, 262144 runs, 0 skips
After
672104 decicycles in sao_edge_filter_8, 262144 runs, 0 skips
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
As with sao_band_filter, pass instead the two variables from the struct needed in the function.
This simplifies writing asm optimized versions.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: James Almer <jamrial@gmail.com>
Pass instead the two variables from the struct needed in the function.
This simplifies writing asm optimized versions of the function
Signed-off-by: James Almer <jamrial@gmail.com>
The x86 asm expects int32_t so use that type.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>