When called inside a loop, the inline asm version results in one pxor
unnecessarely emitted per iteration, as the contents of the __asm__() block are
opaque to the compiler's instruction scheduler.
This is not the case with intrinsics, where pxor will be emitted once with any
half decent compiler.
This also has the benefit of removing any SSE -> AVX penalty that may happen
when the compiler emits VEX encoded instructions.
Signed-off-by: James Almer <jamrial@gmail.com>
For even small values of 'asrc[x]', shifting them by 24 bits or more
will cause arithmetic overflow and be caught by
GCC's undefined behaviour sanitizer.
Ensure the values do not overflow by up-casting the bracketed
expressions involving 'asrc' to uint32_t.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1591930 Wrong sizeof argument
Sponsored-by: Sovereign Tech Fund
Reviewed-by: Steve Lhomme <robux4@ycbcr.xyz>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1591944 Wrong sizeof argument
Sponsored-by: Sovereign Tech Fund
Reviewed-by: Steve Lhomme <robux4@ycbcr.xyz>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1598558 Resource leak
Sponsored-by: Sovereign Tech Fund
Reviewed-by: Steve Lhomme <robux4@ycbcr.xyz>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1591909 Wrong sizeof argument
Sponsored-by: Sovereign Tech Fund
Reviewed-by: Steve Lhomme <robux4@ycbcr.xyz>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Not a bugfix, but might fix CID1604361 Overflowed constant
Sponsored-by: Sovereign Tech Fund
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
An incorrect calculation in ff_perlin_init causes a write to the
stack array at index 256, which is out of bounds.
Fixes: CID1608711
Reviewed-by: Stefano Sabatini <stefasab@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1604548 Unused value
Sponsored-by: Sovereign Tech Fund
Reviewed-by: "Xiang, Haihao" <haihao.xiang-at-intel.com@ffmpeg.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
scale_frame() inconsistently handled the lifetime of `in`. Fixes a
possible double free and a possible memory leak.
The new code always has `scale_frame` take over ownership of the input
frame. I first tried writing this code in a way where the calling code
retains ownership, but this is nontrivial due to the presence of the
no-op short-circuit condition in which the input frame is directly
returned. (As an alternative, we could use av_frame_clone() instead, but
I wanted to avoid touching the original behavior in this commit)
There are two implementations here:
- a generic scalable one processing two columns at a time,
- a specialised processing one (fixed-size) row at a time.
Unsurprisingly, the generic one works out better with smaller widths.
With larger widths, the gains from filling vectors are outweighed by
the extra cost of strided loads and stores. In other words, memory
accesses become the bottleneck.
T-Head C908:
h264_weight2_8_c: 54.5
h264_weight2_8_rvv_i32: 13.7
h264_weight4_8_c: 101.7
h264_weight4_8_rvv_i32: 27.5
h264_weight8_8_c: 197.0
h264_weight8_8_rvv_i32: 75.5
h264_weight16_8_c: 385.0
h264_weight16_8_rvv_i32: 74.2
SpacemiT X60:
h264_weight2_8_c: 48.5
h264_weight2_8_rvv_i32: 8.2
h264_weight4_8_c: 90.7
h264_weight4_8_rvv_i32: 16.5
h264_weight8_8_c: 175.0
h264_weight8_8_rvv_i32: 37.7
h264_weight16_8_c: 342.2
h264_weight16_8_rvv_i32: 66.0
And av_stream_get_codec_timebase().
They were both added for ffmpeg CLI, which no longer calls either of
them. Furthermore the notion of "internal stream timing info" that needs
to be transferred with a special magic API function is fundamentally
flawed and should be removed.
Use the already available AVCodecParameters pointer instead.
Shortens lines.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The check "left >= INT_MAX - right" is supposed to check for
whether left + right does not overflow/wraparound, but given that
left and top are uint32_t INT_MAX - right can already wraparound
for big values of right (and ordinary 32-bit ints):
If right == UINT32_MAX, INT_MAX - right is INT_MAX + 1;
for left in 0..par->width both checks will be passed.
Fix this and simplify the check by using 64-bit types,
where the addition is guaranteed not to overflow.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: CID1551679 Data race condition
Fixes: CID1551687 Data race condition
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Found while reviewing CID1452449 Uninitialized scalar variable
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
It seems reply1 is initialized by ff_rtsp_send_cmd() in most cases but there
are code paths like "continue" which look like they could skip it but even if not
writing this so a complex loop after several layers of calls initialized a local
variable through a pointer is just bad design.
This patch simply initialized the variable.
Fixes: CID1473532 Uninitialized scalar variable
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Found while reviewing CID1473532 Uninitialized scalar variable
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is unlikely to make a difference
Fixes: CID1591896 Unintentional integer overflow
Fixes: CID1591901 Unintentional integer overflow
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>