Since the horizontal and vertical filters are identical except for a
transposition, this uses a common subprocedure with an ad-hoc ABI.
To preserve return-address stack prediction, a link register has to be
used (c.f. the "Control Transfer Instructions" from the
RISC-V ISA Manual). The alternate/temporary link register T0 is used
here, so that the normal RA is preserved (something Arm cannot do!).
To load the strength value based on `qscale`, the shortest possible
and PIC-compatible sequence is used: AUIPC; ADD; LBU. The classic
LLA; ADD; LBU sequence would add one more instruction since LLA is a
convenience alias for AUIPC; ADDI. To ensure that this trick works,
relocation relaxation is disabled.
To implement the two signed divisions by a power of two toward zero:
(x / (1 << SHIFT))
the code relies on the small range of integers involved, computing:
(x + (x >> (16 - SHIFT))) >> SHIFT
rather than the more general:
(x + ((x >> (16 - 1)) & ((1 << SHIFT) - 1))) >> SHIFT
Thus one ANDI instruction is avoided.
T-Head C908:
h263dsp.h_loop_filter_c: 228.2
h263dsp.h_loop_filter_rvv_i32: 144.0
h263dsp.v_loop_filter_c: 242.7
h263dsp.v_loop_filter_rvv_i32: 114.0
(C is probably worse in real use due to less predictible branches.)
This is more spec-compliant because it does not rely
on dead-code elimination by the compiler. Especially
MSVC has problems with this, as can be seen in
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/296373.html
or
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/297022.html
This commit does not eliminate every instance where we rely
on dead code elimination: It only tackles branching to
the initialization of arch-specific dsp code, not e.g. all
uses of CONFIG_ and HAVE_ checks. But maybe it is already
enough to compile FFmpeg with MSVC with whole-programm-optimizations
enabled (if one does not disable too many components).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This patch adds MSA (MIPS-SIMD-Arch) optimizations for H263 lpf functions in new file h263dsp_msa.c
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '0338c396987c82b41d322630ea9712fe5f9561d6':
dsputil: Split off H.263 bits into their own H263DSPContext
Conflicts:
configure
libavcodec/mpegvideo.h
libavcodec/mpegvideo_enc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>