1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-28 20:53:54 +02:00
FFmpeg/libavcodec/riscv
Rémi Denis-Courmont f0ef11ea83 lavc/bswapdsp: RISC-V B bswap_buf
Simply taking the Zbb REV8 instruction into use in a simple loop gives
some significant savings:

bswap_buf_c: 1081.0
bswap_buf_rvb_b: 771.0

But we can also use the 64-bit REV8 as a pseudo-SIMD instruction with
just one additional shift, and one fewer load, effectively doubling the
bandwidth. Consequently, this patch is useful even if the compile-time
target has Zbb enabled for C code:

bswap_buf_c: 1081.0
bswap_buf_rvb_b: 341.0  (this patch)

On the other hand, this approach fails miserably for bswap16_buf as the
ratio of shifts and stores becomes unfavorable compared to naïve C:

bswap16_buf_c: 1542.0
bswap16_buf_rvb_b: 1803.7

Unrolling to process 128 bits (4 samples) at a time actually worsens
performance ever so slightly:

bswap_buf_c: 1081.0
bswap_buf_rvb_b: 408.5
2022-10-05 08:26:19 +02:00
..
aacpsdsp_init.c lavc/aacpsdsp: RISC-V V stereo_interpolate[0] 2022-09-27 13:19:52 +02:00
aacpsdsp_rvv.S lavc/aacpsdsp: RISC-V V stereo_interpolate[0] 2022-09-27 13:19:52 +02:00
alacdsp_init.c lavc/alacdsp: RISC-V V append_extra_bits[1] 2022-10-05 06:51:11 +02:00
alacdsp_rvv.S riscv/alacdsp: drop config.h include 2022-10-05 06:59:43 +02:00
audiodsp_init.c lavc/audiodsp: RISC-V V scalarproduct_int16 2022-09-27 13:19:52 +02:00
audiodsp_rvf.S
audiodsp_rvv.S lavc/audiodsp: RISC-V V scalarproduct_int16 2022-09-27 13:19:52 +02:00
bswapdsp_init.c lavc/bswapdsp: RISC-V B bswap_buf 2022-10-05 08:26:19 +02:00
bswapdsp_rvb.S lavc/bswapdsp: RISC-V B bswap_buf 2022-10-05 08:26:19 +02:00
fmtconvert_init.c riscv: Fix linking without RVV; change #ifdef into #if 2022-09-29 10:28:37 +03:00
fmtconvert_rvv.S riscv: remove unnecessary #include's 2022-10-05 06:54:56 +02:00
idctdsp_init.c lavc/idctdsp: RISC-V V put_signed_pixels_clamped function 2022-09-28 11:46:11 +02:00
idctdsp_rvv.S riscv: remove unnecessary #include's 2022-10-05 06:54:56 +02:00
Makefile lavc/bswapdsp: RISC-V B bswap_buf 2022-10-05 08:26:19 +02:00
pixblockdsp_init.c lavc/pixblockdsp: RISC-V diff_pixels & diff_pixels_unaligned 2022-09-28 11:46:11 +02:00
pixblockdsp_rvi.S riscv: remove unnecessary #include's 2022-10-05 06:54:56 +02:00
pixblockdsp_rvv.S riscv: remove unnecessary #include's 2022-10-05 06:54:56 +02:00
vorbisdsp_init.c lavc/vorbisdsp: RISC-V V inverse_coupling 2022-09-27 13:19:52 +02:00
vorbisdsp_rvv.S riscv: remove unnecessary #include's 2022-10-05 06:54:56 +02:00