FFmpeg

virtualenv/FFmpeg

Fork 0

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-02-04 06:08:26 +02:00

Commit Graph

Author	SHA1	Message	Date
Rémi Denis-Courmont	324899b748	lavu/riscv: use Zbb REV8 at run-time This adds runtime support to use Zbb REV8 for 32- and 64-bit byte-wise swaps. The result is about five times slower than if targetting Zbb statically, but still a lot faster than the default bespoke C code or a call to GCC run-time functions. For 16-bit swap, this is however unsurprisingly a lot worse, and so this sticks to the baseline. In fact, even using REV8 statically does not seem to be beneficial in that case. Zbb static Zbb dynamic I baseline bswap16: 0.668184765 3.340764069 0.668029012 bswap32: 0.668174014 3.340763319 9.353855435 bswap64: 0.668221765 3.340496313 14.698672283 (seconds for 1 billion iterations on a SiFive-U74 core)	2024-06-11 20:12:37 +03:00
Rémi Denis-Courmont	7dcb5e1ab0	riscv/bswap: use compiler builtins av_bswapXX() are used in context that expect exact size types, notably variable arguments to av_log(). On Linux RV64, uint_fast32_t is an unsigned long, so the current inline assembler does not work properly. Since GCC and Clang gained their byte-swap built-ins before they supported RISC-V, we can simply defer to them. As an added bonus, the compiler can do instruction scheduling, which it couldn't with the Zbb inline assembler.	2023-05-02 22:08:21 +02:00
Rémi Denis-Courmont	df2057041b	lavu/riscv: byte-swap operations If the target supports the Basic bit-manipulation (Zbb) extension, then the REV8 instruction is available to reverse byte order. Note that this instruction only exists at the "XLEN" register size, so we need to right shift the result down to the data width. If Zbb is not supported, then this patchset does nothing. Support for run-time detection is left for the future. Currently, there are no bits in auxv/ELF HWCAP for Z-extensions, so there are no clean ways to do this.	2022-09-13 16:50:43 -03:00

Author

SHA1

Message

Date

Rémi Denis-Courmont

324899b748

lavu/riscv: use Zbb REV8 at run-time

This adds runtime support to use Zbb REV8 for 32- and 64-bit byte-wise
swaps. The result is about five times slower than if targetting Zbb
statically, but still a lot faster than the default bespoke C code or a
call to GCC run-time functions.

For 16-bit swap, this is however unsurprisingly a lot worse, and so this
sticks to the baseline. In fact, even using REV8 statically does not
seem to be beneficial in that case.

         Zbb static    Zbb dynamic   I baseline
bswap16:  0.668184765   3.340764069   0.668029012
bswap32:  0.668174014   3.340763319   9.353855435
bswap64:  0.668221765   3.340496313  14.698672283
(seconds for 1 billion iterations on a SiFive-U74 core)

2024-06-11 20:12:37 +03:00

Rémi Denis-Courmont

7dcb5e1ab0

riscv/bswap: use compiler builtins

av_bswapXX() are used in context that expect exact size types, notably
variable arguments to av_log(). On Linux RV64, uint_fast32_t is an
unsigned long, so the current inline assembler does not work properly.

Since GCC and Clang gained their byte-swap built-ins before they
supported RISC-V, we can simply defer to them. As an added bonus, the
compiler can do instruction scheduling, which it couldn't with the Zbb
inline assembler.

2023-05-02 22:08:21 +02:00

Rémi Denis-Courmont

df2057041b

lavu/riscv: byte-swap operations

If the target supports the Basic bit-manipulation (Zbb) extension, then
the REV8 instruction is available to reverse byte order.

Note that this instruction only exists at the "XLEN" register size,
so we need to right shift the result down to the data width.

If Zbb is not supported, then this patchset does nothing. Support for
run-time detection is left for the future. Currently, there are no
bits in auxv/ELF HWCAP for Z-extensions, so there are no clean ways to
do this.

2022-09-13 16:50:43 -03:00

3 Commits