FFmpeg

virtualenv/FFmpeg

Fork 0

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-21 10:55:51 +02:00

Commit Graph

Author	SHA1	Message	Date
Rémi Denis-Courmont	378d1b06c3	riscv: probe for Zbb extension at load time Due to hysterical raisins, most RISC-V Linux distributions target a RV64GC baseline excluding the Bit-manipulation ISA extensions, most notably: - Zba: address generation extension and - Zbb: basic bit manipulation extension. Most CPUs that would make sense to run FFmpeg on support Zba and Zbb (including the current FATE runner), so it makes sense to optimise for them. In fact a large chunk of existing assembler optimisations relies on Zba and/or Zbb. Since we cannot patch shared library code, the next best thing is to carry a flag initialised at load-time and check it on need basis. This results in 3 instructions overhead on isolated use, e.g.: 1: AUIPC rd, %pcrel_hi(ff_rv_zbb_supported) LBU rd, %pcrel_lo(1b)(rd) BEQZ rd, non_Zbb_fallback_code // Zbb code here The C compiler will typically load the flag ahead of time to reducing latency, and can also keep it around if Zbb is used multiple times in a single optimisation scope. For this to work, the flag symbol must be hidden; otherwise the optimisation degrades with a GOT look-up to support interposition: 1: AUIPC rd, GOT_OFFSET_HI LD rd, GOT_OFFSET_LO(rd) LBU rd, (rd) BEQZ rd, non_Zbb_fallback_code // Zbb code here This patch adds code to provision the flag in libraries using bit manipulation functions from libavutil: byte-swap, bit-weight and counting leading or trailing zeroes.	2024-06-11 20:12:37 +03:00
J. Dekker	ca583b22e4	avfilter/riscv: build afir only if required Signed-off-by: J. Dekker <jdek@itanimul.li>	2024-05-14 17:46:19 +02:00
sunyuechi	afb967b81e	af_afir: RISC-V V fcmul_add Segmented loads are slow, so here we use unit-strided load and narrowing shifts. c910: fcmul_add_c: 2179 fcmul_add_rvv_f64: 1652 c908: fcmul_add_c: 4891.2 fcmul_add_rvv_f64: 2399.5 Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>	2023-11-16 20:53:18 +02:00

Author

SHA1

Message

Date

Rémi Denis-Courmont

378d1b06c3

riscv: probe for Zbb extension at load time

Due to hysterical raisins, most RISC-V Linux distributions target a
RV64GC baseline excluding the Bit-manipulation ISA extensions, most
notably:
- Zba: address generation extension and
- Zbb: basic bit manipulation extension.
Most CPUs that would make sense to run FFmpeg on support Zba and Zbb
(including the current FATE runner), so it makes sense to optimise for
them. In fact a large chunk of existing assembler optimisations relies
on Zba and/or Zbb.

Since we cannot patch shared library code, the next best thing is to
carry a flag initialised at load-time and check it on need basis.
This results in 3 instructions overhead on isolated use, e.g.:
1:  AUIPC rd, %pcrel_hi(ff_rv_zbb_supported)
    LBU   rd, %pcrel_lo(1b)(rd)
    BEQZ  rd, non_Zbb_fallback_code
    // Zbb code here

The C compiler will typically load the flag ahead of time to reducing
latency, and can also keep it around if Zbb is used multiple times in a
single optimisation scope. For this to work, the flag symbol must be
hidden; otherwise the optimisation degrades with a GOT look-up to
support interposition:
1:  AUIPC rd, GOT_OFFSET_HI
    LD    rd, GOT_OFFSET_LO(rd)
    LBU   rd, (rd)
    BEQZ  rd, non_Zbb_fallback_code
    // Zbb code here

This patch adds code to provision the flag in libraries using bit
manipulation functions from libavutil: byte-swap, bit-weight and
counting leading or trailing zeroes.

2024-06-11 20:12:37 +03:00

J. Dekker

ca583b22e4

avfilter/riscv: build afir only if required

Signed-off-by: J. Dekker <jdek@itanimul.li>

2024-05-14 17:46:19 +02:00

sunyuechi

afb967b81e

af_afir: RISC-V V fcmul_add

Segmented loads are slow, so here we use unit-strided load and narrowing shifts.

c910:
fcmul_add_c: 2179
fcmul_add_rvv_f64: 1652

c908:
fcmul_add_c: 4891.2
fcmul_add_rvv_f64: 2399.5

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>

2023-11-16 20:53:18 +02:00

3 Commits