1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-06-30 22:24:04 +02:00
Commit Graph

167 Commits

Author SHA1 Message Date
d9f594209f checkasm/riscv: print official extension names 2024-09-04 22:04:11 +03:00
e758b24396 checkasm: add wildcompares for test & functions
Added:

  --test=<pattern>    Filter tests by glob style pattern.
  --bench[=<pattern>] Run benchmark and optionally filter functions
                      by glob style pattern.

Example:

$ ./tests/checkasm/checkasm --bench=yuva*
[...]
yuva420p_bgr24_8_c:                                     34.5 ( 1.00x)
yuva420p_bgr24_8_ssse3:                                 31.1 ( 1.11x)
yuva420p_bgr24_128_c:                                  310.6 ( 1.00x)
yuva420p_bgr24_128_ssse3:                              178.1 ( 1.74x)
yuva420p_bgr24_1080_c:                                2509.6 ( 1.00x)
yuva420p_bgr24_1080_ssse3:                            1471.5 ( 1.71x)
yuva420p_bgr24_1920_c:                                4462.6 ( 1.00x)
yuva420p_bgr24_1920_ssse3:                            2331.1 ( 1.91x)
[...]

Ported from dav1d.

Signed-off-by: J. Dekker <jdek@itanimul.li>
2024-08-28 11:45:46 +02:00
d0986709a8 checkasm: improve print format
Port dav1d's checkasm output format to FFmpeg's checkasm, includes
relative speedups and aligns results.

Signed-off-by: J. Dekker <jdek@itanimul.li>
2024-08-28 11:45:46 +02:00
03f26549cd checkasm: print only results to stdout
Signed-off-by: J. Dekker <jdek@itanimul.li>
2024-08-28 11:45:46 +02:00
42528ff835 checkasm: add csv/tsv bench output
When collecting performance information from checkasm it is common
to parse the output for use in graphs to compare vs different
architectures.

Signed-off-by: J. Dekker <jdek@itanimul.li>
2024-08-28 11:45:46 +02:00
834964ce1a checkasm/mpegvideoencdsp: add pix_sum, pix_norm1, and draw_edges 2024-08-26 12:48:09 +02:00
a2e01cade8 checkasm/yuv2yuv: add tests for semiplanar unscaled converters 2024-08-26 11:04:46 +02:00
d1326b6347 lavu/riscv: drop probing for zba CPU capability 2024-08-05 21:16:26 +03:00
1b2a925e94 lavc/riscv: drop probing for F & D extensions
F and D extensions are included in all RISC-V application profiles ever
made (so starting from RV64GC a.k.a. RVA20). Realistically they need to be
selected at compilation time.

Currently, there are no consumers for these two flags. If there is ever a
need to reintroduce F- or D-specific optimisations, we can always use
__riscv_f or __riscv_d compiler predefined macros respectively.
2024-08-01 22:56:50 +03:00
45d7078a21 lavu/riscv: add CPU flag for B bit manipulations
The B extension was finally ratified in May 2024, encompassing:
- Zba (addresses),
- Zbb (basics) and
- Zbs (single bits).
It does not include Zbc (base-2 polynomials).
2024-07-25 23:09:58 +03:00
1fb77347c8 checkasm: add tests for yuv2rgb 2024-06-28 14:49:49 +02:00
74b4e550cb tests/checkasm: Remove check on linux perf fd in uninit
The check should be >= 0, not > 0. The check itself is redundant
since uninit only being called after init is success.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-18 15:23:46 +08:00
874152033d checkasm: add tests for {lum,chr}ConvertRange 2024-06-16 00:34:24 +02:00
8d117024fe checkasm: disable unaligned access emulation
The OS may silently fix (emulate) unaligned hardware access exceptions.
This is extremely slow and code should be fixed not to rely on unaligned
access on affected hardware. Accordingly this requests that the OS
disable emulation and instead throw Bus error, which will be caught by
checkasm's signal handler.

This has no effects if the hardware supports unaligned access in
hardware, since no exceptions are generated. prctl() will fail safe in
that case.
2024-06-07 17:53:05 +03:00
fc85aff72f checkasm: add linear least square tests 2024-06-01 18:05:58 +03:00
44f7f6e010 checkasm: add h263dsp.{h,v}_loop_filter 2024-05-27 22:42:07 +03:00
d03cdfa2b6 checkasm/riscv: test misaligned before V
Otherwise V functions mask scalar misaligned ones.
2024-05-24 17:53:43 +03:00
d43e123837 checkasm: print bench runs when benchmarking
Helps make sense of the possible noise in the results.
2024-05-21 17:48:48 +02:00
b1adf6d1d0 checkasm: add runs argument to adjust during bench
Some timers on certain device and test combinations can produce noisy
results, affecting the reliability of performance measurements. One
notable example of this is the Canaan K230 RISC-V development board.

An option to adjust the number of samples by an exponent (--runs) has
been added, allowing developers to increase the sample count for more
reliable results.

Signed-off-by: J. Dekker <jdek@itanimul.li>
2024-05-21 16:47:45 +02:00
b410439263 lavu/riscv: CPU flag for fast misaligned accesses 2024-05-14 19:50:00 +03:00
9ef6e15b04 tests/checkasm: add checkasm_check_vvc_alf and check_alf_filter
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
2024-05-14 19:21:35 +08:00
01c5f4ad9f riscv: add Zvbb vector bit manipulation extension 2024-05-11 11:38:49 +03:00
250c0defa2 checkasm: add test for fdct
Reviewed-by: Martin Storsjö <martin@martin.st>
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-11 10:28:59 +02:00
cfa8d2488d checkasm/rv40dsp: add chroma_mc test
This is similar to h264.

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-03 18:00:53 +03:00
985fdf8e3d tests/checkasm: add exclude_guest for non-x86 linux perf
The exclude_guest option only has an effect on x86. Omitting
'exclude_guest' defaults to zero which implies that you can count guest
events should you run one. Some non-x86 kernels just ignore it, while
others (e.g. the Asahi Linux kernels) require the user to explicitly set
the option to 1, i.e. the only behaviour that makes sense when counting
guest events isn't supported.

Signed-off-by: J. Dekker <jdek@itanimul.li>
2024-04-10 13:37:40 +02:00
6728edadde checkasm/rv34dsp: add rv34_inv_transform_dc test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-02-17 14:33:35 +02:00
fb26c7bfd4 tests/checkasm: add checkasm_check_vvc_mc
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
2024-02-01 19:54:29 +08:00
ac40c3bb07 checkasm: Test whether the native FFmpeg timers work
On some platforms (in particular, ARM/AArch64), the implementation
of AV_READ_TIME() may use a privileged instruction - in such
cases, benchmarking just fails with a SIGILL.

Instead of crashing, try executing AV_READ_TIME() once within
a region with the signal handler active, to allow gracefully
informing the user about the issue.

This matches the dav1d checkasm commit
95a192549a448b70d9542e840c4e34b60d09b093.

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-01-15 23:29:12 +02:00
202a35ecdb checkasm/svqenc: add ssd_int8_vs_int16 test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-01-15 19:03:03 +02:00
65739691b9 checkasm: Generalize crash handling
This replaces the riscv specific handling from
7212466e73 (which essentially is
reverted), with a different implementation of the same (plus a bit
more), based on the corresponding feature in dav1d's checkasm,
supporting both Unix and Windows.

See in particular the dav1d commits
0b6ee30eab2400e4f85b735ad29a68a842c34e21,
0421f787ea592fd2cc74c887f20b8dc31393788b,
8501a4b20135f93a4c3b426468e2240e872949c5 and
d23e87f7aee26ddcf5f7a2e185112031477599a7, authored by Henrik Gramner.

The overall approach compared to the existing implementation for
riscv is the same; set up a signal handler, store the state with
sigsetjmp, jump out of the crashing function with siglongjmp.

The main difference is in what happens when the signal handler
is invoked. In the previous implementation, it would resume from
right before calling the crashing function, and then skip that call
based on the setjmp return value.

In the imported implementation from dav1d, we return to right before
the check_func() call, which will skip testing the current function
(as the pointer is the same as it was before).

Other differences are:
- Support for other signal handling mechanisms (Windows
  AddVectoredExceptionHandler)
- Using RtlCaptureContext/RtlRestoreContext instead of setjmp/longjmp
  on Windows with SEH
- Only catching signals once per function - if more than one
  signal is delivered before signal handling is reenabled, any
  signal is handled as it would without our handler
- Not using an arch specific signal handler written in assembly

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-01-11 14:48:53 +02:00
3bdb0fe511 checkasm/takdsp: add decorrelate_ls test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-21 22:42:34 +02:00
f5e3e9e04e checkasm: Remove unnecessary const on scalar parameters
The ffmpeg coding style doesn't usually use const on scalar
parameters (or on the pointer values - as opposed to the type
that is pointed to, where it has a semantic meaning), contrary
to the dav1d coding style (where this was imported from).

This avoids warnings about differences in the type signatures
between declaration and definition of this function, with older
versions of MSVC.

The issue was observed with one version of MSVC 2017,
19.16.27024.1, with warnings like these:

    src/tests/checkasm/checkasm.c(969): warning C4028: formal parameter 3 different from declaration

The warning itself is bogus as the const here is harmless, and
newer versions of MSVC no longer warn about this.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-21 00:14:41 +02:00
1c3620b2bb checkasm: test for abs_pow34
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-11 18:42:07 +02:00
b3825bbe45 riscv: test for assembler support
This should fix the build on LLVM 16 and earlier, at the cost of turning
all non-RVV optimisations off.
2023-12-08 17:21:09 +02:00
d0ec826077 checkasm/ac3dsp: add float_to_fixed24 test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-01 20:26:48 +02:00
7212466e73 checkasm/riscv: report an error upon SIGILL
Terminating the whole checkasm process is not very helpful. This will
report if an illegal instruction occurs while executing a tested
function. This is a common occurrence whilst developping RISC-V
assembler, due to the compatibility between vector configuration and
instruction done at run-time.
2023-11-23 19:04:07 +02:00
286d674221 checkasm: add helper to report a fatal signal 2023-11-23 18:57:18 +02:00
6720a509a7 checkasm: add lossless audio DSP 2023-11-16 16:53:44 +02:00
f25ad0fe02 checkasm: improve Linux perf error message
Report the failing system call name, as is convention, rather than just
a rather unhelpful "syscall".
2023-07-22 21:35:15 +03:00
b6585eb04c lavu: add/use flag for RISC-V Zba extension
The code was blindly assuming that Zbb or V implied Zba. While the
earlier is practically always true, the later broke some QEMU setups,
as V was introduced earlier than Zba.
2023-07-19 19:29:35 +03:00
98e4dd39c5 checkasm: test Zbb before V
Without this, Zbb functions get shadowed by V functions on devices
supporting both extensions, and never tested.
2023-07-19 19:29:35 +03:00
d8ea5f50e2 checkasm: print usage on invalid arguments
This checks that arguments are handled. If not, then this prints a
short usage notice and returns an error.
2023-07-17 18:48:42 +03:00
397cb623c8 aarch64: Add cpu flags for the dotprod and i8mm extensions
Set these available if they are available unconditionally for
the compiler.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-06-06 12:40:42 +03:00
783270bfd1 checkasm: add h264chroma tests
Checks all variants of put_h264_chroma and avg_h264_chroma.
2023-05-20 20:07:21 +02:00
68c151cb1b checkasm: add hevc_deblock chroma test
Signed-off-by: J. Dekker <jdek@itanimul.li>
2023-04-06 06:16:57 +02:00
087faf8cac checkasm: add test for bwdif 2023-03-25 02:38:17 +01:00
3ab11dc5bb libavfilter/x86/vf_convolution: add sobel filter optimization and unit test with intel AVX512 VNNI
This commit enabled assembly code with intel AVX512 VNNI and added unit test for sobel filter

sobel_c: 4537
sobel_avx512icl 2136

Signed-off-by: bwang30 <bin.wang@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-11-14 10:04:16 +08:00
1936c06f02 checkasm: add a verbose check function for uint32_t data 2022-11-04 19:37:46 +01:00
37d5ddc317 lavu/riscv: CPU flag for the Zbb extension
Unfortunately, it is common, and will remain so, that the Bit
manipulations are not enabled at compilation time. This is an official
policy for Debian ports in general (though they do not support RISC-V
officially as of yet) to stick to the minimal target baseline, which
does not include the B extension or even its Zbb subset.

For inline helpers (CPOP, REV8), compiler builtins (CTZ, CLZ) or
even plain C code (MIN, MAX, MINU, MAXU), run-time detection seems
impractical. But at least it can work for the byte-swap DSP functions.
2022-10-05 08:26:19 +02:00
0c0a3deb18 lavu/cpu: CPU flags for the RISC-V Vector extension
RVV defines a total of 12 different extensions, including:

- 5 different instruction subsets:
  - Zve32x: 8-, 16- and 32-bit integers,
  - Zve32f: Zve32x plus single precision floats,
  - Zve64x: Zve32x plus 64-bit integers,
  - Zve64f: Zve32f plus Zve64x,
  - Zve64d: Zve64f plus double precision floats.

- 6 different vector lengths:
  - Zvl32b (embedded only),
  - Zvl64b (embedded only),
  - Zvl128b,
  - Zvl256b,
  - Zvl512b,
  - Zvl1024b,

- and the V extension proper: equivalent to Zve64f and Zvl128b.

In total, there are 6 different possible sets of supported instructions
(including the empty set), but for convenience we allocate one bit for
each type sets: up-to-32-bit ints (RVV_I32), floats (RVV_F32),
64-bit ints (RVV_I64) and doubles (RVV_F64).

Whence the vector size is needed, it can be retrieved by reading the
unprivileged read-only vlenb CSR. This should probably be a separate
helper macro if needed at a later point.
2022-09-27 13:19:52 +02:00