1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-11-23 21:54:53 +02:00
Commit Graph

148 Commits

Author SHA1 Message Date
Niklas Haas
5e6ffa0376 tests/checkasm: add checkasm tests for swscale ops
Because of the lack of an external ABI on low-level kernels, we cannot
directly test internal functions. Instead, we construct a minimal op chain
consisting of a read, the op to be tested, and a write.

The bigger complication arises from the fact that the backend may generate
arbitrary internal state that needs to be passed back to the implementation,
which means we cannot directly call `func_ref` on the generated chain. To get
around this, always compile the op chain twice - once using the backend to be
tested, and once using the reference C backend.

The actual entry point may also just be a shared wrapper, so we need to
be very careful to run checkasm_check_func() on a pseudo-pointer that will
actually be unique for each combination of backend and active CPU flags.
2025-09-01 19:28:36 +02:00
Niklas Haas
8406c56b0c tests/checkasm: generalize DEF_CHECKASM_CHECK_FUNC to floats
We split the standard macro into its body (implementation) and declaration,
and use a macro argument in place of the raw `memcmp` call, with the major
difference that we now take the number of pixels to compare instead of the
number of bytes (to match the signature of float_near_ulp_array).
2025-09-01 19:27:53 +02:00
Niklas Haas
faf62cbdf5 tests/checkasm: increase number of runs in between measurements
Sometimes, when measuring very small functions, rdtsc is not accurate enough
to get a reliable measurement. This increases the number of runs inside the
inner loop from 4 to 32, which should help a lot. Less important when using
the more precise linux-perf API, but still useful.

There should be no user-visible change since the number of runs is adjusted
to keep the total time spent measuring the same.
2025-09-01 19:27:53 +02:00
Andreas Rheinhardt
15cec71665 checkasm/h264dsp: Fix stack-buffer-overflow, effective-type violations
Also ensure that the dst buffers are not too big
(they had the right size for >8 bit depths and were therefore
too big for eight bit, letting potential buffer overflows
in the eight bit version go undetected).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-07-28 19:29:51 +02:00
Niklas Haas
f944a70fcc tests/checkasm: add check for vf_colordetect 2025-07-21 18:10:26 +02:00
Niklas Haas
bfab026298 tests/checkasm: add test for vf_blackdetect 2025-07-18 10:47:31 +02:00
Niklas Haas
9251af058a tests/checkasm: add scene_sad checkasm test 2025-07-17 12:26:05 +02:00
Tristan Matthews
5ea3adfcf9 checkasm: add checkasm_check_dctcoef
This is useful for tests that compare dctcoefs which will be either 2 bytes or
4 bytes, depending on bitdepth.

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-06-16 01:31:44 +02:00
Shaun Loo
45bea45c7b checkasm: add vvc_sao
This is a part of Google Summer of Code 2023

AVX2:
 - vvc_sao.sao_band [OK]
 - vvc_sao.sao_edge [OK]
checkasm: all 54 tests passed
vvc_sao_band_8_8_c:                                    157.4 ( 1.00x)
vvc_sao_band_8_8_avx2:                                  30.7 ( 5.12x)
vvc_sao_band_8_10_c:                                   119.4 ( 1.00x)
vvc_sao_band_8_10_avx2:                                 29.2 ( 4.09x)
vvc_sao_band_8_12_c:                                   144.6 ( 1.00x)
vvc_sao_band_8_12_avx2:                                 30.0 ( 4.82x)
vvc_sao_band_16_8_c:                                   446.5 ( 1.00x)
vvc_sao_band_16_8_avx2:                                103.3 ( 4.32x)
vvc_sao_band_16_10_c:                                  399.2 ( 1.00x)
vvc_sao_band_16_10_avx2:                                64.3 ( 6.21x)
vvc_sao_band_16_12_c:                                  472.9 ( 1.00x)
vvc_sao_band_16_12_avx2:                                56.5 ( 8.37x)
vvc_sao_band_32_8_c:                                  2430.9 ( 1.00x)
vvc_sao_band_32_8_avx2:                                203.3 (11.96x)
vvc_sao_band_32_10_c:                                 1405.7 ( 1.00x)
vvc_sao_band_32_10_avx2:                               208.5 ( 6.74x)
vvc_sao_band_32_12_c:                                 2054.3 ( 1.00x)
vvc_sao_band_32_12_avx2:                               213.0 ( 9.64x)
vvc_sao_band_48_8_c:                                  3835.4 ( 1.00x)
vvc_sao_band_48_8_avx2:                                604.2 ( 6.35x)
vvc_sao_band_48_10_c:                                 3624.6 ( 1.00x)
vvc_sao_band_48_10_avx2:                               468.8 ( 7.73x)
vvc_sao_band_48_12_c:                                 3752.4 ( 1.00x)
vvc_sao_band_48_12_avx2:                               477.5 ( 7.86x)
vvc_sao_band_64_8_c:                                  6061.1 ( 1.00x)
vvc_sao_band_64_8_avx2:                                803.9 ( 7.54x)
vvc_sao_band_64_10_c:                                 6142.5 ( 1.00x)
vvc_sao_band_64_10_avx2:                               827.3 ( 7.43x)
vvc_sao_band_64_12_c:                                 6106.6 ( 1.00x)
vvc_sao_band_64_12_avx2:                               839.9 ( 7.27x)
vvc_sao_band_80_8_c:                                  9478.0 ( 1.00x)
vvc_sao_band_80_8_avx2:                               1516.7 ( 6.25x)
vvc_sao_band_80_10_c:                                10300.5 ( 1.00x)
vvc_sao_band_80_10_avx2:                              1298.7 ( 7.93x)
vvc_sao_band_80_12_c:                                 8941.1 ( 1.00x)
vvc_sao_band_80_12_avx2:                              1315.3 ( 6.80x)
vvc_sao_band_96_8_c:                                 13351.5 ( 1.00x)
vvc_sao_band_96_8_avx2:                               1815.4 ( 7.35x)
vvc_sao_band_96_10_c:                                13197.5 ( 1.00x)
vvc_sao_band_96_10_avx2:                              1872.4 ( 7.05x)
vvc_sao_band_96_12_c:                                11969.0 ( 1.00x)
vvc_sao_band_96_12_avx2:                              1895.8 ( 6.31x)
vvc_sao_band_112_8_c:                                19936.9 ( 1.00x)
vvc_sao_band_112_8_avx2:                              2802.3 ( 7.11x)
vvc_sao_band_112_10_c:                               19534.9 ( 1.00x)
vvc_sao_band_112_10_avx2:                             2635.0 ( 7.41x)
vvc_sao_band_112_12_c:                               16520.6 ( 1.00x)
vvc_sao_band_112_12_avx2:                             2591.8 ( 6.37x)
vvc_sao_band_128_8_c:                                25967.5 ( 1.00x)
vvc_sao_band_128_8_avx2:                              3155.3 ( 8.23x)
vvc_sao_band_128_10_c:                               24002.6 ( 1.00x)
vvc_sao_band_128_10_avx2:                             3374.6 ( 7.11x)
vvc_sao_band_128_12_c:                               20829.4 ( 1.00x)
vvc_sao_band_128_12_avx2:                             3377.0 ( 6.17x)
vvc_sao_edge_8_8_c:                                    174.6 ( 1.00x)
vvc_sao_edge_8_8_avx2:                                  37.0 ( 4.72x)
vvc_sao_edge_8_10_c:                                   174.4 ( 1.00x)
vvc_sao_edge_8_10_avx2:                                 58.5 ( 2.98x)
vvc_sao_edge_8_12_c:                                   171.1 ( 1.00x)
vvc_sao_edge_8_12_avx2:                                 58.5 ( 2.93x)
vvc_sao_edge_16_8_c:                                   677.7 ( 1.00x)
vvc_sao_edge_16_8_avx2:                                 72.2 ( 9.39x)
vvc_sao_edge_16_10_c:                                  724.8 ( 1.00x)
vvc_sao_edge_16_10_avx2:                               106.4 ( 6.81x)
vvc_sao_edge_16_12_c:                                  647.0 ( 1.00x)
vvc_sao_edge_16_12_avx2:                               106.6 ( 6.07x)
vvc_sao_edge_32_8_c:                                  3001.8 ( 1.00x)
vvc_sao_edge_32_8_avx2:                                157.6 (19.04x)
vvc_sao_edge_32_10_c:                                 3071.1 ( 1.00x)
vvc_sao_edge_32_10_avx2:                               404.2 ( 7.60x)
vvc_sao_edge_32_12_c:                                 2698.6 ( 1.00x)
vvc_sao_edge_32_12_avx2:                               398.8 ( 6.77x)
vvc_sao_edge_48_8_c:                                  6557.7 ( 1.00x)
vvc_sao_edge_48_8_avx2:                                380.1 (17.25x)
vvc_sao_edge_48_10_c:                                 6319.9 ( 1.00x)
vvc_sao_edge_48_10_avx2:                               896.3 ( 7.05x)
vvc_sao_edge_48_12_c:                                 6306.4 ( 1.00x)
vvc_sao_edge_48_12_avx2:                               885.5 ( 7.12x)
vvc_sao_edge_64_8_c:                                 11510.7 ( 1.00x)
vvc_sao_edge_64_8_avx2:                                504.1 (22.84x)
vvc_sao_edge_64_10_c:                                10917.4 ( 1.00x)
vvc_sao_edge_64_10_avx2:                              1608.3 ( 6.79x)
vvc_sao_edge_64_12_c:                                11499.8 ( 1.00x)
vvc_sao_edge_64_12_avx2:                              1586.4 ( 7.25x)
vvc_sao_edge_80_8_c:                                 18193.2 ( 1.00x)
vvc_sao_edge_80_8_avx2:                                930.2 (19.56x)
vvc_sao_edge_80_10_c:                                17984.3 ( 1.00x)
vvc_sao_edge_80_10_avx2:                              2420.9 ( 7.43x)
vvc_sao_edge_80_12_c:                                18289.4 ( 1.00x)
vvc_sao_edge_80_12_avx2:                              2412.1 ( 7.58x)
vvc_sao_edge_96_8_c:                                 26361.8 ( 1.00x)
vvc_sao_edge_96_8_avx2:                               1118.4 (23.57x)
vvc_sao_edge_96_10_c:                                26162.2 ( 1.00x)
vvc_sao_edge_96_10_avx2:                              3666.9 ( 7.13x)
vvc_sao_edge_96_12_c:                                25926.6 ( 1.00x)
vvc_sao_edge_96_12_avx2:                              3433.9 ( 7.55x)
vvc_sao_edge_112_8_c:                                36562.9 ( 1.00x)
vvc_sao_edge_112_8_avx2:                              1741.0 (21.00x)
vvc_sao_edge_112_10_c:                               38126.4 ( 1.00x)
vvc_sao_edge_112_10_avx2:                             5153.3 ( 7.40x)
vvc_sao_edge_112_12_c:                               36345.7 ( 1.00x)
vvc_sao_edge_112_12_avx2:                             4684.9 ( 7.76x)
vvc_sao_edge_128_8_c:                                46379.8 ( 1.00x)
vvc_sao_edge_128_8_avx2:                              2012.4 (23.05x)
vvc_sao_edge_128_10_c:                               47029.5 ( 1.00x)
vvc_sao_edge_128_10_avx2:                             6162.2 ( 7.63x)
vvc_sao_edge_128_12_c:                               49647.3 ( 1.00x)
vvc_sao_edge_128_12_avx2:                             6127.1 ( 8.10x)

Co-authored-by: Nuo Mi <nuomi2021@gmail.com>
2025-05-14 20:55:39 +08:00
Mark Thompson
d03c99441d lavc/apv: AVX2 transquant for x86-64
Typical checkasm result on Alder Lake:

decode_transquant_8_c:                                 464.2 ( 1.00x)
decode_transquant_8_avx2:                               86.2 ( 5.38x)
decode_transquant_10_c:                                481.6 ( 1.00x)
decode_transquant_10_avx2:                              83.5 ( 5.77x)
2025-04-27 15:52:30 +01:00
Rodger Combs
779cbc2b97 checkasm: add tests for AES
Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-06 11:02:10 -03:00
Martin Storsjö
b863b81500 checkasm: Implement helpers for defining and checking padded rects
This backports similar functionality from dav1d, from commits
35d1d011fda4a92bcaf42d30ed137583b27d7f6d and
d130da9c315d5a1d3968d278bbee2238ad9051e7.

This allows detecting writes out of bounds, on all 4 sides of
the intended destination rectangle.

The bounds checking also can optionally allow small overwrites
(up to a specified alignment), while still checking for larger
overwrites past the intended allowed region.

Signed-off-by: Martin Storsjö <martin@martin.st>
2025-04-01 18:34:51 +03:00
Martin Storsjö
37c664a253 checkasm: Make checkasm_fail_func return whether we should print verbosely
This makes it easier to implement custom error printouts in tests.

This is a port of dav1d's commit
13a7d78655f8747c2cd01e8a48d44dcc7f60a8e5 into ffmpeg's checkasm.

Signed-off-by: Martin Storsjö <martin@martin.st>
2025-04-01 18:34:48 +03:00
Niklas Haas
256a38101f tests/checkasm: fix wrong summation of bench time
This was changed 8 years ago with the introduction of the linux-perf path,
with seemingly no justification at the time. Likely a developer oversight
from testing.

This bug not only made --runs completely ineffective, but also meant that we
didn't actually correctly filter out outliers.

Fixes: e0d56f097f
2025-03-31 15:27:24 +02:00
Martin Storsjö
47b1e1bd84 checkasm: vvc: Use checkasm_check for printing failing output
Share the checkasm_check_pixel macro from hevc_pel in checkasm.h,
to allow other tests to use the same. (To use it in other tests,
those tests need to have a similar setup for high bitdepth pixels,
with a local variable named "bit_depth".)

This simplifies the code for checking the output, and can print
the failing output (including a map of matching/mismatching
elements) if checkasm is run with the -v/--verbose option.

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-12-10 11:26:09 +02:00
Zhao Zhili
018ec4fe5f tests/checkasm: Simplify logic for WASI signal handling
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Reviewed-by: Martin Storsjö <martin@martin.st>
2024-12-06 10:48:11 +08:00
Zhao Zhili
ea3d21c349 tests/checkasm: Add partial support for wasm
WASI mssing signal and siglongjmp support. This patch workaround
build error and add simd128 flag. Please note that many tests use
large array on stack, so you need to increase the stack size when
build checkasm, e.g., --extra-ldflags='-Wl,-z,stack-size=10485760'

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-12-04 16:43:07 +08:00
Kyosuke Kawakami
711290f9a3 checkasm/diracdsp: test add_dirac_obmc
Signed-off-by: Kyosuke Kawakami <kawakami150708@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2024-11-15 13:44:53 -05:00
Ramiro Polla
834964ce1a checkasm/mpegvideoencdsp: add pix_sum, pix_norm1, and draw_edges 2024-08-26 12:48:09 +02:00
Ramiro Polla
a2e01cade8 checkasm/yuv2yuv: add tests for semiplanar unscaled converters 2024-08-26 11:04:46 +02:00
Ramiro Polla
1fb77347c8 checkasm: add tests for yuv2rgb 2024-06-28 14:49:49 +02:00
Ramiro Polla
874152033d checkasm: add tests for {lum,chr}ConvertRange 2024-06-16 00:34:24 +02:00
Rémi Denis-Courmont
fc85aff72f checkasm: add linear least square tests 2024-06-01 18:05:58 +03:00
Rémi Denis-Courmont
44f7f6e010 checkasm: add h263dsp.{h,v}_loop_filter 2024-05-27 22:42:07 +03:00
J. Dekker
b1adf6d1d0 checkasm: add runs argument to adjust during bench
Some timers on certain device and test combinations can produce noisy
results, affecting the reliability of performance measurements. One
notable example of this is the Canaan K230 RISC-V development board.

An option to adjust the number of samples by an exponent (--runs) has
been added, allowing developers to increase the sample count for more
reliable results.

Signed-off-by: J. Dekker <jdek@itanimul.li>
2024-05-21 16:47:45 +02:00
Wu Jianhua
9ef6e15b04 tests/checkasm: add checkasm_check_vvc_alf and check_alf_filter
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
2024-05-14 19:21:35 +08:00
Ramiro Polla
250c0defa2 checkasm: add test for fdct
Reviewed-by: Martin Storsjö <martin@martin.st>
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-11 10:28:59 +02:00
sunyuechi
cfa8d2488d checkasm/rv40dsp: add chroma_mc test
This is similar to h264.

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-03 18:00:53 +03:00
sunyuechi
6728edadde checkasm/rv34dsp: add rv34_inv_transform_dc test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-02-17 14:33:35 +02:00
Wu Jianhua
fb26c7bfd4 tests/checkasm: add checkasm_check_vvc_mc
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
2024-02-01 19:54:29 +08:00
sunyuechi
202a35ecdb checkasm/svqenc: add ssd_int8_vs_int16 test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-01-15 19:03:03 +02:00
Martin Storsjö
65739691b9 checkasm: Generalize crash handling
This replaces the riscv specific handling from
7212466e73 (which essentially is
reverted), with a different implementation of the same (plus a bit
more), based on the corresponding feature in dav1d's checkasm,
supporting both Unix and Windows.

See in particular the dav1d commits
0b6ee30eab2400e4f85b735ad29a68a842c34e21,
0421f787ea592fd2cc74c887f20b8dc31393788b,
8501a4b20135f93a4c3b426468e2240e872949c5 and
d23e87f7aee26ddcf5f7a2e185112031477599a7, authored by Henrik Gramner.

The overall approach compared to the existing implementation for
riscv is the same; set up a signal handler, store the state with
sigsetjmp, jump out of the crashing function with siglongjmp.

The main difference is in what happens when the signal handler
is invoked. In the previous implementation, it would resume from
right before calling the crashing function, and then skip that call
based on the setjmp return value.

In the imported implementation from dav1d, we return to right before
the check_func() call, which will skip testing the current function
(as the pointer is the same as it was before).

Other differences are:
- Support for other signal handling mechanisms (Windows
  AddVectoredExceptionHandler)
- Using RtlCaptureContext/RtlRestoreContext instead of setjmp/longjmp
  on Windows with SEH
- Only catching signals once per function - if more than one
  signal is delivered before signal handling is reenabled, any
  signal is handled as it would without our handler
- Not using an arch specific signal handler written in assembly

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-01-11 14:48:53 +02:00
sunyuechi
3bdb0fe511 checkasm/takdsp: add decorrelate_ls test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-21 22:42:34 +02:00
Martin Storsjö
f5e3e9e04e checkasm: Remove unnecessary const on scalar parameters
The ffmpeg coding style doesn't usually use const on scalar
parameters (or on the pointer values - as opposed to the type
that is pointed to, where it has a semantic meaning), contrary
to the dav1d coding style (where this was imported from).

This avoids warnings about differences in the type signatures
between declaration and definition of this function, with older
versions of MSVC.

The issue was observed with one version of MSVC 2017,
19.16.27024.1, with warnings like these:

    src/tests/checkasm/checkasm.c(969): warning C4028: formal parameter 3 different from declaration

The warning itself is bogus as the const here is harmless, and
newer versions of MSVC no longer warn about this.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-21 00:14:41 +02:00
sunyuechi
1c3620b2bb checkasm: test for abs_pow34
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-11 18:42:07 +02:00
Rémi Denis-Courmont
b3825bbe45 riscv: test for assembler support
This should fix the build on LLVM 16 and earlier, at the cost of turning
all non-RVV optimisations off.
2023-12-08 17:21:09 +02:00
sunyuechi
d0ec826077 checkasm/ac3dsp: add float_to_fixed24 test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-01 20:26:48 +02:00
Rémi Denis-Courmont
7212466e73 checkasm/riscv: report an error upon SIGILL
Terminating the whole checkasm process is not very helpful. This will
report if an illegal instruction occurs while executing a tested
function. This is a common occurrence whilst developping RISC-V
assembler, due to the compatibility between vector configuration and
instruction done at run-time.
2023-11-23 19:04:07 +02:00
Rémi Denis-Courmont
286d674221 checkasm: add helper to report a fatal signal 2023-11-23 18:57:18 +02:00
Rémi Denis-Courmont
6720a509a7 checkasm: add lossless audio DSP 2023-11-16 16:53:44 +02:00
Andreas Rheinhardt
f8503b4c33 avutil/internal: Don't auto-include emms.h
Instead include emms.h wherever it is needed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
Lynne
783270bfd1 checkasm: add h264chroma tests
Checks all variants of put_h264_chroma and avg_h264_chroma.
2023-05-20 20:07:21 +02:00
J. Dekker
68c151cb1b checkasm: add hevc_deblock chroma test
Signed-off-by: J. Dekker <jdek@itanimul.li>
2023-04-06 06:16:57 +02:00
James Darnley
087faf8cac checkasm: add test for bwdif 2023-03-25 02:38:17 +01:00
bwang30
3ab11dc5bb libavfilter/x86/vf_convolution: add sobel filter optimization and unit test with intel AVX512 VNNI
This commit enabled assembly code with intel AVX512 VNNI and added unit test for sobel filter

sobel_c: 4537
sobel_avx512icl 2136

Signed-off-by: bwang30 <bin.wang@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-11-14 10:04:16 +08:00
James Darnley
1936c06f02 checkasm: add a verbose check function for uint32_t data 2022-11-04 19:37:46 +01:00
Rémi Denis-Courmont
c962c78901 checkasm: RISC-V 64-bit assembler test harness 2022-10-10 02:23:18 +02:00
Lynne
3ade6a8644 x86/lpc: implement a new Welch windowing function
Old one was written with the assumption only even inputs would be given.
This very messy replacement supports even and odd inputs, and supports
AVX2 for extra speed. The buffers given are usually quite big (4k samples),
so the speedup is worth it.
The new SSE version is still faster than the old inline asm version by 33%.

Also checkasm is provided to make sure this monstrosity works.

This fixes some FATE tests.
2022-09-21 07:12:39 +02:00
James Almer
8f119b501e tests/checkasm: add a test for VorbisDSPContext
Signed-off-by: James Almer <jamrial@gmail.com>
2022-09-19 21:28:23 -03:00
Martin Storsjö
5cdf4c0bed checkasm: Silence warnings about unused return value from read()
This codepath is enabled by default on arm, if the linux perf API
is available, unless disabled with --disable-linux-perf.

Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-08 23:39:13 +03:00