Because of the lack of an external ABI on low-level kernels, we cannot
directly test internal functions. Instead, we construct a minimal op chain
consisting of a read, the op to be tested, and a write.
The bigger complication arises from the fact that the backend may generate
arbitrary internal state that needs to be passed back to the implementation,
which means we cannot directly call `func_ref` on the generated chain. To get
around this, always compile the op chain twice - once using the backend to be
tested, and once using the reference C backend.
The actual entry point may also just be a shared wrapper, so we need to
be very careful to run checkasm_check_func() on a pseudo-pointer that will
actually be unique for each combination of backend and active CPU flags.
We split the standard macro into its body (implementation) and declaration,
and use a macro argument in place of the raw `memcmp` call, with the major
difference that we now take the number of pixels to compare instead of the
number of bytes (to match the signature of float_near_ulp_array).
Sometimes, when measuring very small functions, rdtsc is not accurate enough
to get a reliable measurement. This increases the number of runs inside the
inner loop from 4 to 32, which should help a lot. Less important when using
the more precise linux-perf API, but still useful.
There should be no user-visible change since the number of runs is adjusted
to keep the total time spent measuring the same.
Also ensure that the dst buffers are not too big
(they had the right size for >8 bit depths and were therefore
too big for eight bit, letting potential buffer overflows
in the eight bit version go undetected).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is useful for tests that compare dctcoefs which will be either 2 bytes or
4 bytes, depending on bitdepth.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This backports similar functionality from dav1d, from commits
35d1d011fda4a92bcaf42d30ed137583b27d7f6d and
d130da9c315d5a1d3968d278bbee2238ad9051e7.
This allows detecting writes out of bounds, on all 4 sides of
the intended destination rectangle.
The bounds checking also can optionally allow small overwrites
(up to a specified alignment), while still checking for larger
overwrites past the intended allowed region.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes it easier to implement custom error printouts in tests.
This is a port of dav1d's commit
13a7d78655f8747c2cd01e8a48d44dcc7f60a8e5 into ffmpeg's checkasm.
Signed-off-by: Martin Storsjö <martin@martin.st>
This was changed 8 years ago with the introduction of the linux-perf path,
with seemingly no justification at the time. Likely a developer oversight
from testing.
This bug not only made --runs completely ineffective, but also meant that we
didn't actually correctly filter out outliers.
Fixes: e0d56f097f
Share the checkasm_check_pixel macro from hevc_pel in checkasm.h,
to allow other tests to use the same. (To use it in other tests,
those tests need to have a similar setup for high bitdepth pixels,
with a local variable named "bit_depth".)
This simplifies the code for checking the output, and can print
the failing output (including a map of matching/mismatching
elements) if checkasm is run with the -v/--verbose option.
Signed-off-by: Martin Storsjö <martin@martin.st>
WASI mssing signal and siglongjmp support. This patch workaround
build error and add simd128 flag. Please note that many tests use
large array on stack, so you need to increase the stack size when
build checkasm, e.g., --extra-ldflags='-Wl,-z,stack-size=10485760'
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Some timers on certain device and test combinations can produce noisy
results, affecting the reliability of performance measurements. One
notable example of this is the Canaan K230 RISC-V development board.
An option to adjust the number of samples by an exponent (--runs) has
been added, allowing developers to increase the sample count for more
reliable results.
Signed-off-by: J. Dekker <jdek@itanimul.li>
This replaces the riscv specific handling from
7212466e73 (which essentially is
reverted), with a different implementation of the same (plus a bit
more), based on the corresponding feature in dav1d's checkasm,
supporting both Unix and Windows.
See in particular the dav1d commits
0b6ee30eab2400e4f85b735ad29a68a842c34e21,
0421f787ea592fd2cc74c887f20b8dc31393788b,
8501a4b20135f93a4c3b426468e2240e872949c5 and
d23e87f7aee26ddcf5f7a2e185112031477599a7, authored by Henrik Gramner.
The overall approach compared to the existing implementation for
riscv is the same; set up a signal handler, store the state with
sigsetjmp, jump out of the crashing function with siglongjmp.
The main difference is in what happens when the signal handler
is invoked. In the previous implementation, it would resume from
right before calling the crashing function, and then skip that call
based on the setjmp return value.
In the imported implementation from dav1d, we return to right before
the check_func() call, which will skip testing the current function
(as the pointer is the same as it was before).
Other differences are:
- Support for other signal handling mechanisms (Windows
AddVectoredExceptionHandler)
- Using RtlCaptureContext/RtlRestoreContext instead of setjmp/longjmp
on Windows with SEH
- Only catching signals once per function - if more than one
signal is delivered before signal handling is reenabled, any
signal is handled as it would without our handler
- Not using an arch specific signal handler written in assembly
Signed-off-by: Martin Storsjö <martin@martin.st>
The ffmpeg coding style doesn't usually use const on scalar
parameters (or on the pointer values - as opposed to the type
that is pointed to, where it has a semantic meaning), contrary
to the dav1d coding style (where this was imported from).
This avoids warnings about differences in the type signatures
between declaration and definition of this function, with older
versions of MSVC.
The issue was observed with one version of MSVC 2017,
19.16.27024.1, with warnings like these:
src/tests/checkasm/checkasm.c(969): warning C4028: formal parameter 3 different from declaration
The warning itself is bogus as the const here is harmless, and
newer versions of MSVC no longer warn about this.
Signed-off-by: Martin Storsjö <martin@martin.st>
Terminating the whole checkasm process is not very helpful. This will
report if an illegal instruction occurs while executing a tested
function. This is a common occurrence whilst developping RISC-V
assembler, due to the compatibility between vector configuration and
instruction done at run-time.
This commit enabled assembly code with intel AVX512 VNNI and added unit test for sobel filter
sobel_c: 4537
sobel_avx512icl 2136
Signed-off-by: bwang30 <bin.wang@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Old one was written with the assumption only even inputs would be given.
This very messy replacement supports even and odd inputs, and supports
AVX2 for extra speed. The buffers given are usually quite big (4k samples),
so the speedup is worth it.
The new SSE version is still faster than the old inline asm version by 33%.
Also checkasm is provided to make sure this monstrosity works.
This fixes some FATE tests.
This codepath is enabled by default on arm, if the linux perf API
is available, unless disabled with --disable-linux-perf.
Signed-off-by: Martin Storsjö <martin@martin.st>