1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-18 03:19:31 +02:00
Commit Graph

437 Commits

Author SHA1 Message Date
Wu Jianhua
fb26c7bfd4 tests/checkasm: add checkasm_check_vvc_mc
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
2024-02-01 19:54:29 +08:00
Michael Niedermayer
d6c62b4bce
tests/checkasm/aacencdsp: Use float_near_ulp_array() for abs_pow34() test
Fixes: ticket/10818
Approved-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-24 00:42:13 +01:00
Martin Storsjö
ac40c3bb07 checkasm: Test whether the native FFmpeg timers work
On some platforms (in particular, ARM/AArch64), the implementation
of AV_READ_TIME() may use a privileged instruction - in such
cases, benchmarking just fails with a SIGILL.

Instead of crashing, try executing AV_READ_TIME() once within
a region with the signal handler active, to allow gracefully
informing the user about the issue.

This matches the dav1d checkasm commit
95a192549a448b70d9542e840c4e34b60d09b093.

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-01-15 23:29:12 +02:00
sunyuechi
202a35ecdb checkasm/svqenc: add ssd_int8_vs_int16 test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-01-15 19:03:03 +02:00
Martin Storsjö
65739691b9 checkasm: Generalize crash handling
This replaces the riscv specific handling from
7212466e73 (which essentially is
reverted), with a different implementation of the same (plus a bit
more), based on the corresponding feature in dav1d's checkasm,
supporting both Unix and Windows.

See in particular the dav1d commits
0b6ee30eab2400e4f85b735ad29a68a842c34e21,
0421f787ea592fd2cc74c887f20b8dc31393788b,
8501a4b20135f93a4c3b426468e2240e872949c5 and
d23e87f7aee26ddcf5f7a2e185112031477599a7, authored by Henrik Gramner.

The overall approach compared to the existing implementation for
riscv is the same; set up a signal handler, store the state with
sigsetjmp, jump out of the crashing function with siglongjmp.

The main difference is in what happens when the signal handler
is invoked. In the previous implementation, it would resume from
right before calling the crashing function, and then skip that call
based on the setjmp return value.

In the imported implementation from dav1d, we return to right before
the check_func() call, which will skip testing the current function
(as the pointer is the same as it was before).

Other differences are:
- Support for other signal handling mechanisms (Windows
  AddVectoredExceptionHandler)
- Using RtlCaptureContext/RtlRestoreContext instead of setjmp/longjmp
  on Windows with SEH
- Only catching signals once per function - if more than one
  signal is delivered before signal handling is reenabled, any
  signal is handled as it would without our handler
- Not using an arch specific signal handler written in assembly

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-01-11 14:48:53 +02:00
James Almer
46775e64f8 avcodec/takdsp: fix const correctness
Signed-off-by: James Almer <jamrial@gmail.com>
2023-12-22 09:28:04 -03:00
James Almer
c5029bb193 checkasm/takdsp: add decorrelate_sf test
Signed-off-by: James Almer <jamrial@gmail.com>
2023-12-22 09:26:38 -03:00
Martin Storsjö
935837c3d3 checkasm: Fix the takdsp tests
For memcpy and memcmp, we need to multiply by the element size,
otherwise we're copying and comparing only a fraction of the buffer.

For decorrelate_sr, the buffer p1 is the one that is mutated;
copy and check p1 instead of p2.

For decorrelate_sm, both buffers are mutated, so copy and check
both of them.

For decorrelate_sm, the memcpy initialization of p1 and p1_2 was
reversed - p1 is filled with randomize, but then memcpy copies from
p1_2 to p1. As p1_2 is uninitialized at this point, clang concluded
that the copy was bogus and omitted it entirely, triggering failures
in this test on x86 (where there was an existing assembly implementation
to test).

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-22 14:20:31 +02:00
sunyuechi
21e2b6b501 checkasm/takdsp: add decorrelate_sm test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-21 22:42:34 +02:00
sunyuechi
c064823b95 checkasm/takdsp: add decorrelate_sr test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-21 22:42:34 +02:00
sunyuechi
3bdb0fe511 checkasm/takdsp: add decorrelate_ls test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-21 22:42:34 +02:00
Martin Storsjö
f5e3e9e04e checkasm: Remove unnecessary const on scalar parameters
The ffmpeg coding style doesn't usually use const on scalar
parameters (or on the pointer values - as opposed to the type
that is pointed to, where it has a semantic meaning), contrary
to the dav1d coding style (where this was imported from).

This avoids warnings about differences in the type signatures
between declaration and definition of this function, with older
versions of MSVC.

The issue was observed with one version of MSVC 2017,
19.16.27024.1, with warnings like these:

    src/tests/checkasm/checkasm.c(969): warning C4028: formal parameter 3 different from declaration

The warning itself is bogus as the const here is harmless, and
newer versions of MSVC no longer warn about this.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-21 00:14:41 +02:00
sunyuechi
1c3620b2bb checkasm: test for abs_pow34
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-11 18:42:07 +02:00
Rémi Denis-Courmont
b3825bbe45 riscv: test for assembler support
This should fix the build on LLVM 16 and earlier, at the cost of turning
all non-RVV optimisations off.
2023-12-08 17:21:09 +02:00
Martin Storsjö
12598e72e3 checkasm: Fix the signature of float_to_fixed24
The len parameter was changed from unsigned int to size_t in
567c67c6c8.

This fixes crashes in the reference C code, when running checkasm
on aarch64.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-02 18:16:09 +02:00
sunyuechi
d0ec826077 checkasm/ac3dsp: add float_to_fixed24 test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-01 20:26:48 +02:00
sunyuechi
ea6817d2a7 checkasm: test for dcmul_add
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-11-27 17:55:24 +02:00
Rémi Denis-Courmont
7212466e73 checkasm/riscv: report an error upon SIGILL
Terminating the whole checkasm process is not very helpful. This will
report if an illegal instruction occurs while executing a tested
function. This is a common occurrence whilst developping RISC-V
assembler, due to the compatibility between vector configuration and
instruction done at run-time.
2023-11-23 19:04:07 +02:00
Rémi Denis-Courmont
286d674221 checkasm: add helper to report a fatal signal 2023-11-23 18:57:18 +02:00
Rémi Denis-Courmont
8a984aca59 checkasm/flacdsp: add LPC test 2023-11-18 22:01:59 +02:00
Rémi Denis-Courmont
be1675035f checkasm/flacdsp: fix ls/rs/ms tests
decorrelate_ls, _rs and _ms are decorrelate[1], [2] and [3] respectively.
The code ended up testing indep ([0]) as twice, ms never, and misnaming
the other two.
2023-11-17 23:59:22 +02:00
Rémi Denis-Courmont
6720a509a7 checkasm: add lossless audio DSP 2023-11-16 16:53:44 +02:00
Rémi Denis-Courmont
6b708cd783 checkasm/huffyuvdsp: test for add_hfyu_left_pred_bgr32 2023-11-15 16:51:07 +02:00
Rémi Denis-Courmont
20e6195c54 checkasm: test the noise case of sbrdsp.hf_apply_noise
The tested functions treat s_m[i] == 0 as a special case. Other than
that, the functions are slightly complicated vector additions.

This actually makes the zero case happen pseudorandomly.
2023-11-13 18:34:29 +02:00
Rémi Denis-Courmont
427347309b checkasm: test with random bw value
With a value of zero, the function is a glorified memory copy.
2023-11-12 22:33:11 +02:00
Andreas Rheinhardt
0228e27ded checkasm/motion: Don't allocate AVCodecContext
Instead use one on the stack to avoid pulling in all
of libavcodec.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-28 00:17:47 +02:00
Andreas Rheinhardt
80afcc8539 avfilter/bwdif: Add proper BWDIFDSPContext
This already avoids unnecessary indirectly included headers
in the arch-specific vf_bwdif_init.c files; it is also in
preparation for splitting the actual functions out of vf_bwdif.c.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-28 00:17:47 +02:00
Andreas Rheinhardt
a84fe06112 avcodec/idctdsp: Avoid inclusion of avcodec.h
Not every user of idctdsp.h wants to initialize an IDCTDSPContext;
e.g. the proresdsp only uses ff_init_scantable_permutation()
and the IDCT permutation enum; similarly for cavsdsp and wmv2dsp.
Using a forward declaration here avoids an avcodec.h dependency
in the relevant files.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-11 00:26:34 +02:00
Andreas Rheinhardt
f8503b4c33 avutil/internal: Don't auto-include emms.h
Instead include emms.h wherever it is needed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
Andreas Rheinhardt
a39d6e81fa tests/checkasm/sw_scale: Avoid declare_func_emms where possible
This makes the test stricter because it is checked that the
MMX registers are not accidentally clobbered.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
Andreas Rheinhardt
e7866e00c8 tests/checkasm/llvidencdsp: Don't use declare_func_emms
Only sub_media_pred has an MMXEXT version, so one can use
the version with the stricter check (that checks that
the MMX registers have not been clobbered) for sub_left_predict.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
Andreas Rheinhardt
3f82b38516 tests/checkasm/hevc_*: Avoid using declare_func_emms where possible
Only the idct_dc and add_residual functions have MMX versions,
so one can use the version with the stricter check (that checks
that the MMX registers have not been clobbered) for all the other
checks.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
Matthias Dressel
e41bd6e65e checkasm: hevc_sao: Fix a regression in hevc_sao_edge
check_func() might return NULL, in which case the function is not to be
benched. Introduced in cc679054c7.

Signed-off-by: Matthias Dressel <code@deadcode.eu>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-08-24 22:09:37 +03:00
Rémi Denis-Courmont
f25ad0fe02 checkasm: improve Linux perf error message
Report the failing system call name, as is convention, rather than just
a rather unhelpful "syscall".
2023-07-22 21:35:15 +03:00
Rémi Denis-Courmont
b6585eb04c lavu: add/use flag for RISC-V Zba extension
The code was blindly assuming that Zbb or V implied Zba. While the
earlier is practically always true, the later broke some QEMU setups,
as V was introduced earlier than Zba.
2023-07-19 19:29:35 +03:00
Rémi Denis-Courmont
98e4dd39c5 checkasm: test Zbb before V
Without this, Zbb functions get shadowed by V functions on devices
supporting both extensions, and never tested.
2023-07-19 19:29:35 +03:00
Rémi Denis-Courmont
d8ea5f50e2 checkasm: print usage on invalid arguments
This checks that arguments are handled. If not, then this prints a
short usage notice and returns an error.
2023-07-17 18:48:42 +03:00
John Cox
697533e76d avfilter/vf_bwdif: Add a filter_line3 method for optimisation
Add an optional filter_line3 to the available optimisations.

filter_line3 is equivalent to filter_line, memcpy, filter_line

filter_line shares quite a number of loads and some calculations in
common with its next iteration and testing shows that using aarch64
neon filter_line3s performance is 30% better than two filter_lines
and a memcpy.

Adds a test for vf_bwdif filter_line3 to checkasm

Rounds job start lines down to a multiple of 4. This means that if
filter_line3 exists then filter_line will not sometimes be called
once at the end of a slice depending on thread count. The final slice
may do up to 3 extra lines but filter_edge is faster than filter_line
so it is unlikely to create any noticable thread load variation.

Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-07-06 00:21:05 +03:00
John Cox
7ed7c00f55 tests/checkasm: Add test for vf_bwdif filter_edge
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-07-06 00:21:05 +03:00
John Cox
7caa8d6b91 tests/checkasm: Add test for vf_bwdif filter_intra
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-07-06 00:21:05 +03:00
Martin Storsjö
397cb623c8 aarch64: Add cpu flags for the dotprod and i8mm extensions
Set these available if they are available unconditionally for
the compiler.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-06-06 12:40:42 +03:00
Lynne
783270bfd1
checkasm: add h264chroma tests
Checks all variants of put_h264_chroma and avg_h264_chroma.
2023-05-20 20:07:21 +02:00
xufuji456
1e91a39502 checkasm: pass context as pointer
Signed-off-by: xufuji456 <839789740@qq.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-04-13 15:17:04 +03:00
xufuji456
30def6365d checkasm/hevc: add transform_luma test
Signed-off-by: xufuji456 <839789740@qq.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-04-13 15:17:04 +03:00
J. Dekker
68c151cb1b checkasm: add hevc_deblock chroma test
Signed-off-by: J. Dekker <jdek@itanimul.li>
2023-04-06 06:16:57 +02:00
James Darnley
087faf8cac checkasm: add test for bwdif 2023-03-25 02:38:17 +01:00
Lynne
bbe95f7353
x86: replace explicit REP_RETs with RETs
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)

x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.

The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.

In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
2023-02-01 04:23:55 +01:00
James Darnley
eef763c705 checkasm/v210dec: add extra space to the destination arrays 2022-12-21 00:36:49 +01:00
James Darnley
6af453ca38 avcodec/x86: add avx512icl function for v210dec
Ice Lake (Xeon Silver 4316): 2.01x faster (1147±36.8 vs. 571±38.2 decicycles) compared with avx2
2022-12-20 15:02:45 +01:00
James Darnley
cfd1c3c0a1 checkasm/v210enc: test the entire width of 10-bit planar input arrays 2022-12-01 18:19:03 +01:00