1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-28 20:53:54 +02:00
Commit Graph

409 Commits

Author SHA1 Message Date
James Almer
d52ceed9fd tests/checkasm/sw_scale: use memset() to fill dither
Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-19 16:19:11 -03:00
Alan Kelly
ee18edb13a checkasm/sw_scale: properly initialize src_pixer and filter_coeff buffers
Fixes valgrind uninitialised value warnings.

Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-19 11:20:32 -03:00
James Almer
1371647fc3 checkasm/sw_scale: use av_free() instead of free()
Fixes crashes on Win64

Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-17 20:57:33 -03:00
Alan Kelly
554c2bc708 swscale: move yuv2yuvX_sse3 to yasm, unrolls main loop
And other small optimizations for ~20% speedup.
2021-02-17 21:21:03 +01:00
James Almer
bea7c51307 checkasm/vf_gblur: add a test for postscale_slice
Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-17 13:39:31 -03:00
James Almer
2df3c2ed9b checkasm/vf_gblur: split off the horiz_slice test into its own function
Will come in handy for the following commit.

Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-17 13:39:11 -03:00
Josh Dekker
9c513edb79 checkasm: add hevc_pel tests
Co-authored-by: Niklas Haas <git@haasn.xyz>
Signed-off-by: Josh Dekker <josh@itanimul.li>
2021-01-25 09:24:11 +01:00
Anton Khirnov
c8c2dfbc37 lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h
That is a more appropriate place for it.
2021-01-01 14:11:01 +01:00
Limin Wang
c748bd77dc tests: fix warning ISO C90 forbids mixed declarations and code
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
2020-09-10 20:34:51 +08:00
Carl Eugen Hoyos
b61376bdee lavfi/hflip: Support Bayer pixel formats.
Fixes part of ticket #8819.
2020-08-25 01:29:24 +02:00
Jiaxun Yang
e387fcd01c libavutil: Detect MMI and MSA flags for MIPS
Add MMI & MSA runtime detection for MIPS.

Basically there are two code pathes. For systems that
natively support CPUCFG instruction or kernel emulated
that instruction, we'll sense this feature from HWCAP and
report the flags according to values grab from CPUCFG. For
systems that have no CPUCFG (or not export it in HWCAP),
we'll parse /proc/cpuinfo instead.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-07-23 17:21:58 +02:00
James Almer
55e1bc39cb checkasm/vf_blend: use the correct depth parameters to initialize the blend modes
This effectively enables the tests that until now were just running
the C version alone.

Signed-off-by: James Almer <jamrial@gmail.com>
2020-07-12 11:30:23 -03:00
Jun Zhao
7f76f20fa0 checkasm: sw_rgb: Fix mixed declaration and code
Fix mixed declaration and code.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2020-06-01 23:28:07 +08:00
Andreas Rheinhardt
57e570b508 checkasm/sw_scale: Fix stack-buffer-overflow
A buffer whose size is not a multiple of four has been initialized using
consecutive writes of 32bits. This results in a stack-buffer-overflow
reported by ASAN in the checkasm-sw_scale FATE-test.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-20 23:18:50 +02:00
Martin Storsjö
9c326af1d0 checkasm: swscale: Fix running the hscale test on 32 bit x86
This function doesn't call emms.

Signed-off-by: Martin Storsjö <martin@martin.st>
2020-05-16 08:16:12 +03:00
Martin Storsjö
eba1ebd9bf checkasm: sw_rgb: Add a test for interleaveBytes
Signed-off-by: Martin Storsjö <martin@martin.st>
2020-05-15 23:38:01 +03:00
Martin Storsjö
5bdffced0a checkasm: pixblockdsp: Add tests for get_pixels_unaligned and diff_pixels_unaligned
Signed-off-by: Martin Storsjö <martin@martin.st>
2020-05-15 23:37:27 +03:00
Martin Storsjö
ed7d73355e checkasm: aarch64: Check for stack overflows
Also fill x8-x17 with garbage before calling the function.

Figure out the number of stack parameters and make sure that the
value on the stack after those is untouched.

Signed-off-by: Martin Storsjö <martin@martin.st>
2020-05-15 21:22:36 +03:00
Martin Storsjö
6cb2d4d94b checkasm: arm: Check for stack overflows
Figure out the number of stack parameters and make sure that the
value on the stack after those is untouched.

Signed-off-by: Martin Storsjö <martin@martin.st>
2020-05-15 21:22:34 +03:00
Martin Storsjö
3f266cf49e checkasm: arm: Don't use blx to call checkasm_fail_func
We should just use a normal bl here, and the linker will add the 'x'
bit if necessary.

This fixes calling the checkasm_fail_func on windows, where the
code is built in thumb mode (and the linker doesn't clear the 'x'
bit in the blx instruction).

Signed-off-by: Martin Storsjö <martin@martin.st>
2020-05-15 21:22:32 +03:00
Martin Storsjö
89cf9e1fb6 checkasm: arm: Make the indentation consistent with other files
This makes it easier to share code with e.g. the dav1d implementation
of checkasm.

Signed-off-by: Martin Storsjö <martin@martin.st>
2020-05-15 21:22:27 +03:00
Josh de Kock
5913cd4e6c checkasm: add hscale test
This tests the hscale 8bpp to 14/18bpp functions with different filter
sizes.

Signed-off-by: Josh de Kock <josh@itanimul.li>
2020-05-15 10:29:30 +01:00
Martin Storsjö
3ce1b2bf8d checkasm: add function to check and diff memory
This was ported from dav1d (c950e7101bdf5f7117bfca816984a21e550509f0).

Signed-off-by: Josh de Kock <josh@itanimul.li>
2020-05-15 10:29:30 +01:00
Linjie Fu
ddf6ca3a0e tests/checkasm: add overflow test for hevc_add_res
Add overflow test for hevc_add_res when int16_t coeff = -32768.

The result of C is good, while ASM is not.

To verify:
    make fate-checkasm-hevc_add_res
    ffmpeg/tests/checkasm/checkasm --test=hevc_add_res

./checkasm --test=hevc_add_res
checkasm: using random seed 679391863
MMXEXT:
    hevc_add_res_4x4_8_mmxext (hevc_add_res.c:69)
  - hevc_add_res.add_residual [FAILED]
SSE2:
    hevc_add_res_8x8_8_sse2 (hevc_add_res.c:69)
    hevc_add_res_16x16_8_sse2 (hevc_add_res.c:69)
    hevc_add_res_32x32_8_sse2 (hevc_add_res.c:69)
  - hevc_add_res.add_residual [FAILED]
AVX:
    hevc_add_res_8x8_8_avx (hevc_add_res.c:69)
    hevc_add_res_16x16_8_avx (hevc_add_res.c:69)
    hevc_add_res_32x32_8_avx (hevc_add_res.c:69)
  - hevc_add_res.add_residual [FAILED]
AVX2:
    hevc_add_res_32x32_8_avx2 (hevc_add_res.c:69)
  - hevc_add_res.add_residual [FAILED]
checkasm: 8 of 14 tests have failed

Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-03-27 10:57:40 +01:00
Linjie Fu
69b9548dd6 checkasm/hevc_add_res: prepare test data only if the fuction is not tested
check_func will return NULL for functions that have already been tested. If
the func is tested and skipped (which happens several times), there is no
need to prepare data(randomize_buffers and memcpy).

Move relative code in compare_add_res(), prepare data and do check only if
the function is not tested.

Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-03-27 10:57:40 +01:00
Martin Storsjö
5181f491ee checkasm: sbrdsp: Fix a spurious test failure by calculating a better epsilon for sum_square
Signed-off-by: Martin Storsjö <martin@martin.st>
2020-02-08 23:00:21 +02:00
Martin Storsjö
cbb254cb4c checkasm: Check HAVE_GETSTDHANDLE here as well
This was missed in 63418e374f.

Signed-off-by: Martin Storsjö <martin@martin.st>
2020-01-24 22:17:18 +02:00
Martin Storsjö
aad0e26f93 checkasm: aacpsdsp: Tolerate extra intermediate precision in stereo_interpolate
The stereo_interpolate functions add h_step to the values h
BUF_SIZE times. Within the stereo_interpolate C functions, the
values h (h0-h3, h00-h13) are declared as local float variables,
but the compiler is free to keep them in a register with extra
precision.

If the accumulation is rounded to 32 bit float precision after
each step, the less significant bits of h_step end up ignored
and the sum can deviate, affecting the end result more than
the currently set EPS.

By clearing the log2(BUF_SIZE) lower bits of h_step, we make sure
that the accumulation shouldn't differ significantly, regardless
of any extra precision in the accmulating register/variable.

This fixes the aacpsdsp checkasm test when built with clang for
mingw/x86_32.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-12-18 15:15:29 +02:00
Martin Storsjö
f58bda642d checkasm: af_afir: Use a dynamic tolerance depending on values
As the values generated by av_bmg_get can be arbitrarily large
(only the stddev is specified), we can't use a fixed tolerance.
Calculate a dynamic tolerance (like in float_dsp from 38f966b222),
based on the individual steps of the calculation.

This fixes running this test with certain seeds, when built with
clang for mingw/x86_32.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-12-12 23:57:08 +02:00
Martin Storsjö
8f70e261fa checkasm: float_dsp: Scale FLT/DBL_EPSILON sufficiently when comparing
As the values generated by av_bmg_get can be arbitrarily large
(only the stddev is specified), we can't use a fixed tolerance.

This matches what was done for test_vector_dmul_scalar in
38f966b222.

This fixes the float_dsp checkasm test for some seeds, when built
with clang for mingw/x86_32.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-12-11 22:19:45 +02:00
Ting Fu
9691e2a426 checkasm/vf_eq: add test for vf_eq
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
2019-09-26 08:10:31 +08:00
James Almer
1d86e4b3eb checkasm/opusdsp: declare opus_deemphasis as a function returning a float
Fixes ticket #8175

Signed-off-by: James Almer <jamrial@gmail.com>
2019-09-18 11:42:06 -03:00
Lynne
4ce1e13b54 checkasm: add opusdsp tests 2019-09-11 03:28:22 +01:00
Ruiling Song
8f4963ad25 checkasm/vf_gblur: add test for horiz_slice simd
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
2019-06-12 08:54:05 +08:00
James Darnley
76c370af64 checkasm: add test for v210dec 2019-05-02 19:21:37 +02:00
James Almer
0dda0f3bdb Merge commit 'f8abf7d4dfa0504f7f65e4f1fd9d22e01cb371cc'
* commit 'f8abf7d4dfa0504f7f65e4f1fd9d22e01cb371cc':
  checkasm/h264: test 4:2:2 chroma loop filter functions

Merged-by: James Almer <jamrial@gmail.com>
2019-03-14 16:31:41 -03:00
James Almer
06476249cd Merge commit '7e5bde93a1e7641e1622814dafac0be3f413d79b'
* commit '7e5bde93a1e7641e1622814dafac0be3f413d79b':
  build: Rename OBJDIRS variable to OUTDIRS

Merged-by: James Almer <jamrial@gmail.com>
2019-03-10 19:31:13 -03:00
Janne Grunau
f8abf7d4df checkasm/h264: test 4:2:2 chroma loop filter functions 2019-02-27 21:57:05 +01:00
James Almer
f32d293955 Merge commit 'd7f4f5c4a18a0c9e62635cfa6fe8a9302b413c01'
* commit 'd7f4f5c4a18a0c9e62635cfa6fe8a9302b413c01':
  checkasm/h264: add loop filter tests

Merged-by: James Almer <jamrial@gmail.com>
2019-02-20 15:28:25 -03:00
Diego Biurrun
7e5bde93a1 build: Rename OBJDIRS variable to OUTDIRS
These directories are not just for object files.
2019-02-16 13:09:35 +01:00
Carl Eugen Hoyos
608572ce84 tests/checkasm/checkasm: Do not define an unused function.
Fixes the following warning:
tests/checkasm/checkasm.c:615:12: warning: 'bench_init_ffmpeg' defined but not used
2019-01-31 20:16:17 +01:00
Janne Grunau
d7f4f5c4a1 checkasm/h264: add loop filter tests 2019-01-26 12:05:10 +01:00
James Almer
f477ee3e89 checkasm/af_afir: relax the max allowed absolute difference
Should fix failures on x86_32.

Signed-off-by: James Almer <jamrial@gmail.com>
2019-01-13 15:00:20 -03:00
James Almer
ba89dc27b5 checkasm: add an af_afir test
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2019-01-03 10:12:18 -03:00
James Almer
93bf1dcaec checkasm/float_dsp: add test for vector_dmul
Signed-off-by: James Almer <jamrial@gmail.com>
2018-09-14 12:51:55 -03:00
Andrey Semashev
d7eb8d8475 lavfi/tests: Fix 16-bit vf_blend test to avoid memory not aligned to 2 bytes
Generic C implementation of vf_blend performs reads and writes of 16-bit
elements, which requires the buffers to be aligned to at least 2-byte
boundary.

Also, the change fixes source buffer overrun caused by src_offset being
added to to test handling of misaligned buffers.

Fixes: #7226

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-05-30 02:42:10 +02:00
Clément Bœsch
2940af9389 tests/checkasm/nlmeans: fix invalid read/write on ii buffer 2018-05-18 19:12:11 +02:00
Jun Zhao
b30575bc98 checkasm/sw_rgb: fix the function declaration warning
fix the warning: "function declaration isn’t a prototype", in C
int foo() and int foo(void) are different functions. int foo()
accepts an arbitrary number of arguments, while int foo(void) accepts 0
arguments.

Signed-off-by: Jun Zhao <mypopydev@gmail.com>
2018-05-10 19:28:51 +08:00
Clément Bœsch
f679711c1b checkasm: add vf_nlmeans test for ssd_integral_image 2018-05-08 10:28:06 +02:00
Martin Vignali
07a566e7d6 swscale/swscale_unscaled : add X86_64 (SSE2 and AVX) for uyvyto422
and checkasm test
2018-04-22 19:15:32 +02:00
Michael Niedermayer
18d6ff2b42 tests/checkasm/checkasm: Provide verbose failure information on float_near_abs_eps() failures
This will make understanding failures and adjusting EPS easier

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-04-14 15:54:06 +02:00
Martin Vignali
595505083a checkasm/vf_blend : add test for 16 bit version of
grainextract
grainmerge
average
extremity
negation
2018-04-05 21:46:19 +02:00
Josh de Kock
cda43940da checkasm/Makefile: add EXTRALIBS-libavformat
Signed-off-by: Josh de Kock <josh@itanimul.li>
2018-03-31 23:20:16 +01:00
Martin Vignali
a9a7ed4f27 checkasm/swscale : add test for rgb shuffle_bytes func 2018-03-24 20:22:12 +01:00
Yingming Fan
e5b4cd4c4a checkasm/hevc_idct : update test bit depth from 8 9 and 10 to 8 10 and 12
Signed-off-by: James Almer <jamrial@gmail.com>
2018-03-19 00:56:01 -03:00
Yingming Fan
80798e3857 checkasm/hevc_sao : add hevc_sao for checkasm
Signed-off-by: James Almer <jamrial@gmail.com>
2018-03-07 23:53:32 -03:00
Martin Vignali
c0919c4985 checkasm/vf_blend : add test for blend_simple_16, phoenix_16 and difference_16 2018-02-24 21:44:23 +01:00
Martin Vignali
e3fc36a84c checkasm/vf_blend : add depth param in order to add test for 16 bit version 2018-02-24 21:44:09 +01:00
Muhammad Faiz
81d6501be7 checkasm/Makefile: add EXTRALIBS-swresample
Should fix https://ffmpeg.org/pipermail/ffmpeg-devel/2018-February/225058.html

Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
2018-02-09 17:50:44 +07:00
Martin Vignali
78b982d3b9 checkasm : add test for losslessvideoencdsp for diff bytes and sub_left_pred 2018-01-28 20:23:16 +01:00
James Darnley
40d4b13228 checkasm: support for AVX-512 functions 2017-12-24 22:02:41 +01:00
James Almer
da03242778 Revert "checkasm/vf_interlace : add test for lowpass_line 8 and 16"
This reverts commit adff97be5e.

It currently fails on Windows targets.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-19 19:07:24 -03:00
Martin Vignali
adff97be5e checkasm/vf_interlace : add test for lowpass_line 8 and 16 2017-12-19 20:59:51 +01:00
Martin Vignali
cefb7e0060 checkasm/vf_hflip : add test for vf_hflip byte and short simd 2017-12-13 11:34:29 +01:00
Martin Storsjö
18a0f42026 checkasm: Use LOCAL_ALIGNED for aligned variables on the stack
This fixes fate-checkasm-hevc_mc on ARMCC 5.0 after adding
NEON HEVC MC assembly.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-12-12 11:36:38 +02:00
James Almer
1215889bc1 checkasm/llviddsp: fix mixed code and declarations
Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-10 00:51:35 -03:00
Martin Vignali
e1121f9723 checkasm/llviddsp : add test for add_gradient_pred 2017-12-09 15:19:07 +01:00
Martin Vignali
5bda11e70e checkasm/llviddsp : test return of add_left_pred(16) 2017-12-09 15:15:56 +01:00
Martin Vignali
179a2f04eb checkasm/vf_threshold : add test for threshold16 2017-12-09 14:47:13 +01:00
James Almer
1b324700e3 checkasm/vf_threshold: fix mixed code and declarations
Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-04 15:46:09 -03:00
Martin Vignali
cfce442750 checkasm/vf_threshold : add checkasm test for threshold8 2017-12-03 19:17:15 +01:00
Martin Vignali
9bed17cd0f checkasm/utvideo : be more explicit to the WIDTH_PADDED define 2017-12-01 21:08:07 +01:00
Michael Niedermayer
38f966b222 tests/checkasm/float_dsp: Increase allowed difference for float_dsp.vector_dmul
Tested for 10000 iterations on x86-32

Fixes: Ticket6848

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-27 03:31:53 +01:00
James Almer
bea8eeaa2c checkasm/utvideodsp: zero initialize the entire buffer
Signed-off-by: James Almer <jamrial@gmail.com>
2017-11-21 11:24:38 -03:00
James Almer
9a05c873cf checkasm/utvideodsp: fix mixed declarations and code
Signed-off-by: James Almer <jamrial@gmail.com>
2017-11-21 11:13:24 -03:00
Martin Vignali
4a6aa6d1b2 checkasm : add test for huffyuvdsp add_int16 2017-11-21 09:41:42 +01:00
Martin Vignali
6a7eb65e1b checkasm : add utvideodsp test 2017-11-21 09:00:27 +01:00
James Almer
501435e5e6 checkasm/jpeg2000dsp: add test for ict_float
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2017-11-20 18:33:57 -03:00
James Almer
20a93ea8d4 checkasm/jpeg2000dsp: refactor rct_int test
Signed-off-by: James Almer <jamrial@gmail.com>
2017-11-20 18:33:57 -03:00
James Almer
0cef66c906 Merge commit '516c479172755c63063180b0c0953b68b670cdbd'
* commit '516c479172755c63063180b0c0953b68b670cdbd':
  checkasm: Test more h264 idct variants

See 2d263188ba

Merged-by: James Almer <jamrial@gmail.com>
2017-11-11 15:21:22 -03:00
James Almer
2d263188ba Merge commit '547db1eaecd597031165a2bf637acaaacde52788'
* commit '547db1eaecd597031165a2bf637acaaacde52788':
  checkasm: Test more h264 idct variants

Merged-by: James Almer <jamrial@gmail.com>
2017-11-11 13:18:55 -03:00
James Almer
4cfb46f94f checkasm/llviddsp: fix warnings about mixed declaration and code
Signed-off-by: James Almer <jamrial@gmail.com>
2017-11-08 14:54:15 -03:00
Martin Vignali
fbe9148779 checkasm/llviddsp : add test for other dsp func
add_median_pred
add_left_pred : add two func one with acc 0, and one with random acc
add_left_pred16
2017-11-07 00:54:17 +01:00
James Almer
122a749dfc Merge commit 'd05c9cde0e87c23ca42957646bea483dfc09d6bf'
* commit 'd05c9cde0e87c23ca42957646bea483dfc09d6bf':
  checkasm: aarch64: Specify alignment for the register_init const array

Merged-by: James Almer <jamrial@gmail.com>
2017-10-30 21:03:50 -03:00
James Almer
f568d9d0ba Merge commit 'e00db9f78bb475ed5103364f61892f4e75ef89ba'
* commit 'e00db9f78bb475ed5103364f61892f4e75ef89ba':
  checkasm: hevc: Add a hevc_ prefix to the add_residual functions

Merged-by: James Almer <jamrial@gmail.com>
2017-10-28 18:18:41 -03:00
James Almer
6dfcbd80ad Merge commit '7cb1d9e2dbbe5bf4652be5d78cdd68e956fa3d63'
* commit '7cb1d9e2dbbe5bf4652be5d78cdd68e956fa3d63':
  build: Fine-grained link-time dependency settings

Also included are bug fix commits 5ff3b5cafc,
d9da7151ee and
5e27ef800b.

Merged-by: James Almer <jamrial@gmail.com>
2017-10-11 17:55:25 -03:00
Martin Vignali
cbbec68847 libavcodec/blockdsp : add AVX version
Also modify the required alignment, to 32 instead of 16
for several codecs

Signed-off-by: James Almer <jamrial@gmail.com>
2017-10-03 19:47:37 -03:00
Martin Vignali
ac5908b13f libavcodec/exr : add x86 SIMD for predictor
Signed-off-by: James Almer <jamrial@gmail.com>
2017-10-01 17:35:30 -03:00
Martin Storsjö
516c479172 checkasm: Test more h264 idct variants
Signed-off-by: Martin Storsjö <martin@martin.st>
2017-09-27 13:58:39 +03:00
James Almer
7323c896b2 checkasm: add an exrdsp test
Signed-off-by: James Almer <jamrial@gmail.com>
2017-09-17 19:01:40 -03:00
Clément Bœsch
e0d56f097f checkasm: use perf API on Linux ARM*
On ARM platforms, accessing the PMU registers requires special user
access permissions. Since there is no other way to get accurate timers,
the current implementation of timers in FFmpeg rely on these registers.
Unfortunately, enabling user access to these registers on Linux is not
trivial, and generally involve compiling a random and unreliable github
kernel module, or patching somehow your kernel.

Such module is very unlikely to reach the upstream anytime soon. Quoting
Robin Murphin from ARM:

> Say you do give userspace direct access to the PMU; now run two or more
> programs at once that believe they can use the counters for their own
> "minimal-overhead" profiling. Have fun interpreting those results...
>
> And that's not even getting into the implications of scheduling across
> different CPUs, CPUidle, etc. where the PMU state is completely beyond
> userspace's control. In general, the plan to provide userspace with
> something which might happen to just about work in a few corner cases,
> but is meaningless, misleading or downright broken in all others, is to
> never do so.

As a result, the alternative is to use the Performance Monitoring Linux
API which makes use of these registers internally (assuming the PMU of
your ARM board is supported in the kernel, which is definitely not a
given...).

While the Linux API is obviously cross platform, it does have a
significant overhead which needs to be taken into account. As a result,
that mode is only weakly enabled on ARM platforms exclusively.

Note on the non flexibility of the implementation: the timers (native
FFmpeg vs Linux API) are selected at compilation time to prevent the
need of function calls, which would result in a negative impact on the
cycle counters.
2017-09-08 18:51:05 +02:00
Martin Storsjö
e12f1cd616 Revert "checkasm: Test more h264 idct variants"
This reverts commit 547db1eaec.

This commit wasn't supposed to be pushed (yet) since it hasn't
been reviewed.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-09-02 22:23:30 +03:00
Martin Storsjö
547db1eaec checkasm: Test more h264 idct variants 2017-08-31 14:55:34 +03:00
James Almer
e51073fe00 checkasm/vf_blend: rename addition128 and difference128 to grainmerge and grainextract
This was missing from f8d0689d3f.
Fixes checkasm.
2017-08-24 23:39:09 -03:00
James Almer
6f205a42d7 checkasm: add hybrid_analysis_ileave and hybrid_synthesis_deint tests to aacpsdsp
Signed-off-by: James Almer <jamrial@gmail.com>
2017-07-13 17:03:28 -03:00
James Almer
823cc7e25f checkasm: add a g722dsp test
Signed-off-by: James Almer <jamrial@gmail.com>
2017-07-13 17:00:19 -03:00
James Almer
3d3243577c checkasm: use declare_func_float() in sbrdsp sum_square test
The function returns a float.

This fixes the test in x86_32 targets.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-07-04 23:02:57 -03:00
Matthieu Bouron
7864e07f4a checkasm: add sbrdsp tests 2017-07-03 14:28:17 +02:00
James Almer
0eb783eb06 checkasm: randomize the full input buffer in test_hybrid_analysis
Missed in the last commit.
2017-06-30 22:49:54 -03:00
James Almer
fb7b477a91 checkasm: fix size of input buffer in test_hybrid_analysis 2017-06-30 20:37:06 -03:00
Clément Bœsch
b12a36170b lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysis 2017-06-28 12:22:39 +02:00
Clément Bœsch
edd041e64c checkasm: add AAC PS tests
This includes various fixes and improvements from James Almer.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-28 12:22:39 +02:00
James Almer
fa50d9360b x86/vf_blend: add sse and ssse3 extremity functions
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-27 13:17:23 -03:00
James Almer
a579dbb4f7 checkasm: add missing checks to float_dsp's butterflies_float test 2017-06-23 23:38:07 -03:00
Matthieu Bouron
067e42b851 checkasm/aarch64: fix tests returning a float
Avoids overriding the v0 register (which containins the result of the
tested function) in checkasm_call_checked.
2017-06-22 09:18:10 +02:00
Diego Biurrun
fd502f4f5f build: Generalize yasm/nasm-related variable names
None of them are specific to the YASM assembler.

(Cherry-picked from libav commit 39e208f4d4)

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-21 17:00:29 -03:00
James Almer
5b10f484e2 checkasm: add float_dsp tests
Ported from libavutil/tests/float_dsp.c

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-14 19:20:10 -03:00
James Almer
37388b119c checkasm: add a checkasm_checked_call function that doesn't issue emms
Meant for DSP functions returning a float or double, as they'd fail if emms
is called after every run on x86_32.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-14 19:18:56 -03:00
James Almer
93dc1c1221 checkasm: add _fixed suffix to fixed_dsp tests
Should prevents future conflicts with the similarly named floatdsp tests
2017-06-01 13:12:20 -03:00
Martin Storsjö
d05c9cde0e checkasm: aarch64: Specify alignment for the register_init const array
Loads from this strictly doesn't require alignment, but specify it
just for consistency with the arm version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-05-15 10:19:46 +03:00
Martin Storsjö
e00db9f78b checkasm: hevc: Add a hevc_ prefix to the add_residual functions
This makes it easier to group them with the rest when running e.g.
--bench=hevc.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-04-21 13:32:44 +03:00
James Almer
7b3cb953f7 checkasm: add fixed_dsp tests
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2017-04-11 18:05:13 -03:00
Clément Bœsch
210678d3c5 Merge commit '3794062ab1a13442b06f6d76c54dce51ffa54697'
* commit '3794062ab1a13442b06f6d76c54dce51ffa54697':
  Remove Plan 9 support

Merged-by: Clément Bœsch <u@pkh.me>
2017-04-09 14:52:00 +02:00
James Almer
6747fc436e Merge commit 'effc1430b2fe5997d9d55bf28dc507c27125eb27'
* commit 'effc1430b2fe5997d9d55bf28dc507c27125eb27':
  Revert "checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately"

Merged-by: James Almer <jamrial@gmail.com>
2017-04-04 15:26:18 -03:00
Clément Bœsch
edfa7ac8ec Merge commit '81d7f0bbca837afda1f7e60d3ae52ab1360ab44b'
* commit '81d7f0bbca837afda1f7e60d3ae52ab1360ab44b':
  checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately

Merged-by: Clément Bœsch <u@pkh.me>
2017-04-01 11:54:29 +02:00
Clément Bœsch
b589e83f43 Merge commit '9498237049d15812cecb79df47b196c73013908b'
* commit '9498237049d15812cecb79df47b196c73013908b':
  checkasm: Add --test parameter to check only specific components

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-03-31 10:06:13 +02:00
Clément Bœsch
1c9f4b5078 lavc/vp9: split into vp9{block,data,mvs}
This is following Libav layout to ease merges.
2017-03-27 21:38:21 +02:00
James Almer
09ce5519f3 fate/checkasm: fix use of uninitialized memory on hevc_add_res tests 2017-03-24 22:11:34 -03:00
James Almer
36eae45510 fate/checkasm: use LOCAL_ALINGED_32 on hevc_add_res tests 2017-03-24 22:11:22 -03:00
Clément Bœsch
3d4039f964 Merge commit 'ed48a9d8143d2575a4458589cebde69ec326afd8'
* commit 'ed48a9d8143d2575a4458589cebde69ec326afd8':
  checkasm: Add a test for HEVC add_residual

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-24 12:37:09 +01:00
James Almer
0d34473d8e Merge commit 'dd5d4a0e1e3a30a254d1a57ecbdcedf230c6014b'
* commit 'dd5d4a0e1e3a30a254d1a57ecbdcedf230c6014b':
  checkasm: aarch64: Don't clobber x29 in checkasm_stack_clobber

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 18:31:36 -03:00
James Almer
f23078904f Merge commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055'
* commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055':
  build: Drop arch-specific checkasm Makefiles

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 18:01:47 -03:00
James Almer
3ddae9eee9 Merge commit '93d5b022a9fd3a1a1f9c521a1eac7f0410e05b81'
* commit '93d5b022a9fd3a1a1f9c521a1eac7f0410e05b81':
  build: Drop duplicate asm recipe

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 17:57:35 -03:00
James Almer
67b639b496 Merge commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f'
* commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f':
  checkasm: aarch64: Add filler args to make sure all parameters are passed on the stack

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 17:38:20 -03:00
James Almer
a2d34cc51b Merge commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6'
* commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6':
  checkasm: aarch64: Clobber the stack before calling functions

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 17:36:53 -03:00
James Almer
cab4c7fa19 Merge commit 'a05cc56124b4f1237f6355784de821e3290ddb44'
* commit 'a05cc56124b4f1237f6355784de821e3290ddb44':
  checkasm: arm/aarch64: Fix the amount of space reserved for stack parameters

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 17:35:38 -03:00
Clément Bœsch
50bbb67472 Merge commit 'e3f941cb03b139b866a0ad6dc95fbe1b247d54af'
* commit 'e3f941cb03b139b866a0ad6dc95fbe1b247d54af':
  checkasm: add a test for HEVC IDCT

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-23 12:17:39 +01:00
Diego Biurrun
dcc39ee10e lavc: Remove deprecated XvMC support hacks
Deprecated in 11/2013.
2017-03-23 10:09:14 +01:00
James Almer
30cadfe071 avcodec/lossless_videodsp: use ptrdiff_t for length parameters
Signed-off-by: James Almer <jamrial@gmail.com>
2017-03-22 18:38:35 -03:00
Clément Bœsch
7c2a7f9c11 Merge commit '22c3ab18646924ce24dc6017a9e882ff69689e40'
* commit '22c3ab18646924ce24dc6017a9e882ff69689e40':
  checkasm: Add test for huffyuvdsp add_bytes

huffyuvdsp is renamed to llviddsp to be consistent with our codebase.

Note: af607b7e07 wasn't actually required for this test since this
commit is not actually testing huffyuvdsp.

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-22 16:31:38 +01:00
Clément Bœsch
83cd80d10a Merge commit '12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5'
* commit '12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5':
  audiodsp/x86: yasmify vector_clipf_sse
  audiodsp: reorder arguments for vector_clipf

Merged the version from Libav after a discussion with James Almer on
IRC:

19:22 <ubitux> jamrial: opinion on 12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5?
19:23 <ubitux> it was apparently yasmified differently
19:23 <ubitux> (it depends on the previous commit arg shuffle)
19:24 <ubitux> i don't see the magic movsxdifnidn in your port btw
19:24 <ubitux> it's a port from 1d36defe94
19:25 <jamrial> seems better thanks to said arg shuffle
19:25 <jamrial> the loop is the same, but init is simpler
19:25 <jamrial> probably worth merging
19:25 <ubitux> OK
19:25 <ubitux> thanks
19:26 <jamrial> curious they didn't make len ptrdiff_t after the previous bunch of commits, heh
19:26 <ubitux> yeah indeed

Both commits are merged at the same time to prevent a conflict with our
existing yasmified ff_vector_clipf_sse.

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 22:35:07 +01:00
Clément Bœsch
8414755486 Merge commit 'e9ef6171396dc4106526aaa86b620c61ca3d1017'
* commit 'e9ef6171396dc4106526aaa86b620c61ca3d1017':
  checkasm: add tests for audiodsp

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 19:10:56 +01:00
Clément Bœsch
c50b2164a6 Merge commit '2eb97af66af90ca3978229da151f0b8b3a5d9370'
* commit '2eb97af66af90ca3978229da151f0b8b3a5d9370':
  checkasm: add a test for blockdsp

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 19:05:05 +01:00
Clément Bœsch
e07fa3008b Merge commit 'de452e503734ebb0fdbce86e9d16693b3530fad3'
* commit 'de452e503734ebb0fdbce86e9d16693b3530fad3':
  pixblockdsp: Change type of stride parameters to ptrdiff_t

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 15:58:32 +01:00
Clément Bœsch
3c8f7a8f6b Merge commit 'e89cef40506d990a982aefedfde7d3ca4f88c524'
* commit 'e89cef40506d990a982aefedfde7d3ca4f88c524':
  checkasm: Read the unsigned value as it should

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 11:55:20 +01:00
James Almer
e5623aafd8 Merge commit '87c6c78604e4dd16f1f45862b27ca006da010527'
* commit '87c6c78604e4dd16f1f45862b27ca006da010527':
  vp8: Change type of stride parameters to ptrdiff_t

Merged-by: James Almer <jamrial@gmail.com>
2017-03-19 15:11:44 -03:00
Clément Bœsch
8b13492c9e Merge commit '40ad05bab206c932a32171d45581080c914b06ec'
* commit '40ad05bab206c932a32171d45581080c914b06ec':
  checkasm: Cast unsigned to signed

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-03-15 12:32:15 +01:00
Diego Biurrun
39e208f4d4 build: Generalize yasm/nasm-related variable names
None of them are specific to the YASM assembler.
2017-03-01 10:18:15 +01:00
Diego Biurrun
7cb1d9e2db build: Fine-grained link-time dependency settings
Previously, all link-time dependencies were added for all libraries,
resulting in bogus link-time dependencies since not all dependencies
are shared across libraries. Also, in some cases like libavutil, not
all dependencies were taken into account, resulting in some cases of
underlinking.

To address all this mess a machinery is added for tracking which
dependency belongs to which library component and then leveraged
to determine correct dependencies for all individual libraries.
2017-03-01 09:00:40 +01:00
Clément Bœsch
92cb9a3869 Merge commit '9064777dbb335ab4809ae09e3fdcc0245f925cdc'
* commit '9064777dbb335ab4809ae09e3fdcc0245f925cdc':
  checkasm: add HEVC test for testing IDCT DC

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-02-02 11:40:58 +01:00
Clément Bœsch
a0860b0a38 Merge commit '6f9e34baea4f6f484392e4e67f606a0835d07b73'
* commit '6f9e34baea4f6f484392e4e67f606a0835d07b73':
  arm: Check for support for the .fpu directive

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-02-02 11:22:04 +01:00
Clément Bœsch
9f1c81e5ec Merge commit '71a0472114574993df7035f4de9aa007e03817b8'
* commit '71a0472114574993df7035f4de9aa007e03817b8':
  checkasm: arm: report the first clobbered register in checkasm_checked_call

Also includes 446353ea18, 59aeed93e4, and 37961044c6 to avoid breaking
too much stuff.

Merged-by: Clément Bœsch <u@pkh.me>
2017-01-24 19:21:29 +01:00
Martin Storsjö
388f6e6715 arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

                                     Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub16_add_neon:   3188.1   2435.4   2499.0   1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon:  18531.7  16582.3  14207.6  12000.3

By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     274.6    189.5    211.7    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    2064.0   1534.8   1719.4   1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    2135.0   1477.2   1736.3   1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon:    2446.7   1828.7   1993.6   1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   2832.4   2118.3   2266.5   1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.7   2475.3   2523.5   1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     756.2    456.7    862.0    553.9
vp9_inv_dct_dct_32x32_sub2_add_neon:   10682.2   8190.4   8539.2   6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10813.5   8014.9   8518.3   6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon:   11859.6   9313.0   9347.4   7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon:  12946.6  10752.4  10192.2   8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon:  14074.6  11946.5  11001.4   9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon:  15269.9  13662.7  11816.1   9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon:  16327.9  14940.1  12626.7  10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17462.7  15776.1  13446.2  11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon:  18575.5  17157.0  14249.3  12015.1

I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.

In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.

This is cherrypicked from libav commit
9c8bc74c2b.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:30 +01:00
Ronald S. Bultje
1c8fbd7b90 checkasm/vp9: benchmark all sub-IDCTs (but not WHT or ADST). 2016-12-27 10:02:33 -05:00
Diego Biurrun
3794062ab1 Remove Plan 9 support
Supporting the system was a nice joke for the 9 release, but it has
run its course. Nowadays Plan 9 receives no testing and has no
practical usefulness.
2016-12-03 09:15:01 +01:00
Martin Storsjö
9c8bc74c2b arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

                                     Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub16_add_neon:   3188.1   2435.4   2499.0   1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon:  18531.7  16582.3  14207.6  12000.3

By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     274.6    189.5    211.7    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    2064.0   1534.8   1719.4   1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    2135.0   1477.2   1736.3   1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon:    2446.7   1828.7   1993.6   1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   2832.4   2118.3   2266.5   1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.7   2475.3   2523.5   1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     756.2    456.7    862.0    553.9
vp9_inv_dct_dct_32x32_sub2_add_neon:   10682.2   8190.4   8539.2   6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10813.5   8014.9   8518.3   6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon:   11859.6   9313.0   9347.4   7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon:  12946.6  10752.4  10192.2   8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon:  14074.6  11946.5  11001.4   9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon:  15269.9  13662.7  11816.1   9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon:  16327.9  14940.1  12626.7  10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17462.7  15776.1  13446.2  11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon:  18575.5  17157.0  14249.3  12015.1

I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.

In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-30 23:54:07 +02:00
Ronald S. Bultje
06fec74cac checkasm: vp9dsp: benchmark all sub-IDCTs (but not WHT or ADST).
Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-23 23:55:38 +02:00
Martin Storsjö
effc1430b2 Revert "checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately"
This reverts commit 81d7f0bbca.

Instead of just benchmarking dc separately, test all relevant subparts
(in the next commit).

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-23 23:55:26 +02:00
Hendrik Leppkes
286d8bae61 Merge commit '7b1ae0e73ab7f7c5eabc70dbe2e579127c6e154f'
* commit '7b1ae0e73ab7f7c5eabc70dbe2e579127c6e154f':
  checkasm/arm: preserve the stack alignment checkasm_checked_call

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-11-17 15:21:32 +01:00
Hendrik Leppkes
c0af1ee90d Merge commit '80fbb7becae530167373fe5178966b7d7604306e'
* commit '80fbb7becae530167373fe5178966b7d7604306e':
  checkasm: vp8.mc: initialize the full src buffer after ec32574209

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-11-17 15:20:10 +01:00