1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-28 20:53:54 +02:00
Commit Graph

409 Commits

Author SHA1 Message Date
Clément Bœsch
b12a36170b lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysis 2017-06-28 12:22:39 +02:00
Clément Bœsch
edd041e64c checkasm: add AAC PS tests
This includes various fixes and improvements from James Almer.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-28 12:22:39 +02:00
James Almer
fa50d9360b x86/vf_blend: add sse and ssse3 extremity functions
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-27 13:17:23 -03:00
James Almer
a579dbb4f7 checkasm: add missing checks to float_dsp's butterflies_float test 2017-06-23 23:38:07 -03:00
Matthieu Bouron
067e42b851 checkasm/aarch64: fix tests returning a float
Avoids overriding the v0 register (which containins the result of the
tested function) in checkasm_call_checked.
2017-06-22 09:18:10 +02:00
Diego Biurrun
fd502f4f5f build: Generalize yasm/nasm-related variable names
None of them are specific to the YASM assembler.

(Cherry-picked from libav commit 39e208f4d4)

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-21 17:00:29 -03:00
James Almer
5b10f484e2 checkasm: add float_dsp tests
Ported from libavutil/tests/float_dsp.c

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-14 19:20:10 -03:00
James Almer
37388b119c checkasm: add a checkasm_checked_call function that doesn't issue emms
Meant for DSP functions returning a float or double, as they'd fail if emms
is called after every run on x86_32.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-14 19:18:56 -03:00
James Almer
93dc1c1221 checkasm: add _fixed suffix to fixed_dsp tests
Should prevents future conflicts with the similarly named floatdsp tests
2017-06-01 13:12:20 -03:00
Martin Storsjö
d05c9cde0e checkasm: aarch64: Specify alignment for the register_init const array
Loads from this strictly doesn't require alignment, but specify it
just for consistency with the arm version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-05-15 10:19:46 +03:00
Martin Storsjö
e00db9f78b checkasm: hevc: Add a hevc_ prefix to the add_residual functions
This makes it easier to group them with the rest when running e.g.
--bench=hevc.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-04-21 13:32:44 +03:00
James Almer
7b3cb953f7 checkasm: add fixed_dsp tests
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2017-04-11 18:05:13 -03:00
Clément Bœsch
210678d3c5 Merge commit '3794062ab1a13442b06f6d76c54dce51ffa54697'
* commit '3794062ab1a13442b06f6d76c54dce51ffa54697':
  Remove Plan 9 support

Merged-by: Clément Bœsch <u@pkh.me>
2017-04-09 14:52:00 +02:00
James Almer
6747fc436e Merge commit 'effc1430b2fe5997d9d55bf28dc507c27125eb27'
* commit 'effc1430b2fe5997d9d55bf28dc507c27125eb27':
  Revert "checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately"

Merged-by: James Almer <jamrial@gmail.com>
2017-04-04 15:26:18 -03:00
Clément Bœsch
edfa7ac8ec Merge commit '81d7f0bbca837afda1f7e60d3ae52ab1360ab44b'
* commit '81d7f0bbca837afda1f7e60d3ae52ab1360ab44b':
  checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately

Merged-by: Clément Bœsch <u@pkh.me>
2017-04-01 11:54:29 +02:00
Clément Bœsch
b589e83f43 Merge commit '9498237049d15812cecb79df47b196c73013908b'
* commit '9498237049d15812cecb79df47b196c73013908b':
  checkasm: Add --test parameter to check only specific components

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-03-31 10:06:13 +02:00
Clément Bœsch
1c9f4b5078 lavc/vp9: split into vp9{block,data,mvs}
This is following Libav layout to ease merges.
2017-03-27 21:38:21 +02:00
James Almer
09ce5519f3 fate/checkasm: fix use of uninitialized memory on hevc_add_res tests 2017-03-24 22:11:34 -03:00
James Almer
36eae45510 fate/checkasm: use LOCAL_ALINGED_32 on hevc_add_res tests 2017-03-24 22:11:22 -03:00
Clément Bœsch
3d4039f964 Merge commit 'ed48a9d8143d2575a4458589cebde69ec326afd8'
* commit 'ed48a9d8143d2575a4458589cebde69ec326afd8':
  checkasm: Add a test for HEVC add_residual

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-24 12:37:09 +01:00
James Almer
0d34473d8e Merge commit 'dd5d4a0e1e3a30a254d1a57ecbdcedf230c6014b'
* commit 'dd5d4a0e1e3a30a254d1a57ecbdcedf230c6014b':
  checkasm: aarch64: Don't clobber x29 in checkasm_stack_clobber

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 18:31:36 -03:00
James Almer
f23078904f Merge commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055'
* commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055':
  build: Drop arch-specific checkasm Makefiles

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 18:01:47 -03:00
James Almer
3ddae9eee9 Merge commit '93d5b022a9fd3a1a1f9c521a1eac7f0410e05b81'
* commit '93d5b022a9fd3a1a1f9c521a1eac7f0410e05b81':
  build: Drop duplicate asm recipe

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 17:57:35 -03:00
James Almer
67b639b496 Merge commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f'
* commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f':
  checkasm: aarch64: Add filler args to make sure all parameters are passed on the stack

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 17:38:20 -03:00
James Almer
a2d34cc51b Merge commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6'
* commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6':
  checkasm: aarch64: Clobber the stack before calling functions

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 17:36:53 -03:00
James Almer
cab4c7fa19 Merge commit 'a05cc56124b4f1237f6355784de821e3290ddb44'
* commit 'a05cc56124b4f1237f6355784de821e3290ddb44':
  checkasm: arm/aarch64: Fix the amount of space reserved for stack parameters

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 17:35:38 -03:00
Clément Bœsch
50bbb67472 Merge commit 'e3f941cb03b139b866a0ad6dc95fbe1b247d54af'
* commit 'e3f941cb03b139b866a0ad6dc95fbe1b247d54af':
  checkasm: add a test for HEVC IDCT

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-23 12:17:39 +01:00
Diego Biurrun
dcc39ee10e lavc: Remove deprecated XvMC support hacks
Deprecated in 11/2013.
2017-03-23 10:09:14 +01:00
James Almer
30cadfe071 avcodec/lossless_videodsp: use ptrdiff_t for length parameters
Signed-off-by: James Almer <jamrial@gmail.com>
2017-03-22 18:38:35 -03:00
Clément Bœsch
7c2a7f9c11 Merge commit '22c3ab18646924ce24dc6017a9e882ff69689e40'
* commit '22c3ab18646924ce24dc6017a9e882ff69689e40':
  checkasm: Add test for huffyuvdsp add_bytes

huffyuvdsp is renamed to llviddsp to be consistent with our codebase.

Note: af607b7e07 wasn't actually required for this test since this
commit is not actually testing huffyuvdsp.

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-22 16:31:38 +01:00
Clément Bœsch
83cd80d10a Merge commit '12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5'
* commit '12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5':
  audiodsp/x86: yasmify vector_clipf_sse
  audiodsp: reorder arguments for vector_clipf

Merged the version from Libav after a discussion with James Almer on
IRC:

19:22 <ubitux> jamrial: opinion on 12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5?
19:23 <ubitux> it was apparently yasmified differently
19:23 <ubitux> (it depends on the previous commit arg shuffle)
19:24 <ubitux> i don't see the magic movsxdifnidn in your port btw
19:24 <ubitux> it's a port from 1d36defe94
19:25 <jamrial> seems better thanks to said arg shuffle
19:25 <jamrial> the loop is the same, but init is simpler
19:25 <jamrial> probably worth merging
19:25 <ubitux> OK
19:25 <ubitux> thanks
19:26 <jamrial> curious they didn't make len ptrdiff_t after the previous bunch of commits, heh
19:26 <ubitux> yeah indeed

Both commits are merged at the same time to prevent a conflict with our
existing yasmified ff_vector_clipf_sse.

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 22:35:07 +01:00
Clément Bœsch
8414755486 Merge commit 'e9ef6171396dc4106526aaa86b620c61ca3d1017'
* commit 'e9ef6171396dc4106526aaa86b620c61ca3d1017':
  checkasm: add tests for audiodsp

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 19:10:56 +01:00
Clément Bœsch
c50b2164a6 Merge commit '2eb97af66af90ca3978229da151f0b8b3a5d9370'
* commit '2eb97af66af90ca3978229da151f0b8b3a5d9370':
  checkasm: add a test for blockdsp

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 19:05:05 +01:00
Clément Bœsch
e07fa3008b Merge commit 'de452e503734ebb0fdbce86e9d16693b3530fad3'
* commit 'de452e503734ebb0fdbce86e9d16693b3530fad3':
  pixblockdsp: Change type of stride parameters to ptrdiff_t

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 15:58:32 +01:00
Clément Bœsch
3c8f7a8f6b Merge commit 'e89cef40506d990a982aefedfde7d3ca4f88c524'
* commit 'e89cef40506d990a982aefedfde7d3ca4f88c524':
  checkasm: Read the unsigned value as it should

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 11:55:20 +01:00
James Almer
e5623aafd8 Merge commit '87c6c78604e4dd16f1f45862b27ca006da010527'
* commit '87c6c78604e4dd16f1f45862b27ca006da010527':
  vp8: Change type of stride parameters to ptrdiff_t

Merged-by: James Almer <jamrial@gmail.com>
2017-03-19 15:11:44 -03:00
Clément Bœsch
8b13492c9e Merge commit '40ad05bab206c932a32171d45581080c914b06ec'
* commit '40ad05bab206c932a32171d45581080c914b06ec':
  checkasm: Cast unsigned to signed

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-03-15 12:32:15 +01:00
Diego Biurrun
39e208f4d4 build: Generalize yasm/nasm-related variable names
None of them are specific to the YASM assembler.
2017-03-01 10:18:15 +01:00
Diego Biurrun
7cb1d9e2db build: Fine-grained link-time dependency settings
Previously, all link-time dependencies were added for all libraries,
resulting in bogus link-time dependencies since not all dependencies
are shared across libraries. Also, in some cases like libavutil, not
all dependencies were taken into account, resulting in some cases of
underlinking.

To address all this mess a machinery is added for tracking which
dependency belongs to which library component and then leveraged
to determine correct dependencies for all individual libraries.
2017-03-01 09:00:40 +01:00
Clément Bœsch
92cb9a3869 Merge commit '9064777dbb335ab4809ae09e3fdcc0245f925cdc'
* commit '9064777dbb335ab4809ae09e3fdcc0245f925cdc':
  checkasm: add HEVC test for testing IDCT DC

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-02-02 11:40:58 +01:00
Clément Bœsch
a0860b0a38 Merge commit '6f9e34baea4f6f484392e4e67f606a0835d07b73'
* commit '6f9e34baea4f6f484392e4e67f606a0835d07b73':
  arm: Check for support for the .fpu directive

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-02-02 11:22:04 +01:00
Clément Bœsch
9f1c81e5ec Merge commit '71a0472114574993df7035f4de9aa007e03817b8'
* commit '71a0472114574993df7035f4de9aa007e03817b8':
  checkasm: arm: report the first clobbered register in checkasm_checked_call

Also includes 446353ea18, 59aeed93e4, and 37961044c6 to avoid breaking
too much stuff.

Merged-by: Clément Bœsch <u@pkh.me>
2017-01-24 19:21:29 +01:00
Martin Storsjö
388f6e6715 arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

                                     Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub16_add_neon:   3188.1   2435.4   2499.0   1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon:  18531.7  16582.3  14207.6  12000.3

By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     274.6    189.5    211.7    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    2064.0   1534.8   1719.4   1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    2135.0   1477.2   1736.3   1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon:    2446.7   1828.7   1993.6   1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   2832.4   2118.3   2266.5   1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.7   2475.3   2523.5   1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     756.2    456.7    862.0    553.9
vp9_inv_dct_dct_32x32_sub2_add_neon:   10682.2   8190.4   8539.2   6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10813.5   8014.9   8518.3   6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon:   11859.6   9313.0   9347.4   7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon:  12946.6  10752.4  10192.2   8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon:  14074.6  11946.5  11001.4   9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon:  15269.9  13662.7  11816.1   9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon:  16327.9  14940.1  12626.7  10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17462.7  15776.1  13446.2  11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon:  18575.5  17157.0  14249.3  12015.1

I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.

In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.

This is cherrypicked from libav commit
9c8bc74c2b.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:30 +01:00
Ronald S. Bultje
1c8fbd7b90 checkasm/vp9: benchmark all sub-IDCTs (but not WHT or ADST). 2016-12-27 10:02:33 -05:00
Diego Biurrun
3794062ab1 Remove Plan 9 support
Supporting the system was a nice joke for the 9 release, but it has
run its course. Nowadays Plan 9 receives no testing and has no
practical usefulness.
2016-12-03 09:15:01 +01:00
Martin Storsjö
9c8bc74c2b arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

                                     Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub16_add_neon:   3188.1   2435.4   2499.0   1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon:  18531.7  16582.3  14207.6  12000.3

By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     274.6    189.5    211.7    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    2064.0   1534.8   1719.4   1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    2135.0   1477.2   1736.3   1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon:    2446.7   1828.7   1993.6   1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   2832.4   2118.3   2266.5   1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.7   2475.3   2523.5   1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     756.2    456.7    862.0    553.9
vp9_inv_dct_dct_32x32_sub2_add_neon:   10682.2   8190.4   8539.2   6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10813.5   8014.9   8518.3   6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon:   11859.6   9313.0   9347.4   7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon:  12946.6  10752.4  10192.2   8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon:  14074.6  11946.5  11001.4   9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon:  15269.9  13662.7  11816.1   9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon:  16327.9  14940.1  12626.7  10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17462.7  15776.1  13446.2  11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon:  18575.5  17157.0  14249.3  12015.1

I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.

In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-30 23:54:07 +02:00
Ronald S. Bultje
06fec74cac checkasm: vp9dsp: benchmark all sub-IDCTs (but not WHT or ADST).
Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-23 23:55:38 +02:00
Martin Storsjö
effc1430b2 Revert "checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately"
This reverts commit 81d7f0bbca.

Instead of just benchmarking dc separately, test all relevant subparts
(in the next commit).

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-23 23:55:26 +02:00
Hendrik Leppkes
286d8bae61 Merge commit '7b1ae0e73ab7f7c5eabc70dbe2e579127c6e154f'
* commit '7b1ae0e73ab7f7c5eabc70dbe2e579127c6e154f':
  checkasm/arm: preserve the stack alignment checkasm_checked_call

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-11-17 15:21:32 +01:00
Hendrik Leppkes
c0af1ee90d Merge commit '80fbb7becae530167373fe5178966b7d7604306e'
* commit '80fbb7becae530167373fe5178966b7d7604306e':
  checkasm: vp8.mc: initialize the full src buffer after ec32574209

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-11-17 15:20:10 +01:00
Hendrik Leppkes
90b72f6bda Merge commit '8c816c0c9b12fdefd9046415e97df299880bc9b8'
* commit '8c816c0c9b12fdefd9046415e97df299880bc9b8':
  checkasm/arm: align the clobber check data properly for ldrd

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-11-17 15:06:10 +01:00
Hendrik Leppkes
4fe013fc70 Merge commit 'ec32574209f36467ef0d22c21a7e811ba98c15b6'
* commit 'ec32574209f36467ef0d22c21a7e811ba98c15b6':
  checkasm: vp8: mc: test unequal width/height for partitions

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-11-17 15:05:25 +01:00
Martin Storsjö
81d7f0bbca checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately
The dc-only mode is already checked to work correctly above, but this
allows benchmarking this mode for performance tuning, and allows making
sure that it actually is correctly hooked up.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-16 10:06:32 +02:00
Hendrik Leppkes
47f75839e4 Merge commit 'f8d17d53957056c053a46f9320fa7ae6fe1479a5'
* commit 'f8d17d53957056c053a46f9320fa7ae6fe1479a5':
  checkasm: Add tests for vp8dsp

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-11-14 15:29:08 +01:00
Hendrik Leppkes
f75035b06f Merge commit 'e48746deec48e9ff195841bc3266b4e153a878cd'
* commit 'e48746deec48e9ff195841bc3266b4e153a878cd':
  checkasm: h264dsp: Move the x and y variables into the randomize_buffer macro

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-11-13 23:02:39 +01:00
Ronald S. Bultje
0b37cd09a6 checkasm: add vp9dsp.itxfm_add tests.
This includes fixes by Henrik Gramner.

The forward transforms are derived from the reference encoder.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-11 11:09:05 +02:00
Diego Biurrun
9498237049 checkasm: Add --test parameter to check only specific components
Inspired by a patch from Martin Storsjö <martin@martin.st>.
2016-11-08 17:32:25 +01:00
Martin Storsjö
2e55e26b40 vp9: Flip the order of arguments in MC functions
This makes it match the pattern already used for VP8 MC functions.

This also makes the signature match ffmpeg's version of these
functions, easing porting of code in both directions.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-03 09:12:02 +02:00
Alexandra Hájková
ed48a9d814 checkasm: Add a test for HEVC add_residual 2016-10-22 17:33:35 +02:00
Martin Storsjö
dd5d4a0e1e checkasm: aarch64: Don't clobber x29 in checkasm_stack_clobber
x29 (FP) is a callee saved register and should be restored on
return. Instead of backing up x29 and restoring it here, back up
sp in a register that we are allowed to overwrite.

This fixes crashes in checkasm on aarch64 since f1b3e13138.
For some reason, gcc builds didn't crash, but clang builds do.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-10-18 16:17:12 +03:00
Diego Biurrun
2816f8a8bb build: Drop arch-specific checkasm Makefiles
They only contain one line and will never contain more.
2016-10-17 16:25:38 +02:00
Diego Biurrun
93d5b022a9 build: Drop duplicate asm recipe
And move the asm recipe to the top-level Makefile next to the other
local pattern rules for .o files.
2016-10-17 16:25:35 +02:00
Martin Storsjö
c91d6a33f8 checkasm: aarch64: Add filler args to make sure all parameters are passed on the stack
This, combined with clobbering the stack space prior to the call,
increases the chances of finding cases where 32 bit parameters
are erroneously treated as 64 bit.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-10-16 23:26:33 +03:00
Martin Storsjö
f1b3e13138 checkasm: aarch64: Clobber the stack before calling functions
Signed-off-by: Martin Storsjö <martin@martin.st>
2016-10-16 23:26:22 +03:00
Martin Storsjö
a05cc56124 checkasm: arm/aarch64: Fix the amount of space reserved for stack parameters
Even if MAX_ARGS - 2 (for arm) or MAX_ARGS - 7 (for aarch64) parameters
are passed on the stack to checkasm_checked_call, we actually only
need to store MAX_ARGS - 4 (for arm) or MAX_ARGS - 8 (for aarch64)
parameters on the stack when calling the tested function.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-10-16 23:26:15 +03:00
Alexandra Hájková
e3f941cb03 checkasm: add a test for HEVC IDCT
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-11 18:15:40 +02:00
Hendrik Leppkes
6fc74934de Merge commit 'dc7501e524dc3270335749302c7aa449973625f3'
* commit 'dc7501e524dc3270335749302c7aa449973625f3':
  checkasm: Issue emms after benchmarking functions

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-10-07 13:18:05 +02:00
Ronald S. Bultje
c935b54bd6 checkasm: add VP9 loopfilter tests.
The randomize_buffer() implementation assures that "most of the time",
we'll do a good mix of wide16/wide8/hev/regular/no filters for complete
code coverage. However, this is not mathematically assured because that
would make the code either much more complex, or much less random.

Some fixes and improvements by Rodger Combs <rodger.combs@gmail.com>

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:07 +02:00
Alexandra Hájková
22c3ab1864 checkasm: Add test for huffyuvdsp add_bytes
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2016-10-02 17:13:26 +02:00
Diego Biurrun
ba479f3daa hevc: Change type of array stride parameters to ptrdiff_t
ptrdiff_t is the correct type for array strides and similar.
2016-09-29 17:54:23 +02:00
Anton Khirnov
e9ef617139 checkasm: add tests for audiodsp 2016-09-22 09:47:52 +02:00
Anton Khirnov
2eb97af66a checkasm: add a test for blockdsp 2016-09-22 09:47:52 +02:00
Anton Khirnov
683da86aab audiodsp: reorder arguments for vector_clipf
This will make the x86 asm simpler.

ARM conversion by Martin Storsjö <martin@martin.st> and Janne Grunau
<janne-libav@jannau.net>
2016-09-22 09:47:52 +02:00
Luca Barbato
e89cef4050 checkasm: Read the unsigned value as it should
Reading a value larger than int using atoi() may give the wrong result.
2016-09-11 14:12:18 +02:00
Diego Biurrun
87c6c78604 vp8: Change type of stride parameters to ptrdiff_t
ptrdiff_t is the correct type for array strides and similar.
2016-08-26 11:36:53 +02:00
Martin Storsjö
2e95054ebb checkasm: h264dsp: Initialize the padding area
This fixes valgrind warnings about conditional jumps based on
uninitialized data (even though the uninitialized data only ever
was compared with a direct copy of the same uninitialized data).

Signed-off-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-08-11 19:55:16 +02:00
Ronald S. Bultje
e99ecda550 checkasm: add vp9 MC tests.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-08-03 11:07:01 +02:00
James Almer
54a0a52be1 checkasm/vp9dsp: use declare_func_emms in check_loopfilter
Fixes checkasm failures on mmxext functions

Signed-off-by: James Almer <jamrial@gmail.com>
2016-07-26 22:16:21 -03:00
Luca Barbato
40ad05bab2 checkasm: Cast unsigned to signed
Avoid a warning for passing an unsigned value to abs(), some compilers
might optimize away abs().
2016-07-23 08:27:32 +02:00
Alexandra Hájková
9064777dbb checkasm: add HEVC test for testing IDCT DC
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-07-22 19:08:12 +02:00
Martin Storsjö
6f9e34baea arm: Check for support for the .fpu directive
When targeting COFF (windows), clang doesn't support this
directive (while binutils supports it for all targets).

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-07-21 12:52:10 +03:00
Martin Storsjö
37961044c6 checkasm: arm: Ignore changes to bits 0-4 and 7 of FPSCR
These bits are set by exceptions in NEON instructions.

Also print the differing bits when FPSCR is clobbered,
and use bic instead of lsl, for clearing the topmost bits.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-07-17 21:48:17 +03:00
Janne Grunau
59aeed93e4 cheackasm/arm: remove NEON instructions from checkasm_checked_call_vfp
Fixes AS error on non NEON builds introduced in 71a0472114. Also
set the fpu directly to vfp in checkasm.S to cause build errors on NEON
builds.
2016-07-17 11:28:21 +02:00
Martin Storsjö
446353ea18 checkasm: arm: Don't start new const blocks for each string
Each const block needs to be terminated by one endconst
invocation so either call endconst after each, or just
declare plain labels to the later strings.

This fixes errors such as this, on some binutils versions:

checkasm.S:38: Error: Macro `endconst' was already defined

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-07-17 12:21:19 +03:00
Janne Grunau
71a0472114 checkasm: arm: report the first clobbered register in checkasm_checked_call 2016-07-16 12:57:18 +02:00
Janne Grunau
7b1ae0e73a checkasm/arm: preserve the stack alignment checkasm_checked_call
The stack used by checkasm_checked_call_vfp was a multiple of 4 when the
checked function is called. AAPCS requires a double word (8 byte)
aligned stack public interfaces. Since both calls are public interfaces
the stack is misaligned when the checked is called.

Might fix the SIGBUS error in the armv7-linux-clang-3.7 fate config.
2016-07-13 22:18:53 +02:00
Janne Grunau
80fbb7beca checkasm: vp8.mc: initialize the full src buffer after ec32574209
Fixes "Use of uninitialised value" valgrind warnings in checkasm.
2016-07-13 22:18:52 +02:00
Matthieu Bouron
a91c330a29 Merge commit '105998fb5ca3c343f5c8cb39ce3197f87a5e4d36'
* commit '105998fb5ca3c343f5c8cb39ce3197f87a5e4d36':
  checkasm: Add tests for h264 idct

Merged-by: Matthieu Bouron <matthieu.bouron@stupeflix.com>
2016-07-13 17:22:29 +02:00
Matthieu Bouron
495a40cecb tests/checkasm: reduce cosmetic diff with libav
Chunk was not merged in ca5ec2bf51.
2016-07-13 17:11:58 +02:00
Janne Grunau
8c816c0c9b checkasm/arm: align the clobber check data properly for ldrd
Should fix the SIGBUS in the armv7-linux-clang-3.7 fate target.
2016-07-10 13:35:41 +02:00
Janne Grunau
ec32574209 checkasm: vp8: mc: test unequal width/height for partitions 2016-07-10 13:35:41 +02:00
Martin Storsjö
f8d17d5395 checkasm: Add tests for vp8dsp
The tests are inspired by similar tests for vp9 by
Ronald Bultje.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-07-08 14:10:46 +03:00
Michael Niedermayer
fb6b6b5166 tests/checkasm/pixblockdsp: Test 8 byte aligned positions
The code is documented as to require 8byte alignment

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-07-02 22:21:53 +02:00
Martin Storsjö
67cb2c0f73 checkasm: hevc: Iterate over features first, then over bitdepths
This avoids listing the same feature multiple times in the
test output. Previously the output contained something like this:

SSE2:
 - hevc_mc.qpel              [OK]
 - hevc_mc.epel              [OK]
 - hevc_mc.unweighted_pred   [OK]
 - hevc_mc.qpel              [OK]
 - hevc_mc.epel              [OK]
 - hevc_mc.unweighted_pred   [OK]

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-06-29 21:12:05 +03:00
Martin Storsjö
e48746deec checkasm: h264dsp: Move the x and y variables into the randomize_buffer macro
This avoids the risk of accidentally clobbering such variables outside
of the macro if the same variables are used there.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-06-28 14:24:04 +03:00
Martin Storsjö
e57de6faa1 checkasm: h264dsp: Initialize the padding area
This fixes valgrind warnings about conditional jumps based on
uninitialized data (even though the uninitialized data only ever
was compared with a direct copy of the same uninitialized data).

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-06-28 14:24:01 +03:00
Clément Bœsch
5558ff3a9f Merge commit '257f00ec1ab06a2a161f535036c6512f3fc8e801'
* commit '257f00ec1ab06a2a161f535036c6512f3fc8e801':
  Split global .gitignore file into per-directory files

Merged-by: Clément Bœsch <clement@stupeflix.com>
2016-06-22 11:28:51 +02:00
Clément Bœsch
8ef57a0d61 Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb'
* commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb':
  cosmetics: Fix spelling mistakes

Merged-by: Clément Bœsch <u@pkh.me>
2016-06-21 21:55:34 +02:00
Martin Storsjö
dc7501e524 checkasm: Issue emms after benchmarking functions
The functions may not clean up properly after using MMX
registers. For the normal testing calls, the checkasm_checked_call
functions will do the cleanup (and check that functions that
should clean up do it as well), but when benchmarking functions
that don't clean up, we don't currently properly clean up at all.

This causes issues if a benchmarked function is followed by testing
of a function that is supposed to not clobber the MMX/FPU state but
doesn't touch it at all.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-06-21 22:09:29 +03:00
Martin Storsjö
105998fb5c checkasm: Add tests for h264 idct
The tests are inspired by similar tests for vp9 by
Ronald Bultje.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-06-17 21:37:56 +03:00
Diego Biurrun
257f00ec1a Split global .gitignore file into per-directory files 2016-05-13 14:55:56 +02:00
Derek Buitenhuis
ca5ec2bf51 Merge commit '01621202aad7e27b2a05c71d9ad7a19dfcbe17ec'
* commit '01621202aad7e27b2a05c71d9ad7a19dfcbe17ec':
  build: miscellaneous cosmetics

Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2016-05-09 16:25:28 +01:00
Vittorio Giovara
41ed7ab45f cosmetics: Fix spelling mistakes
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-05-04 18:16:21 +02:00
Michael Niedermayer
3c0511f29e tests/checkasm/vf_colorspace: Make bpp_mask const
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-04-13 22:39:41 +02:00
Michael Niedermayer
4d59d075a9 tests/checkasm/vf_colorspace: Fix dst array sizes
Suggested & Approved by: BBB
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-04-12 23:46:52 +02:00
Ronald S. Bultje
5ce703a6bf vf_colorspace: x86-64 SIMD (SSE2) optimizations. 2016-04-12 16:42:48 -04:00
Diego Biurrun
01621202aa build: miscellaneous cosmetics
Restore alphabetical order in lists, break overly long lines, do some
prettyprinting, add some explanatory section comments, group parts
together that belong together logically.
2016-04-07 15:26:08 +02:00
Derek Buitenhuis
ca408cf557 Merge commit '7c82d31cbe9fc5d5a321ad49c14a472bd629b50f'
* commit '7c82d31cbe9fc5d5a321ad49c14a472bd629b50f':
  checkasm: Use standard multiple inclusion guards

Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2016-02-24 17:36:52 +00:00
James Almer
26034929d5 checkasm: bench each vf_blend mode once
Also bench a smaller buffer. This drastically reduces --bench runtime
and reports smaller, more readable numbers.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2016-02-22 13:54:07 -03:00
James Almer
76af0c7877 checkasm: fix dependencies for vf_blend tests
They will now compile if avcodec is disabled

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2016-02-19 16:31:55 -03:00
Diego Biurrun
7c82d31cbe checkasm: Use standard multiple inclusion guards 2016-02-18 15:35:44 +01:00
Timothy Gu
ebf648d490 checkasm/vf_blend: Decrease iteration count
The test is already slow.
2016-02-14 10:48:24 -08:00
Timothy Gu
a953a2991e checkasm: Add vf_blend tests 2016-02-14 10:46:56 -08:00
Timothy Gu
180f9a0958 all: Make header guard names consistent 2016-01-31 15:44:11 -08:00
foo86
ae5b2c5250 avcodec/dca: add new decoder based on libdcadec 2016-01-31 17:09:38 +01:00
foo86
4608996772 avcodec/dca: remove old decoder
Remove all files and functions which are not going to be reused,
and disable all functions and FATE tests temporarily which will be.
2016-01-31 17:09:38 +01:00
Geza Lore
cc602061ee x86inc: Add debug symbols indicating sizes of compiled functions
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.

Currently only implemented for ELF.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-01-23 20:46:28 +01:00
Geza Lore
d39c229e54 x86inc: Add debug symbols indicating sizes of compiled functions
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.

Currently only implemented for ELF.
2016-01-21 23:19:46 +01:00
Ronald S. Bultje
8c9103c4af checkasm: add videodsp emulated_edge_mc test. 2016-01-21 10:25:27 -05:00
Hendrik Leppkes
7e29903526 Merge commit 'fec76cd430f3c865183a6e5b4caec0743e055605'
* commit 'fec76cd430f3c865183a6e5b4caec0743e055605':
  checkasm: Check register clobbering on aarch64

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-19 08:50:44 +01:00
Hendrik Leppkes
0b40e290e3 Merge commit '26ec75aec3576daea691dee53a78ec67c0dc4040'
* commit '26ec75aec3576daea691dee53a78ec67c0dc4040':
  checkasm: Check register clobbering on arm

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-19 08:49:27 +01:00
Martin Storsjö
fec76cd430 checkasm: Check register clobbering on aarch64
This is disabled on iOS, since iOS uses a slightly different ABI
for vararg parameters.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-01-07 09:33:24 +02:00
Martin Storsjö
26ec75aec3 checkasm: Check register clobbering on arm
Use two separate functions, depending on whether VFP/NEON is available.

This is set to require armv5te - it uses blx, which is only available
since armv5t, but we don't have a separate configure item for that.
(It also uses ldrd, which requires armv5te, but this could be avoided
if necessary.)

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-01-07 09:33:24 +02:00
Hendrik Leppkes
0eefc758e2 Merge commit 'f0f54117c8f206e8045d301c2eb975b26e9f263d'
* commit 'f0f54117c8f206e8045d301c2eb975b26e9f263d':
  checkasm: x86: post commit review fixes

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 13:26:28 +01:00
Hendrik Leppkes
d03da3e240 Merge commit '2008f76054906e9ff6bf744800af0e5a5bfe61be'
* commit '2008f76054906e9ff6bf744800af0e5a5bfe61be':
  dca: remove unused decode_hf function and quant_d tables

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 13:17:48 +01:00
Hendrik Leppkes
f299d8d9f2 Merge commit '489e6add4478b0f5717dbf644234c6f3a3baf02c'
* commit '489e6add4478b0f5717dbf644234c6f3a3baf02c':
  checkasm: add fmtconvert tests

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 12:50:13 +01:00
Hendrik Leppkes
eb50a3d440 Merge commit '568a4323fbde03665b2b23a98068d02b39121812'
* commit '568a4323fbde03665b2b23a98068d02b39121812':
  checkasm: add synth_filter test

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 12:45:34 +01:00
Hendrik Leppkes
d882c0b9f9 Merge commit 'e71b747e9dc56cb84f8a06ec8214d5f3bd98bb6d'
* commit 'e71b747e9dc56cb84f8a06ec8214d5f3bd98bb6d':
  checkasm: add tests for dcadsp

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 12:38:46 +01:00
Hendrik Leppkes
0c7ade547a Merge commit '9d218d573f8088c606d873e80df572582e6773ef'
* commit '9d218d573f8088c606d873e80df572582e6773ef':
  checkasm: add float comparison util functions

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 12:35:52 +01:00
Hendrik Leppkes
69ead86027 Merge commit '711781d7a1714ea4eb0217eb1ba04811978c43d1'
* commit '711781d7a1714ea4eb0217eb1ba04811978c43d1':
  x86: checkasm: check for or handle missing cleanup after MMX instructions

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 11:55:44 +01:00
Hendrik Leppkes
e754c8e8ca Merge commit 'e2710e790c09e49e86baa58c6063af0097cc8cb0'
* commit 'e2710e790c09e49e86baa58c6063af0097cc8cb0':
  arm: add a cpu flag for the VFPv2 vector mode

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 11:01:29 +01:00
Janne Grunau
f0f54117c8 checkasm: x86: post commit review fixes
Check the full FPU tag word instead of only the lower half and simplify
the comparison.
Use upper-case function base name as macro name to instantiate both
checked_call variants.
2015-12-29 12:50:38 +01:00
Alexandra Hájková
2008f76054 dca: remove unused decode_hf function and quant_d tables
They were superseded with their integer equivalents. Rename integer
decode_hf to decode_hf.
2015-12-24 13:58:18 +01:00
Janne Grunau
489e6add44 checkasm: add fmtconvert tests 2015-12-21 18:58:46 +01:00
Janne Grunau
568a4323fb checkasm: add synth_filter test 2015-12-21 17:40:18 +01:00
Janne Grunau
e71b747e9d checkasm: add tests for dcadsp 2015-12-21 17:40:18 +01:00
Janne Grunau
9d218d573f checkasm: add float comparison util functions 2015-12-21 17:40:18 +01:00
Janne Grunau
711781d7a1 x86: checkasm: check for or handle missing cleanup after MMX instructions
Not every asm routine is expected clear the MMX state after returning.
It is however a requisite for testing floating point code in checkasm.
Annotate functions requiring cleanup with declare_func_emms() and issue
emms after the call. The remaining functions are checked for having  a
cleared MMX state after return.
2015-12-21 17:40:18 +01:00
Janne Grunau
e2710e790c arm: add a cpu flag for the VFPv2 vector mode
The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu
implementations do not support it in hardware. Vector mode code will
depending the OS either be emulated in software or result in an illegal
instruction on cpus which does not support it. This was not really
problem in practice since NEON implementations of the same functions are
preferred. It will however become a problem for checkasm which tests
every cpu flag separately.

Since this is a cpu feature newer cpu do not support anymore the
behaviour of this flag differs from the other flags. It can be only
activated by runtime cpu feature selection.
2015-12-14 16:42:35 +01:00
Anton Khirnov
0cef06df07 checkasm: add HEVC MC tests 2015-12-05 21:11:21 +01:00
Timothy Gu
3d20f8e7c0 Add pixblockdsp checkasm tests 2015-11-07 18:46:55 -08:00
Rodger Combs
1e477a970f lavu: add AESNI CPU flag 2015-10-28 04:23:14 -05:00
Ronald S. Bultje
eb4b5ff738 vp9: add itxfm_add eob shortcuts to 10/12bpp functions.
These aren't quite as helpful as the ones in 8bpp, since over there,
we can use pmulhrsw, but here the coefficients have too many bits to
be able to take advantage of pmulhrsw. However, we can still skip
cols for which all coefs are 0, and instead just zero the input data
for the row itx. This helps a few % on overall decoding speed.
2015-10-13 11:06:01 -04:00
James Almer
285e41c34c checkasm: add alacdsp tests
Signed-off-by: James Almer <jamrial@gmail.com>
2015-10-06 20:25:49 -03:00
Henrik Gramner
ec85153f25 checkasm: Fix compilation with --disable-avcodec 2015-10-04 15:35:16 +02:00
Henrik Gramner
99982524f9 checkasm: Remove use of deprecated av_set_cpu_flags_mask() 2015-10-03 15:08:24 +02:00
Henrik Gramner
8bb376cf6b checkasm: Fix the function name sorting algorithm
The previous implementation was behaving incorrectly in some corner cases.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-10-03 13:38:03 +02:00
Henrik Gramner
69e456d7fb checkasm/vp9dsp: Fix iszero() to read the correct data 2015-09-28 18:50:13 +02:00
Ronald S. Bultje
0b227c6d47 checkasm: add vp9dsp.itxfm_add tests. 2015-09-28 10:51:53 -04:00
Henrik Gramner
19b28d047d checkasm: Fix the function name sorting algorithm
The previous implementation was behaving incorrectly in some corner cases.
2015-09-28 16:38:23 +02:00
Henrik Gramner
cc28552100 checkasm/x86: Correctly handle variadic functions
The System V ABI on x86-64 specifies that the al register contains an upper
bound of the number of arguments passed in vector registers when calling
variadic functions, so we aren't allowed to clobber it.

checkasm_fail_func() is a variadic function so also zero al before calling it.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-09-28 14:25:59 +02:00
Henrik Gramner
5405584b7b checkasm: Use a self-balancing tree
Tested functions are internally kept in a binary search tree for efficient
lookups. The downside of the current implementation is that the tree quickly
becomes unbalanced which causes an unneccessary amount of comparisons between
nodes. Improve this by changing the tree into a self-balancing left-leaning
red-black tree with a worst case lookup/insertion time complexity of O(log n).

Significantly reduces the recursion depth and makes the tests run around 10%
faster overall. The relative performance improvement compared to the existing
non-balanced tree will also most likely increase as more tests are added.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-09-28 11:16:33 +02:00
Henrik Gramner
7ca1de5b4f checkasm/x86: Correctly handle variadic functions
The System V ABI on x86-64 specifies that the al register contains an upper
bound of the number of arguments passed in vector registers when calling
variadic functions, so we aren't allowed to clobber it.

checkasm_fail_func() is a variadic function so also zero al before calling it.
2015-09-27 20:21:26 +02:00
James Almer
4e03f0ab08 checkasm/vp9dsp: add const to suppress "discards const qualifier" warnings
Reviewed-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-09-26 16:35:39 -03:00
James Almer
af990d72b7 checkasm/Makefile: add missing testclean target
Reviewed-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-09-26 16:35:34 -03:00
Henrik Gramner
2ab65b652d checkasm: Use a self-balancing tree
Tested functions are internally kept in a binary search tree for efficient
lookups. The downside of the current implementation is that the tree quickly
becomes unbalanced which causes an unneccessary amount of comparisons between
nodes. Improve this by changing the tree into a self-balancing left-leaning
red-black tree with a worst case lookup/insertion time complexity of O(log n).

Significantly reduces the recursion depth and makes the tests run around 10%
faster overall. The relative performance improvement compared to the existing
non-balanced tree will also most likely increase as more tests are added.
2015-09-26 15:11:11 +02:00
Ronald S. Bultje
7a4b97e946 checkasm: clip vp9 loopfilter test pixels inside allowed bitdepth range. 2015-09-26 06:42:33 -04:00
Rodger Combs
f559812a84 tests/checkasm: make randomize_buffers a function for easier debugging
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-09-26 02:47:53 +02:00
Michael Niedermayer
5ba40c3c71 tests/checkasm/vp9dsp: Revert first hunk of bddcf758d3
The change was wrong, also add a comment explaining it

Found-by: BBB
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-09-24 18:34:43 +02:00
Ronald S. Bultje
350e9c6765 vp9: fix loopfilter test code to address Hendrik's comments.
(I forgot to actually merge them into the patch I just pushed.)
2015-09-21 20:44:14 -04:00
Rodger Combs
df2a2643fe tests/checkasm: fix stack smash in check_loopfilter
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-09-20 20:26:09 +02:00
Michael Niedermayer
bddcf758d3 tests/checkasm/vp9dsp: Add () to protect macro arguments
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-09-20 11:37:57 +02:00
Ronald S. Bultje
b074367405 checkasm: add VP9 loopfilter tests.
The randomize_buffer() implementation assures that "most of the time",
we'll do a good mix of wide16/wide8/hev/regular/no filters for complete
code coverage. However, this is not mathematically assured because that
would make the code either much more complex, or much less random.
2015-09-20 10:33:04 +02:00
James Almer
784792788b checkasm: add jpeg2000dsp rct_int tests
Reviewed-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-09-20 00:49:35 -03:00
James Almer
763ffa2029 checkasm: add flacdsp decorrelate tests
Reviewed-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-09-17 15:33:07 -03:00
Henrik Gramner
781a25e9c4 checkasm: v210: Fix array overwrite 2015-09-17 10:33:06 +02:00
Michael Niedermayer
a860adb49c tests/checkasm/vp9dsp: Use snprintf() for safetey
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-09-16 14:19:37 +02:00
Henrik Gramner
6115966ad3 checkasm: v210: Fix array overwrite 2015-09-16 13:50:09 +02:00
Henrik Gramner
985e7d8cc1 checkasm: v210: s/Libav/FFmpeg/ 2015-09-16 13:48:43 +02:00
Ronald S. Bultje
bbd44e124a checkasm: add vp9 intra pred tests. 2015-09-15 16:43:29 -04:00
Ronald S. Bultje
084451e1e4 checkasm: add vp9 MC tests. 2015-09-15 16:43:28 -04:00
Hendrik Leppkes
8537e24927 Merge commit '3cdda78deb19b39dbbf8961ae0aec44dbb19bf6d'
* commit '3cdda78deb19b39dbbf8961ae0aec44dbb19bf6d':
  checkasm: add unit tests for v210enc

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-09-08 14:30:00 +02:00
Henrik Gramner
3cdda78deb checkasm: add unit tests for v210enc
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-09-06 10:36:24 +02:00
Henrik Gramner
c457bdebe7 checkasm: Fix floating point arguments on 64-bit Windows
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-08-28 09:54:54 +02:00
Henrik Gramner
33a58d7bf4 checkasm: Fix floating point arguments on 64-bit Windows 2015-08-25 19:34:46 +02:00
Henrik Gramner
515b69f8f8 checkasm: Explicitly declare function prototypes
Now we no longer have to rely on function pointers intentionally
declared without specified argument types.

This makes it easier to support functions with floating point parameters
or return values as well as functions returning 64-bit values on 32-bit
architectures. It also avoids having to explicitly cast strides to
ptrdiff_t for example.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-08-20 19:22:34 +02:00
Henrik Gramner
e13da244f4 checkasm: x86: properly save rdx/edx in checked_call()
If the return value doesn't fit in a single register rdx/edx can in some
cases be used in addition to rax/eax.

Doesn't affect any of the existing checkasm tests but might be useful later.

Also comment the relevant code a bit better.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-08-20 19:22:06 +02:00
Henrik Gramner
e6b8797b82 checkasm: x86: properly save rdx/edx in checked_call()
If the return value doesn't fit in a single register rdx/edx can in some
cases be used in addition to rax/eax.

Doesn't affect any of the existing checkasm tests but might be useful later.

Also comment the relevant code a bit better.
2015-08-19 16:17:35 +02:00
Henrik Gramner
18b101ff59 checkasm: Explicitly declare function prototypes
Now we no longer have to rely on function pointers intentionally
declared without specified argument types.

This makes it easier to support functions with floating point parameters
or return values as well as functions returning 64-bit values on 32-bit
architectures. It also avoids having to explicitly cast strides to
ptrdiff_t for example.
2015-08-19 16:17:35 +02:00
Henrik Gramner
8f4a06faf4 checkasm: Remove unnecessary include
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-08-11 11:00:53 +02:00
Henrik Gramner
5e8e121fcc checkasm: Remove unnecessary include 2015-08-05 23:21:28 +02:00
Michael Niedermayer
1919827f2c Merge commit 'bf0cef5c3a114df452e5476167634dd8f51eb448'
* commit 'bf0cef5c3a114df452e5476167634dd8f51eb448':
  checkasm: Include io.h for isatty, if available

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-30 12:23:54 +02:00
Martin Storsjö
bf0cef5c3a checkasm: Include io.h for isatty, if available
configure does check for isatty, and checkasm properly checks
HAVE_ISATTY, but on some platforms (e.g. WinRT), io.h needs to be
included for isatty to be available.

Signed-off-by: Martin Storsjö <martin@martin.st>
2015-07-30 09:27:09 +03:00
Henrik Gramner
65c1480152 checkasm: Modify report format
Makes it a bit more clear where each test belongs.

Suggested by Anton Khirnov.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-07-27 07:45:11 +02:00
Michael Niedermayer
b940145c67 Merge commit '65c14801527068fcaf729eeffc142ffd4682a21a'
* commit '65c14801527068fcaf729eeffc142ffd4682a21a':
  checkasm: Modify report format

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-27 12:27:12 +02:00
Michael Niedermayer
ce466275f8 Merge commit '4d0d55cd623bcd504867f948849380f6b4060b4d'
* commit '4d0d55cd623bcd504867f948849380f6b4060b4d':
  checkasm: Use LOCAL_ALIGNED

See: f467fc02b4
See: 9e83ac6114
Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-26 11:39:58 +02:00
Michael Niedermayer
4d0d55cd62 checkasm: Use LOCAL_ALIGNED
Fixes alignment issues and bus errors.

Signed-off-by: Martin Storsjö <martin@martin.st>
2015-07-26 10:36:22 +03:00
Michael Niedermayer
f467fc02b4 tests/checkasm/h264pred: Use LOCAL_ALIGNED_16()
Fixes alignment issue and bus errors

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-23 00:44:02 +02:00
Michael Niedermayer
9e83ac6114 tests/checkasm/h264qpel: Use LOCAL_ALIGNED_16()
Fixes alignment issue and bus errors

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-23 00:35:18 +02:00
Michael Niedermayer
e1b5a2e46e Merge commit 'e605bf3b590d295f215fcc9fd58eb11be55b68cb'
* commit 'e605bf3b590d295f215fcc9fd58eb11be55b68cb':
  checkasm: remove empty array initializer list in h264pred test

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-22 16:38:16 +02:00
Michael Niedermayer
c1692439e0 Merge commit '3ae0e721c7b6e0483801b9039b3d140e3b68b7f5'
* commit '3ae0e721c7b6e0483801b9039b3d140e3b68b7f5':
  checkasm: Always link statically

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-22 16:30:37 +02:00
Janne Grunau
e605bf3b59 checkasm: remove empty array initializer list in h264pred test
Fixes MSVC compilation.
2015-07-22 12:06:32 +02:00
Luca Barbato
3ae0e721c7 checkasm: Always link statically
Checkasm needs to use internal symbols that should not be made public.
2015-07-21 23:22:42 +02:00
Michael Niedermayer
593731efa8 tests/checkasm/Makefile: Fix checkasm with SDL
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-21 15:55:05 +02:00
Michael Niedermayer
e6b01480e5 Merge commit '82e6ac85ff9aa7631b8c01521b3d6b5ca0bc8014'
* commit '82e6ac85ff9aa7631b8c01521b3d6b5ca0bc8014':
  checkasm: test all architectures with optimisations

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-18 02:31:14 +02:00
Michael Niedermayer
78274b19f1 Merge commit '6cc4d3e9a982e926494f4b919d9733fe29774acf'
* commit '6cc4d3e9a982e926494f4b919d9733fe29774acf':
  checkasm: exit with status 0 instead of 1 if there are no tests to perform

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-18 02:22:43 +02:00
Michael Niedermayer
b1861f18b6 Merge commit 'fc56868399213d3e9be19bdebeb64df233b39a7e'
* commit 'fc56868399213d3e9be19bdebeb64df233b39a7e':
  cosmetics: Reformat checkasm tests

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-18 01:57:04 +02:00
Janne Grunau
82e6ac85ff checkasm: test all architectures with optimisations 2015-07-18 01:06:45 +02:00
Henrik Gramner
6cc4d3e9a9 checkasm: exit with status 0 instead of 1 if there are no tests to perform 2015-07-18 01:06:44 +02:00
Michael Niedermayer
cb33f8d0f4 checkasm: Give macro a body to avoid potential unexpected syntax issues 2015-07-18 01:06:44 +02:00
Michael Niedermayer
72d1409e23 Merge commit 'd37f23263584774e1798e9ac909a398304a05091'
* commit 'd37f23263584774e1798e9ac909a398304a05091':
  checkasm: Add unit tests for bswapdsp

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-17 23:26:59 +02:00
Luca Barbato
fc56868399 cosmetics: Reformat checkasm tests 2015-07-17 21:29:20 +02:00
Henrik Gramner
d37f232635 checkasm: Add unit tests for bswapdsp
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-07-17 20:03:55 +02:00
Henrik Gramner
2cb34f82b9 checkasm: Add unit tests for h264qpel
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-07-15 19:47:07 +02:00
Michael Niedermayer
a39512ba9e tests/checkasm/checkasm: Give macro a body to avoid potential unexpected syntax issues
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-16 04:35:14 +02:00
Michael Niedermayer
cbd4a1dbde Merge commit '2cb34f82b92c15b811f5c03dc7f61a4baf6e02e3'
* commit '2cb34f82b92c15b811f5c03dc7f61a4baf6e02e3':
  checkasm: Add unit tests for h264qpel

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-15 22:44:28 +02:00
Michael Niedermayer
7c944b0a36 tests/checkasm/x86/Makefile: Use ASMSTRIPFLAGS for asm
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-13 03:15:44 +02:00
Michael Niedermayer
f14fc55969 Merge commit '8bc67ec2c0d2b5444d51a1bed1d50f0e10d92717'
* commit '8bc67ec2c0d2b5444d51a1bed1d50f0e10d92717':
  Checkasm: assembly testing and benchmarking tool

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-12 21:03:06 +02:00
Henrik Gramner
8bc67ec2c0 Checkasm: assembly testing and benchmarking tool
It provides the following features:
 * verify correctness by comparing output to the C version.
 * detect failure to save and restore clobbered callee-saved registers.
 * detect 32-bit parameters being used as if they were 64-bit in x86-64
   (the upper halves are not guaranteed to be zero - but in practice
   they very often are, which makes those bugs hard to spot otherwise).
 * easy benchmarking.

Compile by running 'make checkasm'.
Execute by running 'tests/checkasm/checkasm'.

Optional arguments are '--bench' to run benchmarks for all functions,
'--bench=<pattern>' to run benchmarks for all functions that starts with
<pattern>, and '<integer>' to seed the PRNG for reproducible results.

Contains unit tests for most h264pred functions to get started, more tests
can be added afterwards using those as a reference.

Loosely based on code from x264. Currently only supports x86 and x86-64,
but additional architectures shouldn't be too much of an obstacle to add.

Note that functions with floating point parameters or floating point
return values are not supported. Some compiler-specific features or
preprocessor hacks would likely be required to add support for that.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2015-07-12 16:39:07 +02:00