1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-21 10:55:51 +02:00
Commit Graph

5631 Commits

Author SHA1 Message Date
bwang30
3ab11dc5bb libavfilter/x86/vf_convolution: add sobel filter optimization and unit test with intel AVX512 VNNI
This commit enabled assembly code with intel AVX512 VNNI and added unit test for sobel filter

sobel_c: 4537
sobel_avx512icl 2136

Signed-off-by: bwang30 <bin.wang@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-11-14 10:04:16 +08:00
Peter Ross
b48d2320f1 fate/video: vqc testcase 2022-11-07 16:08:35 +11:00
Peter Ross
e75e3ac106 fate/audio: msnsiren test case 2022-11-07 16:08:35 +11:00
Peter Ross
99499125ed fate/microsoft: add mss2 region test case 2022-11-07 16:08:35 +11:00
Lynne
e6afa61be9
imc: convert to lavu/tx, remove NIH iMDCT and replace with a standard one 2022-11-06 14:39:42 +01:00
Lynne
b428003c1c
dcaenc: convert to lavu/tx
The encoder is fixed point, and uses an MDCT only for analysis. Due
to the slightly different rounding, the encoder makes a different
decision, so the tests have to be adjusted as well.
2022-11-06 14:39:37 +01:00
Lynne
e0661fc805
dca_core: convert to lavu/tx
Thanks to Martin Storsjö <martin@martin.st> for fixing and testing the
arm32 and aarch64 changes.
2022-11-06 14:39:36 +01:00
Lynne
469cd8d7fa
aacdec: convert to lavu/tx and support fixed-point 960-sample decoding
This patch replaces the transform used in AAC with lavu/tx and removes
the limitation on only being able to decode 960-sample files
with the float decoder.
This commit also removes a whole bunch of unnecessary and slow
lifting steps the decoder did to compensate for the poor accuracy
of the old integer transformation code.

Overall float decoder speedup on Zen 3 for 64kbps: 32%
2022-11-06 14:39:33 +01:00
Lynne
4cee7ebd75
ac3: convert to lavu/tx 2022-11-06 14:39:27 +01:00
James Darnley
1936c06f02 checkasm: add a verbose check function for uint32_t data 2022-11-04 19:37:46 +01:00
James Almer
6228ba141d avutil/channel_layout: add a 7.1(top) channel layout
Signed-off-by: James Almer <jamrial@gmail.com>
2022-11-03 19:39:45 -03:00
Pierre-Anthony Lemieux
906219e3ca
avformat/tests/imf: add CPL timecode test 2022-11-03 21:16:10 +10:00
Peter Ross
6fe8556a19 avcodec/svq1: fix interframe mean VLC symbols
Fixes ticket #128.

The SVQ1 interframe mean VLC symbols -128 and 128 are incorrectly swapped
in our SVQ1 implementation, resulting in visible artifacts for some videos.
This patch unswaps the order of these two symbols.

The most noticable example of the artiacts caused by this error can be observed in
https://trac.ffmpeg.org/attachment/ticket/128/svq1_set.7z '352_288_k_50.mov'.
The artifacts are not observed when using the reference decoder
(QuickTime 7.7.9 x86 binary).

As a result of this patch, the reference data for the fate-svq1 test
($SAMPLES/svq1/marymary-shackles.mov) must be modified. For this file, our
decoder output is now bitwise identical to the reference decoder. I have
tested patch with various other samples and they are all now bitwise identical.
2022-11-01 09:24:29 +11:00
Peter Ross
b0c1f248d9 avcodec/svq1enc: output ident string in extradata field
This will enable the acurate identification of FFmpeg produced
SVQ1 streams, should there be new bugs found in the encoder.
2022-11-01 09:24:29 +11:00
Peter Ross
e1dd4a27ca avcodec/svq1enc: do not use ambiguous interframe mean symbols
Don't emit interframe mean symbols -128 and 128.
2022-11-01 09:24:29 +11:00
James Almer
83e918de71 avutil/channel_layout: add a cube channel layout
Signed-off-by: James Almer <jamrial@gmail.com>
2022-10-30 16:18:30 -03:00
Jan Ekström
c889248647 ffmpeg: Add display_{rotation, hflip, vflip} options
This enables overriding the rotation as well as horizontal/vertical
flip state of a specific video stream on the input side.

Additionally, switch the singular test that was utilizing the rotation
metadata to instead override the input display rotation, thus leading
to the same result.
2022-10-19 11:53:52 +02:00
Andreas Rheinhardt
37ee36f689 checkasm/idctdsp: Use declare_func_emms only when needed
There is no MMX code for (add|put|put_signed)_pixels_clamped
since commit bfb28b5ce8, so use
declare_func instead of declare_func_emms() to also test that
we are not in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt
5102b98b7a checkasm/llviddspenc: Use declare_func_emms only when needed
There is no MMX code for diff_bytes since commit
230ea38de1, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt
e814569c8d checkasm/huffyuvdsp: Use declare_func_emms only when needed
There is no MMX code for add_int16 since commit
4b6ffc2880, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt
cd8a33bcce checkasm/llviddsp: Be strict about MMX
There is no MMX code for llviddsp after commit
fed07efcde, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt
b4e2d67636 checkasm/pixblockdsp: Be strict about MMX
There is no MMX code for pixblockdsp after commit
92b5800277, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt
42921190cb checkasm/audiodsp: Be strict about MMX
There is no MMX code for audiodsp after commit
3d716d38ab, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt
18afaa20f1 checkasm/blockdsp: Be strict about MMX
There is no MMX code for blockdsp after commit
ee551a21dd, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt
f224c195e0 checkasm/vc1dsp: Use declare_func_emms only when needed
There is no MMX code for vc1_inv_trans_8x8 or
vc1_unescape_buffer, so use declare_func instead of
declare_func_emms() to also test that we are not in MMX
mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Rémi Denis-Courmont
c962c78901 checkasm: RISC-V 64-bit assembler test harness 2022-10-10 02:23:18 +02:00
Andreas Rheinhardt
bcfa427c8f checkasm/vp8dsp: Use declare_func_emms only when needed
There is no MMX code for loop filters since commit
6a551f1405, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-08 09:33:36 +02:00
Andreas Rheinhardt
406c7fceeb fate/vcodec: Add speedhq tests
The vsynth3 tests are disabled, because the encoder produces garbage.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-06 15:00:21 +02:00
Andreas Rheinhardt
ce4713ea73 avcodec/sgidec: Use planar pixel formats
The data in SGI images is stored planar, so exporting
it via planar pixel formats is natural.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-05 14:38:51 +02:00
Rémi Denis-Courmont
37d5ddc317 lavu/riscv: CPU flag for the Zbb extension
Unfortunately, it is common, and will remain so, that the Bit
manipulations are not enabled at compilation time. This is an official
policy for Debian ports in general (though they do not support RISC-V
officially as of yet) to stick to the minimal target baseline, which
does not include the B extension or even its Zbb subset.

For inline helpers (CPOP, REV8), compiler builtins (CTZ, CLZ) or
even plain C code (MIN, MAX, MINU, MAXU), run-time detection seems
impractical. But at least it can work for the byte-swap DSP functions.
2022-10-05 08:26:19 +02:00
Michael Niedermayer
c5f61c99f9
tests/fate/truehd: Add test for shortened Ticket1726 testcase
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-10-04 23:47:53 +02:00
Tristan Matthews
1d326e9187 fate/opus: add silk LBRR test (refs #9890)
This adds a fate test for a sample with LBRR packets.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2022-10-04 11:54:57 +02:00
Andreas Rheinhardt
98aaaf08b3 avcodec/rl2: Remove wrong check
This check is intended to be avoid buffer overflows,
yet there are four problems with it:
1. It has an in-built off-by-one error: len == out_end - out
is perfectly fine and nothing to worry about.
This off-by-one error led to the pixel in the lower-right corner
not being set properly for the back frame of the sample from
the rl2 FATE-test. This pixel is copied to every frame which
is the reason for the update to the reference file of said test.
With this patch, the output of the decoder matches the output
as captured from the reference decoder* (apart from the fact
that said reference somehow lacks the top part of the frame
(copied over from the background frame)).
2. Given that the stride of the buffer may be different
from the width of the video (despite one pixel taking one byte),
there is a second check lateron making the first check redundant
(if one returns immediately; a simple break at the second check
is not sufficient, because it only exits the inner loop).
3. The check is based around the assumption of the stride being
positive (it has this in common with the other check which
will be fixed in a future commit).
4. Even after fixing the off-by-one error, the check in
question is still triggered by all the non-background frames
in the FATE sample as well as by A1100100.RL2. In all these
cases, they use len == 255 and val == 128. For videos with
background frame this just means "copy from the background
frame", which would be done anyway lateron.* Yet for videos
without it copying it is necessary to avoid leaving
uninitialized parts in the video.

*: Available in https://samples.mplayerhq.hu/game-formats/voyeur-rl2/
**: Due to this, the code that copies the rest from the
back frame is no longer executed for any of the samples
available on the sample server. Given that these are only
the files from the demo version of this game, I don't know
whether this code is executed for any file in existence or not.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-02 20:27:36 +02:00
Rémi Denis-Courmont
0c0a3deb18 lavu/cpu: CPU flags for the RISC-V Vector extension
RVV defines a total of 12 different extensions, including:

- 5 different instruction subsets:
  - Zve32x: 8-, 16- and 32-bit integers,
  - Zve32f: Zve32x plus single precision floats,
  - Zve64x: Zve32x plus 64-bit integers,
  - Zve64f: Zve32f plus Zve64x,
  - Zve64d: Zve64f plus double precision floats.

- 6 different vector lengths:
  - Zvl32b (embedded only),
  - Zvl64b (embedded only),
  - Zvl128b,
  - Zvl256b,
  - Zvl512b,
  - Zvl1024b,

- and the V extension proper: equivalent to Zve64f and Zvl128b.

In total, there are 6 different possible sets of supported instructions
(including the empty set), but for convenience we allocate one bit for
each type sets: up-to-32-bit ints (RVV_I32), floats (RVV_F32),
64-bit ints (RVV_I64) and doubles (RVV_F64).

Whence the vector size is needed, it can be retrieved by reading the
unprivileged read-only vlenb CSR. This should probably be a separate
helper macro if needed at a later point.
2022-09-27 13:19:52 +02:00
Rémi Denis-Courmont
b95e2fbd85 lavu/cpu: detect RISC-V base extensions
This introduces compile-time and run-time CPU detection on RISC-V. In
practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of
I, F and D extensions, and if it does, it probably won't have run-time
detection. So the flags are essentially always set.

But as things stand, checkasm wants them that way. Compare the ARMV8
flag on AArch64. We are nowhere near running short on CPU flag bits.
2022-09-27 13:19:52 +02:00
Paul B Mahol
7bb0afc245 avutil: add RGBA single-float precision packed formats 2022-09-25 18:34:48 +02:00
Paul B Mahol
63bb6d6a9b avutil: add RGB single-precision float formats 2022-09-25 18:34:48 +02:00
Andreas Rheinhardt
54b29e1656 fate/cbs: Add tests for h264_redundant_pps BSF
This also tests writing slice data in the unaligned mode
(some of these files use CAVLC) as well as updating
side data as well as parsing ISOBMFF avcc extradata.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-09-25 14:56:08 +02:00
Lynne
ace42cf581
x86/tx_float: add 15xN PFA FFT AVX SIMD
~4x faster than the C version.
The shuffles in the 15pt dim1 are seriously expensive. Not happy with it,
but I'm contempt.

Can be easily converted to pure AVX by removing all vpermpd/vpermps
instructions.
2022-09-23 12:35:27 +02:00
Lynne
668f43af20
tests/checkasm/lpc: correct arithmetic when randomizing buffers
Results weren't signed.
2022-09-23 01:50:59 +02:00
Lynne
6ad39f01df
tests/checkasm/lpc: reduce range and use signed values
This is more similar to its regular use, and prevents inaccuracies
of huge float*float multiplications from failing the tests.
2022-09-23 01:42:34 +02:00
James Almer
9cbfffa0d4 tests/checkasm/lpc: print mismatching values
Will help debugging.

Signed-off-by: James Almer <jamrial@gmail.com>
2022-09-22 18:18:52 -03:00
James Almer
a1c6f4b653 tests/checkasm/lpc: randomize buffer length
Simplifies the test, while trying more values and preventing pointlessly
running benchmarks in a loop.

Signed-off-by: James Almer <jamrial@gmail.com>
2022-09-22 18:17:26 -03:00
James Almer
c8c4a162fc avcodec/lpc: use ptrdiff_t for length parameters
Signed-off-by: James Almer <jamrial@gmail.com>
2022-09-22 18:17:26 -03:00
Lynne
f2d75d7fb0
fate/checkasm: add LPC test to list 2022-09-22 04:27:20 +02:00
Lynne
b67776e12f
x86/lpc: fix even scalar loop overreads/writes
Passes checkasm with valgrind, tested to sizes of more than 4000 samples.
2022-09-22 04:27:19 +02:00
Andreas Rheinhardt
9beba05311 avcodec/fmtconvert: Remove unused AVCodecContext parameter
Unused since d74a8cb7e4.

Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-09-21 20:26:40 +02:00
Andreas Rheinhardt
fd72d8aea3 avcodec/blockdsp: Remove unused AVCodecContext parameter
Possible since be95df12bb.

Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-09-21 20:24:40 +02:00
Andreas Rheinhardt
6a288ada55 fate/lavf-*: Add missing dependency on pipe protocol
Forgotten in bf1337f99c.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-09-21 15:43:52 +02:00
Lynne
3ade6a8644
x86/lpc: implement a new Welch windowing function
Old one was written with the assumption only even inputs would be given.
This very messy replacement supports even and odd inputs, and supports
AVX2 for extra speed. The buffers given are usually quite big (4k samples),
so the speedup is worth it.
The new SSE version is still faster than the old inline asm version by 33%.

Also checkasm is provided to make sure this monstrosity works.

This fixes some FATE tests.
2022-09-21 07:12:39 +02:00