Some encoders, like flac, can send side data only packets at the end.
Eventually, said extradata update should ideally be used to update the header
when writting to seekable output, but for now, ignore them.
Should fix the undefined behavior of passing NULL to memcpy().
Signed-off-by: James Almer <jamrial@gmail.com>
PFM (aka Portable FloatMap) encodes its scanlines from bottom-to-top,
not from top-to-bottom, unlike other NetPBM formats. Without this
patch, FFmpeg ignores this exception and decodes/encodes PFM images
mirrored vertically from their proper orientation.
For reference, see the NetPBM tool pfmtopam, which encodes a .pam
from a .pfm, using the correct orientation (and which FFmpeg reads
correctly). Also compare ffplay to magick display, which shows the
correct orientation as well.
See: http://www.pauldebevec.com/Research/HDR/PFM/ and see:
https://netpbm.sourceforge.net/doc/pfm.html for descriptions of this
image format.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
This filter, when used in the "pad" mode, currently makes the
distinction between limited and full range solely by testing for YUVJ
pixel formats at link setup time. This is deprecated and should be
improved to perform the detection based on the per-frame metadata.
In order to make this distinction based on color range metadata, which
is only known at the time of filtering frames, for simplicity, we simply
allocate two copies of the "black" frame - one for limited range and the
other for full range metadata. This could be done more dynamically (e.g.
as-needed or simply by blitting the appropriate pixel value directly),
but this change is relatively simple and preserves the structure of the
existing code.
This commit actually fixes a bug in FATE - the new output is correct for
the first time. The previous md5 ref was of a frame that incorrectly
combined full-range pixel data with limited-range black fields. The
corresponding result has been updated.
Signed-off-by: Niklas Haas <git@haasn.dev>
The idea behind last_pkt_props was to store the properties of the last packet
fed to the decoder. Any sort of queueing required by CODEC_CAP_DELAY decoders
that consume several packets before they start outputting frames should be done
by the decoders in question. An example of this is libdav1d.
This is required for the following commits that will fix last_pkt_props in
frame threading scenarios, as well as maintain its contents during flush.
This revers commit 022a12b306.
Signed-off-by: James Almer <jamrial@gmail.com>
The Encoding field (and the \fe tag) allows to limit font selection to
only those fonts declaring support for the specified codepage in their
OS/2's table "Code Page Character Range" field.
Particularly, Encoding=0 means only font's declaring support for "ANSI",
or rather "Latin (Western European)", are allowed to be selected.
Specifying Encoding=1 allows all fonts to be considered.
We do not want to limit font selection, so specify Encoding=1.
NB: at the time of writing libass only partially supports this field,
thus hiding the issue in any libass-based renderer. A VSFilter-based
DirectShow filter or XySubFilter will reveal the issue when a font not
declaring support for latin characters is specified in a style.
Colour values used in ASS files without a "YCbCr Matrix" header set to
"None" are subject to colour mangling, due to how ASS was historically
conceived. A more in-depth description can be found in the documetation
inside libass' public ass_types.h header. The important part is, if this
header is not set to "None", the final output colours can deviate from
the literal value specified in the file. When converting from non-ASS
formats we do not want any colour shift to happen, so let's set the
appropiate header.
NB: ffmpeg's subtitle filter, does not follow libass' documentation
regarding colour mangling, thus hiding the bug. Anything based on
VSFilter, XySubFilter or e.g. mpv do and might show the issue.
(Of course native ASS subs, which _do_ rely on colour mangling won't
work properly with the subtitle filter, but this can be fixed another
time)
It is valid for HEVC; in fact, the ATSC-HEVC spec [1] simply
refers to the relevant H.264 spec.
It is also trivial to implement now: Just move applying AFD
to ff_h2645_sei_to_frame() and stop ignoring AFD when parsing
a HEVC SEI containing it.
A FATE-test for this has been added.
[1]: https://www.atsc.org/atsc-documents/a3412017-video-hevc/
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The encoder is sensitive to changes in precision, and its test target
was a compromise. It was already close to failing on x87 FPUs.
ff_mdct_init used double precision entirely from the scale to computing
the MDCT exp tables. av_tx_init uses single-precision for the scale,
with a small input change which was enough to tip the test into failing on
x87 FPUs.
Increase the fuzz factor in line with other AAC encoder tests to fix.
floating point uses a slightly different predictor technique describe here
http://chriscox.org/TIFFTN3d1.pdf
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This commit enabled assembly code with intel AVX512 VNNI and added unit test for sobel filter
sobel_c: 4537
sobel_avx512icl 2136
Signed-off-by: bwang30 <bin.wang@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The encoder is fixed point, and uses an MDCT only for analysis. Due
to the slightly different rounding, the encoder makes a different
decision, so the tests have to be adjusted as well.
This patch replaces the transform used in AAC with lavu/tx and removes
the limitation on only being able to decode 960-sample files
with the float decoder.
This commit also removes a whole bunch of unnecessary and slow
lifting steps the decoder did to compensate for the poor accuracy
of the old integer transformation code.
Overall float decoder speedup on Zen 3 for 64kbps: 32%
Fixes ticket #128.
The SVQ1 interframe mean VLC symbols -128 and 128 are incorrectly swapped
in our SVQ1 implementation, resulting in visible artifacts for some videos.
This patch unswaps the order of these two symbols.
The most noticable example of the artiacts caused by this error can be observed in
https://trac.ffmpeg.org/attachment/ticket/128/svq1_set.7z '352_288_k_50.mov'.
The artifacts are not observed when using the reference decoder
(QuickTime 7.7.9 x86 binary).
As a result of this patch, the reference data for the fate-svq1 test
($SAMPLES/svq1/marymary-shackles.mov) must be modified. For this file, our
decoder output is now bitwise identical to the reference decoder. I have
tested patch with various other samples and they are all now bitwise identical.
This enables overriding the rotation as well as horizontal/vertical
flip state of a specific video stream on the input side.
Additionally, switch the singular test that was utilizing the rotation
metadata to instead override the input display rotation, thus leading
to the same result.
There is no MMX code for (add|put|put_signed)_pixels_clamped
since commit bfb28b5ce8, so use
declare_func instead of declare_func_emms() to also test that
we are not in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for diff_bytes since commit
230ea38de1, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for add_int16 since commit
4b6ffc2880, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for llviddsp after commit
fed07efcde, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for pixblockdsp after commit
92b5800277, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for audiodsp after commit
3d716d38ab, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for blockdsp after commit
ee551a21dd, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for vc1_inv_trans_8x8 or
vc1_unescape_buffer, so use declare_func instead of
declare_func_emms() to also test that we are not in MMX
mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for loop filters since commit
6a551f1405, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The data in SGI images is stored planar, so exporting
it via planar pixel formats is natural.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Unfortunately, it is common, and will remain so, that the Bit
manipulations are not enabled at compilation time. This is an official
policy for Debian ports in general (though they do not support RISC-V
officially as of yet) to stick to the minimal target baseline, which
does not include the B extension or even its Zbb subset.
For inline helpers (CPOP, REV8), compiler builtins (CTZ, CLZ) or
even plain C code (MIN, MAX, MINU, MAXU), run-time detection seems
impractical. But at least it can work for the byte-swap DSP functions.
This check is intended to be avoid buffer overflows,
yet there are four problems with it:
1. It has an in-built off-by-one error: len == out_end - out
is perfectly fine and nothing to worry about.
This off-by-one error led to the pixel in the lower-right corner
not being set properly for the back frame of the sample from
the rl2 FATE-test. This pixel is copied to every frame which
is the reason for the update to the reference file of said test.
With this patch, the output of the decoder matches the output
as captured from the reference decoder* (apart from the fact
that said reference somehow lacks the top part of the frame
(copied over from the background frame)).
2. Given that the stride of the buffer may be different
from the width of the video (despite one pixel taking one byte),
there is a second check lateron making the first check redundant
(if one returns immediately; a simple break at the second check
is not sufficient, because it only exits the inner loop).
3. The check is based around the assumption of the stride being
positive (it has this in common with the other check which
will be fixed in a future commit).
4. Even after fixing the off-by-one error, the check in
question is still triggered by all the non-background frames
in the FATE sample as well as by A1100100.RL2. In all these
cases, they use len == 255 and val == 128. For videos with
background frame this just means "copy from the background
frame", which would be done anyway lateron.* Yet for videos
without it copying it is necessary to avoid leaving
uninitialized parts in the video.
*: Available in https://samples.mplayerhq.hu/game-formats/voyeur-rl2/
**: Due to this, the code that copies the rest from the
back frame is no longer executed for any of the samples
available on the sample server. Given that these are only
the files from the demo version of this game, I don't know
whether this code is executed for any file in existence or not.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
RVV defines a total of 12 different extensions, including:
- 5 different instruction subsets:
- Zve32x: 8-, 16- and 32-bit integers,
- Zve32f: Zve32x plus single precision floats,
- Zve64x: Zve32x plus 64-bit integers,
- Zve64f: Zve32f plus Zve64x,
- Zve64d: Zve64f plus double precision floats.
- 6 different vector lengths:
- Zvl32b (embedded only),
- Zvl64b (embedded only),
- Zvl128b,
- Zvl256b,
- Zvl512b,
- Zvl1024b,
- and the V extension proper: equivalent to Zve64f and Zvl128b.
In total, there are 6 different possible sets of supported instructions
(including the empty set), but for convenience we allocate one bit for
each type sets: up-to-32-bit ints (RVV_I32), floats (RVV_F32),
64-bit ints (RVV_I64) and doubles (RVV_F64).
Whence the vector size is needed, it can be retrieved by reading the
unprivileged read-only vlenb CSR. This should probably be a separate
helper macro if needed at a later point.
This introduces compile-time and run-time CPU detection on RISC-V. In
practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of
I, F and D extensions, and if it does, it probably won't have run-time
detection. So the flags are essentially always set.
But as things stand, checkasm wants them that way. Compare the ARMV8
flag on AArch64. We are nowhere near running short on CPU flag bits.
This also tests writing slice data in the unaligned mode
(some of these files use CAVLC) as well as updating
side data as well as parsing ISOBMFF avcc extradata.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
~4x faster than the C version.
The shuffles in the 15pt dim1 are seriously expensive. Not happy with it,
but I'm contempt.
Can be easily converted to pure AVX by removing all vpermpd/vpermps
instructions.
Old one was written with the assumption only even inputs would be given.
This very messy replacement supports even and odd inputs, and supports
AVX2 for extra speed. The buffers given are usually quite big (4k samples),
so the speedup is worth it.
The new SSE version is still faster than the old inline asm version by 33%.
Also checkasm is provided to make sure this monstrosity works.
This fixes some FATE tests.
The aim of this test is to show the interleavement
of the file generated in the first pass; so make the
interleavement queue in the framecrc muxer in the second
pass as small as possible so that the framecrc muxer does not
fix wrong interleavement of the input file behind our backs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
enc_dec is designed for raw input and output and computes
the PSNR between these two. The input of the shortest-sub
test is the idx file of a vobsub sub+idx combination
and the output is the output of framecrc of said vobsub
subtitle muxed into Matroska together with a synthesized
video. Calculating the PSNR between these two files makes
no sense, therefore switch to a transcode test, where
the ref file file contains the output of framecrc directly,
making the interleavement better visible in the ref file
at the cost of a larger ref file (>400 lines).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also covers writing mastering display metadata.
Reviewed-by: Tomas Härdin <tjoppen@acc.umu.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Do this by setting AVCodecInternal.pad_samples.
This prevents reading into the frame's padding and writing
into the packet's padding.
This actually happened in our FATE tests (where the number of samples
is 2 mod 4), which therefore needed to be updated.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
APTX decodes four bytes of input to four stereo samples; APTX HD
does the same with six bytes of input. So it can be easily supported
in av_get_audio_frame_duration().
This fixes invalid durations and (derived) timestamps of demuxed
APTX HD packets and therefore fixed the timestamp in the aptx-hd
FATE test.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
We have de- and encoders for APTX and APTX HD, yet not FATE tests.
This commit therefore adds a transcoding test to utilize them.
Furthermore, during creating these tests it turned out that
the duration is set incorrectly for APTX HD. This will be fixed
in a future commit.
(Thanks to Andriy Gelman for finding an issue in an earlier version
that used a 192kHz input sample which does not work reliably accross
platforms.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Since introducing the various packed formats used by VAAPI (and p012),
we've noticed that there's actually a gap in how
av_find_best_pix_fmt_of_2 works. It doesn't actually assign any value
to having the same bit depth as the source format, when comparing
against formats with a higher bit depth. This usually doesn't matter,
because av_get_padded_bits_per_pixel() will account for it.
However, as many of these formats use padding internally, we find that
av_get_padded_bits_per_pixel() actually returns the same value for the
10 bit, 12 bit, 16 bit flavours, etc. In these tied situations, we end
up just picking the first of the two provided formats, even if the
second one should be preferred because it matches the actual bit depth.
This bug already existed if you tried to compare yuv420p10 against p016
and p010, for example, but it simply hadn't come up before so we never
noticed.
But now, we actually got a situation in the VAAPI VP9 decoder where it
offers both p010 and p012 because Profile 3 could be either depth and
ends up picking p012 for 10 bit content due to the ordering of the
testing.
In addition, in the process of testing the fix, I realised we have the
same gap when it comes to chroma subsampling - we do not favour a
format that has exactly the same subsampling vs one with less
subsampling when all else is equal.
To fix this, I'm introducing a small score penalty if the bit depth or
subsampling doesn't exactly match the source format. This will break
the tie in favour of the format with the exact match, but not offset
any of the other scoring penalties we already have.
I have added a set of tests around these formats which will fail
without this fix.
These tests test both the demuxer as well as the muxer
wherever possible. It is not always possible due to the fact
that the muxer supports more codecs than the demuxer.
The spdif demuxer does currently not set the need_parsing flag.
If one were to set this to AVSTREAM_PARSE_FULL, the test results
would change as follows:
- For spdif-aac-remux, the packets are currently padded to 16bits,
i.e. if the actual packet size is odd, there is a padding byte.
The parser splits this byte away into a one byte packet of its own.
Insanely, these one byte packets get the same duration as normal
packets, i.e. timing is ruined.
- The DCA-remux tests get proper duration/timestamps.
- In the spdif-mp2-remux test the demuxer marks the stream as
being MP2; the parser sets it to MP3 and this triggers
the "Codec change in IEC 61937" codepath; this test therefore
returns only two packets with the parser.
- For spdif-mp3-remux some bytes end up in different packets:
Some input packets of this file have an odd length (417B instead
of 418B like all the other packets) and are padded to 418B.
Without a parser, all returned packets from the spdif-demuxer
are 418B. With a parser, the packets that were originally 417B
are 417B again, but the padding byte has not been discarded,
but added to the next packet which is now 419B.
This fixes "Multiple frames in a packet" warning and avoids
an "Invalid data found when processing input" error when decoding.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This duration is equal to the longest duration in all track's tkhd atoms, which
may be comprised of the sum of all edit lists in each track. Empty edit lists
in tracks represent start_time, and the actual media duration is stored in the
mdhd atom.
This change lets the generic demux code derive the longest track duration taken
from mdhd atoms, so the correct duration and start_time combination will be
reported.
Should fix ticket #9775.
Reviewed-by: zhilizhao(赵志立) <quinkblack@foxmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The field is not specific to Opus.
The mp2fixed encoder signals initial_padding and is used
by both the matroska-encoding-delay test as well as
the lavf-mkv tests which necessitated several FATE ref changes.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Matroska generally requires timestamps to be nonnegative, but
there is an exception: Data that corresponds to encoder delay
and is not supposed to be output anyway can have a negative
timestamp. This is achieved by using the CodecDelay header
field: The demuxer has to subtract this value from the raw
(nonnegative) timestamps of the corresponding track.
Therefore the muxer has to add this value first to write
this raw timestamp.
Support for writing CodecDelay has been added in FFmpeg commit
d92b1b1bab and in Libav commit
a1aa37dd0b. The former simply
wrote the header field and did not apply any timestamp offsets,
leading to desynchronisation (if one uses multiple tracks).
The latter applied it at two places, but not at the one where
it actually matters, namely in mkv_write_block(), leading to
the same desynchronisation as with the former commit. It furthermore
used the wrong stream timebase to convert the delay to the
stream's timebase, as the conversion used the timebase from
before avpriv_set_pts_info().
When the latter was merged in 82e4f39883,
it was only done in a deactivated state that still did not
offset the timestamps when muxing due to "assertion failures
and av sync errors". a1aa37dd0b
made it definitely more likely to run into assertion failures
(namely if the relative block timestamp doesn't fit into an int16_t).
Yet all of the above issues have been fixed (in commits
962d631573,
5d3953a5dc and
4ebeab15b0. This commit therefore
enables applying CodecDelay, fixing ticket #7182.
There is just one slight regression from this: If one has input
with encoder delay where the first timestamp is negative, but
the pts of the part of the data that is actually intended to be
output is nonnegative, then the timestamps will currently by default
be shifted to make them nonnegative before they reach the muxer;
the muxer will then ensure that the shifted timestamps are retained.
Before this commit, the muxer did not ensure this; instead the
timestamps that the demuxer will output were shifted and
if the first timestamp of the actually intended output was zero
before shifting, then this unintentional shift just cancels
the shift performed before the packet reached the muxer.
(But notice that this only applies if all the tracks use the same
CodecDelay, or the relative sync between tracks will be impaired.)
This happens in the matroska-opus-remux and matroska-ogg-opus-remux
FATE tests. Future commits will forward the information that
the Matroska muxer has a limited capability to handle negative
timestamps so that the shifting in libavformat can take advantage
of it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is possible for the trailing padding to be zero, namely
e.g. if the AV_PKT_DATA_SKIP_SAMPLES side data is used
for leading padding. Matroska supports this (use a negative
DiscardPadding), but players do not; at least Firefox refuses
to play such a file. So for now only write DiscardPadding
if it is trailing padding and nonzero.
The fate-matroska-ogg-opus-remux was affected by this.
(I wish CodecDelay would not exist and DiscardPadding would
be used to instead trim the codec delay away (with the Block
timestamp corresponding to the time at which the actually
output audio is output).)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The function contains only two assignments, setting DVVideoContext.avctx
and AVCodecContext.chroma_sample_location. However, the decoder does not
use the former, and the encoder should not be setting the latter.
Therefore move the first assignment to dvenc and the second to dvdec.
Make the encoder warn if the user-signalled chroma sample location does
not match the supported one, and return an error on higher compliance
levels.
These are the formats we want/need to use when dealing with the Intel
VAAPI decoder for 12bit 4:2:0, 12bit 4:2:2, 10bit 4:4:4 and 12bit 4:4:4
respectively.
As with the already supported Y210 and YUVX (XVUY) formats, they are
based on formats Microsoft picked as their preferred 4:2:2 and 4:4:4
video formats, and Intel ran with it.
P12 and Y212 are simply an extension of 10 bit formats to say 12 bits
will be used, with 4 unused bits instead of 6.
XV30, and XV36, as exotic as they sound, are variants of Y410 and Y412
where the alpha channel is left formally undefined. We prefer these
over the alpha versions because the hardware cannot actually do
anything with the alpha channel and respecting it is just overhead.
Y412/XV46 is a normal looking packed 4 channel format where each
channel is 16bits wide but only the 12msb are used (like P012).
Y410/XV30 packs three 10bit channels in 32bits with 2bits of alpha,
like A/X2RGB10 style formats. This annoying layout forced me to define
the BE version as a bitstream format. It seems like our pixdesc
infrastructure can handle the LE version being byte-defined, but not
when it's reversed. If there's a better way to handle this, please
let me know. Our existing X2 formats all have the 2 bits at the MSB
end, but this format places them at the LSB end and that seems to be
the root of the problem.
As we already have support for VUYA, I figured I should do the small
amount of work to support VUYX as well. That means a little refactoring
to share code.
This is the alphaless version of VUYA that I introduced recently. After
further discussion and noting that the Intel vaapi driver explicitly
lists XYUV as a support format for encoding and decoding 8bit 444
content, we decided to switch our usage and avoid the overhead of
having a declared alpha channel around.
Note that I am not removing VUYA, as this turned out to have another
use, which was to replace the need for v408enc/dec when dealing with
the format.
The vaapi switching will happen in the next change