Stores the item ids of all the items found in the file and
processes the primary item at the end of the meta box. This patch
does not change any behavior. It sets up the code for parsing
alpha channel (and possibly images with 'grid') in follow up
patches.
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: James Zern <jzern@google.com>
Streams with all zero sample_delta in 'stts' have all zero dts.
They have higher chance be chose by mov_find_next_sample(), which
leads to seek again and again.
For example, GoPro created a 'GoPro SOS' stream:
Stream #0:4[0x5](eng): Data: none (fdsc / 0x63736466), 13 kb/s (default)
Metadata:
creation_time : 2022-06-21T08:49:19.000000Z
handler_name : GoPro SOS
With 'ffprobe -show_frames http://example.com/gopro.mp4', ffprobe
blocks until all samples in 'GoPro SOS' stream are consumed first.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Data does not have to be decrypted in 16-byte blocks for AES-CTR mode, so
existing buggy code can be hugely simplified.
Fixes ticket #9829.
Signed-off-by: Marton Balint <cus@passwd.hu>
In order to not generate 0 sized packets or create a huge index table
needlessly.
Fixes: Timeout
Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304
Fixes: 45738/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-6142535657979904
Signed-off-by: Marton Balint <cus@passwd.hu>
Update the still AVIF parser to only read the primary item. With this
patch, AVIF still images with exif/icc/alpha channel will no longer
fail to parse.
For example, this patch enables parsing of files in:
https://github.com/AOMediaCodec/av1-avif/tree/master/testFiles/Microsoft
Adding two fate tests:
1) demuxing of still image with 1 item - this test will pass regardless
of this patch.
2) demuxing of still image with 2 items - this test will fail without
this patch and will pass with patch applied.
Partially fixes trac ticket #7621
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: James Zern <jzern@google.com>
For ipcm and fpcm streams, big-endian format is the default, but it can be changed
with additional 'pcmC' sub-atom of audio sample description.
Details can be found in ISO/IEC 23003-5:2020
Fixes ticket #9763.
Fixes ticket #9790.
Patch simplified by Marton Balint.
Signed-off-by: Marton Balint <cus@passwd.hu>
Fixes: signed integer overflow: 536870913 * 536870913 cannot be represented in type 'int'
Fixes: 45862/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-4730373768085504
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
ff_codec_get_id loops over ff_codec_movvideo_tags (which is a large
array) two times. The result is unused most of the cases.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The stsc_index is checked and updated for the next sample. If the
next sample needs to update stsd_index and stsc_index, then only
stsc_index is updated, which leads to a missing
AV_PKT_DATA_NEW_EXTRADATA. For example, the sample in the second
chunk needs to update both.
entry[0]
first_chunk = 1
samples_per_chunk = 3
sample_description_index = 1
entry[1]
first_chunk = 2
samples_per_chunk = 1
sample_description_index = 2
entry[2]
first_chunk = 3
samples_per_chunk = 8
sample_description_index = 2
The fix is simple: first check and update stsd_index for current
sample, then check and update stsc_index for the next.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
AVIF still and animations are now supported by the MOV parser.
Add the "avif" extension to the list of supported extensions to
AVInputFormat.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
This patch supports AVIF still images conforming to the
final specification that have exactly one item (i.e. no alpha channel).
The iloc box is parsed and the mov index populated.
Partially fixes#7621.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: Gyan Doshi <ffmpeg@gyani.pro>
60 fps content have "Number of Frames" set to 30 in the tmcd atom, but the
frame duration / timescale reflects the original video frame rate.
Therefore we multiply the frame count with the quotient of the rounded timecode
frame rate and the "Number of Frames" per second to get a frame count in the original
(higher) frame rate.
Note that the frames part in the timecode will be in high frame rate which will
make the timecode different to e.g. MediaInfo which seems to show the 30 fps
timecode even for 120 fps content.
Regression since 428b4aacb1.
Fixes ticket #9710.
Fixes ticket #9492.
Signed-off-by: Marton Balint <cus@passwd.hu>
This avoids unnecessary rebuilds of most source files if only the
list of enabled components has changed, but not the other properties
of the build, set in config.h.
Signed-off-by: Martin Storsjö <martin@martin.st>
As defined by Google's Spatial Audio RFC.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
This was tested with medias recorded from an iPhone XR and an iPhone 13.
Here is how a typical stream looks like in coding order:
┌────────┬─────┬─────┬──────────┐
│ sample | PTS | DTS | keyframe |
├────────┼─────┼─────┼──────────┤
┊ ┊ ┊ ┊ ┊
│ 53 │ 560 │ 510 │ No │
│ 54 │ 540 │ 520 │ No │
│ 55 │ 530 │ 530 │ No │
│ 56 │ 550 │ 540 │ No │
│ 57 │ 600 │ 550 │ Yes │
│ * 58 │ 580 │ 560 │ No │
│ * 59 │ 570 │ 570 │ No │
│ * 60 │ 590 │ 580 │ No │
│ 61 │ 640 │ 590 │ No │
│ 62 │ 620 │ 600 │ No │
┊ ┊ ┊ ┊ ┊
In composition/display order:
┌────────┬─────┬─────┬──────────┐
│ sample | PTS | DTS | keyframe |
├────────┼─────┼─────┼──────────┤
┊ ┊ ┊ ┊ ┊
│ 55 │ 530 │ 530 │ No │
│ 54 │ 540 │ 520 │ No │
│ 56 │ 550 │ 540 │ No │
│ 53 │ 560 │ 510 │ No │
│ * 59 │ 570 │ 570 │ No │
│ * 58 │ 580 │ 560 │ No │
│ * 60 │ 590 │ 580 │ No │
│ 57 │ 600 │ 550 │ Yes │
│ 63 │ 610 │ 610 │ No │
│ 62 │ 620 │ 600 │ No │
┊ ┊ ┊ ┊ ┊
Sample/frame 58, 59 and 60 are B-frames which actually depends on the
key frame (57). Here the key frame is not an IDR but a "CRA" (Clean
Random Access).
Initially, I thought I could rely on the sdtp box (independent and
disposable samples), but unfortunately:
sdtp[54] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
sdtp[55] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
sdtp[56] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
sdtp[57] is_leading:0 sample_depends_on:2 sample_is_depended_on:0 sample_has_redundancy:0
sdtp[58] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
sdtp[59] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
sdtp[60] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
sdtp[61] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
sdtp[62] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
The information that might have been useful here would have been
is_leading, but all the samples are set to 0 so this was unusable.
Instead, we need to rely on sgpd/sbgp tables. In my case the video track
contained 3 sgpd tables with the following grouping types: tscl, sync
and tsas. In the sync table we have the following 2 entries (only):
sgpd.sync[1]: sync nal_unit_type:0x14
sgpd.sync[2]: sync nal_unit_type:0x15
(The count starts at 1 because 0 carries the undefined semantic, we'll
see that later in the reference table).
The NAL unit types presented here correspond to:
libavcodec/hevc.h: HEVC_NAL_IDR_N_LP = 20,
libavcodec/hevc.h: HEVC_NAL_CRA_NUT = 21,
In parallel, the sbgp sync table contains the following:
┌────┬───────┬─────┐
│ id │ count │ gdi │
├────┼───────┼─────┤
│ 0 │ 1 │ 1 │
│ 1 │ 56 │ 0 │
│ 2 │ 1 │ 2 │
│ 3 │ 59 │ 0 │
│ 4 │ 1 │ 2 │
│ 5 │ 59 │ 0 │
│ 6 │ 1 │ 2 │
│ 7 │ 59 │ 0 │
│ 8 │ 1 │ 2 │
│ 9 │ 59 │ 0 │
│ 10 │ 1 │ 2 │
│ 11 │ 11 │ 0 │
└────┴───────┴─────┘
The gdi column (group description index) directly refers to the index in
the sgpd.sync table. This means the first frame is an IDR, then we have
batches of undefined frames interlaced with CRA frames. No IDR ever
appears again (tried on a 30+ seconds sample).
With that information, we can build an heuristic using the presentation
order.
A few things needed to be introduced in this commit:
1. min_sample_duration is extracted from the stts: we need the minimal
step between sample in order to PTS-step backward to a valid point
2. In order to avoid a loop over the ctts table systematically during a
seek, we build an expanded list of sample offsets which will be used
to translate from DTS to PTS
3. An open_key_samples index to keep track of all the non-IDR key
frames; for now it only supports HEVC CRA frames. We should probably
add BLA frames as well, but I don't have any sample so I prefered to
leave that for later
It is entirely possible I missed something obvious in my approach, but I
couldn't come up with a better solution. Also, as mentioned in the diff,
we could optimize is_open_key_sample(), but the linear scaling overhead
should be fine for now since it only happens in seek events.
Fixing this issue prevents sending broken packets to the decoder. With
FFmpeg hevc decoder the frames are skipped, with VideoToolbox the frames
are glitching.
sgpd means Sample Group Description Box.
For now, only the sync grouping type is parsed, but the function can
easily be adjusted to support other flavours.
The sbgp (Sample to Group Box) sync_group table built in previous commit
contains references to this table through the group_description_index
field.
It appears this is not allowed "Each Segment Index box documents how a (sub)segment is divided into one or more subsegments
(which may themselves be further subdivided using Segment Index boxes)."
Fixes: Null pointer dereference
Fixes: Ticket9517
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -9223372036854775808 - 8 cannot be represented in type 'long'
Fixes: 43542/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-5237670148702208
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
MOVAtom.type is always read as a little-endian number
(despite MOV/ISOBMFF being big-endian).
Fixes the matroska-dovi-write-config8 FATE-test on big-endian
arches (which runs into the "index out of range" warning message).
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is small (16 B) and therefore the overhead of exporting it more
than outweighs the size savings from not having duplicated symbols:
When the symbol is no longer avpriv, one saves twice the size of
the string containing the symbols name (2x30 byte), two entries
in .dynsym (24 bytes each on x64), one entry in the importing libraries
.got and .rela.dyn (8 + 24 bytes on x64) and two entries for the
symbol version (2 bytes each) and one hash value in the exporting
library (4 bytes).
(The exact numbers are of course different for other platforms
(e.g. when using dlls), but given that the strings saved alone
more than outweigh the array size it can be presumed that this
is beneficial for all platforms.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To avoid duplicating code. The implementation in dovi_isom is identical.
Signed-off-by: quietvoid <tcChlisop0@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Very high stts sample deltas may occasionally be intended but usually
they are written in error or used to store a negative value for dts correction
when treated as signed 32-bit integers.
This option lets the user set an upper limit, beyond which the delta is clamped to 1.
Values greater than the limit if negative when cast to int32 are used to adjust onward dts.
Unit is the track time scale. Default is UINT_MAX - 48000*10 which
allows upto a 10 second dts correction for 48 kHz audio streams while
accommodating 99.9% of uint32 range.
Signed-off-by: Gyan Doshi <ffmpeg@gyani.pro>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 9223372036200463215 + 1109914409 cannot be represented in type 'long'
Fixes: 41480/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-6553086177443840
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -776522110086937600 * 16 cannot be represented in type 'long'
Fixes: 40563/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-6644829447127040
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
As per 8.6.1.2.2 of ISO/IEC 14496-12:2015(E), STTS sample offsets
are to be always stored as uint32_t. So far, they have been signed ints
which led to desync in files with very large offsets.
The MOVStts struct was used to store CTTS offsets as well. These can be
negative in version 1. So a new struct MOVCtts was created and all
declarations for CTTS usage changed to MOVCtts.
bit_rate is not a critical field, and we shouln't hard fail if we
can't caluclate it due to a large timebase - it needlessly breaks
valid files.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
correct implementation of 'cenc' encryption scheme to support
decryption of partial cipher blocks at the end of subsamples
https://www.iso.org/standard/68042.html
Signed-off-by: Nachiket Tarate <nachiket.programmer@gmail.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
This information is coded in a standard MP4 KindBox and utilizes the
scheme and values as per the DASH role scheme defined in MPEG-DASH.
Other schemes are technically allowed, but where multiple schemes
define the same concepts, the DASH scheme should be utilized.
Such flagging is additionally utilized by the DASH-IF CMAF ingest
specification, enabling an encoder to inform the following component
of the roles of the incoming media streams.
A test is added for this functionality in a similar manner to the
matroska test.
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>