mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-08-10 06:10:52 +02:00

Go to file

Clément Bœsch ab77b878f1 avformat/mov: fix seeking with HEVC open GOP files

This was tested with medias recorded from an iPhone XR and an iPhone 13.

Here is how a typical stream looks like in coding order:

    ┌────────┬─────┬─────┬──────────┐
    │ sample | PTS | DTS | keyframe |
    ├────────┼─────┼─────┼──────────┤
    ┊        ┊     ┊     ┊          ┊
    │   53   │ 560 │ 510 │    No    │
    │   54   │ 540 │ 520 │    No    │
    │   55   │ 530 │ 530 │    No    │
    │   56   │ 550 │ 540 │    No    │
    │   57   │ 600 │ 550 │    Yes   │
    │ * 58   │ 580 │ 560 │    No    │
    │ * 59   │ 570 │ 570 │    No    │
    │ * 60   │ 590 │ 580 │    No    │
    │   61   │ 640 │ 590 │    No    │
    │   62   │ 620 │ 600 │    No    │
    ┊        ┊     ┊     ┊          ┊

In composition/display order:

    ┌────────┬─────┬─────┬──────────┐
    │ sample | PTS | DTS | keyframe |
    ├────────┼─────┼─────┼──────────┤
    ┊        ┊     ┊     ┊          ┊
    │   55   │ 530 │ 530 │    No    │
    │   54   │ 540 │ 520 │    No    │
    │   56   │ 550 │ 540 │    No    │
    │   53   │ 560 │ 510 │    No    │
    │ * 59   │ 570 │ 570 │    No    │
    │ * 58   │ 580 │ 560 │    No    │
    │ * 60   │ 590 │ 580 │    No    │
    │   57   │ 600 │ 550 │    Yes   │
    │   63   │ 610 │ 610 │    No    │
    │   62   │ 620 │ 600 │    No    │
    ┊        ┊     ┊     ┊          ┊

Sample/frame 58, 59 and 60 are B-frames which actually depends on the
key frame (57). Here the key frame is not an IDR but a "CRA" (Clean
Random Access).

Initially, I thought I could rely on the sdtp box (independent and
disposable samples), but unfortunately:

    sdtp[54] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
    sdtp[55] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
    sdtp[56] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
    sdtp[57] is_leading:0 sample_depends_on:2 sample_is_depended_on:0 sample_has_redundancy:0
    sdtp[58] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
    sdtp[59] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
    sdtp[60] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
    sdtp[61] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
    sdtp[62] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0

The information that might have been useful here would have been
is_leading, but all the samples are set to 0 so this was unusable.

Instead, we need to rely on sgpd/sbgp tables. In my case the video track
contained 3 sgpd tables with the following grouping types: tscl, sync
and tsas. In the sync table we have the following 2 entries (only):

    sgpd.sync[1]: sync nal_unit_type:0x14
    sgpd.sync[2]: sync nal_unit_type:0x15

(The count starts at 1 because 0 carries the undefined semantic, we'll
see that later in the reference table).

The NAL unit types presented here correspond to:

    libavcodec/hevc.h:    HEVC_NAL_IDR_N_LP       = 20,
    libavcodec/hevc.h:    HEVC_NAL_CRA_NUT        = 21,

In parallel, the sbgp sync table contains the following:

    ┌────┬───────┬─────┐
    │ id │ count │ gdi │
    ├────┼───────┼─────┤
    │  0 │   1   │  1  │
    │  1 │   56  │  0  │
    │  2 │   1   │  2  │
    │  3 │   59  │  0  │
    │  4 │   1   │  2  │
    │  5 │   59  │  0  │
    │  6 │   1   │  2  │
    │  7 │   59  │  0  │
    │  8 │   1   │  2  │
    │  9 │   59  │  0  │
    │ 10 │   1   │  2  │
    │ 11 │   11  │  0  │
    └────┴───────┴─────┘

The gdi column (group description index) directly refers to the index in
the sgpd.sync table. This means the first frame is an IDR, then we have
batches of undefined frames interlaced with CRA frames. No IDR ever
appears again (tried on a 30+ seconds sample).

With that information, we can build an heuristic using the presentation
order.

A few things needed to be introduced in this commit:

1. min_sample_duration is extracted from the stts: we need the minimal
   step between sample in order to PTS-step backward to a valid point
2. In order to avoid a loop over the ctts table systematically during a
   seek, we build an expanded list of sample offsets which will be used
   to translate from DTS to PTS
3. An open_key_samples index to keep track of all the non-IDR key
   frames; for now it only supports HEVC CRA frames. We should probably
   add BLA frames as well, but I don't have any sample so I prefered to
   leave that for later

It is entirely possible I missed something obvious in my approach, but I
couldn't come up with a better solution. Also, as mentioned in the diff,
we could optimize is_open_key_sample(), but the linear scaling overhead
should be fine for now since it only happens in seek events.

Fixing this issue prevents sending broken packets to the decoder. With
FFmpeg hevc decoder the frames are skipped, with VideoToolbox the frames
are glitching.

2022-03-04 15:50:51 +01:00

compat

Replace all occurences of av_mallocz_array() by av_calloc()

2021-09-20 01:03:52 +02:00

doc

avfilter/f_graphmonitor: add several more flags

2022-03-04 13:54:11 +01:00

ffbuild

Makefile: Redo duplicating object files in shared builds

2022-01-04 05:01:04 +01:00

fftools

fftools/ffmpeg: Don't presume frame_queue to have been allocated

2022-03-03 03:48:04 +01:00

libavcodec

avcodec/tiff: do not abort on zero denominator

2022-03-03 21:22:48 +01:00

libavdevice

libavcodec, libavdevice: Remove unnecessary includes of version.h

2022-02-24 22:36:15 +02:00

libavfilter

avfilter/vf_zscale: fix leaks in fast/bypass path

2022-03-04 14:07:20 +01:00

libavformat

avformat/mov: fix seeking with HEVC open GOP files

2022-03-04 15:50:51 +01:00

libavutil

avutil: [loongarch] Update loongson_intrinsics.h to v1.1.0

2022-03-01 23:53:40 +01:00

libpostproc

lib*/version.h: Bump Versions after release/5.0 branch

2022-01-04 14:29:06 +01:00

libswresample

lib*/version.h: Bump Versions after release/5.0 branch

2022-01-04 14:29:06 +01:00

libswscale

swscale: Take the destination range into account for yuv->rgb->yuv conversions

2022-02-25 11:01:17 +02:00

presets

…

tests

configure: stop allowing disabling lzo

2022-02-26 14:22:07 -03:00

tools

tools/target_bsf_fuzzer: simplify the loop feeding packets to the filter

2022-02-28 12:06:55 -03:00

.gitattributes

…

.gitignore

Remove mentions of a nonexistent avversion.h

2022-02-25 11:01:17 +02:00

.mailmap

mailmap: add entry for myself

2021-03-09 02:09:55 +00:00

.travis.yml

Merge commit '899ee03088d55152a48830df0899887f055da1de'

2019-03-14 15:53:16 -03:00

Changelog

lavc/mpeg*: drop the XvMC hwaccel code

2022-02-15 10:16:15 +01:00

configure

configure: Fix detecting/using getauxval

2022-03-04 14:29:42 +02:00

CONTRIBUTING.md

…

COPYING.GPLv2

…

COPYING.GPLv3

…

COPYING.LGPLv2.1

…

COPYING.LGPLv3

…

CREDITS

…

INSTALL.md

INSTALL.md: Fix Markdown formatting

2019-01-31 10:29:16 -09:00

LICENSE.md

avfilter/vf_geq: Relicense to LGPL

2019-12-28 11:20:48 +01:00

MAINTAINERS

lavc/mpeg*: drop the XvMC hwaccel code

2022-02-15 10:16:15 +01:00

Makefile

Remove mentions of a nonexistent avversion.h

2022-02-25 11:01:17 +02:00

README.md

README: fix typo and description of libavfilter

2021-10-08 09:44:34 +05:30

RELEASE

Bump Versions before release/4.4 branch

2021-03-20 01:01:12 +01:00

README.md

FFmpeg README

FFmpeg is a collection of libraries and tools to process multimedia content such as audio, video, subtitles and related metadata.

Libraries

libavcodec provides implementation of a wider range of codecs.
libavformat implements streaming protocols, container formats and basic I/O access.
libavutil includes hashers, decompressors and miscellaneous utility functions.
libavfilter provides means to alter decoded audio and video through a directed graph of connected filters.
libavdevice provides an abstraction to access capture and playback devices.
libswresample implements audio mixing and resampling routines.
libswscale implements color conversion and scaling routines.

Tools

ffmpeg is a command line toolbox to manipulate, convert and stream multimedia content.
ffplay is a minimalistic multimedia player.
ffprobe is a simple analysis tool to inspect multimedia content.
Additional small tools such as aviocat, ismindex and qt-faststart.

Documentation

The offline documentation is available in the doc/ directory.

The online documentation is available in the main website and in the wiki.

Examples

Coding examples are available in the doc/examples directory.

License

FFmpeg codebase is mainly LGPL-licensed with optional components licensed under GPL. Please refer to the LICENSE file for detailed information.

Contributing

Patches should be submitted to the ffmpeg-devel mailing list using git format-patch or git send-email. Github pull requests should be avoided because they are not part of our review process and will be ignored.

Languages

C 90.1%

Assembly 7.9%

Makefile 1.3%

C++ 0.2%

Objective-C 0.2%

Other 0.1%