Currently, we have a AV_CODEC_ID_SSA, which matches the way the ASS/SSA
markup is muxed in a standalone .ass/.ssa file. This means the AVPacket
data starts with a "Dialogue:" string, followed by a timing information
(start and end of the event as string) and a trailing CRLF after each
line. One packet can contain several lines. We'll refer to this layout
as "SSA" or "SSA lines".
In matroska, this markup is not stored as such: it has no "Dialogue:"
prefix, it contains a ReadOrder field, the timing information is not in
the payload, and it doesn't contain the trailing CRLF. See [1] for more
info. We'll refer to this layout as "ASS".
Since we have only one common codec for both formats, the matroska
demuxer is constructing an AVPacket following the "SSA lines" format.
This causes several problems, so it was decided to change this into
clean ASS packets.
Some insight about what is changed or unchanged in this commit:
CODECS
------
- the decoding process still writes "SSA lines" markup inside the ass
fields of the subtitles rectangles (sub->rects[n]->ass), which is
still the current common way of representing decoded subtitles
markup. It is meant to change later.
- new ASS codec id: AV_CODEC_ID_ASS (which is different from the
legacy AV_CODEC_ID_SSA)
- lavc/assdec: the "ass" decoder is renamed into "ssa" (instead of
"ass") for consistency with the codec id and allows to add a real
ass decoder. This ass decoder receives clean ASS lines (so it starts
with a ReadOrder, is followed by the Layer, etc). We make sure this
is decoded properly in a new ass-line rectangle of the decoded
subtitles (the ssa decoder OTOH is doing a simple straightforward
copy). Using the packet timing instead of data string makes sure the
ass-line now contains the appropriate timing.
- lavc/assenc: just like the ass decoder, the "ssa" encoder is renamed
into "ssa" (instead of "ass") for consistency with the codec id, and
allows to add a real "ass" encoder.
One important thing about this encoder is that it only supports one
ass rectangle: we could have put several dialogue events in the
AVPacket (separated by a \0 for instance) but this would have cause
trouble for the muxer which needs not only the start time, but also
the duration: typically, you have merged events with the same start
time (stored in the AVPacket->pts) but a different duration. At the
moment, only the matroska do the merge with the SSA-line codec.
We will need to make sure all the decoders in the future can't add
more than one rectangle (and only one Dialogue line in it
obviously).
FORMATS
-------
- lavf/assenc: the .ass/.ssa muxer can take both SSA and ASS packets.
In the case of ASS packets as input, it adds the timing based on the
AVPacket pts and duration, and mux it with "Dialogue:", trailing
CRLF, etc.
- lavf/assdec: unchanged; it currently still only outputs SSA-lines
packets.
- lavf/mkv: the demuxer can now output ASS packets without the need of
any "SSA-lines" reconstruction hack. It will become the default at
next libavformat bump, and the SSA support will be dropped from the
demuxer. The muxer can take ASS packets since it's muxed normally,
and still supports the old SSA packets. All the SSA support and
hacks in Matroska code will be dropped at next lavf bump.
[1]: http://www.matroska.org/technical/specs/subtitles/ssa.html
* commit 'bcc94328980e6c56546792ab08b0756abdce310b':
opt: check the return values of av_get_token for ENOMEM.
doc: Fix best_nb_channells typo
matroska: pass the lace size to the matroska_parse_rm_audio
Conflicts:
libavformat/matroskadec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8a96df7b70be509dae9ceec82d2c10a20361356d':
matroska: fix a corner case in ebml-lace parsing
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Each lace must be independent according to the specification.
Fix heap-buffer-overflow in matroska_parse_block for
corrupted real media in mkv files.
Stricter check than fc43c19a56
CC: libav-stable@libav.org
Fix heap-buffer-overflow in matroska_parse_block for
corrupted real media in mkv files.
CC: libav-stable@libav.org
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This patch adds the enums for the ContentEncryption elements.
This patch also adds support for parsing the ContentEncKeyID. The
ContentEncKeyID is then base64 encoded and stored in the stream's
metadata.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Adding support for parsing AlphaMode element in the Track header
and export that information as a metadata tag. This flag indicates
presence of alpha channel data in BlockAdditional element.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Matroska specification lists support for BlockAdditional element
which is not supported by ffmpeg's matroska parser. This patch
adds grammar definitions for parsing that element (and few other
related elements) and then puts the data in AVPacket.side_data
with new AVPacketSideDataType AV_PKT_DATA_MATROSKA_BLOCKADDITIONAL.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '7b8c5b263bc680eff5710bee5994de39d47fc15e':
vc1dec: prevent a crash due missing pred_flag parameter
matroska: Fix use after free
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
doc: add apidoc target for doxygen API documentation
matroskadec: do not use avpacket internals
Conflicts:
doc/Makefile
libavformat/matroskadec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '4c995fafd861f537360b3717901cdbed6a6844e7':
configure: simplify get_version() function
build: support asan and tsan toolchain shortcuts
rmdec: Move SIPR code shared with Matroska demuxer to a separate file
Merged-by: Michael Niedermayer <michaelni@gmx.at>