style_active doesn't do anything any more: It is already assured
that style_active is one when one reaches the end of a style.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The checks for whether a style should be opened/closed at the current
character position are as follows: A variable entry contained the index
of the currently active or potentially next active style. If the current
character position coincided with the start of style[entry], the style
was activated; this was followed by a check whether the current
character position coincided with the end of style[entry]; if so, the
style was deactivated and entry incremented. Afterwards the char was
processed.
The order of the checks leads to problems in case the endChar of style A
coincides with the startChar of the next style (say B): Style B was never
opened. When we are at said common position, the currently active style
is A and so the start pos check does not succeed; but the end pos check
does and it closes the currently active style A and increments entry.
At the next iteration of the loop, the current character position is
bigger than the start position of style B (which is style[entry]) and
therefore the style is not activated.
The solution is of course to first check for whether a style needs to be
closed (and increment entry if it does) before checking whether the next
style needs to be opened.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
They would either lead to unnecessary ASS tags being emitted (namely
tags that are reset immediately thereafter) or would lead to problems
when parsing: e.g. if a zero-length style immediately follows another
style, the current code will end the preceding style and set the
zero-length style as the next potentially active style, but it is only
tested for activation when the next character is parsed at which point
the current offset is already greater than both the starting as well
as the end offset of the empty style. It will therefore neither be
opened nor closed and all subsequent styles will be ignored.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Up until now, the 3GPP Timed Text decoder used av_dynarray_add()
for a list of style entries. Said entries are individually allocated
and owned by the pointers in the dynamic array and are therefore
unsuitable for av_dynarray_add() which simply frees the array,
but not the entries on error. In this case the intended new entry
also leaks because it has been forgotten to free it.
This commit fixes this. It is now allocated in one go and not
reallocated multiple times (and it won't be overallocated any more).
After all, the final number of elements (pending errors) is already
known in advance.
Furthermore, the style entries are now the entries of the new array,
i.e. they are no longer allocated separately. This also removes one
level of indirection.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
There is no need to walk through the list of fonts twice.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Every font entry occupies at least three bytes, so checking early
whether there is that much data available is a low-effort way to exclude
invalid extradata. Doing so leads to an overall simplification.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Up until now, the 3GPP Timed Text decoder used av_dynarray_add()
for a list of font entries, a structure which contains an allocated
string. The font entries are owned by the pointers in the dynamic array
and are therefore unsuitable for av_dynarray_add() which simply frees
the array, but not the font entries and of course not the strings. The
latter all leak if reallocating the dynamic array fails.
This commit fixes this. It stops reallocating the array altogether:
After all, the final number of elements (pending errors) is already
known in advance.
Furthermore, the font entries are now the entries of the new array,
i.e. the font entries are no longer allocated separately. This also
removes one level of indirection.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
If allocating fonts fails when reading the header, all fonts are freed,
yet the counter of fonts is not reset and no error is returned; when
subtitles are decoded lateron, the inexistent list of fonts is searched
for the matching font for this particular entry which of course leads to
a segfault.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Fixes: left shift of 243 by 24 places cannot be represented in type 'int'
Fixes: 22716/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MOVTEXT_fuzzer-5704263425851392
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Font sizes are relative to the subtitle frame dimensions. If the
expected frame dimensions are not known, the font sizes will most
likely be incorrect.
Signed-off-by: Philip Langdale <philipl@overt.org>
Style flags were only being turned on. If the default was on and style
record turned off, style flag remained on.
Signed-off-by: Philip Langdale <philipl@overt.org>
It's not necessary to walk the style record list twice per subtitle
character. style records are in order and do not overlap.
Signed-off-by: Philip Langdale <philipl@overt.org>
Limits based on 3GPP TS 26.245 V14.0.0
Fixes: Timeout
Fixes: 6377/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MOVTEXT_fuzzer-5175929115508736
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Subtitles which contained styled UTF-8 subtitles (i.e. not just 7 bit
ASCII characters) were not handled correctly. The spec mandates that
styling start/end ranges are in "characters". It's not quite clear what
a "character" is supposed to be, but maybe they mean unicode codepoints.
FFmpeg's decoder treated the style ranges as byte idexes, which could
lead to UTF-8 sequences being broken, and the common code dropping the
whole subtitle line.
Change this and count the codepoint instead. This also means that even
if this is somehow wrong, the decoder won't break UTF-8 sequences
anymore. The sample which led me to investigate this now appears to work
correctly.
The value of wrap_flag in the Text Wrap Box specifies if the text is to
be wrapped or not. Uses 'end of line wrap' amongst the wrap styles
supported by ASS if the text is to be wrapped, i.e; fill as much text
in a line as possible, then break to next line.
The 3GPP spec has no provision for smart wrapping.
Signed-off-by: Niklesh <niklesh.lalwani@iitb.ac.in>
As suggested, posting the combined patch with the fate changes.
The patch sets the default style in ASS from the default style
information present in the movtext header.
Signed-off-by: Niklesh <niklesh.lalwani@iitb.ac.in>
The commit broke sub-movtext and sub-subripenc
fate output differs between mips ad x86 so updating fate ref is not
possible
This reverts commit d670848d4c.
Add support for fontsize in style records. The patch uses reset to
directly change back to default style instead of using closing tags,
since we are not handling the default styles right now.
Signed-off-by: Niklesh <niklesh.lalwani@iitb.ac.in>