Istvan Sebok provided a sample where field pair -> two fields content
was being misdetected by the existing logic. I added an additional
test to check the input picture type as identified by our h.264
parser.
Signed-off-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
With the flag in place, it's hard to actually use the decoder, and
I'm happy with how it works, with the exception of DivX3 where I've
never found a sample that worked that I was confident actually
matched what the hardware claimed to support.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This was a regression that came in when I switched to using the
h.264 annex b filter all the time. As the filter modifies extradata,
its use violates the statelessness assumption that exists in the
'ffmpeg' command line tool, and maybe elsewhere. It assumes that
a docoder can be reinitalised and pointed to an existing stream and
get the same results.
For now, the only way to meet this requirement is to backup the
extradata.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Now that we're converting all streams to Annex B format, we
can identify them as such to the hardware.
Signed-off-by: Philip Langdale <philipl@overt.org>
As we're now always running mp4 format streams through the annex b
filter, it makes sense to pass the filtered stream down, as
libcrystalhd would be doing the conversion internally anyway.
Signed-off-by: Philip Langdale <philipl@overt.org>
Originally, we needed to restore the original extradata after
initialising the mp4toannexb filter because mplayer would end up
taking two passes through the init sequence for the same stream
and end up miscategorising the stream. This doesn't seem to happen
anymore, making the backup/restore process unnecessary.
Signed-off-by: Philip Langdale <philipl@overt.org>
The H.264 parser that we use to detect interlacing can only handle
an Annex B stream, so we need to actually use the filter. This is
unfortunate as the crystalhd library is already doing this conversion
internally. A future change will reorganise the decode path more
completely so that we can feed the converted stream into libcrystalhd
and avoid the second conversion.
Signed-off-by: Philip Langdale <philipl@overt.org>
In preparation for using the filter on the actual bitstream, we need
to extend it's lifetime to match that of the decoder.
Signed-off-by: Philip Langdale <philipl@overt.org>
I still don't fully understand the cause but the difference between
the samples that trigger the bug and the samples that don't is
that the former uses delay frames and the later uses drop frames
as placeholders for the packed frame. So, if we see the one type
of frame, we can assume the bug will or won't be present.
Right now, I'm detecting the frame types by size, which may not be
safe in general, but given the specific codec and file type, I
expect any scenario where we encounter these frames where they
aren't being used for b-frame packing won't care one way or
another whether the work around is in effect or not.
Signed-off-by: Philip Langdale <philipl@overt.org>
The CrystalHD hardware can do scaling, which is particularly
desirable when dealing with some high resolution clips that take
so long to decode and copy out that they end up playing back
slower than realtime. By using scaling, we can make the output
frames smaller and reduce the copy out time.
This option takes the desired horizontal width in pixels, and
the hardware will do an aspect-scale. Upscaling is not supported
and the hardware will simply ignore any request to do so.
Signed-off-by: Philip Langdale <philipl@overt.org>
I was using the wrong value to track the position of the parser in the
stream. For an error-free stream, the size of the frame and number of
bytes consumed will be the same, but in an error situation they can
diverge.
Signed-off-by: Philip Langdale <philipl@overt.org>
As previously discussed, the CrystalHD hardware returns exceptionally
useless information about interlaced h.264 content - to the extent
that it's not possible to distinguish most MBAFF and PAFF content until
it's too late.
In an attempt to compensate for this, I'm introducing two mechanisms:
1) Peeking at the picture number of the next picture
The hardware provides a capability to peek the next picture number. If
it is the same as the current picture number, then we are clearly dealing
with two fields and not a frame or fieldpair.
If this always worked, it would be all we need, but it's not guaranteed
to work. Sometimes, the next picture may not be decoded sufficiently
for the number to be known; alternately, a corruption in the stream may
cause the hardware to refuse to return the number even if the next
intact frame is decoded. In either case, the query will return 0.
If we are unable to peek the next picture number, we assume that the
picture is a frame/fieldpair and return it accordingly. If that turns
out to be incorrect, we discard the second field, and the user has
to live with the glitch. In testing, false detection can occur for
the first couple of seconds, and then the pipeline stabalizes and
we get correct detection.
2) Use the h264_parser to detect when individual input fields have
been combined into an output fieldpair.
I have multiple PAFF samples where this behaviour is detected. The
peeking mechanism described above will correctly detect that the
output is a fieldpair, but we need to know what the input type was
to ensure pipeline stability (only return one output frame per input
frame).
If we find ourselves with an output fieldpair, yet the input picture
type was a field, as reported by the parser, then we are dealing with
this case, and can make sure not to return anything on the next
decode() call.
Taken together, these allow us to remove the hard-coded hacks for
different h.264 types, and we can clearly describe the conditions
under which we can trust the hardware's claim that content is
interlaced.
Signed-off-by: Philip Langdale <philipl@overt.org>
Now that we know the type of the input picture, we have to bring
that information to the output picture to help identify its type.
We do this by adding a field to the opaque_list node.
Signed-off-by: Philip Langdale <philipl@overt.org>
As the hardware is unreliable, we will have to use the h.264 parser
to identify whether an input picture is a field or a frame. This
change loads the parser and extracts the picture type.
Signed-off-by: Philip Langdale <philipl@overt.org>
In preparation for adding additional fields to the node, return
the node instead of the pts value. This requires the caller to
free the node.
Signed-off-by: Philip Langdale <philipl@overt.org>
I found another MBAFF sample where the input:output pattern is
the same as mpeg2 and vc1 (fieldpair input, individual field output).
While I'm not sure how you can output individual fields from MBAFF,
if I apply the mpeg2/vc1 handling to this file, it plays correctly.
So, this changes the detection algorithm to handle the known cases.
Whitespace will be fixed in a separate change.
Signed-off-by: Philip Langdale <philipl@overt.org>
As previously discussed, the CrystalHD hardware treats some PAFF
clips different from others; even when input fields are always in
separate packets, the hardware might return a single fieldpair for
one clip and individual fields for another.
Given the bogus flags set by the hardware, it is impossible to
distinguish these two cases without knowing about the current
picture and the next one. The hardware can usually provide the
picture number of the next picture and when that is available,
we can detect the two cases.
When it is not available, we have to guess - and find out later
if we were right or wrong.
With this change, clips will play correctly unless they are PAFF
where individual fields are returned *and* no next picture number
is available. Generally speaking, the incorrect cases arise in
the first couple of seconds of a clip as the delay calibration takes
place. Once that's set, things work fine.
As previously discussed, the CrystalHD hardware returns exceptionally
useless information about interlaced h.264 content - to the extent
that it's not possible to distinguish MBAFF and PAFF content until
it's too late.
This change introduces use of the h264_parser to help bridge the
gap; it can indicate if the input data is PAFF fields or not.
With this clarity, some of heuristics can be removed from the code,
making this less convoluted.
Finally, I found an MBAFF clip that acts like non h.264 content so
I had to make allowances for that.
Note that I still cannot distinguish between two forms of PAFF,
where the hardware either returns individual fields or a field-pair.
It's not clear that there's even a spec relevant difference between
the two forms, as opposed to hardware ideosyncracies.