Also fix a couple of possible overflows while at it.
Fixes the negative initial timestamps in ticket #10358.
Signed-off-by: Marton Balint <cus@passwd.hu>
Not sure if the function naming frame_queue_destory is intended because
"destory" is not really a word. Changing it to "destroy" makes more sense.
Signed-off-by: QiTong Li <liqitong@163.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
This is only a preparatory step to a fully threaded architecture and
does not yet make decoding truly parallel - the main thread will
currently submit a packet and wait until it has been fully processed by
the decoding thread before moving on. Decoder behavior as observed by
the rest of the program should remain unchanged. That will change in
future commits after encoders and filters are moved to threads and a
thread-aware scheduler is added.
It is only used for flushing the subtitle decoder, so allocate a
dedicated packet for that.
Keep Decoder.pkt unused for now, it will be repurposed in future
commits.
Make the function process just one input stream at a time and save an
indentation level. Also rename it to ist_add() to be consistent with an
analogous function in ffmpeg_mux_init.
Currently of_output_packet() reuses the input packet, which requires its
callers to submit blank packets even on EOF, which makes the code more
complex.
It is set by the muxing code, which will not be synchronized with
encoding code after upcoming threading changes. Use an encoder-private
variable instead.
Packets submitted to the muxer now have their timebase attached to them,
so the muxer can do conversion to muxing timebase and avoid exposing it
to callers.
There is no reason to postpone it until opening the encoder. Also, abort
when the input stream is unknown, rather than disregard an explicit
request from the user.
The code will currently add a small offset to avoid exact midpoints, but
this can cause inexact results when a float timestamp is exactly
representable as an integer.
Fixes off-by-one in the first frame duration in multiple FATE tests.
Current code marks the output stream as finished and waits for a flush
packet, but that is both unnecessary and suspect, as in theory nothing
should be sent to a finished stream - not even flush packets.
Make all relevant state per-filtergraph input, rather than per-input
stream. Refactor the code to make it work and avoid leaking memory when
a single subtitle stream is sent to multiple filters.
Set them in ifilter_parameters_from_dec(), similarly to audio/video
streams. This reduces the extent to which sub2video filters need to be
treated specially.
This function should not take an InputStream, as it only uses it to get
the InputFile and the timebase. Pass those directly instead and avoid
confusion over dealing with multiple InputStreams.
This queue should be associated with a specific filtergraph input - if
a subtitle stream is sent to multiple filters then each should have its
own queue.
This code is a sub2video analogue of ifilter_send_frame(), so it
properly belongs to the filtering code.
Note that using sub2video with more than one target for a given input
subtitle stream is currently broken and this commit does not change
that. It will be addressed in following commits.
When the filtergraph has no inputs, it can be configured immediately
when all its outputs are bound to output streams. This will simplify
treating some corner cases.
This way the list of filtergraph inputs/outputs is always known after
FilterGraph creation. This will allow treating simple and complex
filtergraphs in a more uniform manner.
Currently NULL would be passed for simple filtergraphs, which would
make the filter code extract the graph description from the output
stream when needed. This is unnecessarily convoluted.
A non-limiting stream could mistakenly end up being the queue head,
which would then produce incorrect synchronization, seen e.g. in
fate-matroska-flac-extradata-update for certain number of frame threads
(e.g. 5).
Found-By: James Almer
Checking whether the user requested an unsupported conversion between
text and bitmap subtitles can be done immediately when creating the
output stream.
This function is entangled with encoder setup, so it is more encoding
code rather than ffmpeg_hw code. This will allow making more encoder
state private in the future.
This function is entangled with decoder setup, so it is more decoding
code rather than ffmpeg_hw code. This will allow making more decoder
state private in the future.
As the comment in the code mentions, EAGAIN is not an expected value here
because we call avcodec_receive_frame() until all frames have been returned.
avcodec_send_packet() returning EAGAIN means a packet is still buffered, which
hints that the underlying decoder is buggy and not fetching packets as it
should.
An example of this behavior was in the libdav1d wrapper before f209614290,
where feeding it split frames (or individual OBUs) would result in the CLI
eventually printing the confusing "Error submitting packet to decoder: Resource
temporarily unavailable" error message, and just keep going until EOF without
returning new frames.
Signed-off-by: James Almer <jamrial@gmail.com>
It currently emulates the long-removed
avcodec_decode_audio4/avcodec_decode_video2 APIs, which obfuscates the
actual decoding flow. Restructure the decoding calls so that they
naturally follow the new avcodec_send_packet()/avcodec_receive_frame()
design.
This is not only significantly easier to read, but also shorter.
It is currently handled in the same loop as audio and video, but this
obscures the actual flow, because only one iteration is ever performed
for subtitles.
Also, avoid a pointless packet reference.
process_input_packet() contains two non-interacting pieces of nontrivial
size and complexity - decoding and streamcopy. Separating them makes the
code easier to read.
New placement requires fewer explicit conditions and is easier to
understand.
The logic should be exactly equivalent, since this is the only place
where eof_reached is set for decoding.
Passing ist=NULL is currently used to identify stream types that do not
decode into AVFrames, i.e. subtitles. That is highly non-obvious -
always pass a non-NULL InputStream and just check the type explicitly.
It tracks whether the decoder for this stream ever produced any frames
and its only use is for checking whether a filter input ever received a
frame - those that did not are prioritized by the scheduler.
This is awkward and unnecessarily complicated - checking whether the
filtergraph input format is valid works just as well and does not
require maintaining an extra variable.
In case no decoder is available, dec_open() called from ist_use() will
fail with 'Decoding requested, but no decoder found', so this check is
redundant.
When a filtergraph input receives EOF but never saw any input frames, we
use the fallback parameters. Currently an attempt to actually configure
the filtergraph will happen elsewhere, but there is no reason to
postpone this.
With complex filtergraphs it can happen that the filtergraph is
unconfigured because some other filter than the one we just got EOF on
is missing parameters.
Make sure that the fallback parametes for a given input are only used
when that input is unconfigured.
It does no initialization anymore, except for setting
transcode_init_done - the bulk of the function is printing the
input/output maps. It also cannot fail anymore, so remove the useless
return value.
Export the corresponding flag in InputFile instead. This will allow
making the demuxer AVFormatContext private in future commits, similarly
to what was previously done for muxers.
There is no point in having a per-stream wallclock start time, since
they are all computed at the same instant. Keep a per-file start time
instead, initialized when the demuxer thread starts.
That is a more appropriate place for this code and will allow hiding
more of InputStream.
The value of repeat_pict extracted from libavformat internal parser no
longer needs to be trasmitted outside of the demuxing thread.
Move readrate handling to the demuxer thread. This has to be done in the
same commit, since it reads InputStream.dts,nb_packets, which are now
set in the demuxer thread.
This way computing it and using it for streamcopy does not need to
happen in sync. Will be useful in following commits, where updating
InputStream.dts will be moved to the demuxing thread.
This code runs post-demuxing and is not synchronized with the decoder
output (which may be delayed with respect to its input by arbitrary and
unknowable amounts), so accessing any decoder properties is incorrect.
Move them to a separate function called right after timestamp
discontinuity processing. This is now possible, since these values have
no interaction with decoding anymore.
The two checks using eof_reached are testing whether more input can
possibly appear on this filtergraph input. InputFilterPriv.eof is the
more authoritative source for this information.
When an input stream terminates and no frames were successfully decoded,
filtering code will currently configure the filtergraph using demuxer
stream parameters. Use decoder parameters instead, which should be more
reliable. Also, initialize them immediately when an input stream is
bound to a filtergraph input, so that these parameters are always
available (if at all) and filtering code does not need to reach into the
decoder at some arbitrary later point.
When no frames are ever seen by an encoder, encoder flush will do a
last-ditch attempt to configure its source filtergraph in order to at
least get the stream parameters. This involves extracting demuxer
parameters from filtergraph source inputs, which is
* a bad layering violation
* probably unreachable, because decoders are flushed before encoders,
which should call ifilter_send_eof(), which will also set these
parameters; however due to complex control flow it is hard to be
entirely sure this code can never be triggered
Even if this code can actually be reached, it is probably better to
return an error as the comment above it says.
These two functions are a part of a single logical action - determining
which, if any, output stream needs to be processed next. Keeping them
separate is a historical artifact that obscures what is actually being
done.
Currently those are set in different ways depending on whether the
stream is decoded or not, using some values from the decoder if it is.
This is wrong, because there may be arbitrary amount of delay between
input packets and output frames (depending e.g. on the thread count when
frame threading is used).
Always use the path that was previously used only for streamcopy. This
should not cause any issues, because these values are now used only for
streamcopy and discontinuity handling.
This change will allow to decouple discontinuity processing from
decoding and move it to ffmpeg_demux. It also makes the code simpler.
Changes output in fate-cover-art-aiff-id3v2-remux and
fate-cover-art-mp3-id3v2-remux, where attached pictures are now written
in the correct order. This happens because InputStream.dts is no longer
reset to AV_NOPTS_VALUE after decoding, so streamcopy actually sees
valid dts values.
This was added in 380db56928 as a
temporary crutch that is not needed anymore. The only case where this
code can be triggered is the very first frame, for which InputStream.pts
is always equal to 0.
Stop using InputStream.dts for generating missing timestamps for decoded
frames, because it contains pre-decoding timestamps and there may be
arbitrary amount of delay between input packets and output frames (e.g.
dependent on the thread count when frame threading is used). It is also
in AV_TIME_BASE (i.e. microseconds), which may introduce unnecessary
rounding issues.
New code maintains a timebase that is the inverse of the LCM of all the
samplerates seen so far, and thus can accurately represent every audio
sample. This timebase is used to generate missing timestamps after
decoding.
Changes the result of the following FATE tests
* pcm_dvd-16-5.1-96000
* lavf-smjpeg
* adpcm-ima-smjpeg
In all of these the timestamps now better correspond to actual frame
durations.
If input packets have timestamps, they will be propagated to output
frames by the decoder, so at best this block does not do anything.
There can also be an arbitrary amount of delay between packets sent to
the decoder and decoded frames (e.g. due to decoder's intrinsic delay or
frame threading), so deriving any timestamps from packet properties is
wrong.
The previous name is misleading, because the function does not actually
initialize any filters - it creates a new output stream and binds a
filtergraph output to it.
Their only function is checking that encoding was not specified for
data/unknown-type streams, but the check is broken because enc_ctx will
not be created in ost_add() unless a valid encoder can be found.
Add an actually working check for all types for which encoding is not
supported in choose_encoder().
ts_discontinuity_detect() is applied right after demuxing, while
InputStream.pts is a post-decoding timestamp, which may be delayed with
respect to demuxing by an arbitrary amount (e.g. depending on the thread
count when frame threading is used).
The name is misleading, because it is not a picture in the sense of MPEG
terminology (which define "picture" as "frame or field"), but always a
full frame. 'next' is also redundant and/or misleading, because it is
the _current_ frame to be encoded.
Previously they would only be used with trivial filtergraphs, because
filters did not handle frame durations. That is no longer true - most
filters process frame durations properly (there may still be some that
don't - this change will help finding and fixing them).
Improves output video frame durations in a number of FATE tests.
When an encoder exports sum-of-squared-differences information in
encoded packets, print_report() will print PSNR information in the
status line. However,
* the code computing PSNR assumes 8bit 420 video and prints incorrect
values otherwise; there are no issues on trac about this
* only a few encoders (namely aom, vpx, mpegvideo, snow) export this
information; other often-used encoders such as libx26[45] do not
export this, even though they could
This suggests that this feature is not useful and it is better to remove
it rather than spend effort on fixing it.
Remove now-obsolete code setting packet durations pre-muxing for CFR
encoded video.
Changes output in the following FATE tests:
* numerous adpcm tests
* ffmpeg-filter_complex_audio
* lavf-asf
* lavf-mkv
* lavf-mkv_attachment
* matroska-encoding-delay
All of these change due to the fact that the output duration is now
the actual input data duration and does not include padding added by
the encoder.
* apng-osample: less wrong packet durations are now passed to the muxer.
They are not entirely correct, because the first frame duration should
be 3 rather than 2. This is caused by the vsync code and should be
addressed later, but this change is a step in the right direction.
* tscc2-mov: last output frame has a duration of 11 rather than 1 - this
corresponds to the duration actually returned by the demuxer.
* film-cvid: video frame durations are now 2 rather than 1 - this
corresponds to durations actually returned by the demuxer and matches
the timestamps.
* mpeg2-ticket6677: durations of some video frames are now 2 rather than
1 - this matches the timestamps.
That field was added to store timestamp conversion state for audio
decoding code. Later it started being used by streamcopy as well, but
that use is wrong because, for a given input stream, both decoding and
an arbitrary number of streamcopies may be performed simultaneously.
They would then all overwrite the same state variable.
Store this state in MuxStream instead.
This is the last use of InputStream in of_streamcopy(), so the ist
parameter can now be removed.
It stores codec parameters of the stream submitted to the muxer, which
may be different from the codec parameters in AVStream due to bitstream
filtering.
This avoids the confusing back and forth synchronisation between the
encoder, bitstream filters, and the muxer, now information flows only in
one direction. It also reduces the need for non-muxing code to access
AVStream.
Reduces access to a deeply nested muxer property
OutputStream.st->codecpar->codec_type for this fundamental and immutable
stream property.
Besides making the code shorter, this will allow making the AVStream
(OutputStream.st) private to the muxer in the future.
Set InputStream.decoding_needed/discard/etc. only from
ist_{filter,output},add() functions. Reduces the knowledge of
InputStream internals in muxing/filtering code.
init_input_stream() can print log messages directly, there is no need to
ship them to the caller.
Also, log errors to the InputStream and avoid duplicate information in
the message.
Changing AVCodecContext.sample_aspect_ratio after the encoder was opened
is by itself questionable, but if anywhere it belongs in encoding rather
than filtering code.
Creating a new output stream of a given type is currently done by
calling new_<type>_stream(), which all start by calling
new_output_stream() to allocate the stream and do common init, followed
by type-specific init.
Reverse this structure - the caller now calls the common function
ost_add() with the type as a parameter, which then calls the
type-specific function internally. This will allow adding common code
that runs after type-specific code in future commits.
In most cases this should only occur once or so per stream in an
input, and in case the logic ends up in an eternal loop, it should
be visible to the end user instead of being completely invisible.
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
When no packet dts values are available from the container, video
decoding code will currently use its own guessed values, which will then
be propagated to pkt_dts on decoded frames and used as pts in certain
cases. This is inaccurate, fragile, and unnecessarily convoluted.
Simply removing this allows the extrapolation code introduced in the
previous commit to do a better job.
Changes the results of numerous h264 and hevc FATE tests, where no
spurious timestamp gaps are generated anymore. Several tests no longer
need -vsync passthrough.
When no timestamps are available from the container, the video decoding
code will currently use fake dts values - generated in
process_input_packet() based on a combination of information from the
decoder and the parser (obtained via the demuxer) - to generate
timestamps during decoder flushing. This is fragile, hard to follow, and
unnecessarily convoluted, since more reliable information can be
obtained directly from post-decoding values.
The new code keeps track of the last decoded frame pts and estimates its
duration based on a number of heuristics. Timestamps generated when both
pts and pkt_dts are missing are then simple pts+duration of the last frame.
The heuristics are somewhat complicated by the fact that lavf insists on
making up packet timestamps based on its highly incomplete information.
That should be removed in the future, allowing to further simplify this
code.
The results of the following tests change:
* h264-3386 now requires -fps_mode passthrough to avoid dropping frames
at the end; this is a pathology of the interaction of the new and old
code, and the fact that the sample switches from field to frame coding
in the last packet, and will be fixed in following commits
* hevc-conformance-DELTAQP_A_BRCM_4 stops inventing an arbitrary
timestamp gap at the end
* hevc-small422chroma - the single frame output by this test now has a
timestamp of 0, rather than an arbitrary 7
This field contains different values depending on whether the stream is
being decoded or not. When it is, InputStream.pts is set to the
timestamp of the last decoded frame. Otherwise, it is made equal to
InputStream.dts.
Since a given InputStream can be at the same time decoded and
streamcopied to any number of output streams, this use is incorrect, as
decoded frame timestamps can be delayed with respect to input packets by
an arbitrary amount (e.g. depending on the thread count when frame
threading is used).
Replace all uses of InputStream.pts for streamcopy with InputStream.dts,
which is its value when decoding is not performed. Stop setting
InputStream.pts for pure streamcopy.
Also, pass InputStream.dts as a parameter to do_streamcopy(), which
will allow that function to be decoupled from InputStream completely in
the future.