Encoder initialization is currently split rather arbitrarily between
init_output_stream_encode() and init_output_stream(). Move all of it to
init_output_stream_encode().
The code currently uses lavfi for this, which creates a sort of
configuration dependency loop - the encoder should be ideally
initialized with information from the first audio frame, but to get this
frame one needs to first open the encoder to know the frame size. This
necessitates an awkward workaround, which causes audio handling to be
different from video.
With this change, audio encoder initialization is congruent with video.
Frame counters can overflow relatively easily (INT_MAX number of frames is
slightly more than 1 year for 60 fps content), so make sure we use 64 bit
values for them.
Also deprecate the old 32 bit frame_number attribute.
Signed-off-by: Marton Balint <cus@passwd.hu>
Analogous to -enc_stats*, but happens right before muxing. Useful
because bitstream filters and the sync queue can modify packets after
encoding and before muxing. Also has access to the muxing timebase.
Splits the currently handled subtitle at random access point
packets that can be configured to follow a specific output stream.
Currently only subtitle streams which are directly mapped into the
same output in which the heartbeat stream resides are affected.
This way the subtitle - which is known to be shown at this time
can be split and passed to muxer before its full duration is
yet known. This is also a drawback, as this essentially outputs
multiple subtitles from a single input subtitle that continues
over multiple random access points. Thus this feature should not
be utilized in cases where subtitle output latency does not matter.
Co-authored-by: Andrzej Nadachowski <andrzej.nadachowski@24i.com>
Co-authored-by: Bernard Boulay <bernard.boulay@24i.com>
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
This way we can call process_subtitles without causing the decoded
frame counter to get bumped.
Additionally, this now takes into mention all of the decoded
subtitle frames without fix_sub_duration latency/buffering, or filtering
out decoded reset/end subtitles without any rendered rectangles, which
matches the original intent in 4754345027
.
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
This enables us to later call this when generating additional
subtitles for splitting purposes.
Co-authored-by: Andrzej Nadachowski <andrzej.nadachowski@24i.com>
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
Rather than the encoder timebase. Since the times are parsed as
microseconds, this will not reduce precision, except possibly when
chapter times are used and the chapter timebase happens to be better
aligned with the encoder timebase, which is unlikely.
This will allow parsing the keyframe times earlier (before encoder
timebase is known) in future commits.
There are 8 of them and they are typically used together. Allows to pass
just this struct to forced_kf_apply(), which makes it clear that the
rest of the OutputStream is not accessed there.
Replace it with an array of streams in each InputFile. This is a more
accurate reflection of the actual relationship between InputStream and
InputFile.
Analogous to what was previously done to output streams in
7ef7a22251.
Encoding init code will currently fall back to a 25fps default when no
framerate is known or specified, but only if there is a known source
input stream. There is no good reason for this condition, so drop it.
Frame limiting is now handled using sync queues. This code prevents the
sync queue from triggering EOF, resulting in unnecessarily many frames
being decoded, filtered, and then discarded.
Found-by: U. Artie Eoff <ullysses.a.eoff@intel.com>
The current adjustment of input start times just adjusts the tsoffset.
And it does so, by resetting the tsoffset to nullify the new start time.
This leads to breakage of -copyts, ignoring of input_ts_offset, breaking
of -isync as well as breaking wrap correction.
Fixed by taking cognizance of these parameters, and by correcting start times
just before sync offsets are applied.
Now that we have proper options for defining display matrix
overrides, this should no longer be required.
fftools does not have its own versioning, so for now the define is
just set to 1 and disables the functionality if set to zero.
Replace it with an array of streams in each OutputFile. This is a more
accurate reflection of the actual relationship between OutputStream and
OutputFile. This is easier to handle and will allow further
simplifications in future commits.
It has been deprecated in favor of the aresample filter for almost 10
years.
Another thing this option can do is drop audio timestamps and have them
generated by the encoding code or the muxer, but
- for encoding, this can already be done with the setpts filter
- for muxing this should almost never be done as timestamp generation by
the muxer is deprecated, but people who really want to do this can use
the setts bitstream filter
update_video_stats() currently uses OutputStream.data_size to print the
total size of the encoded stream so far and the average bitrate.
However, that field is updated in the muxer thread, right before the
packet is sent to the muxer. Not only is this racy, but the numbers may
not match even if muxing was in the main thread due to bitstream
filters, filesize limiting, etc.
Introduce a new counter, data_size_enc, for total size of the packets
received from the encoder and use that in update_video_stats(). Rename
data_size to data_size_mux to indicate its semantics more clearly.
No synchronization is needed for data_size_mux, because it is only read
in the main thread in print_final_stats(), which runs after the muxer
threads are terminated.
It is either equal to OutputStream.enc_ctx->codec, or NULL when enc_ctx
is NULL. Replace the use of enc with enc_ctx->codec, or the equivalent
enc_ctx->codec_* fields where more convenient.
It races with the demuxing thread. Instead, send the information along
with the demuxed packets.
Ideally, the code should stop using the stream-internal parsing
completely, but that requires considerably more effort.
Fixes races, e.g. in:
- fate-h264-brokensps-2580
- fate-h264-extradata-reload
- fate-iv8-demux
- fate-m4v-cfr
- fate-m4v
Use it instead of AVStream.codecpar in the main thread. While
AVStream.codecpar is documented to only be updated when the stream is
added or avformat_find_stream_info(), it is actually updated during
demuxing. Accessing it from a different thread then constitutes a race.
Ideally, some mechanism should eventually be provided for signalling
parameter updates to the user. Then the demuxing thread could pick up
the changes and propagate them to the decoder.
Discontinuity detection/correction is left in the main thread, as it is
entangled with InputStream.next_dts and related variables, which may be
set by decoding code.
Fixes races e.g. in fate-ffmpeg-streamloop after
aae9de0cb2.
This will allow to move normal offset handling to demuxer thread, since
discontinuities currently have to be processed in the main thread, as
the code uses some decoder-produced values.
InputFile.ts_offset can change during transcoding, due to discontinuity
correction. This should not affect the streamcopy starting timestamp.
Cf. bf2590aed3
-stream_loop is currently handled by destroying the demuxer thread,
seeking, then recreating it anew. This is very messy and conflicts with
the future goal of moving each major ffmpeg component into its own
thread.
Handle -stream_loop directly in the demuxer thread. Looping requires the
demuxer to know the duration of the file, which takes into account the
duration of the last decoded audio frame (if any). Use a thread message
queue to communicate this information from the main thread to the
demuxer thread.
This avoids a potential race with the demuxer adding new streams. It is
also more efficient, since we no longer do inter-thread transfers of
packets that will be just discarded.
This undocumented feature runtime-enables dumping input packets. I can
think of no reasonable real-world use case that cannot also be
accomplished in a different way. Keeping this functionality would
interfere with the following commit moving it to the input thread (then
setting the variable would require locking or atomics, which would be
unnecessarily complicated for a feature that probably nobody uses).
There are currently three possible modes for an output stream:
1) The stream is produced by encoding output from some filtergraph. This
is true when ost->enc_ctx != NULL, or equivalently when
ost->encoding_needed != 0.
2) The stream is produced by copying some input stream's packets. This
is true when ost->enc_ctx == NULL && ost->source_index >= 0.
3) The stream is produced by attaching some file directly. This is true
when ost->enc_ctx == NULL && ost->source_index < 0.
OutputStream.stream_copy is currently used to identify case 2), and
sometimes to confusingly (or even incorrectly) identify case 1). Remove
it, replacing its usage with checking enc_ctx/source_index values.
The streamcopy initialization code briefly needs an AVCodecContext to
apply AVOptions to. Allocate a temporary codec context, do not use the
encoding one.
It retrieves the muxer's internal timestamp with under-defined
semantics. Continuing to use this value would also require
synchronization once the muxer is moved to a separate thread.
Replace the value with last_mux_dts.
This field means different things when the video is encoded (number of
frames emitted to the encoding sync queue/encoder by the video sync
code) or copied (number of packets sent to the muxer sync queue).
Print the value of packets_written instead, which means the same thing
in both cases. It is also more accurate, since packets may be dropped by
the sync queue or bitstream filters.
Same issues apply to it as to -shortest.
Changes the results of the following tests:
- matroska-flac-extradata-update
The test reencodes two input FLAC streams into three output FLAC
streams. The last output stream is limited to 8 frames. The current
code results in the first two output streams having 12 frames, after
this commit all three streams have 8 frames and are the same length.
This new result is better, since it is predictable.
- mkv-1242
The test streamcopies one video and one audio stream, video is limited
to 11 frames. The new result shortens the audio stream so that it is
not longer than the video.
The -shortest option (which finishes the output file at the time the
shortest stream ends) is currently implemented by faking the -t option
when an output stream ends. This approach is fragile, since it depends
on the frames/packets being processed in a specific order. E.g. there
are currently some situations in which the output file length will
depend unpredictably on unrelated factors like encoder delay. More
importantly, the present work aiming at splitting various ffmpeg
components into different threads will make this approach completely
unworkable, since the frames/packets will arrive in effectively random
order.
This commit introduces a "sync queue", which is essentially a collection
of FIFOs, one per stream. Frames/packets are submitted to these FIFOs
and are then released for further processing (encoding or muxing) when
it is ensured that the frame in question will not cause its stream to
get ahead of the other streams (the logic is similar to libavformat's
interleaving queue).
These sync queues are then used for encoding and/or muxing when the
-shortest option is specified.
A new option – -shortest_buf_duration – controls the maximum number of
queued packets, to avoid runaway memory usage.
This commit changes the results of the following tests:
- copy-shortest[12]: the last audio frame is now gone. This is
correct, since it actually outlasts the last video frame.
- shortest-sub: the video packets following the last subtitle packet are
now gone. This is also correct.
The following commits will add a new buffering stage after bitstream
filters, which should not be taken into account for choosing next
output.
OutputStream.last_mux_dts is also used by the muxing code to make up
missing DTS values - that field is now moved to the muxer-private
MuxStream object.
The current placement of this free is historical - it used to be
followed by avcodec_close(), since removed.
The proper place for freeing the stats is currently right before the
encoder context itself is freed.
It is currently called from two places:
- output_packet() in ffmpeg.c, which submits the newly available output
packet to the muxer
- from of_check_init() in ffmpeg_mux.c after the header has been
written, to flush the muxing queue
Some packets will thus be processed by this function twice, so it
requires an extra parameter to indicate the place it is called from and
avoid modifying some state twice.
This is fragile and hard to follow, so split this function into two.
Also rename of_write_packet() to of_submit_packet() to better reflect
its new purpose.
The muxing queue currently lives in OutputStream, which is a very large
struct storing the state for both encoding and muxing. The muxing queue
is only used by the code in ffmpeg_mux, so it makes sense to restrict it
to that file.
This makes the first step towards reducing the scope of OutputStream.
Figure out earlier whether the output stream/file should be bitexact and
store this information in a flag in OutputFile/OutputStream.
Stop accessing the muxer in set_encoder_id(), which will become
forbidden in future commits.
The current code postpones closing the files until after printing the
final report, which accesses the output file size. Deal with this by
storing the final file size before closing the file.
Move the file size checking code to ffmpeg_mux. Use the recently
introduced of_filesize(), making this code consistent with the size
shown by print_report().
Frame counters can overflow relatively easily (INT_MAX number of frames is
slightly more than 1 year for 60 fps content), so make sure we are always
using 64 bit values for them.
A live stream can easily run for more than a year and the framedup logic breaks
on an overflow.
Signed-off-by: Marton Balint <cus@passwd.hu>
Option was added in commit 39aafa5ee9 but was never documented.
Also does not seem there are current use cases for it,
tests for which it was introduced are still working therefore we drop
it altogether.
Indirectly fix trac issue: http://trac.ffmpeg.org/ticket/1698
Signed-off-by: Marton Balint <cus@passwd.hu>
This is a more appropriate place for this code, since the values we read
from AV_PKT_DATA_QUALITY_STATS side data are primarily written into
video stats. This ensures that the values written into stats actually
apply to the right packet.
Rename the function to update_video_stats() to better reflect its new
purpose.
It retrieves libavformat's internal dts value (contrary to the
function's name), which is not only incorrect in general, but also
unnecessary because we can access the packet directly.
Its use for muxing is not documented, in practice it is incremented per
each packet successfully passed to the muxer's write_packet(). Since
there is a lot of indirection between ffmpeg receiving a packet from the
encoder and it actually being written (e.g. bitstream filters, the
interleaving queue), using nb_frames here is incorrect.
Add a new counter for packets received from encoder instead.
This field is currently used by checks
- skipping packets before the first keyframe
- skipping packets before start time
to test whether any packets have been output already. But since
frame_number is incremented after the bitstream filters are applied
(which may involve delay), this use is incorrect. The keyframe check
works around this by adding an extra flag, the start-time check does
not.
Simplify both checks by replacing the seen_kf flag with a flag tracking
whether any packets have been output by do_streamcopy().
Especially useful when debugging subtitle output, but also shows
if values are set or not for demux and encoding.
Co-authored-by: Jan Ekström <jan.ekstrom@24i.com>
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
This avoids including version.h in all source files, avoiding
unnecessary rebuilds when the version number is bumped. Only
version_major.h is included by the main header, which defines
availability of e.g. FF_API_* macros, and which is bumped much
less often.
This isn't done for libavutil/version.h, because that header needs
to be included essentially everywhere due to LIBAVUTIL_VERSION_INT
being used wherever an AVClass is constructed.
Signed-off-by: Martin Storsjö <martin@martin.st>
Bitstream filters inserted between the input and output were never drained,
resulting in packets being lost if the bsf had any buffered.
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes ticket 9086.
Since early 2021, some of YouTube's VP9 encodes have non-monotonous DTS.
This makes ffmpeg fatally fail when trying to copy or encode the V9 video.
ffmpeg already includes functionality to correct this, however it was
disabled without explanation for VP9 stream copies in
2e6636aa87
This patch restores the DTS correction logic, and allows ffmpeg to correctly
encode (invalid) videos produced by youtube.com. I have verified that frames
are NOT being cut (so it does not re-introduce 4313).
Reviwed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
If the input stream framerate is known, it will be configured on the
relevant filtergraph input and get propagated to the output stream in
the above line. That makes these assignments redundant.
The only caller of do_video_out() doesn't need the frame afterwards,
ergo one can replace an av_frame_ref() by av_frame_move_ref().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Don't mark all streams as finished, instead make sync_opts keep track of the
stream's duration, and set recording_time to it, same as in transcoding paths.
Fixes tickets #9512 and #9513.
Signed-off-by: James Almer <jamrial@gmail.com>
This was almost completely redundant. The only functionality that's no longer
available after this removal is the videotoolbox_pixfmt arg, which has been
obsolete for several years.
The types used by the AVFifo API are inconsistent:
av_fifo_(space|size)() returns an int; av_fifo_alloc() takes an
unsigned, other parts use size_t. This commit therefore ensures
that the size of the muxing_queue FIFO never exceeds INT_MAX.
While just at it, also make sure not to call av_fifo_size()
unnecessarily often.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
send_frame_to_filters() sends a frame to all the filters that
need said frame; for every filter except the last one this involves
creating a reference to the frame, because
av_buffersrc_add_frame_flags() by default takes ownership of
the supplied references. Yet said function has a flag which
changes its behaviour to create a reference itself.
This commit uses this flag and stops creating the references itself;
this allows to remove the spare AVFrame holding the temporary
references; it also avoids unreferencing said frame.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
As well as the custom get_buffer2() implementation which would become a
redundant wrapper for avcodec_default_get_buffer2() after this
Signed-off-by: James Almer <jamrial@gmail.com>
Treat values returned from av_dict_get() as const, since they are
internal to AVDictionary.
Signed-off-by: Chad Fraleigh <chadf@triularity.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Currently, the code doing this is spread over several places and may
behave in unexpected ways. E.g. automatic 'default' marking is only done
for streams fed by complex filtergraphs. It is also applied in the order
in which the output streams are initialized, which is effectively
random.
Move processing the dispositions at the end of open_output_file(), when
we already have all the necessary information.
Apply the automatic default marking only if no explicit -disposition
options were supplied by the user, and apply it to the first stream of
each type (excluding attached pics) when there is more than one stream
of that type and no default markings were copied from the input streams.
Explicitly document the new behavior.
Changes the results of some tests, where the output file gets a default
disposition, while it previously did not.
When viewing logs, it's sometimes useful to be able to see whether
execution was ended via q command.
Signed-off-by: softworkz <softworkz@hotmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>