1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-11-23 21:54:53 +02:00
Commit Graph

13 Commits

Author SHA1 Message Date
James Almer
cce85642c9 fftools/ffmpeg_sched: add a function to remove a filtergraph from the scheduler
For the purpose of merging streams in a stream group, a filtergraph can't be
created once we know it will be used. Therefore, allow filtergraphs to be
removed from the scheduler after being added.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-10-30 11:02:01 -03:00
Niklas Haas
5f4cbb5617 fftools/ffmpeg_sched: choke inputs during filtergraph configuration
Currently, while the filter graph is being initially created, the scheduler
continues demuxing frames on the last input that happened to be active before
the filter graph was complete.

This can lead to an excess number of decoded frames "piling" up on this input,
regardless of whether or not it will actually be requested by the configured
filter graph. Suspending the filter graph during this initialization phase
reduces the amount of wasted memory.
2025-09-30 13:16:59 +02:00
Niklas Haas
fd4b5b24ce fftools/ffmpeg_sched: lower default frame queue size
I tested this extensively under different conditions and could not come up
with any scenario where using a larger queue size was actually beneficial.
Moreover, having such a large default queue is very wasteful especially
for larger frame sizes; and can in the worst case lead to an extra ~50% memory
footprint per input (with the default 16 threads), regardless of whether that
input is currently in use or not.

My methodology was to add logging in the event of a queue underrun/overrun,
and then observe and then observe the frequency of such events in practice,
as well as the impact on performance. I came up with an example filter graph
involving decoding, filtering and encoding with several input files and
various changes to move the bottleneck around.

I found that, in all configurations I tested, with all thread counts and
bottlenecks, using a queue size of 2 frames yielded practically identical
performance to a queue size of 8 frames. I was only able to consistently
measure a slowdown when restricting the queue to a single frame, where the
underruns ended up making up almost 1.1% of frame events in the worst case.

A summary of my test log follows:

= Bottleneck in decoder =

ffmpeg -i A -i B -i C -filter_complex "concat=n=3" -f null -

== 16 threads ==

=== Queue statistics (dec -> filtergraph) ===
- 8 frames = 91355 underruns, 1 overrun
- 4 frames = 91381 underruns, 2 overruns
- 2 frames = 91326 underruns, 21 overruns
- 1 frame  = 91284 underruns, 102 overruns

=== Time elapsed ===
- 8 frames = 14.37s
- 4 frames = 14.28s
- 2 frames = 14.27s
- 1 frame  = 14.35s

== 1 thread ==

=== Queue statistics (dec -> filtergraph) ===
- 8 frames = 91801 underruns, 0 overruns
- 4 frames = 91929 underruns, 1 overrun
- 2 frames = 91854 underruns, 7 overruns
- 1 frame  = 91745 underrons, 83 overruns

=== Time elapsed ===
- 8 frames = 39.51s
- 4 frames = 39.94s
- 2 frames = 39.91s
- 1 frame  = 41.69s

= Bottleneck in filter graph: =

ffmpeg -i A -i B -i C -filter_complex "concat=n=3,scale=3840x2160" -f null -

== 16 threads ==

=== Queue statistics (dec -> filtergraph) ===
- 8 frames =  277 underruns, 84673 overruns
- 4 frames =  640 underruns, 86523 overruns
- 2 frames =  850 underruns, 88751 overruns
- 1 frame  = 1028 underruns, 89957 overruns

=== Time elapsed ===
- 8 frames = 26.35s
- 4 frames = 26.31s
- 2 frames = 26.38s
- 1 frame  = 26.55s

== 1 thread ==

=== Queue statistics (dec -> filtergraph) ===
- 8 frames = 29746 underruns, 57033 overruns
- 4 frames = 29940 underruns, 58948 overruns
- 2 frames = 30160 underruns, 60185 overruns
- 1 frame  = 30259 underruns, 61126 overruns

=== Time elapsed ===
- 8 frames = 52.08s
- 4 frames = 52.49s
- 2 frames = 52.25s
- 1 frame  = 52.69s

= Bottleneck in encoder: =

ffmpeg -i A -i B -i C -filter_complex "concat=n=3" -c:v libx264 -preset veryfast -f null -

== 1 thread ==

== Queue statistics (filtergraph -> enc) ==
- 8 frames = 26763 underruns, 63535 overruns
- 4 frames = 26863 underruns, 63810 overruns
- 2 frames = 27243 underruns, 63839 overruns
- 1 frame  = 27670 underruns, 63953 overruns

== Time elapsed ==
- 8 frames = 89.45s
- 4 frames = 89.04s
- 2 frames = 89.24s
- 1 frame  = 90.26s
2025-09-30 13:16:59 +02:00
Timo Rothenpieler
262d41c804 all: fix typos found by codespell 2025-08-03 13:48:47 +02:00
Anton Khirnov
68c198fae2 fftools/ffmpeg_sched: allow decoders to have multiple outputs
Will be useful for multilayer video.
2024-09-23 17:15:02 +02:00
Anton Khirnov
255ae03601 fftools/ffmpeg_sched: allow filtergraphs to send to filtergraphs
Will be useful for filtergraph chaining that will be added in following
commits.
2024-04-09 10:34:18 +02:00
Mark Thompson
7f4b8d2f5e ffmpeg: set extra_hw_frames to account for frames held in queues
Since e0da916b8f the ffmpeg utility has
held multiple frames output by the decoder in internal queues without
telling the decoder that it is going to do so.  When the decoder has a
fixed-size pool of frames (common in some hardware APIs where the output
frames must be stored as an array texture) this could lead to the pool
being exhausted and the decoder getting stuck.  Fix this by telling the
decoder to allocate additional frames according to the queue size.
2024-03-19 22:56:56 +00:00
Anton Khirnov
efab83c156 fftools/ffmpeg_sched: allow connecting encoder output to decoders 2024-03-13 08:01:15 +01:00
Anton Khirnov
2ee9362419 fftools/ffmpeg: remove unncessary casts for *_thread() return values
These functions used to be passed directly to pthread_create(), which
required them to return void*. This is no longer the case, so they can
return a plain int.
2024-03-13 08:01:15 +01:00
Anton Khirnov
e0da916b8f fftools/ffmpeg: optimize inter-thread queue sizes
Use 8 packets/frames by default rather than 1, which seems to provide
better throughput.

Allow -thread_queue_size to set the muxer queue size manually again.
2024-01-28 13:34:56 +01:00
Anton Khirnov
00013341df fftools/ffmpeg_sched: add filter API to signal EOF on input 2024-01-27 09:24:29 +01:00
Anton Khirnov
2305091a3a fftools/ffmpeg: update the reported timestamp at the end
Reported-by: microchip
2023-12-14 20:16:54 +01:00
Anton Khirnov
9b8cc36ce0 fftools/ffmpeg: add thread-aware transcode scheduling infrastructure
See the comment block at the top of fftools/ffmpeg_sched.h for more
details on what this scheduler is for.

This commit adds the scheduling code itself, along with minimal
integration with the rest of the program:
* allocating and freeing the scheduler
* passing it throughout the call stack in order to register the
  individual components (demuxers/decoders/filtergraphs/encoders/muxers)
  with the scheduler

The scheduler is not actually used as of this commit, so it should not
result in any change in behavior. That will change in future commits.
2023-12-12 08:24:18 +01:00