1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-10-06 05:47:18 +02:00

fftools/ffmpeg_sched: lower default frame queue size

I tested this extensively under different conditions and could not come up
with any scenario where using a larger queue size was actually beneficial.
Moreover, having such a large default queue is very wasteful especially
for larger frame sizes; and can in the worst case lead to an extra ~50% memory
footprint per input (with the default 16 threads), regardless of whether that
input is currently in use or not.

My methodology was to add logging in the event of a queue underrun/overrun,
and then observe and then observe the frequency of such events in practice,
as well as the impact on performance. I came up with an example filter graph
involving decoding, filtering and encoding with several input files and
various changes to move the bottleneck around.

I found that, in all configurations I tested, with all thread counts and
bottlenecks, using a queue size of 2 frames yielded practically identical
performance to a queue size of 8 frames. I was only able to consistently
measure a slowdown when restricting the queue to a single frame, where the
underruns ended up making up almost 1.1% of frame events in the worst case.

A summary of my test log follows:

= Bottleneck in decoder =

ffmpeg -i A -i B -i C -filter_complex "concat=n=3" -f null -

== 16 threads ==

=== Queue statistics (dec -> filtergraph) ===
- 8 frames = 91355 underruns, 1 overrun
- 4 frames = 91381 underruns, 2 overruns
- 2 frames = 91326 underruns, 21 overruns
- 1 frame  = 91284 underruns, 102 overruns

=== Time elapsed ===
- 8 frames = 14.37s
- 4 frames = 14.28s
- 2 frames = 14.27s
- 1 frame  = 14.35s

== 1 thread ==

=== Queue statistics (dec -> filtergraph) ===
- 8 frames = 91801 underruns, 0 overruns
- 4 frames = 91929 underruns, 1 overrun
- 2 frames = 91854 underruns, 7 overruns
- 1 frame  = 91745 underrons, 83 overruns

=== Time elapsed ===
- 8 frames = 39.51s
- 4 frames = 39.94s
- 2 frames = 39.91s
- 1 frame  = 41.69s

= Bottleneck in filter graph: =

ffmpeg -i A -i B -i C -filter_complex "concat=n=3,scale=3840x2160" -f null -

== 16 threads ==

=== Queue statistics (dec -> filtergraph) ===
- 8 frames =  277 underruns, 84673 overruns
- 4 frames =  640 underruns, 86523 overruns
- 2 frames =  850 underruns, 88751 overruns
- 1 frame  = 1028 underruns, 89957 overruns

=== Time elapsed ===
- 8 frames = 26.35s
- 4 frames = 26.31s
- 2 frames = 26.38s
- 1 frame  = 26.55s

== 1 thread ==

=== Queue statistics (dec -> filtergraph) ===
- 8 frames = 29746 underruns, 57033 overruns
- 4 frames = 29940 underruns, 58948 overruns
- 2 frames = 30160 underruns, 60185 overruns
- 1 frame  = 30259 underruns, 61126 overruns

=== Time elapsed ===
- 8 frames = 52.08s
- 4 frames = 52.49s
- 2 frames = 52.25s
- 1 frame  = 52.69s

= Bottleneck in encoder: =

ffmpeg -i A -i B -i C -filter_complex "concat=n=3" -c:v libx264 -preset veryfast -f null -

== 1 thread ==

== Queue statistics (filtergraph -> enc) ==
- 8 frames = 26763 underruns, 63535 overruns
- 4 frames = 26863 underruns, 63810 overruns
- 2 frames = 27243 underruns, 63839 overruns
- 1 frame  = 27670 underruns, 63953 overruns

== Time elapsed ==
- 8 frames = 89.45s
- 4 frames = 89.04s
- 2 frames = 89.24s
- 1 frame  = 90.26s
This commit is contained in:
Niklas Haas
2025-09-03 14:38:38 +02:00
parent 15407cf90b
commit fd4b5b24ce

View File

@@ -257,7 +257,7 @@ int sch_add_mux(Scheduler *sch, SchThreadFunc func, int (*init)(void *),
/**
* Default size of a frame thread queue.
*/
#define DEFAULT_FRAME_THREAD_QUEUE_SIZE 8
#define DEFAULT_FRAME_THREAD_QUEUE_SIZE 2
/**
* Add a muxed stream for a previously added muxer.