FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-11-23 21:54:53 +02:00

Author	SHA1	Message	Date
James Almer	cce85642c9	fftools/ffmpeg_sched: add a function to remove a filtergraph from the scheduler For the purpose of merging streams in a stream group, a filtergraph can't be created once we know it will be used. Therefore, allow filtergraphs to be removed from the scheduler after being added. Signed-off-by: James Almer <jamrial@gmail.com>	2025-10-30 11:02:01 -03:00
Niklas Haas	5f4cbb5617	fftools/ffmpeg_sched: choke inputs during filtergraph configuration Currently, while the filter graph is being initially created, the scheduler continues demuxing frames on the last input that happened to be active before the filter graph was complete. This can lead to an excess number of decoded frames "piling" up on this input, regardless of whether or not it will actually be requested by the configured filter graph. Suspending the filter graph during this initialization phase reduces the amount of wasted memory.	2025-09-30 13:16:59 +02:00
Niklas Haas	fd4b5b24ce	fftools/ffmpeg_sched: lower default frame queue size I tested this extensively under different conditions and could not come up with any scenario where using a larger queue size was actually beneficial. Moreover, having such a large default queue is very wasteful especially for larger frame sizes; and can in the worst case lead to an extra ~50% memory footprint per input (with the default 16 threads), regardless of whether that input is currently in use or not. My methodology was to add logging in the event of a queue underrun/overrun, and then observe and then observe the frequency of such events in practice, as well as the impact on performance. I came up with an example filter graph involving decoding, filtering and encoding with several input files and various changes to move the bottleneck around. I found that, in all configurations I tested, with all thread counts and bottlenecks, using a queue size of 2 frames yielded practically identical performance to a queue size of 8 frames. I was only able to consistently measure a slowdown when restricting the queue to a single frame, where the underruns ended up making up almost 1.1% of frame events in the worst case. A summary of my test log follows: = Bottleneck in decoder = ffmpeg -i A -i B -i C -filter_complex "concat=n=3" -f null - == 16 threads == === Queue statistics (dec -> filtergraph) === - 8 frames = 91355 underruns, 1 overrun - 4 frames = 91381 underruns, 2 overruns - 2 frames = 91326 underruns, 21 overruns - 1 frame = 91284 underruns, 102 overruns === Time elapsed === - 8 frames = 14.37s - 4 frames = 14.28s - 2 frames = 14.27s - 1 frame = 14.35s == 1 thread == === Queue statistics (dec -> filtergraph) === - 8 frames = 91801 underruns, 0 overruns - 4 frames = 91929 underruns, 1 overrun - 2 frames = 91854 underruns, 7 overruns - 1 frame = 91745 underrons, 83 overruns === Time elapsed === - 8 frames = 39.51s - 4 frames = 39.94s - 2 frames = 39.91s - 1 frame = 41.69s = Bottleneck in filter graph: = ffmpeg -i A -i B -i C -filter_complex "concat=n=3,scale=3840x2160" -f null - == 16 threads == === Queue statistics (dec -> filtergraph) === - 8 frames = 277 underruns, 84673 overruns - 4 frames = 640 underruns, 86523 overruns - 2 frames = 850 underruns, 88751 overruns - 1 frame = 1028 underruns, 89957 overruns === Time elapsed === - 8 frames = 26.35s - 4 frames = 26.31s - 2 frames = 26.38s - 1 frame = 26.55s == 1 thread == === Queue statistics (dec -> filtergraph) === - 8 frames = 29746 underruns, 57033 overruns - 4 frames = 29940 underruns, 58948 overruns - 2 frames = 30160 underruns, 60185 overruns - 1 frame = 30259 underruns, 61126 overruns === Time elapsed === - 8 frames = 52.08s - 4 frames = 52.49s - 2 frames = 52.25s - 1 frame = 52.69s = Bottleneck in encoder: = ffmpeg -i A -i B -i C -filter_complex "concat=n=3" -c:v libx264 -preset veryfast -f null - == 1 thread == == Queue statistics (filtergraph -> enc) == - 8 frames = 26763 underruns, 63535 overruns - 4 frames = 26863 underruns, 63810 overruns - 2 frames = 27243 underruns, 63839 overruns - 1 frame = 27670 underruns, 63953 overruns == Time elapsed == - 8 frames = 89.45s - 4 frames = 89.04s - 2 frames = 89.24s - 1 frame = 90.26s	2025-09-30 13:16:59 +02:00
Timo Rothenpieler	262d41c804	all: fix typos found by codespell	2025-08-03 13:48:47 +02:00
Anton Khirnov	68c198fae2	fftools/ffmpeg_sched: allow decoders to have multiple outputs Will be useful for multilayer video.	2024-09-23 17:15:02 +02:00
Anton Khirnov	255ae03601	fftools/ffmpeg_sched: allow filtergraphs to send to filtergraphs Will be useful for filtergraph chaining that will be added in following commits.	2024-04-09 10:34:18 +02:00
Mark Thompson	7f4b8d2f5e	ffmpeg: set extra_hw_frames to account for frames held in queues Since `e0da916b8f` the ffmpeg utility has held multiple frames output by the decoder in internal queues without telling the decoder that it is going to do so. When the decoder has a fixed-size pool of frames (common in some hardware APIs where the output frames must be stored as an array texture) this could lead to the pool being exhausted and the decoder getting stuck. Fix this by telling the decoder to allocate additional frames according to the queue size.	2024-03-19 22:56:56 +00:00
Anton Khirnov	efab83c156	fftools/ffmpeg_sched: allow connecting encoder output to decoders	2024-03-13 08:01:15 +01:00
Anton Khirnov	2ee9362419	fftools/ffmpeg: remove unncessary casts for _thread() return values These functions used to be passed directly to pthread_create(), which required them to return void. This is no longer the case, so they can return a plain int.	2024-03-13 08:01:15 +01:00
Anton Khirnov	e0da916b8f	fftools/ffmpeg: optimize inter-thread queue sizes Use 8 packets/frames by default rather than 1, which seems to provide better throughput. Allow -thread_queue_size to set the muxer queue size manually again.	2024-01-28 13:34:56 +01:00
Anton Khirnov	00013341df	fftools/ffmpeg_sched: add filter API to signal EOF on input	2024-01-27 09:24:29 +01:00
Anton Khirnov	2305091a3a	fftools/ffmpeg: update the reported timestamp at the end Reported-by: microchip	2023-12-14 20:16:54 +01:00
Anton Khirnov	9b8cc36ce0	fftools/ffmpeg: add thread-aware transcode scheduling infrastructure See the comment block at the top of fftools/ffmpeg_sched.h for more details on what this scheduler is for. This commit adds the scheduling code itself, along with minimal integration with the rest of the program: * allocating and freeing the scheduler * passing it throughout the call stack in order to register the individual components (demuxers/decoders/filtergraphs/encoders/muxers) with the scheduler The scheduler is not actually used as of this commit, so it should not result in any change in behavior. That will change in future commits.	2023-12-12 08:24:18 +01:00

13 Commits