Applying film grain happens after ff_thread_finish_setup(),
so the parameters synced in hevc_update_thread_context() must not
be modified. But this is exactly what happens in case applying
film grain fails. (The likely result is that in case of frame threading
an uninitialized frame is output.)
Given that it is actually very easy to know in advance whether
ff_h274_apply_film_grain() supports a given set of parameters,
one can check for this before ff_thread_finish_setup()
and avoid allocating an unused buffer lateron.
Reviewed-by: Niklas Haas <ffmpeg@haasn.xyz>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This could arguably also be a vf, but I decided to put it here since
decoders are technically required to apply film grain during the output
step, and I would rather want to avoid requiring users insert the
correct film grain synthesis filter on their own.
The code, while in C, is written in a way that unrolls/vectorizes fairly
well under -O3, and is reasonably cache friendly. On my CPU, a single
thread pushes about 400 FPS at 1080p.
Apart from hand-written assembly, one possible avenue of improvement
would be to change the access order to compute the grain row-by-row
rather than in 8x8 blocks. This requires some redundant PRNG calls, but
would make the algorithm more cache-oblivious.
The implementation has been written to the wording of SMPTE RDD 5-2006
as faithfully as I can manage. However, apart from passing a visual
inspection, no guarantee of correctness can be made due to the lack of
any publicly available reference implementation against which to
compare it.
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: James Almer <jamrial@gmail.com>