avcodec/h274: add film grain synthesis routine
This could arguably also be a vf, but I decided to put it here since
decoders are technically required to apply film grain during the output
step, and I would rather want to avoid requiring users insert the
correct film grain synthesis filter on their own.
The code, while in C, is written in a way that unrolls/vectorizes fairly
well under -O3, and is reasonably cache friendly. On my CPU, a single
thread pushes about 400 FPS at 1080p.
Apart from hand-written assembly, one possible avenue of improvement
would be to change the access order to compute the grain row-by-row
rather than in 8x8 blocks. This requires some redundant PRNG calls, but
would make the algorithm more cache-oblivious.
The implementation has been written to the wording of SMPTE RDD 5-2006
as faithfully as I can manage. However, apart from passing a visual
inspection, no guarantee of correctness can be made due to the lack of
any publicly available reference implementation against which to
compare it.
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-08-17 21:25:32 +02:00
|
|
|
/*
|
|
|
|
* H.274 film grain synthesis
|
|
|
|
* Copyright (c) 2021 Niklas Haas <ffmpeg@haasn.xyz>
|
|
|
|
*
|
|
|
|
* This file is part of FFmpeg.
|
|
|
|
*
|
|
|
|
* FFmpeg is free software; you can redistribute it and/or
|
|
|
|
* modify it under the terms of the GNU Lesser General Public
|
|
|
|
* License as published by the Free Software Foundation; either
|
|
|
|
* version 2.1 of the License, or (at your option) any later version.
|
|
|
|
*
|
|
|
|
* FFmpeg is distributed in the hope that it will be useful,
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
* Lesser General Public License for more details.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU Lesser General Public
|
|
|
|
* License along with FFmpeg; if not, write to the Free Software
|
|
|
|
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
|
|
|
*/
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @file
|
|
|
|
* H.274 film grain synthesis.
|
|
|
|
* @author Niklas Haas <ffmpeg@haasn.xyz>
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef AVCODEC_H274_H
|
|
|
|
#define AVCODEC_H274_H
|
|
|
|
|
2022-11-28 23:00:05 +02:00
|
|
|
#include "libavutil/film_grain_params.h"
|
avcodec/h274: add film grain synthesis routine
This could arguably also be a vf, but I decided to put it here since
decoders are technically required to apply film grain during the output
step, and I would rather want to avoid requiring users insert the
correct film grain synthesis filter on their own.
The code, while in C, is written in a way that unrolls/vectorizes fairly
well under -O3, and is reasonably cache friendly. On my CPU, a single
thread pushes about 400 FPS at 1080p.
Apart from hand-written assembly, one possible avenue of improvement
would be to change the access order to compute the grain row-by-row
rather than in 8x8 blocks. This requires some redundant PRNG calls, but
would make the algorithm more cache-oblivious.
The implementation has been written to the wording of SMPTE RDD 5-2006
as faithfully as I can manage. However, apart from passing a visual
inspection, no guarantee of correctness can be made due to the lack of
any publicly available reference implementation against which to
compare it.
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-08-17 21:25:32 +02:00
|
|
|
|
|
|
|
// Must be initialized to {0} prior to first usage
|
|
|
|
typedef struct H274FilmGrainDatabase {
|
|
|
|
// Database of film grain patterns, lazily computed as-needed
|
|
|
|
int8_t db[13 /* h */][13 /* v */][64][64];
|
|
|
|
uint16_t residency[13 /* h */]; // bit field of v
|
|
|
|
|
|
|
|
// Temporary buffer for slice generation
|
|
|
|
int16_t slice_tmp[64][64];
|
|
|
|
} H274FilmGrainDatabase;
|
|
|
|
|
2022-08-23 00:46:04 +02:00
|
|
|
/**
|
|
|
|
* Check whether ff_h274_apply_film_grain() supports the given parameter combination.
|
|
|
|
*
|
|
|
|
* @param model_id model_id from AVFilmGrainParams to be supplied
|
|
|
|
* @param pix_fmt pixel format of the frames to be supplied
|
|
|
|
*/
|
|
|
|
static inline int ff_h274_film_grain_params_supported(int model_id, enum AVPixelFormat pix_fmt)
|
|
|
|
{
|
|
|
|
return model_id == 0 && pix_fmt == AV_PIX_FMT_YUV420P;
|
|
|
|
}
|
|
|
|
|
avcodec/h274: add film grain synthesis routine
This could arguably also be a vf, but I decided to put it here since
decoders are technically required to apply film grain during the output
step, and I would rather want to avoid requiring users insert the
correct film grain synthesis filter on their own.
The code, while in C, is written in a way that unrolls/vectorizes fairly
well under -O3, and is reasonably cache friendly. On my CPU, a single
thread pushes about 400 FPS at 1080p.
Apart from hand-written assembly, one possible avenue of improvement
would be to change the access order to compute the grain row-by-row
rather than in 8x8 blocks. This requires some redundant PRNG calls, but
would make the algorithm more cache-oblivious.
The implementation has been written to the wording of SMPTE RDD 5-2006
as faithfully as I can manage. However, apart from passing a visual
inspection, no guarantee of correctness can be made due to the lack of
any publicly available reference implementation against which to
compare it.
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-08-17 21:25:32 +02:00
|
|
|
// Synthesizes film grain on top of `in` and stores the result to `out`. `out`
|
|
|
|
// must already have been allocated and set to the same size and format as
|
|
|
|
// `in`.
|
|
|
|
//
|
|
|
|
// Returns a negative error code on error, such as invalid params.
|
2022-08-23 00:46:04 +02:00
|
|
|
// If ff_h274_film_grain_params_supported() indicated that the parameters
|
|
|
|
// are supported, no error will be returned if the arguments given to
|
|
|
|
// ff_h274_film_grain_params_supported() coincide with actual values
|
|
|
|
// from the frames and params.
|
avcodec/h274: add film grain synthesis routine
This could arguably also be a vf, but I decided to put it here since
decoders are technically required to apply film grain during the output
step, and I would rather want to avoid requiring users insert the
correct film grain synthesis filter on their own.
The code, while in C, is written in a way that unrolls/vectorizes fairly
well under -O3, and is reasonably cache friendly. On my CPU, a single
thread pushes about 400 FPS at 1080p.
Apart from hand-written assembly, one possible avenue of improvement
would be to change the access order to compute the grain row-by-row
rather than in 8x8 blocks. This requires some redundant PRNG calls, but
would make the algorithm more cache-oblivious.
The implementation has been written to the wording of SMPTE RDD 5-2006
as faithfully as I can manage. However, apart from passing a visual
inspection, no guarantee of correctness can be made due to the lack of
any publicly available reference implementation against which to
compare it.
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-08-17 21:25:32 +02:00
|
|
|
int ff_h274_apply_film_grain(AVFrame *out, const AVFrame *in,
|
|
|
|
H274FilmGrainDatabase *db,
|
|
|
|
const AVFilmGrainParams *params);
|
|
|
|
|
|
|
|
#endif /* AVCODEC_H274_H */
|