It differs from query_func() in accepting arrays of input/output format
configurations to be filled as callback parameters. This allows to mark
the filter context as const, ensuring it is not modified by this
function, as it is not supposed to have any side effects beyond
returning the supported formats.
The color range should be set to match the input when creating
the VideoToolbox context. Otherwise, the new context will default
to limited range, creates inconsistencies with full range inputs.
Signed-off-by: Gnattu OC <gnattuoc@me.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The only thing this function does beyond calling av_get_pix_fmt() is
falling back onto parsing the argument as a number. No other filters
should need to do this.
The current logic hard-coded a check for v_sub == 1. We can extend this
logic slightly to cover the case of interlaced 4:1:0 (which has v_sub ==
2).
Here is a diagram explaining this scenario (with center-siting):
a a a a a a a a
b b b b b b b b
X X
a a a a a a a a
b b b b b b b b
a a a a a a a a
b b b b b b b b
Y Y
a a a a a a a a
b b b b b b b b
a = even luma rows
b = odd luma rows
X = even chroma sample
Y = odd chroma sample
In progressive mode, the chroma samples sit at (384, 384) respectively.
Relative to the 8x4 grid of even luma samples (a), the X sample sits at:
h_chr_pos = 384
v_chr_pos = 192
Relative to the 8x4 grid of odd luma samples (b), the Y sample sits at:
h_chr_pos = 384
v_chr_pos = 576
The new code calculates the correct values in all circumstances.
Currently, this just functions as a more principled and user-friendly
replacement for the (undocumented and hard to use) *_chr_pos fields.
However, the goal is to automatically infer these values from the input
frames' chroma location, and deprecate the manual use of *_chr_pos
altogether. (Indeed, my plans for an swscale replacement will most
likely also end up limiting the set of legal chroma locations to those
permissible by AVFrame properties)
The current logic only fixes it when the user does not explicitly
specify the chroma location. However, this does not make a lot of sense.
Since there is no way to specify this property per-field, it effectively
*prevents* the user from being able to correctly scale interlaced frames
with top-aligned chroma.
It makes more sense to consider the user setting in the progressive case
only, and automatically adapt it to the correct interlaced field
positions, following the details of the MPEG specification.
We check for whether subformats support storage immediately below.
Those are the ones we require storage for, rather than the base format
itself.
This permits better reuse of AVHWFrame contexts.
The patch also removes an always-false check in the subformat check.
Vulkan encoding was designed in a very... consolidated way.
You had to know the exact codec and profile that the image was going to
eventually be encoded as at... image creation time. Unfortunately, as good
as our code is, glimpsing into the exact future isn't what its capable of.
video_maintenance1 removed that requirement, which only then made encoding
images practically possible.
Specifically those that should be visible to filters, but hidden from
API callers. Such properties are currently located at the end of the
public AVFilterLink struct, demarcated by a comment marking them as
private. However it is generally better to hide them explicitly, using
the same pattern already employed in avformat or avcodec.
The new struct is currently trivial, but will become more useful in
following commits.
Fixes: CID1458148 Result is not floating-point
Fixes: CID1458149 Result is not floating-point
Fixes: CID1458150 Result is not floating-point
Fixes: CID1458151 Result is not floating-point
Fixes: CID1458152 Result is not floating-point
Fixes: CID1458154 Result is not floating-point
Fixes: CID1458155 Result is not floating-point
Fixes: CID1458156 Result is not floating-point
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The issue is that shaderc_result_get_num_errors may sometime
return 0 even when shaderc_result_get_compilation_status returns
a non-zero error code.
Since we use the result from the former, override the status
if it returned 0.
Callers of ff_framesync_get_frame() generally do not expect the result
to be writable, those that do (e.g. ff_framesync_dualinput_get_writable())
ensure writability themselves.
Significantly reduces memory consumption in complex graphs with
framesync-based filters (e.g. scale, ssim).
Reported-By: Mark Shwartzman
Found by reviewing CID1513722 Operands don't affect result
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1604398 Unchecked return value
Fixes: CID1604542 Unchecked return value
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1452758 Out-of-bounds read (actual out of bounds access depends on a frame with more than 3 planes)
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1437470 Out-of-bounds read (out of bounds read would only occur with a pixel format of more than 4 planes)
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
An incorrect calculation in ff_perlin_init causes a write to the
stack array at index 256, which is out of bounds.
Fixes: CID1608711
Reviewed-by: Stefano Sabatini <stefasab@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
scale_frame() inconsistently handled the lifetime of `in`. Fixes a
possible double free and a possible memory leak.
The new code always has `scale_frame` take over ownership of the input
frame. I first tried writing this code in a way where the calling code
retains ownership, but this is nontrivial due to the presence of the
no-op short-circuit condition in which the input frame is directly
returned. (As an alternative, we could use av_frame_clone() instead, but
I wanted to avoid touching the original behavior in this commit)
Fixes: CID1551694 Use after free (false positive based on assuming that out == in and one is freed and one used)
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1598548 Logically dead code
Sponsored-by: Sovereign Tech Fund
Reviewed-by: "Xiang, Haihao" <haihao.xiang@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Due to hysterical raisins, most RISC-V Linux distributions target a
RV64GC baseline excluding the Bit-manipulation ISA extensions, most
notably:
- Zba: address generation extension and
- Zbb: basic bit manipulation extension.
Most CPUs that would make sense to run FFmpeg on support Zba and Zbb
(including the current FATE runner), so it makes sense to optimise for
them. In fact a large chunk of existing assembler optimisations relies
on Zba and/or Zbb.
Since we cannot patch shared library code, the next best thing is to
carry a flag initialised at load-time and check it on need basis.
This results in 3 instructions overhead on isolated use, e.g.:
1: AUIPC rd, %pcrel_hi(ff_rv_zbb_supported)
LBU rd, %pcrel_lo(1b)(rd)
BEQZ rd, non_Zbb_fallback_code
// Zbb code here
The C compiler will typically load the flag ahead of time to reducing
latency, and can also keep it around if Zbb is used multiple times in a
single optimisation scope. For this to work, the flag symbol must be
hidden; otherwise the optimisation degrades with a GOT look-up to
support interposition:
1: AUIPC rd, GOT_OFFSET_HI
LD rd, GOT_OFFSET_LO(rd)
LBU rd, (rd)
BEQZ rd, non_Zbb_fallback_code
// Zbb code here
This patch adds code to provision the flag in libraries using bit
manipulation functions from libavutil: byte-swap, bit-weight and
counting leading or trailing zeroes.
Add xpu device support to libtorch backend.
To enable xpu support you need to add
"-Wl,--no-as-needed -lintel-ext-pt-gpu -Wl,--as-needed" to
"--extra-libs" when configure ffmpeg.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Actually, the jaccard distance is defined as D = 1 - intersect / union.
Additionally, the distance value is compared against a constant that
must be between 0 and 1, which is not the case here. Both facts together
has led to the fact, that the function always returned a matching course
signature. To leave the constant intact and to avoid floating point
computation, this commit multiplies with 1 << 16 making the constant
effectively 9000 / (1<<16) =~ 0.14.
Reported-by: Sachin Tilloo <sachin.tilloo@gmail.com>
Reviewed-by: Sachin Tilloo <sachin.tilloo@gmail.com>
Tested-by: Sachin Tilloo <sachin.tilloo@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The result might not fit into 32bit if an image has gigantic
dimensions and one of the planes has a dominant value
(particularly so if said value is big).
Fixes Coverity issues #1598399, #1598401, #1598402, #1598403, #1598404.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For code such as 'model->model = ov_model' is confusing. We can
just drop the member variable and use cast to get the subclass.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
It will be freed again by ff_dnn_uninit.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
It will be freed again by ff_dnn_uninit.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
The earlier code distinguished between a partial reset
(yae_clear()) and a complete reset (yae_release_buffers()
which also releases the buffers); this separation existed
to avoid allocations, as buffers were reallocated on reconfigs.
Yet it is pointless since a5704659e3,
so simply use yae_release_buffers() everywhere.
Reviewed-by: Pavel Koshevoy <pkoshevoy@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Should fix many Coverity false positives, namely #1457947-#1457994
as well as #1461195-#146210.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This patch trying to resolve mulitiple issues related to parameter
configuration:
Firstly, each DNN filters duplicate DNN_COMMON_OPTIONS, which should
be the common options of backend.
Secondly, backend options are hidden behind the scene. It's a
AV_OPT_TYPE_STRING backend_configs for user, and parsed by each
backend. We don't know each backend support what kind of options
from the help message.
Third, DNN backends duplicate DNN_BACKEND_COMMON_OPTIONS.
Last but not the least, pass backend options via AV_OPT_TYPE_STRING
makes it hard to pass AV_OPT_TYPE_BINARY to backend, if not impossible.
This patch puts backend common options and each backend options inside
DnnContext to reduce code duplication, make options user friendly, and
easy to extend for future usecase.
For example,
./ffmpeg -h filter=dnn_processing
dnn_processing AVOptions:
dnn_backend <int> ..FV....... DNN backend (from INT_MIN to INT_MAX) (default tensorflow)
tensorflow 1 ..FV....... tensorflow backend flag
openvino 2 ..FV....... openvino backend flag
torch 3 ..FV....... torch backend flag
dnn_base AVOptions:
model <string> ..F........ path to model file
input <string> ..F........ input name of the model
output <string> ..F........ output name of the model
backend_configs <string> ..F.......P backend configs (deprecated)
options <string> ..F.......P backend configs (deprecated)
nireq <int> ..F........ number of request (from 0 to INT_MAX) (default 0)
async <boolean> ..F........ use DNN async inference (default true)
device <string> ..F........ device to run model
dnn_tensorflow AVOptions:
sess_config <string> ..F........ config for SessionOptions
dnn_openvino AVOptions:
batch_size <int> ..F........ batch size per request (from 1 to 1000) (default 1)
input_resizable <boolean> ..F........ can input be resizable or not (default false)
layout <int> ..F........ input layout of model (from 0 to 2) (default none)
none 0 ..F........ none
nchw 1 ..F........ nchw
nhwc 2 ..F........ nhwc
scale <float> ..F........ Add scale preprocess operation. Divide each element of input by specified value. (from INT_MIN to INT_MAX) (default 0)
mean <float> ..F........ Add mean preprocess operation. Subtract specified value from each element of input. (from INT_MIN to INT_MAX) (default 0)
dnn_th AVOptions:
optimize <int> ..F........ turn on graph executor optimization (from 0 to 1) (default 0)
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>