Otherwise, passing an UNSPECIFIED frame to am MPEG-only filter graph
would trigger insertion of an unnecessary vf_scale filter, which would
perform a memcpy to convert between the two.
This is safe to do because unspecified YUV frames are already
universally assumed to be MPEG range, in particular by swscale.
Currently, this only affects untagged RGB/XYZ/Gray, which get forced to
their corresponding metadata before entering the filter graph. The main
justification for this change, however, is the planned ability to add
automatic promotion of unspecified yuv to mpeg range yuv.
Notably, this change will never allow accidentally cross-promoting
unspecified to jpeg or to a specific YUV matrix, since that is still
bound by the constraints of YUV range negotiation as set up by
query_formats.
When this filter overrides frame properties, the outgoing frames have
a different YUV colorspace than the incoming ones. This requires
signalling the new colorspace on the outlink, and in particular, making
sure it's *not* set to a common ref with the input - otherwise the point
of this filter would be destroyed.
Untouched fields will continue being passed through, so we don't need to
do anything there.
An oversight in my previous series. This omission slipped under the
radar because fftools/ffmpeg_filter.c did not use these options, instead
preferring to insert an explicit format filter.
We don't need to include fifo.h, because we don't need AVFifo
as a complete type. Also add the other used headers directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Besides being extremly simple this also avoids including
ff_ccfifo_ccdetected() unnecessarily (it is only used by decklink).
This is possible because this is not avpriv, but duplicated into
lavd if necessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Instead of overriding the frame properties in fill_picture(), advertise
the supported YUV colorspace and range at format negotiation time. (The
correct metadata will now be set automatically by ff_get_video_buffer)
This makes all ff_draw_* based filters aware of YUV colorspaces and
ranges. Needed for YUVJ removal. Also fixes a bug where e.g. vf_pad
would generate a limited range background even after conversion to
full-scale grayscale.
The FATE changes were a consequence of the aforementioned bugfix - the
gray scale files are output as full range (due to conversion by
libswscale, which hard-codes gray = full), and appropriately tagged as
such, but before this change the padded version incorrectly used
a limited range (16) black background for these formats.
Yadif filter assumed that the output timebase is always half of the input
timebase. This is not true if halving the input time base is not representable
as an AVRational causing the output timestamps to be invalidly scaled in such a
case.
So let's use av_reduce instead of av_mul_q when calculating the output time
base and if the conversion is inexact then let's fall back to the original
timebase which probably makes more parctical sense than using x/INT_MAX.
Fixes invalidly scaled pts_time values in this command line:
ffmpeg -f lavfi -i testsrc -vf settb=tb=1/2000000000,yadif,showinfo -f null none
Signed-off-by: Marton Balint <cus@passwd.hu>
The video param change check will print loglines below debug level
for each frame which is different from the inlink parameters. This
can spam the console. It is now printed at warning level once for
each param change else it is kept at debug level.
Partially addresses #10823
Use class confidence instead of box_score to filt boxes, which is more
accurate. Class confidence is obtained by multiplying class probability
distribution and box_score.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
For detect and classify output, width and height make no sence, so
change width, height to dims to represent the shape of tensor. Use
layout and dims to get width, height and channel.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Now when using openvino backend, user doesn't need to set input/output
names in command line. Model ports will be automatically detected.
For example:
ffmpeg -i input.png -vf \
dnn_detect=dnn_backend=openvino:model=model.xml:input=image:\
output=detection_out -y output.png
can be simplified to:
ffmpeg -i input.png -vf dnn_detect=dnn_backend=openvino:model=model.xml\
-y output.png
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Fixed blackstripe on bottom or segmentation fault in case
when patch width and height differ.
Signed-off-by: Vladimir Petrov <vppetrovmms@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
hw_frames_ctx on the input link is only set when the input link is
configured, which hasn't happened yet. This temporarily hacks around
the problem (in a way no worse than before the format negotiation
changes) until a proper fix can be applied.
Generalize drawtext utilities to make them usable in other filters.
This will be needed to introduce the QR code source and filter without
duplicating functionality.
Rewrite the format parsing code to make it more easily generalizable. In
particular, `invert_formats` does not depend on the type of format list
passed to it, which allows me to re-use this helper in an upcoming
commit.
Slightly shortens the code, at the sole cost of doing several malloc
(ff_add_format) instead of a single malloc.
This is at odds with the YUV matrix negotiation API, in which such
dynamic changes in YUV encoding are no longer easily possible. There is
also no really strong motivating reason to do this, since the choice of
YUV matrix is essentially arbitrary and not actually related to the
Dolby Vision decoding process.
This filter will always accept any input format, even if the user sets
a specific in_range/in_color_matrix. This is to preserve status quo with
current behavior, where passing a specific in_color_matrix merely
overrides the incoming frames' attributes. (Use `vf_format` to force
a specific input range)
Because changing colorspace and color_range now requires reconfiguring
the link, we can lift sws_setColorspaceDetails out of scale_frame and
into config_props. (This will also get re-called if the input frame
properties change)
To allow adding proper negotiation, in particular, to fftools.
These values will simply be negotiated downstream for YUV formats, and
ignored otherwise.
Motivated by YUVJ removal. This change will allow full negotiation
between color ranges and matrices as needed. By default, all ranges and
matrices are marked as supported.
Because grayscale formats are currently handled very inconsistently (and
in particular, assumed as forced full-range by swscale), we exclude them
from negotiation altogether for the time being, to get this API merged.
After filter negotiation is available, we can relax the
grayscale-is-forced-jpeg restriction again, when it will be more
feasible to do so without breaking a million test cases.
Note that this commit updates one FATE test as a consequence of the
sanity fallback for non-YUV formats. In particular, the test case now
writes rgb24(pc, gbr/unspecified/unspecified) to the matroska file,
instead of rgb24(unspecified/unspecified/unspecified) as before.
Currently, the logic inside the FF_FILTER_FORMATS_QUERY_FUNC branch
prevents this code from running in the event that we have a filter with
a single video input and a single audio output, as the resulting audio
output link will not have its channel counts / samplerates correctly
initialized to their default values, possibly triggering a segfault
downstream.
An example of such a filter is vaf_spectrumsynth. Although this
particular filter already sets up the channel counts and samplerates as
part of the query function and therefore avoids triggering this bug, the
bug still exists in principle. (And importantly, sets a wrong precedent)
Even if a query func is set. This is safe to do, because
ff_default_query_formats is documented not to touch any filter lists
that were already set by the query func.
The reason to do this is because it allows us to extend
AVFilterFormatsConfig without having to touch every filter in existence.
An alternative implementation of this commit would be to explicitly add
a `ff_default_query_formats` call at the end of every query_formats
function, but that would end up functionally equivalent to this change
while touching a whole lot more code paths for no reason.
As a bonus, eliminates some code/logic duplication from this function.
For this kind of model, we can directly use its output as final result
just like ssd model. The difference is that it splits output into two
tensors. [x_min, y_min, x_max, y_max, confidence] and [lable_id].
Model example refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/person-detection-0106
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add dynamic outputs support. Some models don't have fixed output size.
Its size changes according to result. Now openvino can run these kinds of
models.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Fixes: out of array access:
Fixes: tickets/10745/poc12ffmpeg
Found-by: Li Zeyuan and Zeng Yunxiang.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array read
Fixes: tickets/10744/poc11ffmpeg
Found-by: Li Zeyuan and Zeng Yunxiang.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access
Fixes: tickets/10743/poc10ffmpeg
Found-by: Zeng Yunxiang and Li Zeyuan
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access
Fixes: tickets/10747/poc14ffmpeg
Found-by: Zeng Yunxiang and Song Jiaxuan
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The code works in steps of 2 lines and lacks support for odd height
Implementing odd height support is better but for now this fixes the
out of array access
Fixes: out of array access
Fixes: tickets/10702/poc6ffmpe
Found-by: Zeng Yunxiang
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Some encoders (e.g., libx264) dump encoder configuration as user
data unregistered SEI message. This option try to print it as
ascii character when possible.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Unnecessary since acf63d5350adeae551d412db699f8ca03f7e76b9;
also avoids relocations.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The difference of yolov4 is that sigmoid function needed to be applied
on x, y coordinates. Also make it compatiple with NHWC output as the
yolov4 model from openvino model zoo has NHWC output layout.
Model refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v4-tf
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add yolov3 support. The difference of yolov3 is that it has multiple
outputs in different scale to perform better on both large and small
object.
The model detail refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v3-tf
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add input pad to get model input resolution. Detection models always
have fixed input size. And the output coordinators are based on the
input resolution, so we need to get input size to map coordinators to
our real output frames.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add multiple output support to openvino backend. You can use '&' to
split different output when you set output name using command line.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
The libvmaf filter was doing substring checks in place of string equality
comparisons. This led to a bug when the user specified the pooling method
"harmonic_mean", since "mean" was checked first and the substring comparison
returned true. This patch changes all substring comparisons for string equality
comparisons. This is both correct and more efficient than the existing method.
Signed-off-by: nilfm <nilf@netflix.com>
The current logic for detecting frames that are too small for the
algorithm does not account for chroma sub-sampling, and so a sample
where the luma plane is large enough, but the chroma planes are not
will not be rejected. In that event, a heap overflow will occur.
This change adjusts the logic to consider the chroma planes and makes
the change to all three bwdif implementations.
Fixes#10688
Signed-off-by: Cosmin Stejerean <cosmin@cosmin.at>
Reviewed-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: Philip Langdale <philipl@overt.org>
Both qsv encoders and decoders use 4 as the default value of
async_depth, let's use 4 as the default value for vpp_qsv filter too.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
Add yolo support. Yolo model doesn't output final result. It outputs
candidate boxes, so we need post-process to remove overlap boxes to
get final results. Also, the box's coordinators relate to cell and
anchors, so we need these information to calculate boxes as well.
Model detail please refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v2-tf
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
There are many kinds of detection DNN model and they have different
preprocess and postprocess methods. To support more models,
"model_type" option is added to help to choose preprocess and
postprocess function.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Segmented loads are slow, so here we use unit-strided load and narrowing shifts.
c910:
fcmul_add_c: 2179
fcmul_add_rvv_f64: 1652
c908:
fcmul_add_c: 4891.2
fcmul_add_rvv_f64: 2399.5
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
When using vf_scale to force a specific output color space, also tag
this on the AVFrame. (Mirroring existing logic for output range)
Move the sanity fix for RGB after the new assignment, to avoid leaking
bogus YUV colorspace metadata for RGB spaces.
No need to write a custom string parser when we can just use an integer
option with preset values. The various bits of fallback logic are wholly
redundant with equivalent logic already inside sws_getCoefficients.
Note: I disallowed setting 'out_color_matrix=auto', because this does
not do anything meaningful in the current code (just hard-codes
AVCOL_SPC_BT470BG fallback).
The check if drawing needs to be initialized and supported formats
should be drawable ones was flawed, as pad_stop/pad_start is only
populated from stop_duration/start_duration after these checks.
To fix that, check the _duration variants as well and for better
readability and maintainability break the check out into its own
helper.
Fixes a regression from 86b252ea9dFix#10621
Signed-off-by: Anton Khirnov <anton@khirnov.net>
We calculated offsets as pairs, but addressed them in the shader
as single float values, while reading them as ivec2s.
Also removes unused code (was provisionally added if cooperative matrices
could be used, but that turned out to be impossible).
Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Due to making the decode frames context use the coded size, the
filter started to display those artifacts as it reused the input frame's size.
Change it to instead output the real image size for images, not the input.
The code above this does a whitelist on desc->flags, which now includes
the (disallowed) AV_PIX_FMT_FLAG_XYZ for XYZ formats. So there is no
more need for a separate check, here.
These are not supported by the drawing functions at all, and were
incorrectly advertised as supported in the past.
Note: This check is added only to separate the logic change from the API
change in the following commit, and will be removed again after it
becomes redundant.
The vidstab library added support in Nov 2020 for writing/reading
the transforms data in binary in addition to ASCII. The library default
was changed to binary format but no changes were made to the AVfilters
resulting in data file for writing or reading being always opened as text.
This effectively broke the filters.
Option added to vidstabdetect to specify file format and open files in
both filters with the correct attributes.
This logic only covers the case of yuv420p. Extend this logic to cover
*all* vertically subsampled YUV formats, which require the same
interlaced scaling logic.
Fortunately, we can get away with re-using the same code for both JPEG
and MPEG range YUV, because the only difference here is the horizontal
alignment. (Which I omit touching for now, to avoid introducing possibly
unintended changes in default behavior)
Fixes multiplane support on Nvidia.
Also, remove the ENCODE usage, even if the driver signals it as supported.
Currently, it's not used, and when it is used, it'll be gated behind
two extension checks.
Instead, use forward declarations; and in order not to affect
any user include these headers for them, but not internally.
This has the advantage of removing implicit inclusions of these
headers from almost all files providing codecs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In this logical branch, fq->queued and fq->allocated must be equal.
Deleting this code will make it easier to understand the logic
(copy the data before tail to the newly requested space)
Signed-off-by: yangyalei <yangyalei@xiaomi.com>
This function is used by the AARCH64 code to filter a few
remaining lines in case the dimensions are not suitably
aligned; it is furthermore also used by checkasm to actually
test the AARCH64 code against. But it is normally not used
and this patch therefore moves it in bwdifdsp.h as a static
inline function so that it can be avoided if possible
or inlined where necessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise checkasm/vf_bwdif.c pulls in vf_bwdif.c and
then all of libavfilter. Besides being bad size-wise
this also has the downside that it pulls in
avpriv_(cga|vga16)_font from libavutil which are marked
as being imported from another library when building
libavfilter as a DLL and this breaks checkasm because
it links both lavfi and lavu statically.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This already avoids unnecessary indirectly included headers
in the arch-specific vf_bwdif_init.c files; it is also in
preparation for splitting the actual functions out of vf_bwdif.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>