- Properly initialize all the mappings to -1 by default.
- Make sure every output channel is assigned exactly once
- Autodetect a native layout when only native channels are present
- Always honor the user specified layout, but make sure the mapping is
compatible with it
The last item is a regression from 4af412be71.
Signed-off-by: Marton Balint <cus@passwd.hu>
In this case in_channel_idx was never set and the default 0 was used.
Suprisingly no one noticed that the respective fate test output was wrong.
Signed-off-by: Marton Balint <cus@passwd.hu>
PyTorch is an open source machine learning framework that accelerates
the path from research prototyping to production deployment. Official
website: https://pytorch.org/. We call the C++ library of PyTorch as
LibTorch, the same below.
To build FFmpeg with LibTorch, please take following steps as
reference:
1. download LibTorch C++ library in
https://pytorch.org/get-started/locally/,
please select C++/Java for language, and other options as your need.
Please download cxx11 ABI version:
(libtorch-cxx11-abi-shared-with-deps-*.zip).
2. unzip the file to your own dir, with command
unzip libtorch-shared-with-deps-latest.zip -d your_dir
3. export libtorch_root/libtorch/include and
libtorch_root/libtorch/include/torch/csrc/api/include to $PATH
export libtorch_root/libtorch/lib/ to $LD_LIBRARY_PATH
4. config FFmpeg with ../configure --enable-libtorch \
--extra-cflag=-I/libtorch_root/libtorch/include \
--extra-cflag=-I/libtorch_root/libtorch/include/torch/csrc/api/include \
--extra-ldflags=-L/libtorch_root/libtorch/lib/
5. make
To run FFmpeg DNN inference with LibTorch backend:
./ffmpeg -i input.jpg -vf \
dnn_processing=dnn_backend=torch:model=LibTorch_model.pt -y output.jpg
The LibTorch_model.pt can be generated by Python with torch.jit.script()
api. https://pytorch.org/tutorials/advanced/cpp_export.html. This is
pytorch official guide about how to convert and load torchscript model.
Please note, torch.jit.trace() is not recommanded, since it does
not support ambiguous input size.
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
All versions of MSVC that support C11 (namely >= v19.27)
also support the restrict keyword, therefore av_restrict
is no longer necessary since 75697836b1.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
At present, consume_update evaluates timeline state on all links for
a multi-input filter. This can lead to the filter being incorrectly
en/dis-abled when evaluation on a frame on a secondary link leads to
a different result than the frame on the current main link next in
line for processing.
In some builds, the following object files could be left behind
after make clean:
./libavfilter/metal/utils.o
./libavfilter/metal/vf_yadif_videotoolbox.metallib.o
./libavcodec/x86/h26x/h2656dsp.o
./libavcodec/neon/mpegvideo.o
./ffbuild/bin2c_host.o
Fixes: http://trac.ffmpeg.org/ticket/10895
Signed-off-by: Martin Storsjö <martin@martin.st>
src/libavfilter/internal.h:255:45: note: passing argument to parameter 'filter' here
int ff_filter_config_links(AVFilterContext *filter);
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Previously to support dynamic reconfigurations of the matrix string (e.g. 0m),
the rdiv values would always be cleared to 0.f, causing the rdiv to be
recalculated based on the new filter. This however had the side effect of
always ignoring user specified rdiv values.
Instead float user_rdiv[0] is added to ConvolutionContext which will store the
user specified rdiv values. Then the original rdiv array will store either the
user_rdiv or the automatically calculated 1/sum.
This fixes trac ticket #10294, #10867.
Signed-off-by: Stone Chen <chen.stonechen@gmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Avoids ugly casts when uninitializing.
(One could actually avoid allocating this separately if one
were willing to expose FFFramePool to those files including
link_internal.h.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also make FFFilterGraph.sink_links a FilterLinkInternal**
because sink_links is used to access FilterLinkInternal
fields.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
(These fields were in AVFilterGraph although AVFilterGraphInternal
existed for years.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To do this, allocate AVFilterGraphInternal jointly with AVFilterGraph
and rename it to FFFilterGraph in the process (similarly to
AVStream/FFStream).
The AVFilterGraphInternal* will be removed on the next major version
bump.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This commit moves the generic-layer stuff (that is not used
by filters) to a new header of its own, similarly to
5e7b5b0090 for libavcodec.
thread.h and link_internal.h are merged into this header.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To do this, allocate AVFilterInternal jointly with AVFilterContext
and rename it to FFFilterContext in the process (similarly to
AVStream/FFStream).
The AVFilterInternal* will be removed from AVFilterContext
on the next major bump.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These logs use the wrong loglevel and are uninformative;
and it is of course highly unlikely that a buffer of 56B
can't be allocated.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This hack is used to limit the visibility of some AVFilterLink fields to
only certain files. Replace it with the same pattern that is used e.g.
in lavf AVStream/FFStream and avoid exposing these internal fields in a
public header completely.
Makes it robust against adding fields before it, which will be useful in
following commits.
Majority of the patch generated by the following Coccinelle script:
@@
typedef AVOption;
identifier arr_name;
initializer list il;
initializer list[8] il1;
expression tail;
@@
AVOption arr_name[] = { il, { il1,
- tail
+ .unit = tail
}, ... };
with some manual changes, as the script:
* has trouble with options defined inside macros
* sometimes does not handle options under an #else branch
* sometimes swallows whitespace
avfilter_insert_filter() already reports (also with AV_LOG_VERBOSE)
when a filter is auto-inserted.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise, passing an UNSPECIFIED frame to am MPEG-only filter graph
would trigger insertion of an unnecessary vf_scale filter, which would
perform a memcpy to convert between the two.
This is safe to do because unspecified YUV frames are already
universally assumed to be MPEG range, in particular by swscale.
Currently, this only affects untagged RGB/XYZ/Gray, which get forced to
their corresponding metadata before entering the filter graph. The main
justification for this change, however, is the planned ability to add
automatic promotion of unspecified yuv to mpeg range yuv.
Notably, this change will never allow accidentally cross-promoting
unspecified to jpeg or to a specific YUV matrix, since that is still
bound by the constraints of YUV range negotiation as set up by
query_formats.
When this filter overrides frame properties, the outgoing frames have
a different YUV colorspace than the incoming ones. This requires
signalling the new colorspace on the outlink, and in particular, making
sure it's *not* set to a common ref with the input - otherwise the point
of this filter would be destroyed.
Untouched fields will continue being passed through, so we don't need to
do anything there.
An oversight in my previous series. This omission slipped under the
radar because fftools/ffmpeg_filter.c did not use these options, instead
preferring to insert an explicit format filter.
We don't need to include fifo.h, because we don't need AVFifo
as a complete type. Also add the other used headers directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Besides being extremly simple this also avoids including
ff_ccfifo_ccdetected() unnecessarily (it is only used by decklink).
This is possible because this is not avpriv, but duplicated into
lavd if necessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Instead of overriding the frame properties in fill_picture(), advertise
the supported YUV colorspace and range at format negotiation time. (The
correct metadata will now be set automatically by ff_get_video_buffer)
This makes all ff_draw_* based filters aware of YUV colorspaces and
ranges. Needed for YUVJ removal. Also fixes a bug where e.g. vf_pad
would generate a limited range background even after conversion to
full-scale grayscale.
The FATE changes were a consequence of the aforementioned bugfix - the
gray scale files are output as full range (due to conversion by
libswscale, which hard-codes gray = full), and appropriately tagged as
such, but before this change the padded version incorrectly used
a limited range (16) black background for these formats.
Yadif filter assumed that the output timebase is always half of the input
timebase. This is not true if halving the input time base is not representable
as an AVRational causing the output timestamps to be invalidly scaled in such a
case.
So let's use av_reduce instead of av_mul_q when calculating the output time
base and if the conversion is inexact then let's fall back to the original
timebase which probably makes more parctical sense than using x/INT_MAX.
Fixes invalidly scaled pts_time values in this command line:
ffmpeg -f lavfi -i testsrc -vf settb=tb=1/2000000000,yadif,showinfo -f null none
Signed-off-by: Marton Balint <cus@passwd.hu>
The video param change check will print loglines below debug level
for each frame which is different from the inlink parameters. This
can spam the console. It is now printed at warning level once for
each param change else it is kept at debug level.
Partially addresses #10823
Use class confidence instead of box_score to filt boxes, which is more
accurate. Class confidence is obtained by multiplying class probability
distribution and box_score.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
For detect and classify output, width and height make no sence, so
change width, height to dims to represent the shape of tensor. Use
layout and dims to get width, height and channel.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Now when using openvino backend, user doesn't need to set input/output
names in command line. Model ports will be automatically detected.
For example:
ffmpeg -i input.png -vf \
dnn_detect=dnn_backend=openvino:model=model.xml:input=image:\
output=detection_out -y output.png
can be simplified to:
ffmpeg -i input.png -vf dnn_detect=dnn_backend=openvino:model=model.xml\
-y output.png
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Fixed blackstripe on bottom or segmentation fault in case
when patch width and height differ.
Signed-off-by: Vladimir Petrov <vppetrovmms@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
hw_frames_ctx on the input link is only set when the input link is
configured, which hasn't happened yet. This temporarily hacks around
the problem (in a way no worse than before the format negotiation
changes) until a proper fix can be applied.
Generalize drawtext utilities to make them usable in other filters.
This will be needed to introduce the QR code source and filter without
duplicating functionality.
Rewrite the format parsing code to make it more easily generalizable. In
particular, `invert_formats` does not depend on the type of format list
passed to it, which allows me to re-use this helper in an upcoming
commit.
Slightly shortens the code, at the sole cost of doing several malloc
(ff_add_format) instead of a single malloc.
This is at odds with the YUV matrix negotiation API, in which such
dynamic changes in YUV encoding are no longer easily possible. There is
also no really strong motivating reason to do this, since the choice of
YUV matrix is essentially arbitrary and not actually related to the
Dolby Vision decoding process.
This filter will always accept any input format, even if the user sets
a specific in_range/in_color_matrix. This is to preserve status quo with
current behavior, where passing a specific in_color_matrix merely
overrides the incoming frames' attributes. (Use `vf_format` to force
a specific input range)
Because changing colorspace and color_range now requires reconfiguring
the link, we can lift sws_setColorspaceDetails out of scale_frame and
into config_props. (This will also get re-called if the input frame
properties change)
To allow adding proper negotiation, in particular, to fftools.
These values will simply be negotiated downstream for YUV formats, and
ignored otherwise.
Motivated by YUVJ removal. This change will allow full negotiation
between color ranges and matrices as needed. By default, all ranges and
matrices are marked as supported.
Because grayscale formats are currently handled very inconsistently (and
in particular, assumed as forced full-range by swscale), we exclude them
from negotiation altogether for the time being, to get this API merged.
After filter negotiation is available, we can relax the
grayscale-is-forced-jpeg restriction again, when it will be more
feasible to do so without breaking a million test cases.
Note that this commit updates one FATE test as a consequence of the
sanity fallback for non-YUV formats. In particular, the test case now
writes rgb24(pc, gbr/unspecified/unspecified) to the matroska file,
instead of rgb24(unspecified/unspecified/unspecified) as before.
Currently, the logic inside the FF_FILTER_FORMATS_QUERY_FUNC branch
prevents this code from running in the event that we have a filter with
a single video input and a single audio output, as the resulting audio
output link will not have its channel counts / samplerates correctly
initialized to their default values, possibly triggering a segfault
downstream.
An example of such a filter is vaf_spectrumsynth. Although this
particular filter already sets up the channel counts and samplerates as
part of the query function and therefore avoids triggering this bug, the
bug still exists in principle. (And importantly, sets a wrong precedent)
Even if a query func is set. This is safe to do, because
ff_default_query_formats is documented not to touch any filter lists
that were already set by the query func.
The reason to do this is because it allows us to extend
AVFilterFormatsConfig without having to touch every filter in existence.
An alternative implementation of this commit would be to explicitly add
a `ff_default_query_formats` call at the end of every query_formats
function, but that would end up functionally equivalent to this change
while touching a whole lot more code paths for no reason.
As a bonus, eliminates some code/logic duplication from this function.
For this kind of model, we can directly use its output as final result
just like ssd model. The difference is that it splits output into two
tensors. [x_min, y_min, x_max, y_max, confidence] and [lable_id].
Model example refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/person-detection-0106
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add dynamic outputs support. Some models don't have fixed output size.
Its size changes according to result. Now openvino can run these kinds of
models.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Fixes: out of array access:
Fixes: tickets/10745/poc12ffmpeg
Found-by: Li Zeyuan and Zeng Yunxiang.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array read
Fixes: tickets/10744/poc11ffmpeg
Found-by: Li Zeyuan and Zeng Yunxiang.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access
Fixes: tickets/10743/poc10ffmpeg
Found-by: Zeng Yunxiang and Li Zeyuan
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access
Fixes: tickets/10747/poc14ffmpeg
Found-by: Zeng Yunxiang and Song Jiaxuan
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The code works in steps of 2 lines and lacks support for odd height
Implementing odd height support is better but for now this fixes the
out of array access
Fixes: out of array access
Fixes: tickets/10702/poc6ffmpe
Found-by: Zeng Yunxiang
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Some encoders (e.g., libx264) dump encoder configuration as user
data unregistered SEI message. This option try to print it as
ascii character when possible.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Unnecessary since acf63d5350adeae551d412db699f8ca03f7e76b9;
also avoids relocations.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The difference of yolov4 is that sigmoid function needed to be applied
on x, y coordinates. Also make it compatiple with NHWC output as the
yolov4 model from openvino model zoo has NHWC output layout.
Model refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v4-tf
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add yolov3 support. The difference of yolov3 is that it has multiple
outputs in different scale to perform better on both large and small
object.
The model detail refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v3-tf
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add input pad to get model input resolution. Detection models always
have fixed input size. And the output coordinators are based on the
input resolution, so we need to get input size to map coordinators to
our real output frames.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add multiple output support to openvino backend. You can use '&' to
split different output when you set output name using command line.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
The libvmaf filter was doing substring checks in place of string equality
comparisons. This led to a bug when the user specified the pooling method
"harmonic_mean", since "mean" was checked first and the substring comparison
returned true. This patch changes all substring comparisons for string equality
comparisons. This is both correct and more efficient than the existing method.
Signed-off-by: nilfm <nilf@netflix.com>
The current logic for detecting frames that are too small for the
algorithm does not account for chroma sub-sampling, and so a sample
where the luma plane is large enough, but the chroma planes are not
will not be rejected. In that event, a heap overflow will occur.
This change adjusts the logic to consider the chroma planes and makes
the change to all three bwdif implementations.
Fixes#10688
Signed-off-by: Cosmin Stejerean <cosmin@cosmin.at>
Reviewed-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: Philip Langdale <philipl@overt.org>
Both qsv encoders and decoders use 4 as the default value of
async_depth, let's use 4 as the default value for vpp_qsv filter too.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.