The difference of yolov4 is that sigmoid function needed to be applied
on x, y coordinates. Also make it compatiple with NHWC output as the
yolov4 model from openvino model zoo has NHWC output layout.
Model refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v4-tf
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add yolov3 support. The difference of yolov3 is that it has multiple
outputs in different scale to perform better on both large and small
object.
The model detail refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v3-tf
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add input pad to get model input resolution. Detection models always
have fixed input size. And the output coordinators are based on the
input resolution, so we need to get input size to map coordinators to
our real output frames.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add multiple output support to openvino backend. You can use '&' to
split different output when you set output name using command line.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
The libvmaf filter was doing substring checks in place of string equality
comparisons. This led to a bug when the user specified the pooling method
"harmonic_mean", since "mean" was checked first and the substring comparison
returned true. This patch changes all substring comparisons for string equality
comparisons. This is both correct and more efficient than the existing method.
Signed-off-by: nilfm <nilf@netflix.com>
The current logic for detecting frames that are too small for the
algorithm does not account for chroma sub-sampling, and so a sample
where the luma plane is large enough, but the chroma planes are not
will not be rejected. In that event, a heap overflow will occur.
This change adjusts the logic to consider the chroma planes and makes
the change to all three bwdif implementations.
Fixes#10688
Signed-off-by: Cosmin Stejerean <cosmin@cosmin.at>
Reviewed-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: Philip Langdale <philipl@overt.org>
Both qsv encoders and decoders use 4 as the default value of
async_depth, let's use 4 as the default value for vpp_qsv filter too.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
Add yolo support. Yolo model doesn't output final result. It outputs
candidate boxes, so we need post-process to remove overlap boxes to
get final results. Also, the box's coordinators relate to cell and
anchors, so we need these information to calculate boxes as well.
Model detail please refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v2-tf
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
There are many kinds of detection DNN model and they have different
preprocess and postprocess methods. To support more models,
"model_type" option is added to help to choose preprocess and
postprocess function.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Segmented loads are slow, so here we use unit-strided load and narrowing shifts.
c910:
fcmul_add_c: 2179
fcmul_add_rvv_f64: 1652
c908:
fcmul_add_c: 4891.2
fcmul_add_rvv_f64: 2399.5
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
When using vf_scale to force a specific output color space, also tag
this on the AVFrame. (Mirroring existing logic for output range)
Move the sanity fix for RGB after the new assignment, to avoid leaking
bogus YUV colorspace metadata for RGB spaces.
No need to write a custom string parser when we can just use an integer
option with preset values. The various bits of fallback logic are wholly
redundant with equivalent logic already inside sws_getCoefficients.
Note: I disallowed setting 'out_color_matrix=auto', because this does
not do anything meaningful in the current code (just hard-codes
AVCOL_SPC_BT470BG fallback).
The check if drawing needs to be initialized and supported formats
should be drawable ones was flawed, as pad_stop/pad_start is only
populated from stop_duration/start_duration after these checks.
To fix that, check the _duration variants as well and for better
readability and maintainability break the check out into its own
helper.
Fixes a regression from 86b252ea9dFix#10621
Signed-off-by: Anton Khirnov <anton@khirnov.net>
We calculated offsets as pairs, but addressed them in the shader
as single float values, while reading them as ivec2s.
Also removes unused code (was provisionally added if cooperative matrices
could be used, but that turned out to be impossible).
Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Due to making the decode frames context use the coded size, the
filter started to display those artifacts as it reused the input frame's size.
Change it to instead output the real image size for images, not the input.
The code above this does a whitelist on desc->flags, which now includes
the (disallowed) AV_PIX_FMT_FLAG_XYZ for XYZ formats. So there is no
more need for a separate check, here.
These are not supported by the drawing functions at all, and were
incorrectly advertised as supported in the past.
Note: This check is added only to separate the logic change from the API
change in the following commit, and will be removed again after it
becomes redundant.
The vidstab library added support in Nov 2020 for writing/reading
the transforms data in binary in addition to ASCII. The library default
was changed to binary format but no changes were made to the AVfilters
resulting in data file for writing or reading being always opened as text.
This effectively broke the filters.
Option added to vidstabdetect to specify file format and open files in
both filters with the correct attributes.
This logic only covers the case of yuv420p. Extend this logic to cover
*all* vertically subsampled YUV formats, which require the same
interlaced scaling logic.
Fortunately, we can get away with re-using the same code for both JPEG
and MPEG range YUV, because the only difference here is the horizontal
alignment. (Which I omit touching for now, to avoid introducing possibly
unintended changes in default behavior)
Fixes multiplane support on Nvidia.
Also, remove the ENCODE usage, even if the driver signals it as supported.
Currently, it's not used, and when it is used, it'll be gated behind
two extension checks.
Instead, use forward declarations; and in order not to affect
any user include these headers for them, but not internally.
This has the advantage of removing implicit inclusions of these
headers from almost all files providing codecs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In this logical branch, fq->queued and fq->allocated must be equal.
Deleting this code will make it easier to understand the logic
(copy the data before tail to the newly requested space)
Signed-off-by: yangyalei <yangyalei@xiaomi.com>
This function is used by the AARCH64 code to filter a few
remaining lines in case the dimensions are not suitably
aligned; it is furthermore also used by checkasm to actually
test the AARCH64 code against. But it is normally not used
and this patch therefore moves it in bwdifdsp.h as a static
inline function so that it can be avoided if possible
or inlined where necessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise checkasm/vf_bwdif.c pulls in vf_bwdif.c and
then all of libavfilter. Besides being bad size-wise
this also has the downside that it pulls in
avpriv_(cga|vga16)_font from libavutil which are marked
as being imported from another library when building
libavfilter as a DLL and this breaks checkasm because
it links both lavfi and lavu statically.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This already avoids unnecessary indirectly included headers
in the arch-specific vf_bwdif_init.c files; it is also in
preparation for splitting the actual functions out of vf_bwdif.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Dnn models has different data preprocess requirements. Scale and mean
parameters are added to preprocess input data.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Dnn models have different input layout (NCHW or NHWC), so a
"layout" option is added
Use openvino's API to do layout conversion for input data. Use swscale
to do layout conversion for output data as openvino doesn't have
similiar C API for output.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
The PSNR filter uses the pixel values without considering
the color ranges. This is incorrect. Patch adds a warning
so at least the user knows it.
Let's see an example:
(1) Let's get a simple black pixel/white pixel image.
```
$ echo -n -e "\x00\x00\x00\xff\xff\xff" > /tmp/foo.rgb24
```
(2) From this image, let's distill full and limited range y4m
copies.
```
$ ffmpeg -y -f rawvideo -video_size 2x1 -pix_fmt rgb24 -i /tmp/foo.rgb24 -vf scale="out_range=full" -pix_fmt yuv420p /tmp/foo.full.y4m
$ xxd /tmp/foo.full.y4m
00000000: 5955 5634 4d50 4547 3220 5732 2048 3120 YUV4MPEG2 W2 H1
00000010: 4632 353a 3120 4970 2041 303a 3020 4334 F25:1 Ip A0:0 C4
00000020: 3230 6a70 6567 2058 5953 4353 533d 3432 20jpeg XYSCSS=42
00000030: 304a 5045 4720 5843 4f4c 4f52 5241 4e47 0JPEG XCOLORRANG
00000040: 453d 4655 4c4c 0a46 5241 4d45 0a00 ff80 E=FULL.FRAME....
00000050: 80 .
```
and
```
$ ffmpeg -y -f rawvideo -video_size 2x1 -pix_fmt rgb24 -i /tmp/foo.rgb24 -vf scale="out_range=limited" -pix_fmt yuv420p /tmp/foo.limited.y4m
$ xxd /tmp/foo.limited.y4m
00000000: 5955 5634 4d50 4547 3220 5732 2048 3120 YUV4MPEG2 W2 H1
00000010: 4632 353a 3120 4970 2041 303a 3020 4334 F25:1 Ip A0:0 C4
00000020: 3230 6a70 6567 2058 5953 4353 533d 3432 20jpeg XYSCSS=42
00000030: 304a 5045 4720 5843 4f4c 4f52 5241 4e47 0JPEG XCOLORRANG
00000040: 453d 4c49 4d49 5445 440a 4652 414d 450a E=LIMITED.FRAME.
00000050: 10eb 8080 ....
```
Note that the 2x images are the same (both have 1x pixel at the
darkest black, and one at the brightest white). Only difference
is the range.
(3) Let's calculate the PSNR score:
```
$ ./ffmpeg -filter_threads 1 -filter_complex_threads 1 -i /tmp/foo.full.y4m -i /tmp/foo.limited.y4m -lavfi "psnr" -f null -
...
[Parsed_psnr_0 @ 0x2f5dac0] PSNR y:22.972065 u:inf v:inf average:25.982365 min:25.982365 max:25.982365
```
As we are comparing an image with itself, we expect "y:inf" as the
luma PSNR. Issue here is that the PSNR filter just uses the pixel
values, ignoring the color ranges.
A possible solution would be to have the filter do the conversion.
Proposed solution is to add a warning.
```
$ ./ffmpeg -filter_threads 1 -filter_complex_threads 1 -i /tmp/foo.full.y4m -i /tmp/foo.limited.y4m -lavfi "psnr" -f null -
...
[Parsed_psnr_0 @ 0x2f5dac0] master and reference frames use different color ranges (pc != tv)
...
[Parsed_psnr_0 @ 0x2f5dac0] PSNR y:22.972065 u:inf v:inf average:25.982365 min:25.982365 max:25.982365
```
Tested:
Ran fate.
```
$ make fate -j
...
TEST seek-lavf-ppmpipe
TEST seek-lavf-pgmpipe
TEST seek-lavf-mxf_opatom
```
When ov_model_const_input_by_name/ov_model_const_output_by_name
failed, input_port/output_port can be wild pointer.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
av_image_copy() accepts const uint8_t* const * as source;
lots of user have uint8_t* const * and therefore either
cast (the majority) or copy the array of pointers.
This commit changes this by adding a static inline wrapper
for av_image_copy() that casts between the two types
so that we do not need to add casts everywhere else.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These casts cast const away temporarily; they are safe, because
the pointers that are initialized point to const data. But this
is nevertheless not nice and leads to warnings when using
-Wcast-qual. blend_modes.c generates 546 (2*39*7) such warnings
which is the majority of such warnings for FFmpeg as a whole.
vf_blend.c and vf_blend_init.h also use this pattern;
they have also been changed.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The const makes no sense and is later cast away.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is not even clear what these emms_c() are supposed to achieve:
create_filtergraph() (where it may be called) does not call
ASM functions itself; it merely sets some function pointers.
Furthermore, there are no colorspacedsp functions using MMX
(checked by checkasm which does not use declare_new_emms()).
Finally, checking whether to issue emms_c() is overblown anyway.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Can be used to configure libplacebo's underlying raw options, which
sometimes includes new or advanced / in-depth settings not (yet) exposed
by vf_libplacebo.
This new upstream struct simplifies params struct management by allowing
them to all be contained in a single dynamically allocated struct. This
commit switches to the new API in a backwards-compatible way.
The only nontrivial change that was required was to handle
`sigmoid_params` in a way consistent with the rest of the params
structs, instead of setting it directly to the upstream default.
OpenVINO API 2.0 was released in March 2022, which introduced new
features.
This commit implements current OpenVINO features with new 2.0 APIs. And
will add other features in API 2.0.
Please add installation path, which include openvino.pc, to
PKG_CONFIG_PATH mannually for new OpenVINO libs config.
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
This commit fixes bug #10495
The code had several bugs related to post-loop compensation code:
- test assembly instruction performs bitwise AND operation and
generate flags used by jz branch instruction. Wrong test condition
leads to incorrect branching
- Incorrect compensation code for some branches
Signed-off-by: Evgeny Pavlov <lucenticus@gmail.com>
If user doesn't set framerate when he creates a filter, the filter uses
default framerate {0, 1}. This causes error when setting timebase to
1/framerate. Now change it to pass inlink->time_base to outlink when
framerate is not set.
This patch fixes ticket: #10476#10468
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Including winsock2.h or windows.h without WIN32_LEAN_AND_MEAN cause
bzlib.h to parse as nonsense, due to an instance of #define char small
in rpcndr.h.
See:
https://stackoverflow.com/a/27794577
Signed-off-by: L. E. Segovia <amy@amyspark.me>
Signed-off-by: Martin Storsjö <martin@martin.st>
A filter needs formats.h iff it uses FILTER_QUERY_FUNC();
since lots of filters have been switched to use something
else than FILTER_QUERY_FUNC, they don't need it any more,
but removing this header has been forgotten.
This commit does this; files with formats.h inclusion went down
from 304 to 139 here (it were 449 before the preceding commit).
While just at it, also improve the other headers a bit.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
internal.h doesn't rely on it; instead include it directly
in every user that needs it (a filter needing it is basically
equivalent to it using FILTER_QUERY_FUNC, i.e. a majority of
filters doesn't need it).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This file is only used by the greyedge filter and therefore
only compiled if said filter is enabled. This also allows
to remove a config_components.h inclusion, avoiding unnecessary
rebuilds.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Lots of video filters use a very simple input or output:
An array with a single AVFilterPad whose name is "default"
and whose type is AVMEDIA_TYPE_VIDEO; everything else is unset.
Given that we never use pointer equality for inputs or outputs*,
we can simply use a single AVFilterPad instead of dozens; this
even saves .data.rel.ro (8312B here) as well as relocations.
*: In fact, several filters (like the filters in vf_lut.c)
already use the same outputs; furthermore, ff_filter_alloc()
duplicates the input and output pads so that we do not even
work with the pads directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
internal.h does not depend on video.h (and should not depend on it)
and therefore should not include video.h at all; instead all users
of video.h should include it directly.
Doing so also avoids unnecessary video.h inclusions in files that
don't need it, like most audio filters.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Lots of audio filters use very simple inputs or outputs:
An array with a single AVFilterPad whose name is "default"
and whose type is AVMEDIA_TYPE_AUDIO; everything else is unset.
Given that we never use pointer equality for inputs or outputs*,
we can simply use a single AVFilterPad instead of dozens; this
even saves .data.rel.ro (4784B here) as well as relocations.
*: In fact, several filters (like the filters in af_biquads.c)
already use the same inputs; furthermore, ff_filter_alloc()
duplicates the input and output pads so that we do not even
work with the pads directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise the var_names and the corresponding enum will be off
and e.g. the array holding the variable values will be too small.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
User may set color range / matrix coefficient set / primaries / transfer
characteristics for output.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>