In cases where the execution inside the function execute_model_ov fails,
the OVRequestItem must be pushed back to the request_queue before returning
the error. In case pushing back fails, release the allocated memory.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
Fixes: floating point division by 0
Fixes: -nan is outside the range of representable values of type 'int'
Fixes: Ticket8307
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: 1.04064e+10 is outside the range of representable values of type 'int'
Fixes: Ticket 8279
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
It does not modify anything; it only returns a value, so it fulfills
the requirements for av_pure.
The deeper rationale behind this change is that this function is called
quite often inside arguments to FFMIN which may lead to two calls to it;
declaring this function as av_pure allows the compiler to optimize the
second call away.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids empty "Channel" or "Overall" header lines added to log output
when measurement is restricted to one scope using
"measure_perchannel=none" or "measure_overall=none".
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
This commit adds handling for cases where an error may occur, clearing
the allocated memory resources.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit rearranges the existing code to create a separate function
for the completion callback in execute_model_tf.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit rearranges the existing code to create separate function
for filling request with execution data.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit uses TFRequestItem and the existing sync execution
mechanism to use request-based execution. It will help in adding
async functionality to the TensorFlow backend later.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit introduces a typedef TFInferRequest to store
execution parameters for a single call to the TensorFlow C API.
This typedef is used in the TFRequestItem.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit uses the common TaskItem and InferenceItem typedefs
for execution in TensorFlow backend.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
The early return caused isses for the "add" mode (got fixed in
c95dfe5cce) and the "select" mode needs a similar
fix. It is probably better to fully remove the check, since all modes work
correctly with NULL metadata.
Signed-off-by: Marton Balint <cus@passwd.hu>
In dd770883e9, support for expressions was added. Among the constants
added were labels of qnstc, qpal, sntsc & spal.
These were added in ba2a8cb40b to represent parameter permutations where
only the resolution is different. They don't have any usage currency and
don't represent any industry standards or convention in terms of framerate.
In cases where the execution inside the function execute_model_ov fails,
push the RequestItem back to the request_queue before returning the error.
In case pushing back fails, release the allocated memory.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
Support single input for guided filter by adding guidance mode.
If the guidance mode is off, single input is required. And
edge-preserving smoothing is conducted. If the mode is on, two
inputs are needed. The second input serves as the guidance. For
this mode, more tasks are supported, such as detail enhancement,
dehazing and so on.
Signed-off-by: Xuewei Meng <xwmeng96@gmail.com>
Reviewed-by: Steven Liu <lq@chinaffmpeg.org>
Since commit 89ffcd1, the status pts of the output link is set to a
value in the input link time base, not in the output link time base when
EOF is reached. Usually this pst value is larger than the required one
because the output link time base is more greater than the input link
time base. When "-vf vpp_qsv,fps" is used, user has to wait a long time
for the ending of the pipeline because fps filter output a huge number
of frames until the wrong status pts is hit.
The issue can be triggered with the command below (use a clip with 1000
frames in this case):
$> time ffmpeg -hwaccel qsv -c:v hevc_qsv -i input.h265 -vf
"vpp_qsv=w=1920:h=1080,fps=fps=30" -f null -
...
[out_0_0 @ 0x564ccd27e020] 10000000 buffers queued in out_0_0, something
may be wrong.
frame=40119596 fps=88080 q=-0.0 Lsize=N/A time=371:28:39.96 bitrate=N/A
speed=2.94e+03x
video:17238889kB audio:0kB subtitle:0kB other streams:0kB global
headers:0kB muxing overhead: unknown
real 9m7.451s
user 2m34.102s
sys 0m39.734s
In order to avoid the above issue, the same time base for input and
ouput links is used in this patch.
Fixes ticket #9286
Signed-off-by: Zhong Li <zhongli_dev@126.com>
Fix memory leak for RequestItem upon error while pushing to the
request_queue in the completion callback.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
AV_OPT_TYPE_VIDEO_RATE AVOption types are parsed as expressions, but in a
limited way. For example, name constants can only be parsed alone and not as
part of a longer expression.
This change allows usage like
ffmpeg -i IN -vf fps="if(eq(source_fps\,film)\,ntsc_film\,source_fps)" OUT
Suggested-by: ffmpeg@fb.com
Signed-off-by: James Almer <jamrial@gmail.com>
These properties have values either 0 or 1, so using uint8_t
is a better option as compared to int.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
Convert output_name to char **output_names in TaskItem and use it as
a pointer to array of output names in the DNN backend.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
Extract TaskItem and InferenceItem from OpenVino backend and convert
ov_model to void in TaskItem.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
CID 1485004: Uninitialized variables (UNINIT)
Using uninitialized value "x" when calling "*pixel_belongs_to_region".
Signed-off-by: Ting Fu <ting.fu@intel.com>
Fixes: floating point division by 0
Fixes: undefined behavior in handling NaN
Fixes: Ticket 8268
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
fix problem when set x to odd number in nv12 by cuda
test step:
1. ffmpeg -f lavfi testsrc2=s=176x144 -pix_fmt nv12 -t 1 output_overlay.yuv
2. ffmpeg -f lavfi testsrc2=s=352x288 -pix_fmt nv12 -t 1 output_main.yuv
before this patch:
overlay_cuda=x=0:y=0 will right,
overlay_cuda=x=3:y=0 will wrong,
both will right after patch.
Signed-off-by: Steven Liu <liuqi05@kuaishou.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
This commit corrects the type of pointer of elements from the
inference queue in ff_dnn_free_model_ov.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This fixes an issue where the yadif filter could cause the timebase denominator to overflow.
Signed-off-by: Tom Boshoven <tom@jwplayer.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This feature can be used with dnn detection by setting vf_drawtext's option
text_source=side_data_detection_bboxes, for example:
./ffmpeg -i face.jpeg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:\
input=data:output=detection_out:labels=face-detection-adas-0001.label,drawbox=box_source=
side_data_detection_bboxes,drawtext=text_source=side_data_detection_bboxes:fontcolor=green:\
fontsize=40, -y face_detect.jpeg
Please note, the default fontsize of vf_drawtext is 12, which may be too
small to be seen clearly.
Signed-off-by: Ting Fu <ting.fu@intel.com>
This feature can be used with dnn detection by setting vf_drawbox's
option box_source=side_data_detection_bboxes, for example:
./ffmpeg -i face.jpeg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:\
input=data:output=detection_out:labels=face-detection-adas-0001.label,\
drawbox=box_source=side_data_detection_bboxes -y face_detect.jpeg
Signed-off-by: Ting Fu <ting.fu@intel.com>
ref_frame is owned by the framesync structure and should therefore not
be modified; furthermore, these properties that are copied don't seem to
be used at all, so copying is unnecessary. Finally copying when the
destination frame is NULL gives a guaranteed segfault.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: CID1398579 Dereference before null check
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Two modes are supported in guided filter, basic mode and fast mode.
Basic mode is the initial pushed guided filter without optimization.
Fast mode is implemented based on the basic one by sub-sampling method.
The sub-sampling ratio which can be defined by users controls the
algorithm complexity. The larger the sub-sampling ratio, the lower
the algorithm complexity.
Signed-off-by: Xuewei Meng <xwmeng96@gmail.com>
Reviewed-by: Steven Liu <liuqi05@kuaishou.com>
duplicate ff_hex_to_data() function from avformat and rename it to
hex_to_data() as static function.
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
CID: 1482090
there can return null from av_frame_get_side_data, and will use sd->data
after av_frame_get_side_data, so should check null return value.
Signed-off-by: Steven Liu <liuqi05@kuaishou.com>
Testing model is tensorflow offical model in github repo, please refer
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md
to download the detect model as you need.
For example, local testing was carried on with 'ssd_mobilenet_v2_coco_2018_03_29.tar.gz', and
used one image of dog in
https://github.com/tensorflow/models/blob/master/research/object_detection/test_images/image1.jpg
Testing command is:
./ffmpeg -i image1.jpg -vf dnn_detect=dnn_backend=tensorflow:input=image_tensor:output=\
"num_detections&detection_scores&detection_classes&detection_boxes":model=ssd_mobilenet_v2_coco.pb,\
showinfo -f null -
We will see the result similar as below:
[Parsed_showinfo_1 @ 0x33e65f0] side data - detection bounding boxes:
[Parsed_showinfo_1 @ 0x33e65f0] source: ssd_mobilenet_v2_coco.pb
[Parsed_showinfo_1 @ 0x33e65f0] index: 0, region: (382, 60) -> (1005, 593), label: 18, confidence: 9834/10000.
[Parsed_showinfo_1 @ 0x33e65f0] index: 1, region: (12, 8) -> (328, 549), label: 18, confidence: 8555/10000.
[Parsed_showinfo_1 @ 0x33e65f0] index: 2, region: (293, 7) -> (682, 458), label: 1, confidence: 8033/10000.
[Parsed_showinfo_1 @ 0x33e65f0] index: 3, region: (342, 0) -> (690, 325), label: 1, confidence: 5878/10000.
There are two boxes of dog with cores 94.05% & 93.45% and two boxes of person with scores 80.33% & 58.78%.
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Add examples on how to use this filter, and improve the code style.
Implement the slice-level parallelism for guided filter.
Add the basic version of guided filter.
Signed-off-by: Xuewei Meng <xwmeng96@gmail.com>
Reviewed-by: Steven Liu <liuqi05@kuaishou.com>
classification is done on every detection bounding box in frame's side data,
which are the results of object detection (filter dnn_detect).
Please refer to commit log of dnn_detect for the material for detection,
and see below for classification.
- download material for classifcation:
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.bin
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.xml
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.label
- run command as:
./ffmpeg -i cici.jpg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:input=data:output=detection_out:confidence=0.6:labels=face-detection-adas-0001.label,dnn_classify=dnn_backend=openvino:model=emotions-recognition-retail-0003.xml:input=data:output=prob_emotion:confidence=0.3:labels=emotions-recognition-retail-0003.label:target=face,showinfo -f null -
We'll see the detect&classify result as below:
[Parsed_showinfo_2 @ 0x55b7d25e77c0] side data - detection bounding boxes:
[Parsed_showinfo_2 @ 0x55b7d25e77c0] source: face-detection-adas-0001.xml, emotions-recognition-retail-0003.xml
[Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 0, region: (1005, 813) -> (1086, 905), label: face, confidence: 10000/10000.
[Parsed_showinfo_2 @ 0x55b7d25e77c0] classify: label: happy, confidence: 6757/10000.
[Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 1, region: (888, 839) -> (967, 926), label: face, confidence: 6917/10000.
[Parsed_showinfo_2 @ 0x55b7d25e77c0] classify: label: anger, confidence: 4320/10000.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Different function type of model requires different parameters, for
example, object detection detects lots of objects (cat/dog/...) in
the frame, and classifcation needs to know which object (cat or dog)
it is going to classify.
The current interface needs to add a new function with more parameters
to support new requirement, with this change, we can just add a new
struct (for example DNNExecClassifyParams) based on DNNExecBaseParams,
and so we can continue to use the current interface execute_model just
with params changed.
There's one task item for one function call from dnn interface,
there's one request item for one call to openvino. For classify,
one task might need multiple inference for classification on every
bounding box, so add InferenceItem.