This filter accepts all the dnn networks which do image processing.
Currently, frame with formats rgb24 and bgr24 are supported. Other
formats such as gray and YUV will be supported next. The dnn network
can accept data in float32 or uint8 format. And the dnn network can
change frame size.
The following is a python script to halve the value of the first
channel of the pixel. It demos how to setup and execute dnn model
with python+tensorflow. It also generates .pb file which will be
used by ffmpeg.
import tensorflow as tf
import numpy as np
import imageio
in_img = imageio.imread('in.bmp')
in_img = in_img.astype(np.float32)/255.0
in_data = in_img[np.newaxis, :]
filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0, 1.]).reshape(1,1,3,3).astype(np.float32)
filter = tf.Variable(filter_data)
x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID', name='dnn_out')
sess=tf.Session()
sess.run(tf.global_variables_initializer())
output = sess.run(y, feed_dict={x: in_data})
graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out'])
tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb', as_text=False)
output = output * 255.0
output = output.astype(np.uint8)
imageio.imsave("out.bmp", np.squeeze(output))
To do the same thing with ffmpeg:
- generate halve_first_channel.pb with the above script
- generate halve_first_channel.model with tools/python/convert.py
- try with following commands
./ffmpeg -i input.jpg -vf dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:fmt=rgb24:dnn_backend=native -y out.native.png
./ffmpeg -i input.jpg -vf dnn_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_out:fmt=rgb24:dnn_backend=tensorflow -y out.tf.png
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If filter changes frame w/h, AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC
cannot be supported.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
to support dnn networks more general, we need to know the input info
of the dnn model.
background:
The data type of dnn model's input could be float32, uint8 or fp16, etc.
And the w/h of input image could be fixed or variable.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
so, we can make a filter more general to accept different network
models, by adding a data type convertion after getting data from network.
After we add dt field into struct DNNData, it becomes the same as
DNNInputData, so merge them with one struct: DNNData.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
Unlike other tf.*.conv2d layers, tf.nn.conv2d does not create many
nodes (within a scope) in the graph, it just acts like other layers.
tf.nn.conv2d only creates one node in the graph, and no internal
nodes such as 'kernel' are created.
The format of native model file is also changed, a flag named
has_bias is added, so change the version number.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
Or it'll cause invalid color and s->filter is NULL.
Please reproduce it with below command on big endian system:
$ ./ffmpeg -f lavfi -i "anoisesrc=d=60:c=1:r=48000" -f s16le -c:a pcm_s16le -f
null -
Segmentation fault (core dumped)
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>