FFmpeg/libavfilter/dnn_interface.h

/*
 * Copyright (c) 2018 Sergey Lavrushkin
 *
 * This file is part of FFmpeg.
 *
 * FFmpeg is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 *
 * FFmpeg is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public
 * License along with FFmpeg; if not, write to the Free Software
 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */

/**
 * @file
 * DNN inference engine interface.
 */

#ifndef AVFILTER_DNN_INTERFACE_H
#define AVFILTER_DNN_INTERFACE_H

#include <stdint.h>
#include "libavutil/frame.h"

typedef enum {DNN_SUCCESS, DNN_ERROR} DNNReturnType;

typedef enum {DNN_NATIVE, DNN_TF, DNN_OV} DNNBackendType;

typedef enum {DNN_FLOAT = 1, DNN_UINT8 = 4} DNNDataType;

typedef struct DNNData{
    void *data;
    DNNDataType dt;
    int width, height, channels;
} DNNData;

typedef struct DNNModel{
    // Stores model that can be different for different backends.
    void *model;
    // Stores options when the model is executed by the backend
    const char *options;
    // Stores userdata used for the interaction between AVFrame and DNNData
    void *userdata;
    // Gets model input information
    // Just reuse struct DNNData here, actually the DNNData.data field is not needed.
    DNNReturnType (*get_input)(void *model, DNNData *input, const char *input_name);
    // Gets model output width/height with given input w/h
    DNNReturnType (*get_output)(void *model, const char *input_name, int input_width, int input_height,
                                const char *output_name, int *output_width, int *output_height);
    // set the pre process to transfer data from AVFrame to DNNData
    // the default implementation within DNN is used if it is not provided by the filter
    int (*pre_proc)(AVFrame *frame_in, DNNData *model_input, void *user_data);
    // set the post process to transfer data from DNNData to AVFrame
    // the default implementation within DNN is used if it is not provided by the filter
    int (*post_proc)(AVFrame *frame_out, DNNData *model_output, void *user_data);
} DNNModel;

// Stores pointers to functions for loading, executing, freeing DNN models for one of the backends.
typedef struct DNNModule{
    // Loads model and parameters from given file. Returns NULL if it is not possible.
    DNNModel *(*load_model)(const char *model_filename, const char *options, void *userdata);
    // Executes model with specified input and output. Returns DNN_ERROR otherwise.
    DNNReturnType (*execute_model)(const DNNModel *model, const char *input_name, AVFrame *in_frame,
                                   const char **output_names, uint32_t nb_output, AVFrame *out_frame);
    // Frees memory allocated for model.
    void (*free_model)(DNNModel **model);
} DNNModule;

// Initializes DNNModule depending on chosen backend.
DNNModule *ff_get_dnn_module(DNNBackendType backend_type);

#endif
Adds dnn inference module for simple convolutional networks. Reimplements srcnn filter based on it. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-05-25 20:31:04 +03:00			`/*`
			`* Copyright (c) 2018 Sergey Lavrushkin`
			`*`
			`* This file is part of FFmpeg.`
			`*`
			`* FFmpeg is free software; you can redistribute it and/or`
			`* modify it under the terms of the GNU Lesser General Public`
			`* License as published by the Free Software Foundation; either`
			`* version 2.1 of the License, or (at your option) any later version.`
			`*`
			`* FFmpeg is distributed in the hope that it will be useful,`
			`* but WITHOUT ANY WARRANTY; without even the implied warranty of`
			`* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU`
			`* Lesser General Public License for more details.`
			`*`
			`* You should have received a copy of the GNU Lesser General Public`
			`* License along with FFmpeg; if not, write to the Free Software`
			`* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA`
			`*/`

			`/**`
			`* @file`
			`* DNN inference engine interface.`
			`*/`

			`#ifndef AVFILTER_DNN_INTERFACE_H`
			`#define AVFILTER_DNN_INTERFACE_H`

libavfilter/dnn: support multiple outputs for tensorflow model some models such as ssd, yolo have more than one output. the clean up code in this patch is a little complex, it is because that set_input_output_tf could be called for many times together with ff_dnn_execute_model_tf, we have to clean resources for the case that the two interfaces are called interleaved. Signed-off-by: Guo, Yejun <yejun.guo@intel.com> Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2019-04-25 10:14:33 +08:00			`#include <stdint.h>`
dnn: change dnn interface to replace DNNData* with AVFrame* Currently, every filter needs to provide code to transfer data from AVFrame* to model input (DNNData), and also from model output (DNNData) to AVFrame. Actually, such transfer can be implemented within DNN module, and so filter can focus on its own business logic. DNN module also exports the function pointer pre_proc and post_proc in struct DNNModel, just in case that a filter has its special logic to transfer data between AVFrame and DNNData*. The default implementation within DNN module is used if the filter does not set pre/post_proc. 2020-08-28 12:51:44 +08:00			`#include "libavutil/frame.h"`
libavfilter/dnn: support multiple outputs for tensorflow model some models such as ssd, yolo have more than one output. the clean up code in this patch is a little complex, it is because that set_input_output_tf could be called for many times together with ff_dnn_execute_model_tf, we have to clean resources for the case that the two interfaces are called interleaved. Signed-off-by: Guo, Yejun <yejun.guo@intel.com> Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2019-04-25 10:14:33 +08:00
Adds dnn inference module for simple convolutional networks. Reimplements srcnn filter based on it. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-05-25 20:31:04 +03:00			`typedef enum {DNN_SUCCESS, DNN_ERROR} DNNReturnType;`

dnn: add openvino as one of dnn backend OpenVINO is a Deep Learning Deployment Toolkit at https://github.com/openvinotoolkit/openvino, it supports CPU, GPU and heterogeneous plugins to accelerate deep learning inferencing. Please refer to https://github.com/openvinotoolkit/openvino/blob/master/build-instruction.md to build openvino (c library is built at the same time). Please add option -DENABLE_MKL_DNN=ON for cmake to enable CPU path. The header files and libraries are installed to /usr/local/deployment_tools/inference_engine/ with default options on my system. To build FFmpeg with openvion, take my system as an example, run with: $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/deployment_tools/inference_engine/lib/intel64/:/usr/local/deployment_tools/inference_engine/external/tbb/lib/ $ ../ffmpeg/configure --enable-libopenvino --extra-cflags=-I/usr/local/deployment_tools/inference_engine/include/ --extra-ldflags=-L/usr/local/deployment_tools/inference_engine/lib/intel64 $ make Here are the features provided by OpenVINO inference engine: - support more DNN model formats It supports TensorFlow, Caffe, ONNX, MXNet and Kaldi by converting them into OpenVINO format with a python script. And torth model can be first converted into ONNX and then to OpenVINO format. see the script at https://github.com/openvinotoolkit/openvino/tree/master/model-optimizer/mo.py which also does some optimization at model level. - optimize at inference stage It optimizes for X86 CPUs with SSE, AVX etc. It also optimizes based on OpenCL for Intel GPUs. (only Intel GPU supported becuase Intel OpenCL extension is used for optimization) Signed-off-by: Guo, Yejun <yejun.guo@intel.com> Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2020-05-25 15:38:09 +08:00			`typedef enum {DNN_NATIVE, DNN_TF, DNN_OV} DNNBackendType;`
Adds dnn inference module for simple convolutional networks. Reimplements srcnn filter based on it. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-05-25 20:31:04 +03:00
dnn: export operand info in python script and load in c code Signed-off-by: Guo, Yejun <yejun.guo@intel.com> Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2019-08-20 16:50:34 +08:00			`typedef enum {DNN_FLOAT = 1, DNN_UINT8 = 4} DNNDataType;`
libavfilter/dnn: add more data type support for dnn model input currently, only float is supported as model input, actually, there are other data types, this patch adds uint8. Signed-off-by: Guo, Yejun <yejun.guo@intel.com> Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2019-04-25 10:14:42 +08:00
avfilter/dnn: get the data type of network output from dnn execution result so, we can make a filter more general to accept different network models, by adding a data type convertion after getting data from network. After we add dt field into struct DNNData, it becomes the same as DNNInputData, so merge them with one struct: DNNData. Signed-off-by: Guo, Yejun <yejun.guo@intel.com> Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2019-10-21 20:38:10 +08:00			`typedef struct DNNData{`
libavfilter/dnn: add more data type support for dnn model input currently, only float is supported as model input, actually, there are other data types, this patch adds uint8. Signed-off-by: Guo, Yejun <yejun.guo@intel.com> Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2019-04-25 10:14:42 +08:00			`void *data;`
			`DNNDataType dt;`
			`int width, height, channels;`
Adds dnn inference module for simple convolutional networks. Reimplements srcnn filter based on it. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-05-25 20:31:04 +03:00			`} DNNData;`

			`typedef struct DNNModel{`
			`// Stores model that can be different for different backends.`
libavfilter: Code style fixes for pointers in DNN module and sr filter. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-07-27 19:34:02 +03:00			`void *model;`
dnn: add backend options when load the model different backend might need different options for a better performance, so, add the parameter into dnn interface, as a preparation. Signed-off-by: Guo, Yejun <yejun.guo@intel.com> 2020-08-07 14:32:55 +08:00			`// Stores options when the model is executed by the backend`
			`const char *options;`
dnn: add userdata for load model parameter the userdata will be used for the interaction between AVFrame and DNNData 2020-08-24 16:09:59 +08:00			`// Stores userdata used for the interaction between AVFrame and DNNData`
			`void *userdata;`
avfilter/dnn: add a new interface to query dnn model's input info to support dnn networks more general, we need to know the input info of the dnn model. background: The data type of dnn model's input could be float32, uint8 or fp16, etc. And the w/h of input image could be fixed or variable. Signed-off-by: Guo, Yejun <yejun.guo@intel.com> Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2019-10-21 20:38:17 +08:00			`// Gets model input information`
			`// Just reuse struct DNNData here, actually the DNNData.data field is not needed.`
			`DNNReturnType (get_input)(void model, DNNData input, const char input_name);`
dnn: add a new interface DNNModel.get_output for some cases (for example, super resolution), the DNN model changes the frame size which impacts the filter behavior, so the filter needs to know the out frame size at very beginning. Currently, the filter reuses DNNModule.execute_model to query the out frame size, it is not clear from interface perspective, so add a new explict interface DNNModel.get_output for such query. 2020-09-11 22:15:04 +08:00			`// Gets model output width/height with given input w/h`
			`DNNReturnType (get_output)(void model, const char *input_name, int input_width, int input_height,`
			`const char output_name, int output_width, int *output_height);`
dnn: change dnn interface to replace DNNData* with AVFrame* Currently, every filter needs to provide code to transfer data from AVFrame* to model input (DNNData), and also from model output (DNNData) to AVFrame. Actually, such transfer can be implemented within DNN module, and so filter can focus on its own business logic. DNN module also exports the function pointer pre_proc and post_proc in struct DNNModel, just in case that a filter has its special logic to transfer data between AVFrame and DNNData*. The default implementation within DNN module is used if the filter does not set pre/post_proc. 2020-08-28 12:51:44 +08:00			`// set the pre process to transfer data from AVFrame to DNNData`
			`// the default implementation within DNN is used if it is not provided by the filter`
			`int (pre_proc)(AVFrame frame_in, DNNData model_input, void user_data);`
			`// set the post process to transfer data from DNNData to AVFrame`
			`// the default implementation within DNN is used if it is not provided by the filter`
			`int (post_proc)(AVFrame frame_out, DNNData model_output, void user_data);`
Adds dnn inference module for simple convolutional networks. Reimplements srcnn filter based on it. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-05-25 20:31:04 +03:00			`} DNNModel;`

			`// Stores pointers to functions for loading, executing, freeing DNN models for one of the backends.`
			`typedef struct DNNModule{`
			`// Loads model and parameters from given file. Returns NULL if it is not possible.`
dnn: add userdata for load model parameter the userdata will be used for the interaction between AVFrame and DNNData 2020-08-24 16:09:59 +08:00			`DNNModel (load_model)(const char model_filename, const char options, void *userdata);`
dnn: put DNNModel.set_input and DNNModule.execute_model together suppose we have a detect and classify filter in the future, the detect filter generates some bounding boxes (BBox) as AVFrame sidedata, and the classify filter executes DNN model for each BBox. For each BBox, we need to crop the AVFrame, copy data to DNN model input and do the model execution. So we have to save the in_frame at DNNModel.set_input and use it at DNNModule.execute_model, such saving is not feasible when we support async execute_model. This patch sets the in_frame as execution_model parameter, and so all the information are put together within the same function for each inference. It also makes easy to support BBox async inference. 2020-09-10 22:29:57 +08:00			`// Executes model with specified input and output. Returns DNN_ERROR otherwise.`
			`DNNReturnType (execute_model)(const DNNModel model, const char input_name, AVFrame in_frame,`
			`const char *output_names, uint32_t nb_output, AVFrame out_frame);`
Adds dnn inference module for simple convolutional networks. Reimplements srcnn filter based on it. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-05-25 20:31:04 +03:00			`// Frees memory allocated for model.`
libavfilter: Code style fixes for pointers in DNN module and sr filter. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-07-27 19:34:02 +03:00			`void (free_model)(DNNModel *model);`
Adds dnn inference module for simple convolutional networks. Reimplements srcnn filter based on it. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-05-25 20:31:04 +03:00			`} DNNModule;`

			`// Initializes DNNModule depending on chosen backend.`
libavfilter: Code style fixes for pointers in DNN module and sr filter. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-07-27 19:34:02 +03:00			`DNNModule *ff_get_dnn_module(DNNBackendType backend_type);`
Adds dnn inference module for simple convolutional networks. Reimplements srcnn filter based on it. Signed-off-by: Pedro Arthur <bygrandao@gmail.com> 2018-05-25 20:31:04 +03:00
			`#endif`