We introduced a ff_horiz_slice_avx2/512() implemented on a new algorithm.
In a nutshell, the new algorithm does three things, gathering data from
8/16 rows, blurring data, and scattering data back to the image buffer.
Here we used a customized transpose 8x8/16x16 to avoid the huge overhead
brought by gather and scatter instructions, which is dependent on the
temporary buffer called localbuf added newly.
Performance data:
ff_horiz_slice_avx2(old): 109.89
ff_horiz_slice_avx2(new): 666.67
ff_horiz_slice_avx512: 1000
Co-authored-by: Cheng Yanfei <yanfei.cheng@intel.com>
Co-authored-by: Jin Jun <jun.i.jin@intel.com>
Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
The new vertical slice with AVX2/512 acceleration can significantly
improve the performance of Gaussian Filter 2D.
Performance data:
ff_verti_slice_c: 32.57
ff_verti_slice_avx2: 476.19
ff_verti_slice_avx512: 833.33
Co-authored-by: Cheng Yanfei <yanfei.cheng@intel.com>
Co-authored-by: Jin Jun <jun.i.jin@intel.com>
Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
Implements a gray world color correction algorithm
using a log scale LAB colorspace.
Signed-off-by: Paul Buxton <paulbuxton.mail@googlemail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
This patch renames the InferenceItem to LastLevelTaskItem in the
three backends to avoid confusion among the meanings of these structs.
The following are the renames done in this patch:
1. extract_inference_from_task -> extract_lltask_from_task
2. InferenceItem -> LastLevelTaskItem
3. inference_queue -> lltask_queue
4. inference -> lltask
5. inference_count -> lltask_count
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
Remove async flag from filter's perspective after the unification
of async and sync modes in the DNN backend.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit removes the unused sync mode specific code from the DNN
filters since the sync and async mode are now unified from the
filters' perspective.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit unifies the async and sync mode from the DNN filters'
perspective. As of this commit, the Native backend only supports
synchronous execution mode.
Now the user can switch between async and sync mode by using the
'async' option in the backend_configs. The values can be 1 for
async and 0 for sync mode of execution.
This commit affects the following filters:
1. vf_dnn_classify
2. vf_dnn_detect
3. vf_dnn_processing
4. vf_sr
5. vf_derain
This commit also updates the filters vf_dnn_detect and vf_dnn_classify
to send only the input frame and send NULL as output frame instead of
input frame to the DNN backends.
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This affects only the xmedian filter, not tmedian.
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>