FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-29 22:00:58 +02:00

Author	SHA1	Message	Date
Wu Jianhua	4041c1029b	libavfilter/x86/vf_gblur: add localbuf and ff_horiz_slice_avx2/512() We introduced a ff_horiz_slice_avx2/512() implemented on a new algorithm. In a nutshell, the new algorithm does three things, gathering data from 8/16 rows, blurring data, and scattering data back to the image buffer. Here we used a customized transpose 8x8/16x16 to avoid the huge overhead brought by gather and scatter instructions, which is dependent on the temporary buffer called localbuf added newly. Performance data: ff_horiz_slice_avx2(old): 109.89 ff_horiz_slice_avx2(new): 666.67 ff_horiz_slice_avx512: 1000 Co-authored-by: Cheng Yanfei <yanfei.cheng@intel.com> Co-authored-by: Jin Jun <jun.i.jin@intel.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>	2021-08-29 19:58:33 +02:00
Wu Jianhua	68a2722aee	libavfilter/x86/vf_gblur: add ff_verti_slice_avx2/512() The new vertical slice with AVX2/512 acceleration can significantly improve the performance of Gaussian Filter 2D. Performance data: ff_verti_slice_c: 32.57 ff_verti_slice_avx2: 476.19 ff_verti_slice_avx512: 833.33 Co-authored-by: Cheng Yanfei <yanfei.cheng@intel.com> Co-authored-by: Jin Jun <jun.i.jin@intel.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>	2021-08-29 19:58:33 +02:00
Andreas Rheinhardt	8be701d9f7	avfilter/avfilter: Add numbers of (in\|out)pads directly to AVFilter Up until now, an AVFilter's lists of input and output AVFilterPads were terminated by a sentinel and the only way to get the length of these lists was by using avfilter_pad_count(). This has two drawbacks: first, sizeof(AVFilterPad) is not negligible (i.e. 64B on 64bit systems); second, getting the size involves a function call instead of just reading the data. This commit therefore changes this. The sentinels are removed and new private fields nb_inputs and nb_outputs are added to AVFilter that contain the number of elements of the respective AVFilterPad array. Given that AVFilter.(in\|out)puts are the only arrays of zero-terminated AVFilterPads an API user has access to (AVFilterContext.(in\|out)put_pads are not zero-terminated and they already have a size field) the argument to avfilter_pad_count() is always one of these lists, so it just has to find the filter the list belongs to and read said number. This is slower than before, but a replacement function that just reads the internal numbers that users are expected to switch to will be added soon; and furthermore, avfilter_pad_count() is probably never called in hot loops anyway. This saves about 49KiB from the binary; notice that these sentinels are not in .bss despite being zeroed: they are in .data.rel.ro due to the non-sentinels. Reviewed-by: Nicolas George <george@nsup.org> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-08-20 12:53:58 +02:00
Andreas Rheinhardt	1b20853fb3	avfilter/internal: Factor out executing a filter's execute_func The current way of doing it involves writing the ctx parameter twice. Reviewed-by: Nicolas George <george@nsup.org> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-08-15 21:33:25 +02:00
Andreas Rheinhardt	18ec426a86	avfilter/formats: Factor common function combinations out Several combinations of functions happen quite often in query_format functions; e.g. ff_set_common_formats(ctx, ff_make_format_list(sample_fmts)) is very common. This commit therefore adds functions that are equivalent to commonly used function combinations in order to reduce code duplication. Reviewed-by: Nicolas George <george@nsup.org> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-08-13 17:36:22 +02:00
Andreas Rheinhardt	a04ad248a0	avfilter: Constify all AVFilters This is possible now that the next-API is gone. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: James Almer <jamrial@gmail.com>	2021-04-27 11:48:05 -03:00
James Almer	3c77584be8	avfilter/vf_gblur: add missing arch check Removed by mistake in 2b4da1cb8c2984b37e5c912e103a1b8b734e7c1f where it should have been replaced instead. Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-17 15:45:40 -03:00
James Almer	2b4da1cb8c	x86/vf_gblur: fix postscale_slice prologue x86_32 ABI does not pass float arguments directly on xmm regs, and the Win64 ABI uses only the first four regs for this purpose. Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-17 13:33:20 -03:00
Paul B Mahol	44cf3a2b16	avfilter/x86/vf_gblur: add postscale SIMD	2021-02-16 21:12:11 +01:00
Paul B Mahol	058db59e16	avfilter/vf_gblur: factor out postscale function	2021-02-16 21:12:11 +01:00
Paul B Mahol	34922dffca	avfilter/vf_gblur: add float format support	2021-02-12 21:09:51 +01:00
Paul B Mahol	1b26f27026	avfilter/vf_gblur: add support for 12bit yuva formats	2019-11-18 17:26:59 +01:00
Paul B Mahol	1e35519fe0	avfilter/vf_gblur: fix undefined behaviour Fixes #8292	2019-10-16 19:29:56 +02:00
Paul B Mahol	64a805883d	avfilter/vf_gblur: fix heap-buffer overflow Fixes #8282	2019-10-16 12:13:04 +02:00
Paul B Mahol	33e69806aa	avfilter/vf_gblur: switch to ff_filter_process_command()	2019-10-14 11:40:17 +02:00
Paul B Mahol	da9337c911	avfilter/vf_gblur: add support for commands	2019-10-06 15:34:28 +02:00
Ruiling Song	83f9da7768	avfilter/vf_gblur: add x86 SIMD optimizations The horizontal pass get ~2x performance with the patch under single thread. Tested overall performance using the command(avx2 enabled): ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null For single thread, the fps improves from 43 to 60, about 40%. For multi-thread, the fps improves from 110 to 130, about 20%. Signed-off-by: Ruiling Song <ruiling.song@intel.com>	2019-06-12 08:53:11 +08:00
Ruiling Song	06ba4783a0	lavfi/gblur: doing several columns at the same time Instead of doing each column one by one, doing several columns together gives about 30% better performance. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Ruiling Song <ruiling.song@intel.com>	2019-05-08 10:39:43 +08:00
Paul B Mahol	bd6c57d532	avfilter: add support for gray14 format	2018-09-09 19:10:44 +02:00
Paul B Mahol	bac508fec1	avfilter: add support for GRAY9 and GBRAP10	2017-08-07 13:11:09 +02:00
Paul B Mahol	27ebdcf079	avfilter: add GRAY10 and GRAY12 to some filters Signed-off-by: Paul B Mahol <onemda@gmail.com>	2017-04-10 18:13:02 +02:00
Michael Niedermayer	6294247730	avfilter/vf_gblur: Increase supported pixel count from 31bit to 32bit in filter_postscale() Fixes CID1396252 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-01-27 22:16:37 +01:00
Paul B Mahol	443c9fab57	avfilter/vf_gblur: add sigmaV option, different vertical filtering Signed-off-by: Paul B Mahol <onemda@gmail.com>	2016-09-04 23:59:45 +02:00
Paul B Mahol	ee605aa730	avfilter: add gblur filter Signed-off-by: Paul B Mahol <onemda@gmail.com>	2016-09-04 15:33:05 +02:00

24 Commits