1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-02 03:06:28 +02:00
FFmpeg/libswscale
Hubert Mazur 2537fdc510 sw_scale: Add specializations for hscale 16 to 19
Provide arm64 neon optimized implementations for hscale16To19 with
filter sizes 4, 8 and X4.

The tests and benchmarks run on AWS Graviton 2 instances.
The results from a checkasm tool are shown below.

hscale_16_to_19__fs_4_dstW_512_c: 6216.0
hscale_16_to_19__fs_4_dstW_512_neon: 2257.0
hscale_16_to_19__fs_8_dstW_512_c: 10417.7
hscale_16_to_19__fs_8_dstW_512_neon: 3112.5
hscale_16_to_19__fs_12_dstW_512_c: 14890.5
hscale_16_to_19__fs_12_dstW_512_neon: 3899.0
hscale_16_to_19__fs_16_dstW_512_c: 19006.5
hscale_16_to_19__fs_16_dstW_512_neon: 5341.2
hscale_16_to_19__fs_32_dstW_512_c: 36629.5
hscale_16_to_19__fs_32_dstW_512_neon: 9502.7
hscale_16_to_19__fs_40_dstW_512_c: 45477.5
hscale_16_to_19__fs_40_dstW_512_neon: 11552.0

(Note, the checkasm tests for these functions haven't been
merged since they fail on x86.)

Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-11-01 15:24:58 +02:00
..
aarch64 sw_scale: Add specializations for hscale 16 to 19 2022-11-01 15:24:58 +02:00
arm sws: rename SwsContext.swscale to convert_unscaled 2021-07-03 15:57:53 +02:00
loongarch swscale/la: Add output_lasx.c file. 2022-09-10 22:56:39 +02:00
ppc sws: rename SwsContext.swscale to convert_unscaled 2021-07-03 15:57:53 +02:00
riscv sws/rgb2rgb: RISC-V 64-bit V packed YUYV/UYVY to planar 4:2:2 2022-09-30 07:25:44 +02:00
tests swscale: introduce isSwappedChroma 2022-01-04 19:39:22 -06:00
x86 swscale/x86/rgb_2_rgb: Empty MMX state in ff_shuffle_bytes_2103_mmxext 2022-08-23 12:21:00 +02:00
alphablend.c swscale/alphablend: Fix slice handling 2021-10-03 20:38:29 +02:00
bayer_template.c swscale: do not drop half of bits from 16bit bayer formats 2020-08-08 12:03:42 +02:00
gamma.c swscale: re-enable gamma 2015-09-04 19:00:20 -03:00
half2float.c swscale/input: add rgbaf16 input support 2022-08-19 22:09:36 +02:00
hscale_fast_bilinear.c sws: Move fast bilinear C code into seperate file 2014-07-19 05:36:26 +02:00
hscale.c swscale: add opaque parameter to input functions 2022-08-19 22:09:36 +02:00
input.c swscale/input: Avoid calls to av_pix_fmt_desc_get() 2022-09-19 23:40:41 +02:00
libswscale.v build: Change structure of the linker version script templates 2016-05-29 16:43:11 +02:00
log2_tab.c lsws: duplicate ff_log2_tab 2014-08-12 20:52:21 +02:00
Makefile swscale/input: add rgbaf16 input support 2022-08-19 22:09:36 +02:00
options.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
output.c swscale/output: Don't call av_pix_fmt_desc_get() in a loop 2022-09-19 23:40:41 +02:00
rgb2rgb_template.c swscale/rgb2rgb_template: use shuffle macro on big-endian arches 2020-12-12 23:07:22 -05:00
rgb2rgb.c sws/rgb2rgb: RISC-V V shuffle_bytes_xxxx functions 2022-09-30 07:24:09 +02:00
rgb2rgb.h sws/rgb2rgb: RISC-V V shuffle_bytes_xxxx functions 2022-09-30 07:24:09 +02:00
slice.c swscale/input: add rgbaf16 input support 2022-08-19 22:09:36 +02:00
swscale_internal.h swscale/la: Optimize hscale functions with lasx. 2022-09-10 22:56:38 +02:00
swscale_unscaled.c libswscale: force a minimum size of the slide for bayer sources 2022-10-14 12:19:13 +02:00
swscale.c swscale/la: Optimize hscale functions with lasx. 2022-09-10 22:56:38 +02:00
swscale.h swscale: document some missing arguments 2022-10-17 09:56:47 +02:00
swscaleres.rc Add Windows resource file support for shared libraries 2013-12-05 23:42:07 +01:00
utils.c swscale/la: Optimize hscale functions with lasx. 2022-09-10 22:56:38 +02:00
version_major.h libswscale: Split version.h 2022-03-16 14:05:26 +02:00
version.c lib*/version: Move library version functions into files of their own 2022-05-10 06:49:32 +02:00
version.h swscale/output: add support for Y210LE and Y212LE 2022-09-10 12:29:12 -07:00
vscale.c Replace all occurences of av_mallocz_array() by av_calloc() 2021-09-20 01:03:52 +02:00
yuv2rgb.c swscale/la: Add yuv2rgb_lasx.c and rgb2rgb_lasx.c files 2022-09-10 22:56:38 +02:00