FFmpeg/libswscale at fe100bc556d7b25d301ed65f7ae7a74880770f09 - FFmpeg - Gitea: Git with a cup of tea

virtualenv/FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-03-03 14:32:16 +02:00

History

Martin Storsjö 70db14376c swscale: aarch64: Optimize the final summation in the hscale routine

Before:                     Cortex A53      A72      A73  Graviton 2  Graviton 3
hscale_8_to_15_width8_neon:     8273.0   4602.5   4289.5      2429.7      1629.1
hscale_8_to_15_width16_neon:   12405.7   6803.0   6359.0      3549.0      2378.4
hscale_8_to_15_width32_neon:   21258.7  11491.7  11469.2      5797.2      3919.6
hscale_8_to_15_width40_neon:   25652.0  14173.7  12488.2      6893.5      4810.4

After:
hscale_8_to_15_width8_neon:     7633.0   3981.5   3350.2      1980.7      1261.1
hscale_8_to_15_width16_neon:   11666.7   5951.0   5512.0      3080.7      2131.4
hscale_8_to_15_width32_neon:   20900.7  10733.2   9481.7      5275.2      3862.1
hscale_8_to_15_width40_neon:   24826.0  13536.2  11502.0      6397.2      4731.9

Thus, this gives overall a 8-29% speedup for the smaller filter
sizes, around 1-8% for the larger filter sizes.

Inspired by a patch by Jonathan Swinney <jswinney@amazon.com>.

Signed-off-by: Martin Storsjö <martin@martin.st>

2022-04-22 10:49:46 +03:00

..

swscale: aarch64: Optimize the final summation in the hscale routine

2022-04-22 10:49:46 +03:00

sws: rename SwsContext.swscale to convert_unscaled

2021-07-03 15:57:53 +02:00

sws: rename SwsContext.swscale to convert_unscaled

2021-07-03 15:57:53 +02:00

swscale: introduce isSwappedChroma

2022-01-04 19:39:22 -06:00

swscale/x86/swscale: Remove superfluous and invalid ';'

2022-01-22 17:00:45 +01:00

alphablend.c

swscale/alphablend: Fix slice handling

2021-10-03 20:38:29 +02:00

bayer_template.c

swscale: do not drop half of bits from 16bit bayer formats

2020-08-08 12:03:42 +02:00

gamma.c

swscale: re-enable gamma

2015-09-04 19:00:20 -03:00

hscale_fast_bilinear.c

sws: Move fast bilinear C code into seperate file

2014-07-19 05:36:26 +02:00

hscale.c

avutil: Rename FF_CEIL_COMPAT to AV_CEIL_COMPAT

2016-01-27 16:36:46 +00:00

input.c

Remove unnecessary libavutil/(avutil|common|internal).h inclusions

2022-02-24 12:56:49 +01:00

libswscale.v

build: Change structure of the linker version script templates

2016-05-29 16:43:11 +02:00

log2_tab.c

lsws: duplicate ff_log2_tab

2014-08-12 20:52:21 +02:00

Makefile

libswscale: Split version.h

2022-03-16 14:05:26 +02:00

options.c

Remove unnecessary libavutil/(avutil|common|internal).h inclusions

2022-02-24 12:56:49 +01:00

output.c

swscale/output: use isSwappedChroma

2022-01-04 19:39:22 -06:00

rgb2rgb_template.c

swscale/rgb2rgb_template: use shuffle macro on big-endian arches

2020-12-12 23:07:22 -05:00

rgb2rgb.c

swscale: aarch64: Add a NEON implementation of interleaveBytes

2020-05-15 23:38:17 +03:00

rgb2rgb.h

Remove unnecessary libavutil/(avutil|common|internal).h inclusions

2022-02-24 12:56:49 +01:00

slice.c

Replace all occurences of av_mallocz_array() by av_calloc()

2021-09-20 01:03:52 +02:00

swscale_internal.h

Remove unnecessary libavutil/(avutil|common|internal).h inclusions

2022-02-24 12:56:49 +01:00

swscale_unscaled.c

sws: add a new scaling API

2021-09-06 09:16:52 +02:00

swscale.c

Remove unnecessary libavutil/(avutil|common|internal).h inclusions

2022-02-24 12:56:49 +01:00

swscale.h

Keep including the full version.h when headers are included externally

2022-03-19 00:01:57 +02:00

swscaleres.rc

Add Windows resource file support for shared libraries

2013-12-05 23:42:07 +01:00

utils.c

libswscale: Split version.h

2022-03-16 14:05:26 +02:00

version_major.h

libswscale: Split version.h

2022-03-16 14:05:26 +02:00

version.h

doc: Add an entry to APIchanges about changes to version.h and version_major.h

2022-03-16 14:12:46 +02:00

vscale.c

Replace all occurences of av_mallocz_array() by av_calloc()

2021-09-20 01:03:52 +02:00

yuv2rgb.c

swscale/yuv2rgb: Silence a set-but-unused-variable warning

2021-12-03 16:10:51 +01:00