1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-07 11:13:41 +02:00
Commit Graph

115846 Commits

Author SHA1 Message Date
Anton Khirnov
d86ac94df2 lavc/hevcdec: output RASL frames based on the value of no_rasl_output_flag
Instead of an ad-hoc scheme. Also, combine skipping RASL frames with
skip_frame handling - current code seems flawed as it only executes for
the first slice of a RASL frame and unnecessarily unsets is_decoded,
which should not be set at this point anyway..

Some RASL frames in fate-hevc-afd-tc-sei that were previously discarded
are now output.
2024-06-11 17:39:35 +02:00
Anton Khirnov
3115c84015 lavc/hevcdec: only set no_rasl_output_flag for IRAP frames
Its meaning is only specified for IRAP frames.

As it's currently never used otherwise, this should not change decoder
behaviour, but will be useful in future commits.
2024-06-11 17:39:35 +02:00
Anton Khirnov
381b70e173 lavc/hevcdec: do not pass HEVCContext to ff_hevc_frame_nb_refs()
Pass the only things required from it - slice header and PPS -
explicitly.

Will be useful in the following commits to avoid mofiying HEVCContext in
hls_slice_header().
2024-06-11 17:39:35 +02:00
Anton Khirnov
07eb60c0da lavc/hevcdec: only call export_stream_params_from_sei() once per frame
Not once per each slice header, as it makes no sense and may cause races
with frame threading.
2024-06-11 17:39:35 +02:00
Anton Khirnov
01b379a93e lavc/hevcdec: move pocTid0 computation to hevc_frame_start()
It is only done once per frame. Also, rename the variable to poc_tid0 to
be consistent with our naming conventions.
2024-06-11 17:39:35 +02:00
Anton Khirnov
5e438511ab lavc/hevcdec: do not pass HEVCContext to decode_lt_rps()
Pass the two numbers needed from it explicitly.

Makes it clear that HEVCContext is not modified by this function.
2024-06-11 17:39:35 +02:00
Anton Khirnov
0892ec947c lavc/hevcdec: pass SliceHeader explicitly to pred_weight_table()
And replace the HEVCContext* parameter by void *logctx.

Makes it clear that only SliceHeader is modified by this function.
2024-06-11 17:39:35 +02:00
Anton Khirnov
90fc331b0f lavc/hevcdec: only ignore INVALIDDATA in decode_nal_unit()
All other errors should cause a failure, regardless of the value of
err_recognition. Also, print a warning message when skipping invalid NAL
units.
2024-06-11 17:39:35 +02:00
Anton Khirnov
8eb134f4f9 lavc/hevcdec: drop an always-zero variable 2024-06-11 17:39:35 +02:00
Anton Khirnov
8c8072c29c lavc/hevcdec: move active PPS from HEVCParamSets to HEVCContext
"Currently active PPS" is a property of the decoding process, not of the
list of available parameter sets.
2024-06-11 17:39:34 +02:00
Anton Khirnov
0f47342c12 lavc/hevcdec: stop accessing parameter sets through HEVCParamSets
Instead, accept PPS/SPS as function arguments.

Makes the code shorter and significantly reduces diff in future commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
38b8ae4112 lavc/hevc/pred: stop accessing parameter sets through HEVCParamSets
Instead, accept PPS/SPS as function arguments.

Makes the code shorter and significantly reduces diff in future commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
d0868d70ea lavc/hevc/cabac: stop accessing parameter sets through HEVCParamSets
Instead, accept PPS/SPS as function arguments.

Makes the code shorter and significantly reduces diff in future commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
b38aecffec lavc/hevc/filter: stop accessing parameter sets through HEVCParamSets
Instead, accept PPS as a function argument and retrieve SPS through it.

Makes the code shorter and significantly reduces diff in future commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
fb873a05b3 lavc/hevc/mvs: stop accessing parameter sets through HEVCParamSets
Instead, accept PPS as a function argument and retrieve SPS through it.

Makes the code shorter and significantly reduces diff in future commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
6ddba110eb lavc/hevc/parser: stop using HEVCParamSets.[psv]ps
The parser does not need to preserve these between frames.
2024-06-11 17:39:34 +02:00
Anton Khirnov
2e46d68f55 lavc/hevc_ps: make SPS hold a reference to its VPS
SPS and its dependent PPSes depend on, and are parsed for, specific VPS data.

This will be useful in following commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
c879165b39 lavc/hevc_ps: make PPS hold a reference to its SPS
PPS depends on, and is parsed for, specific SPS data.

This will be useful in following commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
e12fd62d1d lavc/hevcdec: drop a redundant assignment in hevc_decode_frame()
The exact same code is executed at the beginning of decode_nal_units()
2024-06-11 17:39:34 +02:00
Anton Khirnov
a82f2b0924 lavc/hevcdec: simplify condition 2024-06-11 17:39:34 +02:00
Anton Khirnov
0407556716 lavc/hevcdec: do not free SliceHeader arrays in pic_arrays_free()
SliceHeader.{entry_point_offset,size,offset} are not derived from frame
size and do not need to be freed here.
2024-06-11 17:39:34 +02:00
sfan5
0455a62d84 lavf/tls_mbedtls: handle session ticket error code as no-op
When TLSv1.3 and session tickets are enabled mbedtls_ssl_read()
will return an error code to inform about a received session ticket.
This can simply be handled like EAGAIN instead of errornously
aborting the connection.

ref: https://github.com/Mbed-TLS/mbedtls/issues/8749
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-06-11 17:00:35 +02:00
sfan5
1b1e9cadc5 lavf/tls_mbedtls: fix handling of certification validation failures
We manually check the verification status after the handshake has completed
using mbedtls_ssl_get_verify_result(). However with VERIFY_REQUIRED
mbedtls_ssl_handshake() already returns an error, so this code is never reached.
Fix that by using VERIFY_OPTIONAL, which performs the verification but
does not abort the handshake.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-06-11 16:58:22 +02:00
sfan5
827578ca76 lavf/tls_mbedtls: hook up debug message callback
Unfortunately this won't work out-of-the-box because mbedTLS
only provides a global (not per-context) debug toggle.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-06-11 16:58:15 +02:00
sfan5
807d1505bf lavf/tls_mbedtls: add missing call to psa_crypto_init
This is mandatory depending on configuration or at least with mbedTLS 3.6.0.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-06-11 16:35:46 +02:00
sfan5
63b6620ad3 lavf/tls_mbedtls: handle more error codes for human-readable messages
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-06-11 16:35:31 +02:00
Rémi Denis-Courmont
b6f37ffba7 lavc/vc1dsp: match C block layout in inv_trans_4x8_rvv
Although checkasm does not verify this, the decoder requires that the
transform updates the input block exactly like the C code does.

This fixes vc1-ism, vc1_ilaced_twomv, vc1_sa00040, vc1_sa10091,
vc1_sa10143, vc1_sa20021, vc1test_smm0005 and wmv3-drm-dec tests.
2024-06-11 17:15:09 +03:00
Rémi Denis-Courmont
6c05069e68 lavc/vc1dsp: match C block layout in inv_trans_4x4_rvv
Although checkasm does not verify this, the decoder requires that the
transform updates the input block exactly like the C code does.

This fixes vc1-ism, vc1_ilaced_twomv, vc1_sa00040, vc1_sa10091,
vc1_sa10143, vc1_sa20021, vc1test_smm0005 and wmv3-drm-dec tests.
2024-06-11 17:15:09 +03:00
Andreas Rheinhardt
6ae1a337f2 fftools/ffmpeg_mux_init: Fix leak when using non-encoding option
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 14:32:25 +02:00
Andreas Rheinhardt
8754c9bd82 configure: Disable DNN without backend
The DNN filters are useless without a backend.
This will also "fix" Coverity issues #1598288 and #1601718.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 19:36:54 +08:00
Andreas Rheinhardt
c84e40d9e6 fftools/ffmpeg_mux_init: Return error upon error
Currently it may return an uninitialized value.
Introduced in 840f2bc18e.
Fixes Coverity issue #1603565.

Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 08:16:42 +02:00
Andreas Rheinhardt
a0ff31e740 avcodec/vvc/inter: Don't return void
Returning a void is not allowed by the spec. Just return instead.

Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 02:43:14 +02:00
Rémi Denis-Courmont
417957ec5e sws/range_convert: R-V V to/from JPEG
C908   X60
chrRangeFromJpeg_8_c:          2.7    2.5
chrRangeFromJpeg_8_rvv_i32:    1.7    1.5
chrRangeFromJpeg_24_c:         7.5    6.7
chrRangeFromJpeg_24_rvv_i32:   1.7    1.5
chrRangeFromJpeg_128_c:       55.2   34.7
chrRangeFromJpeg_128_rvv_i32:  6.5    3.0
chrRangeFromJpeg_144_c:       44.0   39.2
chrRangeFromJpeg_144_rvv_i32:  7.7    4.5
chrRangeFromJpeg_256_c:       78.2   69.5
chrRangeFromJpeg_256_rvv_i32: 12.2    6.0
chrRangeFromJpeg_512_c:      172.2  138.5
chrRangeFromJpeg_512_rvv_i32: 24.5   11.7
chrRangeToJpeg_8_c:            4.7    4.2
chrRangeToJpeg_8_rvv_i32:      2.0    1.7
chrRangeToJpeg_24_c:          13.7   12.2
chrRangeToJpeg_24_rvv_i32:     2.0    1.5
chrRangeToJpeg_128_c:         72.0   63.7
chrRangeToJpeg_128_rvv_i32:    6.7    3.2
chrRangeToJpeg_144_c:         80.7   71.7
chrRangeToJpeg_144_rvv_i32:    8.5    4.7
chrRangeToJpeg_256_c:        143.2  127.2
chrRangeToJpeg_256_rvv_i32:   13.5    6.5
chrRangeToJpeg_512_c:        285.7  253.7
chrRangeToJpeg_512_rvv_i32:   27.0   13.0
lumRangeFromJpeg_8_c:          1.7    1.5
lumRangeFromJpeg_8_rvv_i32:    1.2    1.0
lumRangeFromJpeg_24_c:         4.2    3.7
lumRangeFromJpeg_24_rvv_i32:   1.2    1.0
lumRangeFromJpeg_128_c:       21.7   19.2
lumRangeFromJpeg_128_rvv_i32:  3.7    1.7
lumRangeFromJpeg_144_c:       24.7   22.0
lumRangeFromJpeg_144_rvv_i32:  4.7    2.7
lumRangeFromJpeg_256_c:       43.7   39.0
lumRangeFromJpeg_256_rvv_i32:  7.5    3.2
lumRangeFromJpeg_512_c:       87.0   77.2
lumRangeFromJpeg_512_rvv_i32: 14.5    6.7
lumRangeToJpeg_8_c:            2.7    2.2
lumRangeToJpeg_8_rvv_i32:      1.0    1.0
lumRangeToJpeg_24_c:           7.2    6.5
lumRangeToJpeg_24_rvv_i32:     1.2    1.0
lumRangeToJpeg_128_c:         37.7   33.7
lumRangeToJpeg_128_rvv_i32:    3.7    2.0
lumRangeToJpeg_144_c:         42.5   37.7
lumRangeToJpeg_144_rvv_i32:    4.7    2.7
lumRangeToJpeg_256_c:         75.0   66.7
lumRangeToJpeg_256_rvv_i32:    7.5    3.5
lumRangeToJpeg_512_c:        149.5  133.0
lumRangeToJpeg_512_rvv_i32:   14.7    7.0
2024-06-10 22:48:52 +03:00
Zhao Zhili
9dac8495b0 swscale/aarch64: Add rgb24 to yuv implementation
Test on Apple M1:

rgb24_to_uv_8_c: 0.0
rgb24_to_uv_8_neon: 0.2
rgb24_to_uv_128_c: 1.0
rgb24_to_uv_128_neon: 0.5
rgb24_to_uv_1080_c: 7.0
rgb24_to_uv_1080_neon: 5.7
rgb24_to_uv_1920_c: 12.5
rgb24_to_uv_1920_neon: 9.5
rgb24_to_uv_half_8_c: 0.2
rgb24_to_uv_half_8_neon: 0.2
rgb24_to_uv_half_128_c: 1.0
rgb24_to_uv_half_128_neon: 0.5
rgb24_to_uv_half_1080_c: 6.2
rgb24_to_uv_half_1080_neon: 3.0
rgb24_to_uv_half_1920_c: 11.2
rgb24_to_uv_half_1920_neon: 5.2
rgb24_to_y_8_c: 0.2
rgb24_to_y_8_neon: 0.0
rgb24_to_y_128_c: 0.5
rgb24_to_y_128_neon: 0.5
rgb24_to_y_1080_c: 4.7
rgb24_to_y_1080_neon: 3.2
rgb24_to_y_1920_c: 8.0
rgb24_to_y_1920_neon: 5.7

On Pixel 6:

rgb24_to_uv_8_c: 30.7
rgb24_to_uv_8_neon: 56.9
rgb24_to_uv_128_c: 213.9
rgb24_to_uv_128_neon: 173.2
rgb24_to_uv_1080_c: 1649.9
rgb24_to_uv_1080_neon: 1424.4
rgb24_to_uv_1920_c: 2907.9
rgb24_to_uv_1920_neon: 2480.7
rgb24_to_uv_half_8_c: 36.2
rgb24_to_uv_half_8_neon: 33.4
rgb24_to_uv_half_128_c: 167.9
rgb24_to_uv_half_128_neon: 99.4
rgb24_to_uv_half_1080_c: 1293.9
rgb24_to_uv_half_1080_neon: 778.7
rgb24_to_uv_half_1920_c: 2292.7
rgb24_to_uv_half_1920_neon: 1328.7
rgb24_to_y_8_c: 19.7
rgb24_to_y_8_neon: 27.7
rgb24_to_y_128_c: 129.9
rgb24_to_y_128_neon: 96.7
rgb24_to_y_1080_c: 995.4
rgb24_to_y_1080_neon: 767.7
rgb24_to_y_1920_c: 1747.4
rgb24_to_y_1920_neon: 1337.2

Note both tests use clang as compiler, which has vectorization
enabled by default with -O3.

Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:12:09 +08:00
Zhao Zhili
b1240c983f tests/checkasm: Fix build error when enable linux perf on Android
B0 is defined by system header, see f0f596dbc6 for ref.

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:11:46 +08:00
Zhao Zhili
33e4cc963d avutil/timer: Add clock_gettime as a fallback of AV_READ_TIME
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:11:36 +08:00
Zhao Zhili
6a18c0bc87 avutil/aarch64: Skip define AV_READ_TIME for apple
It will fallback to mach_absolute_time inside libavutil/timer.h

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:10:42 +08:00
James Almer
94f2274a8b x86/aacencdsp: fix ff_aac_quantize_bands_avx on unix64 ABI
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 17:16:02 -03:00
James Almer
17c3cc5bb6 swscale/x86/rgb_2_rgb: add missing wrap to ff_uyvytoyuv422_avx2
Fixes old yasm.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 16:04:36 -03:00
James Almer
03546f49a3 swscale/x86/rgb2rgb: add missing wrap for ff_uyvytoyuv422_avx2
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 15:56:52 -03:00
James Almer
287d139b77 checkasm/sw_rgb: fix alignment of buffers for rgb_to_yuv tests
src is apparently not guaranteed to be >8 byte aligned, but align to 16
nonetheless as the x86 asm will do unaligned loads anyway.
dst is guaranteed to be 32 byte aligned for the Y plane, but 16 byte for UV.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 14:12:51 -03:00
James Almer
e8cef5e152 swscale/x86/rgb2rgb: remove mmxext version of shuffle_bytes_2103
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:43:11 -03:00
James Almer
c578bb9864 swscale/x86/input: add AVX2 optimized uyvytoyuv422
uyvytoyuv422_c: 23991.8
uyvytoyuv422_sse2: 2817.8
uyvytoyuv422_avx: 2819.3
uyvytoyuv422_avx2: 1972.3

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:43:11 -03:00
James Almer
e9cfd53257 swscale/x86/input: add AVX2 optimized RGB32 to YUV functions
abgr_to_uv_8_c: 43.3
abgr_to_uv_8_sse2: 14.3
abgr_to_uv_8_avx: 15.3
abgr_to_uv_8_avx2: 18.8
abgr_to_uv_128_c: 650.3
abgr_to_uv_128_sse2: 110.8
abgr_to_uv_128_avx: 112.3
abgr_to_uv_128_avx2: 64.8
abgr_to_uv_1080_c: 5456.3
abgr_to_uv_1080_sse2: 888.8
abgr_to_uv_1080_avx: 900.8
abgr_to_uv_1080_avx2: 518.3
abgr_to_uv_1920_c: 9692.3
abgr_to_uv_1920_sse2: 1593.8
abgr_to_uv_1920_avx: 1613.3
abgr_to_uv_1920_avx2: 864.8
abgr_to_y_8_c: 23.3
abgr_to_y_8_sse2: 12.8
abgr_to_y_8_avx: 13.3
abgr_to_y_8_avx2: 17.3
abgr_to_y_128_c: 308.3
abgr_to_y_128_sse2: 67.3
abgr_to_y_128_avx: 66.8
abgr_to_y_128_avx2: 44.8
abgr_to_y_1080_c: 2371.3
abgr_to_y_1080_sse2: 512.8
abgr_to_y_1080_avx: 505.8
abgr_to_y_1080_avx2: 314.3
abgr_to_y_1920_c: 4177.3
abgr_to_y_1920_sse2: 915.8
abgr_to_y_1920_avx: 926.8
abgr_to_y_1920_avx2: 519.3
bgra_to_uv_8_c: 37.3
bgra_to_uv_8_sse2: 13.3
bgra_to_uv_8_avx: 14.8
bgra_to_uv_8_avx2: 19.8
bgra_to_uv_128_c: 563.8
bgra_to_uv_128_sse2: 111.3
bgra_to_uv_128_avx: 112.3
bgra_to_uv_128_avx2: 64.8
bgra_to_uv_1080_c: 4691.8
bgra_to_uv_1080_sse2: 893.8
bgra_to_uv_1080_avx: 899.8
bgra_to_uv_1080_avx2: 517.8
bgra_to_uv_1920_c: 8332.8
bgra_to_uv_1920_sse2: 1590.8
bgra_to_uv_1920_avx: 1605.8
bgra_to_uv_1920_avx2: 867.3
bgra_to_y_8_c: 22.3
bgra_to_y_8_sse2: 12.8
bgra_to_y_8_avx: 12.8
bgra_to_y_8_avx2: 17.3
bgra_to_y_128_c: 291.3
bgra_to_y_128_sse2: 67.8
bgra_to_y_128_avx: 69.3
bgra_to_y_128_avx2: 45.3
bgra_to_y_1080_c: 2357.3
bgra_to_y_1080_sse2: 508.3
bgra_to_y_1080_avx: 518.3
bgra_to_y_1080_avx2: 399.8
bgra_to_y_1920_c: 4202.8
bgra_to_y_1920_sse2: 906.8
bgra_to_y_1920_avx: 907.3
bgra_to_y_1920_avx2: 526.3

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:43:11 -03:00
James Almer
d5fe99dc5f swscale/x86/input: add AVX2 optimized RGB24 to YUV functions
rgb24_to_uv_8_c: 39.3
rgb24_to_uv_8_sse2: 14.3
rgb24_to_uv_8_ssse3: 13.3
rgb24_to_uv_8_avx: 12.8
rgb24_to_uv_8_avx2: 14.3
rgb24_to_uv_128_c: 582.8
rgb24_to_uv_128_sse2: 127.3
rgb24_to_uv_128_ssse3: 107.3
rgb24_to_uv_128_avx: 111.3
rgb24_to_uv_128_avx2: 62.3
rgb24_to_uv_1080_c: 4981.3
rgb24_to_uv_1080_sse2: 1048.3
rgb24_to_uv_1080_ssse3: 876.8
rgb24_to_uv_1080_avx: 887.8
rgb24_to_uv_1080_avx2: 492.3
rgb24_to_uv_1280_c: 5906.8
rgb24_to_uv_1280_sse2: 1263.3
rgb24_to_uv_1280_ssse3: 1048.3
rgb24_to_uv_1280_avx: 1045.8
rgb24_to_uv_1280_avx2: 579.8
rgb24_to_uv_1920_c: 8665.3
rgb24_to_uv_1920_sse2: 1888.8
rgb24_to_uv_1920_ssse3: 1571.8
rgb24_to_uv_1920_avx: 1558.8
rgb24_to_uv_1920_avx2: 869.3
rgb24_to_y_8_c: 20.3
rgb24_to_y_8_sse2: 11.8
rgb24_to_y_8_ssse3: 10.3
rgb24_to_y_8_avx: 10.3
rgb24_to_y_8_avx2: 10.8
rgb24_to_y_128_c: 284.8
rgb24_to_y_128_sse2: 83.3
rgb24_to_y_128_ssse3: 66.8
rgb24_to_y_128_avx: 64.8
rgb24_to_y_128_avx2: 39.3
rgb24_to_y_1080_c: 2451.3
rgb24_to_y_1080_sse2: 696.3
rgb24_to_y_1080_ssse3: 516.8
rgb24_to_y_1080_avx: 518.8
rgb24_to_y_1080_avx2: 301.8
rgb24_to_y_1280_c: 2892.8
rgb24_to_y_1280_sse2: 816.8
rgb24_to_y_1280_ssse3: 623.3
rgb24_to_y_1280_avx: 616.3
rgb24_to_y_1280_avx2: 350.8
rgb24_to_y_1920_c: 4338.8
rgb24_to_y_1920_sse2: 1210.8
rgb24_to_y_1920_ssse3: 928.3
rgb24_to_y_1920_avx: 920.3
rgb24_to_y_1920_avx2: 534.8

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:42:09 -03:00
James Almer
6743c2fc6a checkasm/sw_rgb: test rgb32/rgb32_1 to yuv
Test all four pixel formats, but only bench the two native endian ones for a
given target.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 12:29:49 -03:00
James Almer
91b9af0058 x86/aacencdsp: add AVX version of quantize_bands
quant_bands_signed_c: 1928.0
quant_bands_signed_sse2: 406.0
quant_bands_signed_avx: 207.0
quant_bands_unsigned_c: 1702.0
quant_bands_unsigned_sse2: 404.0
quant_bands_unsigned_avx: 209.0

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 12:29:49 -03:00
Rémi Denis-Courmont
7a3369398f sws/input: R-V V 32-bit RGB to halved UV
T-Head C908:
abgr_to_uv_half_8_c:            2.2
abgr_to_uv_half_8_rvv_i32:      3.5
abgr_to_uv_half_128_c:         44.0
abgr_to_uv_half_128_rvv_i32:   13.0
abgr_to_uv_half_1080_c:       245.0
abgr_to_uv_half_1080_rvv_i32: 107.2
abgr_to_uv_half_1920_c:       406.2
abgr_to_uv_half_1920_rvv_i32: 188.7
bgra_to_uv_half_8_c:            2.2
bgra_to_uv_half_8_rvv_i32:      3.5
bgra_to_uv_half_128_c:         26.5
bgra_to_uv_half_128_rvv_i32:   13.0
bgra_to_uv_half_1080_c:       219.7
bgra_to_uv_half_1080_rvv_i32: 107.0
bgra_to_uv_half_1920_c:       406.7
bgra_to_uv_half_1920_rvv_i32: 188.7

SpacemiT X60:
abgr_to_uv_half_8_c:           2.2
abgr_to_uv_half_8_rvv_i32:     3.0
abgr_to_uv_half_128_c:        28.2
abgr_to_uv_half_128_rvv_i32:   5.7
abgr_to_uv_half_1080_c:      235.5
abgr_to_uv_half_1080_rvv_i32: 47.7
abgr_to_uv_half_1920_c:      418.2
abgr_to_uv_half_1920_rvv_i32: 84.0
bgra_to_uv_half_8_c:           2.0
bgra_to_uv_half_8_rvv_i32:     3.0
bgra_to_uv_half_128_c:        23.7
bgra_to_uv_half_128_rvv_i32:   5.7
bgra_to_uv_half_1080_c:      195.5
bgra_to_uv_half_1080_rvv_i32: 47.7
bgra_to_uv_half_1920_c:      346.5
bgra_to_uv_half_1920_rvv_i32: 84.0
2024-06-09 14:33:04 +03:00
Rémi Denis-Courmont
e2f069905e sws/input: R-V V 32-bit RGB to UV 2024-06-09 14:33:04 +03:00
Rémi Denis-Courmont
f5555cb106 sws/input: R-V V 32-bit RGB to Y
T-Head C908:
abgr_to_y_8_c:            2.5
abgr_to_y_8_rvv_i32:      2.2
abgr_to_y_128_c:         37.0
abgr_to_y_128_rvv_i32:    8.5
abgr_to_y_1080_c:       327.0
abgr_to_y_1080_rvv_i32:  69.5
abgr_to_y_1920_c:       552.0
abgr_to_y_1920_rvv_i32: 122.2
bgra_to_y_8_c:            2.5
bgra_to_y_8_rvv_i32:      2.2
bgra_to_y_128_c:         37.2
bgra_to_y_128_rvv_i32:    8.5
bgra_to_y_1080_c:       310.2
bgra_to_y_1080_rvv_i32:  69.5
bgra_to_y_1920_c:       568.2
bgra_to_y_1920_rvv_i32: 122.5

SpacemiT X60:
abgr_to_y_8_c:            2.5
abgr_to_y_8_rvv_i32:      2.0
abgr_to_y_128_c:         33.0
abgr_to_y_128_rvv_i32:    3.7
abgr_to_y_1080_c:       276.0
abgr_to_y_1080_rvv_i32:  31.5
abgr_to_y_1920_c:       493.7
abgr_to_y_1920_rvv_i32:  55.5
bgra_to_y_8_c:            2.2
bgra_to_y_8_rvv_i32:      2.0
bgra_to_y_128_c:         33.0
bgra_to_y_128_rvv_i32:    3.7
bgra_to_y_1080_c:       276.0
bgra_to_y_1080_rvv_i32:  31.5
bgra_to_y_1920_c:       490.7
bgra_to_y_1920_rvv_i32:  55.5
2024-06-09 14:33:04 +03:00