Anton Khirnov
d86ac94df2
lavc/hevcdec: output RASL frames based on the value of no_rasl_output_flag
...
Instead of an ad-hoc scheme. Also, combine skipping RASL frames with
skip_frame handling - current code seems flawed as it only executes for
the first slice of a RASL frame and unnecessarily unsets is_decoded,
which should not be set at this point anyway..
Some RASL frames in fate-hevc-afd-tc-sei that were previously discarded
are now output.
2024-06-11 17:39:35 +02:00
Anton Khirnov
3115c84015
lavc/hevcdec: only set no_rasl_output_flag for IRAP frames
...
Its meaning is only specified for IRAP frames.
As it's currently never used otherwise, this should not change decoder
behaviour, but will be useful in future commits.
2024-06-11 17:39:35 +02:00
Anton Khirnov
381b70e173
lavc/hevcdec: do not pass HEVCContext to ff_hevc_frame_nb_refs()
...
Pass the only things required from it - slice header and PPS -
explicitly.
Will be useful in the following commits to avoid mofiying HEVCContext in
hls_slice_header().
2024-06-11 17:39:35 +02:00
Anton Khirnov
07eb60c0da
lavc/hevcdec: only call export_stream_params_from_sei() once per frame
...
Not once per each slice header, as it makes no sense and may cause races
with frame threading.
2024-06-11 17:39:35 +02:00
Anton Khirnov
01b379a93e
lavc/hevcdec: move pocTid0 computation to hevc_frame_start()
...
It is only done once per frame. Also, rename the variable to poc_tid0 to
be consistent with our naming conventions.
2024-06-11 17:39:35 +02:00
Anton Khirnov
5e438511ab
lavc/hevcdec: do not pass HEVCContext to decode_lt_rps()
...
Pass the two numbers needed from it explicitly.
Makes it clear that HEVCContext is not modified by this function.
2024-06-11 17:39:35 +02:00
Anton Khirnov
0892ec947c
lavc/hevcdec: pass SliceHeader explicitly to pred_weight_table()
...
And replace the HEVCContext* parameter by void *logctx.
Makes it clear that only SliceHeader is modified by this function.
2024-06-11 17:39:35 +02:00
Anton Khirnov
90fc331b0f
lavc/hevcdec: only ignore INVALIDDATA in decode_nal_unit()
...
All other errors should cause a failure, regardless of the value of
err_recognition. Also, print a warning message when skipping invalid NAL
units.
2024-06-11 17:39:35 +02:00
Anton Khirnov
8eb134f4f9
lavc/hevcdec: drop an always-zero variable
2024-06-11 17:39:35 +02:00
Anton Khirnov
8c8072c29c
lavc/hevcdec: move active PPS from HEVCParamSets to HEVCContext
...
"Currently active PPS" is a property of the decoding process, not of the
list of available parameter sets.
2024-06-11 17:39:34 +02:00
Anton Khirnov
0f47342c12
lavc/hevcdec: stop accessing parameter sets through HEVCParamSets
...
Instead, accept PPS/SPS as function arguments.
Makes the code shorter and significantly reduces diff in future commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
38b8ae4112
lavc/hevc/pred: stop accessing parameter sets through HEVCParamSets
...
Instead, accept PPS/SPS as function arguments.
Makes the code shorter and significantly reduces diff in future commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
d0868d70ea
lavc/hevc/cabac: stop accessing parameter sets through HEVCParamSets
...
Instead, accept PPS/SPS as function arguments.
Makes the code shorter and significantly reduces diff in future commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
b38aecffec
lavc/hevc/filter: stop accessing parameter sets through HEVCParamSets
...
Instead, accept PPS as a function argument and retrieve SPS through it.
Makes the code shorter and significantly reduces diff in future commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
fb873a05b3
lavc/hevc/mvs: stop accessing parameter sets through HEVCParamSets
...
Instead, accept PPS as a function argument and retrieve SPS through it.
Makes the code shorter and significantly reduces diff in future commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
6ddba110eb
lavc/hevc/parser: stop using HEVCParamSets.[psv]ps
...
The parser does not need to preserve these between frames.
2024-06-11 17:39:34 +02:00
Anton Khirnov
2e46d68f55
lavc/hevc_ps: make SPS hold a reference to its VPS
...
SPS and its dependent PPSes depend on, and are parsed for, specific VPS data.
This will be useful in following commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
c879165b39
lavc/hevc_ps: make PPS hold a reference to its SPS
...
PPS depends on, and is parsed for, specific SPS data.
This will be useful in following commits.
2024-06-11 17:39:34 +02:00
Anton Khirnov
e12fd62d1d
lavc/hevcdec: drop a redundant assignment in hevc_decode_frame()
...
The exact same code is executed at the beginning of decode_nal_units()
2024-06-11 17:39:34 +02:00
Anton Khirnov
a82f2b0924
lavc/hevcdec: simplify condition
2024-06-11 17:39:34 +02:00
Anton Khirnov
0407556716
lavc/hevcdec: do not free SliceHeader arrays in pic_arrays_free()
...
SliceHeader.{entry_point_offset,size,offset} are not derived from frame
size and do not need to be freed here.
2024-06-11 17:39:34 +02:00
sfan5
0455a62d84
lavf/tls_mbedtls: handle session ticket error code as no-op
...
When TLSv1.3 and session tickets are enabled mbedtls_ssl_read()
will return an error code to inform about a received session ticket.
This can simply be handled like EAGAIN instead of errornously
aborting the connection.
ref: https://github.com/Mbed-TLS/mbedtls/issues/8749
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-06-11 17:00:35 +02:00
sfan5
1b1e9cadc5
lavf/tls_mbedtls: fix handling of certification validation failures
...
We manually check the verification status after the handshake has completed
using mbedtls_ssl_get_verify_result(). However with VERIFY_REQUIRED
mbedtls_ssl_handshake() already returns an error, so this code is never reached.
Fix that by using VERIFY_OPTIONAL, which performs the verification but
does not abort the handshake.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-06-11 16:58:22 +02:00
sfan5
827578ca76
lavf/tls_mbedtls: hook up debug message callback
...
Unfortunately this won't work out-of-the-box because mbedTLS
only provides a global (not per-context) debug toggle.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-06-11 16:58:15 +02:00
sfan5
807d1505bf
lavf/tls_mbedtls: add missing call to psa_crypto_init
...
This is mandatory depending on configuration or at least with mbedTLS 3.6.0.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-06-11 16:35:46 +02:00
sfan5
63b6620ad3
lavf/tls_mbedtls: handle more error codes for human-readable messages
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-06-11 16:35:31 +02:00
Rémi Denis-Courmont
b6f37ffba7
lavc/vc1dsp: match C block layout in inv_trans_4x8_rvv
...
Although checkasm does not verify this, the decoder requires that the
transform updates the input block exactly like the C code does.
This fixes vc1-ism, vc1_ilaced_twomv, vc1_sa00040, vc1_sa10091,
vc1_sa10143, vc1_sa20021, vc1test_smm0005 and wmv3-drm-dec tests.
2024-06-11 17:15:09 +03:00
Rémi Denis-Courmont
6c05069e68
lavc/vc1dsp: match C block layout in inv_trans_4x4_rvv
...
Although checkasm does not verify this, the decoder requires that the
transform updates the input block exactly like the C code does.
This fixes vc1-ism, vc1_ilaced_twomv, vc1_sa00040, vc1_sa10091,
vc1_sa10143, vc1_sa20021, vc1test_smm0005 and wmv3-drm-dec tests.
2024-06-11 17:15:09 +03:00
Andreas Rheinhardt
6ae1a337f2
fftools/ffmpeg_mux_init: Fix leak when using non-encoding option
...
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 14:32:25 +02:00
Andreas Rheinhardt
8754c9bd82
configure: Disable DNN without backend
...
The DNN filters are useless without a backend.
This will also "fix" Coverity issues #1598288 and #1601718 .
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 19:36:54 +08:00
Andreas Rheinhardt
c84e40d9e6
fftools/ffmpeg_mux_init: Return error upon error
...
Currently it may return an uninitialized value.
Introduced in 840f2bc18e
.
Fixes Coverity issue #1603565 .
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 08:16:42 +02:00
Andreas Rheinhardt
a0ff31e740
avcodec/vvc/inter: Don't return void
...
Returning a void is not allowed by the spec. Just return instead.
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 02:43:14 +02:00
Rémi Denis-Courmont
417957ec5e
sws/range_convert: R-V V to/from JPEG
...
C908 X60
chrRangeFromJpeg_8_c: 2.7 2.5
chrRangeFromJpeg_8_rvv_i32: 1.7 1.5
chrRangeFromJpeg_24_c: 7.5 6.7
chrRangeFromJpeg_24_rvv_i32: 1.7 1.5
chrRangeFromJpeg_128_c: 55.2 34.7
chrRangeFromJpeg_128_rvv_i32: 6.5 3.0
chrRangeFromJpeg_144_c: 44.0 39.2
chrRangeFromJpeg_144_rvv_i32: 7.7 4.5
chrRangeFromJpeg_256_c: 78.2 69.5
chrRangeFromJpeg_256_rvv_i32: 12.2 6.0
chrRangeFromJpeg_512_c: 172.2 138.5
chrRangeFromJpeg_512_rvv_i32: 24.5 11.7
chrRangeToJpeg_8_c: 4.7 4.2
chrRangeToJpeg_8_rvv_i32: 2.0 1.7
chrRangeToJpeg_24_c: 13.7 12.2
chrRangeToJpeg_24_rvv_i32: 2.0 1.5
chrRangeToJpeg_128_c: 72.0 63.7
chrRangeToJpeg_128_rvv_i32: 6.7 3.2
chrRangeToJpeg_144_c: 80.7 71.7
chrRangeToJpeg_144_rvv_i32: 8.5 4.7
chrRangeToJpeg_256_c: 143.2 127.2
chrRangeToJpeg_256_rvv_i32: 13.5 6.5
chrRangeToJpeg_512_c: 285.7 253.7
chrRangeToJpeg_512_rvv_i32: 27.0 13.0
lumRangeFromJpeg_8_c: 1.7 1.5
lumRangeFromJpeg_8_rvv_i32: 1.2 1.0
lumRangeFromJpeg_24_c: 4.2 3.7
lumRangeFromJpeg_24_rvv_i32: 1.2 1.0
lumRangeFromJpeg_128_c: 21.7 19.2
lumRangeFromJpeg_128_rvv_i32: 3.7 1.7
lumRangeFromJpeg_144_c: 24.7 22.0
lumRangeFromJpeg_144_rvv_i32: 4.7 2.7
lumRangeFromJpeg_256_c: 43.7 39.0
lumRangeFromJpeg_256_rvv_i32: 7.5 3.2
lumRangeFromJpeg_512_c: 87.0 77.2
lumRangeFromJpeg_512_rvv_i32: 14.5 6.7
lumRangeToJpeg_8_c: 2.7 2.2
lumRangeToJpeg_8_rvv_i32: 1.0 1.0
lumRangeToJpeg_24_c: 7.2 6.5
lumRangeToJpeg_24_rvv_i32: 1.2 1.0
lumRangeToJpeg_128_c: 37.7 33.7
lumRangeToJpeg_128_rvv_i32: 3.7 2.0
lumRangeToJpeg_144_c: 42.5 37.7
lumRangeToJpeg_144_rvv_i32: 4.7 2.7
lumRangeToJpeg_256_c: 75.0 66.7
lumRangeToJpeg_256_rvv_i32: 7.5 3.5
lumRangeToJpeg_512_c: 149.5 133.0
lumRangeToJpeg_512_rvv_i32: 14.7 7.0
2024-06-10 22:48:52 +03:00
Zhao Zhili
9dac8495b0
swscale/aarch64: Add rgb24 to yuv implementation
...
Test on Apple M1:
rgb24_to_uv_8_c: 0.0
rgb24_to_uv_8_neon: 0.2
rgb24_to_uv_128_c: 1.0
rgb24_to_uv_128_neon: 0.5
rgb24_to_uv_1080_c: 7.0
rgb24_to_uv_1080_neon: 5.7
rgb24_to_uv_1920_c: 12.5
rgb24_to_uv_1920_neon: 9.5
rgb24_to_uv_half_8_c: 0.2
rgb24_to_uv_half_8_neon: 0.2
rgb24_to_uv_half_128_c: 1.0
rgb24_to_uv_half_128_neon: 0.5
rgb24_to_uv_half_1080_c: 6.2
rgb24_to_uv_half_1080_neon: 3.0
rgb24_to_uv_half_1920_c: 11.2
rgb24_to_uv_half_1920_neon: 5.2
rgb24_to_y_8_c: 0.2
rgb24_to_y_8_neon: 0.0
rgb24_to_y_128_c: 0.5
rgb24_to_y_128_neon: 0.5
rgb24_to_y_1080_c: 4.7
rgb24_to_y_1080_neon: 3.2
rgb24_to_y_1920_c: 8.0
rgb24_to_y_1920_neon: 5.7
On Pixel 6:
rgb24_to_uv_8_c: 30.7
rgb24_to_uv_8_neon: 56.9
rgb24_to_uv_128_c: 213.9
rgb24_to_uv_128_neon: 173.2
rgb24_to_uv_1080_c: 1649.9
rgb24_to_uv_1080_neon: 1424.4
rgb24_to_uv_1920_c: 2907.9
rgb24_to_uv_1920_neon: 2480.7
rgb24_to_uv_half_8_c: 36.2
rgb24_to_uv_half_8_neon: 33.4
rgb24_to_uv_half_128_c: 167.9
rgb24_to_uv_half_128_neon: 99.4
rgb24_to_uv_half_1080_c: 1293.9
rgb24_to_uv_half_1080_neon: 778.7
rgb24_to_uv_half_1920_c: 2292.7
rgb24_to_uv_half_1920_neon: 1328.7
rgb24_to_y_8_c: 19.7
rgb24_to_y_8_neon: 27.7
rgb24_to_y_128_c: 129.9
rgb24_to_y_128_neon: 96.7
rgb24_to_y_1080_c: 995.4
rgb24_to_y_1080_neon: 767.7
rgb24_to_y_1920_c: 1747.4
rgb24_to_y_1920_neon: 1337.2
Note both tests use clang as compiler, which has vectorization
enabled by default with -O3.
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:12:09 +08:00
Zhao Zhili
b1240c983f
tests/checkasm: Fix build error when enable linux perf on Android
...
B0 is defined by system header, see f0f596dbc6
for ref.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:11:46 +08:00
Zhao Zhili
33e4cc963d
avutil/timer: Add clock_gettime as a fallback of AV_READ_TIME
...
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:11:36 +08:00
Zhao Zhili
6a18c0bc87
avutil/aarch64: Skip define AV_READ_TIME for apple
...
It will fallback to mach_absolute_time inside libavutil/timer.h
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:10:42 +08:00
James Almer
94f2274a8b
x86/aacencdsp: fix ff_aac_quantize_bands_avx on unix64 ABI
...
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 17:16:02 -03:00
James Almer
17c3cc5bb6
swscale/x86/rgb_2_rgb: add missing wrap to ff_uyvytoyuv422_avx2
...
Fixes old yasm.
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 16:04:36 -03:00
James Almer
03546f49a3
swscale/x86/rgb2rgb: add missing wrap for ff_uyvytoyuv422_avx2
...
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 15:56:52 -03:00
James Almer
287d139b77
checkasm/sw_rgb: fix alignment of buffers for rgb_to_yuv tests
...
src is apparently not guaranteed to be >8 byte aligned, but align to 16
nonetheless as the x86 asm will do unaligned loads anyway.
dst is guaranteed to be 32 byte aligned for the Y plane, but 16 byte for UV.
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 14:12:51 -03:00
James Almer
e8cef5e152
swscale/x86/rgb2rgb: remove mmxext version of shuffle_bytes_2103
...
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:43:11 -03:00
James Almer
c578bb9864
swscale/x86/input: add AVX2 optimized uyvytoyuv422
...
uyvytoyuv422_c: 23991.8
uyvytoyuv422_sse2: 2817.8
uyvytoyuv422_avx: 2819.3
uyvytoyuv422_avx2: 1972.3
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:43:11 -03:00
James Almer
e9cfd53257
swscale/x86/input: add AVX2 optimized RGB32 to YUV functions
...
abgr_to_uv_8_c: 43.3
abgr_to_uv_8_sse2: 14.3
abgr_to_uv_8_avx: 15.3
abgr_to_uv_8_avx2: 18.8
abgr_to_uv_128_c: 650.3
abgr_to_uv_128_sse2: 110.8
abgr_to_uv_128_avx: 112.3
abgr_to_uv_128_avx2: 64.8
abgr_to_uv_1080_c: 5456.3
abgr_to_uv_1080_sse2: 888.8
abgr_to_uv_1080_avx: 900.8
abgr_to_uv_1080_avx2: 518.3
abgr_to_uv_1920_c: 9692.3
abgr_to_uv_1920_sse2: 1593.8
abgr_to_uv_1920_avx: 1613.3
abgr_to_uv_1920_avx2: 864.8
abgr_to_y_8_c: 23.3
abgr_to_y_8_sse2: 12.8
abgr_to_y_8_avx: 13.3
abgr_to_y_8_avx2: 17.3
abgr_to_y_128_c: 308.3
abgr_to_y_128_sse2: 67.3
abgr_to_y_128_avx: 66.8
abgr_to_y_128_avx2: 44.8
abgr_to_y_1080_c: 2371.3
abgr_to_y_1080_sse2: 512.8
abgr_to_y_1080_avx: 505.8
abgr_to_y_1080_avx2: 314.3
abgr_to_y_1920_c: 4177.3
abgr_to_y_1920_sse2: 915.8
abgr_to_y_1920_avx: 926.8
abgr_to_y_1920_avx2: 519.3
bgra_to_uv_8_c: 37.3
bgra_to_uv_8_sse2: 13.3
bgra_to_uv_8_avx: 14.8
bgra_to_uv_8_avx2: 19.8
bgra_to_uv_128_c: 563.8
bgra_to_uv_128_sse2: 111.3
bgra_to_uv_128_avx: 112.3
bgra_to_uv_128_avx2: 64.8
bgra_to_uv_1080_c: 4691.8
bgra_to_uv_1080_sse2: 893.8
bgra_to_uv_1080_avx: 899.8
bgra_to_uv_1080_avx2: 517.8
bgra_to_uv_1920_c: 8332.8
bgra_to_uv_1920_sse2: 1590.8
bgra_to_uv_1920_avx: 1605.8
bgra_to_uv_1920_avx2: 867.3
bgra_to_y_8_c: 22.3
bgra_to_y_8_sse2: 12.8
bgra_to_y_8_avx: 12.8
bgra_to_y_8_avx2: 17.3
bgra_to_y_128_c: 291.3
bgra_to_y_128_sse2: 67.8
bgra_to_y_128_avx: 69.3
bgra_to_y_128_avx2: 45.3
bgra_to_y_1080_c: 2357.3
bgra_to_y_1080_sse2: 508.3
bgra_to_y_1080_avx: 518.3
bgra_to_y_1080_avx2: 399.8
bgra_to_y_1920_c: 4202.8
bgra_to_y_1920_sse2: 906.8
bgra_to_y_1920_avx: 907.3
bgra_to_y_1920_avx2: 526.3
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:43:11 -03:00
James Almer
d5fe99dc5f
swscale/x86/input: add AVX2 optimized RGB24 to YUV functions
...
rgb24_to_uv_8_c: 39.3
rgb24_to_uv_8_sse2: 14.3
rgb24_to_uv_8_ssse3: 13.3
rgb24_to_uv_8_avx: 12.8
rgb24_to_uv_8_avx2: 14.3
rgb24_to_uv_128_c: 582.8
rgb24_to_uv_128_sse2: 127.3
rgb24_to_uv_128_ssse3: 107.3
rgb24_to_uv_128_avx: 111.3
rgb24_to_uv_128_avx2: 62.3
rgb24_to_uv_1080_c: 4981.3
rgb24_to_uv_1080_sse2: 1048.3
rgb24_to_uv_1080_ssse3: 876.8
rgb24_to_uv_1080_avx: 887.8
rgb24_to_uv_1080_avx2: 492.3
rgb24_to_uv_1280_c: 5906.8
rgb24_to_uv_1280_sse2: 1263.3
rgb24_to_uv_1280_ssse3: 1048.3
rgb24_to_uv_1280_avx: 1045.8
rgb24_to_uv_1280_avx2: 579.8
rgb24_to_uv_1920_c: 8665.3
rgb24_to_uv_1920_sse2: 1888.8
rgb24_to_uv_1920_ssse3: 1571.8
rgb24_to_uv_1920_avx: 1558.8
rgb24_to_uv_1920_avx2: 869.3
rgb24_to_y_8_c: 20.3
rgb24_to_y_8_sse2: 11.8
rgb24_to_y_8_ssse3: 10.3
rgb24_to_y_8_avx: 10.3
rgb24_to_y_8_avx2: 10.8
rgb24_to_y_128_c: 284.8
rgb24_to_y_128_sse2: 83.3
rgb24_to_y_128_ssse3: 66.8
rgb24_to_y_128_avx: 64.8
rgb24_to_y_128_avx2: 39.3
rgb24_to_y_1080_c: 2451.3
rgb24_to_y_1080_sse2: 696.3
rgb24_to_y_1080_ssse3: 516.8
rgb24_to_y_1080_avx: 518.8
rgb24_to_y_1080_avx2: 301.8
rgb24_to_y_1280_c: 2892.8
rgb24_to_y_1280_sse2: 816.8
rgb24_to_y_1280_ssse3: 623.3
rgb24_to_y_1280_avx: 616.3
rgb24_to_y_1280_avx2: 350.8
rgb24_to_y_1920_c: 4338.8
rgb24_to_y_1920_sse2: 1210.8
rgb24_to_y_1920_ssse3: 928.3
rgb24_to_y_1920_avx: 920.3
rgb24_to_y_1920_avx2: 534.8
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:42:09 -03:00
James Almer
6743c2fc6a
checkasm/sw_rgb: test rgb32/rgb32_1 to yuv
...
Test all four pixel formats, but only bench the two native endian ones for a
given target.
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 12:29:49 -03:00
James Almer
91b9af0058
x86/aacencdsp: add AVX version of quantize_bands
...
quant_bands_signed_c: 1928.0
quant_bands_signed_sse2: 406.0
quant_bands_signed_avx: 207.0
quant_bands_unsigned_c: 1702.0
quant_bands_unsigned_sse2: 404.0
quant_bands_unsigned_avx: 209.0
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 12:29:49 -03:00
Rémi Denis-Courmont
7a3369398f
sws/input: R-V V 32-bit RGB to halved UV
...
T-Head C908:
abgr_to_uv_half_8_c: 2.2
abgr_to_uv_half_8_rvv_i32: 3.5
abgr_to_uv_half_128_c: 44.0
abgr_to_uv_half_128_rvv_i32: 13.0
abgr_to_uv_half_1080_c: 245.0
abgr_to_uv_half_1080_rvv_i32: 107.2
abgr_to_uv_half_1920_c: 406.2
abgr_to_uv_half_1920_rvv_i32: 188.7
bgra_to_uv_half_8_c: 2.2
bgra_to_uv_half_8_rvv_i32: 3.5
bgra_to_uv_half_128_c: 26.5
bgra_to_uv_half_128_rvv_i32: 13.0
bgra_to_uv_half_1080_c: 219.7
bgra_to_uv_half_1080_rvv_i32: 107.0
bgra_to_uv_half_1920_c: 406.7
bgra_to_uv_half_1920_rvv_i32: 188.7
SpacemiT X60:
abgr_to_uv_half_8_c: 2.2
abgr_to_uv_half_8_rvv_i32: 3.0
abgr_to_uv_half_128_c: 28.2
abgr_to_uv_half_128_rvv_i32: 5.7
abgr_to_uv_half_1080_c: 235.5
abgr_to_uv_half_1080_rvv_i32: 47.7
abgr_to_uv_half_1920_c: 418.2
abgr_to_uv_half_1920_rvv_i32: 84.0
bgra_to_uv_half_8_c: 2.0
bgra_to_uv_half_8_rvv_i32: 3.0
bgra_to_uv_half_128_c: 23.7
bgra_to_uv_half_128_rvv_i32: 5.7
bgra_to_uv_half_1080_c: 195.5
bgra_to_uv_half_1080_rvv_i32: 47.7
bgra_to_uv_half_1920_c: 346.5
bgra_to_uv_half_1920_rvv_i32: 84.0
2024-06-09 14:33:04 +03:00
Rémi Denis-Courmont
e2f069905e
sws/input: R-V V 32-bit RGB to UV
2024-06-09 14:33:04 +03:00
Rémi Denis-Courmont
f5555cb106
sws/input: R-V V 32-bit RGB to Y
...
T-Head C908:
abgr_to_y_8_c: 2.5
abgr_to_y_8_rvv_i32: 2.2
abgr_to_y_128_c: 37.0
abgr_to_y_128_rvv_i32: 8.5
abgr_to_y_1080_c: 327.0
abgr_to_y_1080_rvv_i32: 69.5
abgr_to_y_1920_c: 552.0
abgr_to_y_1920_rvv_i32: 122.2
bgra_to_y_8_c: 2.5
bgra_to_y_8_rvv_i32: 2.2
bgra_to_y_128_c: 37.2
bgra_to_y_128_rvv_i32: 8.5
bgra_to_y_1080_c: 310.2
bgra_to_y_1080_rvv_i32: 69.5
bgra_to_y_1920_c: 568.2
bgra_to_y_1920_rvv_i32: 122.5
SpacemiT X60:
abgr_to_y_8_c: 2.5
abgr_to_y_8_rvv_i32: 2.0
abgr_to_y_128_c: 33.0
abgr_to_y_128_rvv_i32: 3.7
abgr_to_y_1080_c: 276.0
abgr_to_y_1080_rvv_i32: 31.5
abgr_to_y_1920_c: 493.7
abgr_to_y_1920_rvv_i32: 55.5
bgra_to_y_8_c: 2.2
bgra_to_y_8_rvv_i32: 2.0
bgra_to_y_128_c: 33.0
bgra_to_y_128_rvv_i32: 3.7
bgra_to_y_1080_c: 276.0
bgra_to_y_1080_rvv_i32: 31.5
bgra_to_y_1920_c: 490.7
bgra_to_y_1920_rvv_i32: 55.5
2024-06-09 14:33:04 +03:00