Rémi Denis-Courmont
6c05069e68
lavc/vc1dsp: match C block layout in inv_trans_4x4_rvv
...
Although checkasm does not verify this, the decoder requires that the
transform updates the input block exactly like the C code does.
This fixes vc1-ism, vc1_ilaced_twomv, vc1_sa00040, vc1_sa10091,
vc1_sa10143, vc1_sa20021, vc1test_smm0005 and wmv3-drm-dec tests.
2024-06-11 17:15:09 +03:00
Andreas Rheinhardt
6ae1a337f2
fftools/ffmpeg_mux_init: Fix leak when using non-encoding option
...
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 14:32:25 +02:00
Andreas Rheinhardt
8754c9bd82
configure: Disable DNN without backend
...
The DNN filters are useless without a backend.
This will also "fix" Coverity issues #1598288 and #1601718 .
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 19:36:54 +08:00
Andreas Rheinhardt
c84e40d9e6
fftools/ffmpeg_mux_init: Return error upon error
...
Currently it may return an uninitialized value.
Introduced in 840f2bc18e
.
Fixes Coverity issue #1603565 .
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 08:16:42 +02:00
Andreas Rheinhardt
a0ff31e740
avcodec/vvc/inter: Don't return void
...
Returning a void is not allowed by the spec. Just return instead.
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-11 02:43:14 +02:00
Rémi Denis-Courmont
417957ec5e
sws/range_convert: R-V V to/from JPEG
...
C908 X60
chrRangeFromJpeg_8_c: 2.7 2.5
chrRangeFromJpeg_8_rvv_i32: 1.7 1.5
chrRangeFromJpeg_24_c: 7.5 6.7
chrRangeFromJpeg_24_rvv_i32: 1.7 1.5
chrRangeFromJpeg_128_c: 55.2 34.7
chrRangeFromJpeg_128_rvv_i32: 6.5 3.0
chrRangeFromJpeg_144_c: 44.0 39.2
chrRangeFromJpeg_144_rvv_i32: 7.7 4.5
chrRangeFromJpeg_256_c: 78.2 69.5
chrRangeFromJpeg_256_rvv_i32: 12.2 6.0
chrRangeFromJpeg_512_c: 172.2 138.5
chrRangeFromJpeg_512_rvv_i32: 24.5 11.7
chrRangeToJpeg_8_c: 4.7 4.2
chrRangeToJpeg_8_rvv_i32: 2.0 1.7
chrRangeToJpeg_24_c: 13.7 12.2
chrRangeToJpeg_24_rvv_i32: 2.0 1.5
chrRangeToJpeg_128_c: 72.0 63.7
chrRangeToJpeg_128_rvv_i32: 6.7 3.2
chrRangeToJpeg_144_c: 80.7 71.7
chrRangeToJpeg_144_rvv_i32: 8.5 4.7
chrRangeToJpeg_256_c: 143.2 127.2
chrRangeToJpeg_256_rvv_i32: 13.5 6.5
chrRangeToJpeg_512_c: 285.7 253.7
chrRangeToJpeg_512_rvv_i32: 27.0 13.0
lumRangeFromJpeg_8_c: 1.7 1.5
lumRangeFromJpeg_8_rvv_i32: 1.2 1.0
lumRangeFromJpeg_24_c: 4.2 3.7
lumRangeFromJpeg_24_rvv_i32: 1.2 1.0
lumRangeFromJpeg_128_c: 21.7 19.2
lumRangeFromJpeg_128_rvv_i32: 3.7 1.7
lumRangeFromJpeg_144_c: 24.7 22.0
lumRangeFromJpeg_144_rvv_i32: 4.7 2.7
lumRangeFromJpeg_256_c: 43.7 39.0
lumRangeFromJpeg_256_rvv_i32: 7.5 3.2
lumRangeFromJpeg_512_c: 87.0 77.2
lumRangeFromJpeg_512_rvv_i32: 14.5 6.7
lumRangeToJpeg_8_c: 2.7 2.2
lumRangeToJpeg_8_rvv_i32: 1.0 1.0
lumRangeToJpeg_24_c: 7.2 6.5
lumRangeToJpeg_24_rvv_i32: 1.2 1.0
lumRangeToJpeg_128_c: 37.7 33.7
lumRangeToJpeg_128_rvv_i32: 3.7 2.0
lumRangeToJpeg_144_c: 42.5 37.7
lumRangeToJpeg_144_rvv_i32: 4.7 2.7
lumRangeToJpeg_256_c: 75.0 66.7
lumRangeToJpeg_256_rvv_i32: 7.5 3.5
lumRangeToJpeg_512_c: 149.5 133.0
lumRangeToJpeg_512_rvv_i32: 14.7 7.0
2024-06-10 22:48:52 +03:00
Zhao Zhili
9dac8495b0
swscale/aarch64: Add rgb24 to yuv implementation
...
Test on Apple M1:
rgb24_to_uv_8_c: 0.0
rgb24_to_uv_8_neon: 0.2
rgb24_to_uv_128_c: 1.0
rgb24_to_uv_128_neon: 0.5
rgb24_to_uv_1080_c: 7.0
rgb24_to_uv_1080_neon: 5.7
rgb24_to_uv_1920_c: 12.5
rgb24_to_uv_1920_neon: 9.5
rgb24_to_uv_half_8_c: 0.2
rgb24_to_uv_half_8_neon: 0.2
rgb24_to_uv_half_128_c: 1.0
rgb24_to_uv_half_128_neon: 0.5
rgb24_to_uv_half_1080_c: 6.2
rgb24_to_uv_half_1080_neon: 3.0
rgb24_to_uv_half_1920_c: 11.2
rgb24_to_uv_half_1920_neon: 5.2
rgb24_to_y_8_c: 0.2
rgb24_to_y_8_neon: 0.0
rgb24_to_y_128_c: 0.5
rgb24_to_y_128_neon: 0.5
rgb24_to_y_1080_c: 4.7
rgb24_to_y_1080_neon: 3.2
rgb24_to_y_1920_c: 8.0
rgb24_to_y_1920_neon: 5.7
On Pixel 6:
rgb24_to_uv_8_c: 30.7
rgb24_to_uv_8_neon: 56.9
rgb24_to_uv_128_c: 213.9
rgb24_to_uv_128_neon: 173.2
rgb24_to_uv_1080_c: 1649.9
rgb24_to_uv_1080_neon: 1424.4
rgb24_to_uv_1920_c: 2907.9
rgb24_to_uv_1920_neon: 2480.7
rgb24_to_uv_half_8_c: 36.2
rgb24_to_uv_half_8_neon: 33.4
rgb24_to_uv_half_128_c: 167.9
rgb24_to_uv_half_128_neon: 99.4
rgb24_to_uv_half_1080_c: 1293.9
rgb24_to_uv_half_1080_neon: 778.7
rgb24_to_uv_half_1920_c: 2292.7
rgb24_to_uv_half_1920_neon: 1328.7
rgb24_to_y_8_c: 19.7
rgb24_to_y_8_neon: 27.7
rgb24_to_y_128_c: 129.9
rgb24_to_y_128_neon: 96.7
rgb24_to_y_1080_c: 995.4
rgb24_to_y_1080_neon: 767.7
rgb24_to_y_1920_c: 1747.4
rgb24_to_y_1920_neon: 1337.2
Note both tests use clang as compiler, which has vectorization
enabled by default with -O3.
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:12:09 +08:00
Zhao Zhili
b1240c983f
tests/checkasm: Fix build error when enable linux perf on Android
...
B0 is defined by system header, see f0f596dbc6
for ref.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:11:46 +08:00
Zhao Zhili
33e4cc963d
avutil/timer: Add clock_gettime as a fallback of AV_READ_TIME
...
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:11:36 +08:00
Zhao Zhili
6a18c0bc87
avutil/aarch64: Skip define AV_READ_TIME for apple
...
It will fallback to mach_absolute_time inside libavutil/timer.h
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:10:42 +08:00
James Almer
94f2274a8b
x86/aacencdsp: fix ff_aac_quantize_bands_avx on unix64 ABI
...
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 17:16:02 -03:00
James Almer
17c3cc5bb6
swscale/x86/rgb_2_rgb: add missing wrap to ff_uyvytoyuv422_avx2
...
Fixes old yasm.
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 16:04:36 -03:00
James Almer
03546f49a3
swscale/x86/rgb2rgb: add missing wrap for ff_uyvytoyuv422_avx2
...
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 15:56:52 -03:00
James Almer
287d139b77
checkasm/sw_rgb: fix alignment of buffers for rgb_to_yuv tests
...
src is apparently not guaranteed to be >8 byte aligned, but align to 16
nonetheless as the x86 asm will do unaligned loads anyway.
dst is guaranteed to be 32 byte aligned for the Y plane, but 16 byte for UV.
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 14:12:51 -03:00
James Almer
e8cef5e152
swscale/x86/rgb2rgb: remove mmxext version of shuffle_bytes_2103
...
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:43:11 -03:00
James Almer
c578bb9864
swscale/x86/input: add AVX2 optimized uyvytoyuv422
...
uyvytoyuv422_c: 23991.8
uyvytoyuv422_sse2: 2817.8
uyvytoyuv422_avx: 2819.3
uyvytoyuv422_avx2: 1972.3
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:43:11 -03:00
James Almer
e9cfd53257
swscale/x86/input: add AVX2 optimized RGB32 to YUV functions
...
abgr_to_uv_8_c: 43.3
abgr_to_uv_8_sse2: 14.3
abgr_to_uv_8_avx: 15.3
abgr_to_uv_8_avx2: 18.8
abgr_to_uv_128_c: 650.3
abgr_to_uv_128_sse2: 110.8
abgr_to_uv_128_avx: 112.3
abgr_to_uv_128_avx2: 64.8
abgr_to_uv_1080_c: 5456.3
abgr_to_uv_1080_sse2: 888.8
abgr_to_uv_1080_avx: 900.8
abgr_to_uv_1080_avx2: 518.3
abgr_to_uv_1920_c: 9692.3
abgr_to_uv_1920_sse2: 1593.8
abgr_to_uv_1920_avx: 1613.3
abgr_to_uv_1920_avx2: 864.8
abgr_to_y_8_c: 23.3
abgr_to_y_8_sse2: 12.8
abgr_to_y_8_avx: 13.3
abgr_to_y_8_avx2: 17.3
abgr_to_y_128_c: 308.3
abgr_to_y_128_sse2: 67.3
abgr_to_y_128_avx: 66.8
abgr_to_y_128_avx2: 44.8
abgr_to_y_1080_c: 2371.3
abgr_to_y_1080_sse2: 512.8
abgr_to_y_1080_avx: 505.8
abgr_to_y_1080_avx2: 314.3
abgr_to_y_1920_c: 4177.3
abgr_to_y_1920_sse2: 915.8
abgr_to_y_1920_avx: 926.8
abgr_to_y_1920_avx2: 519.3
bgra_to_uv_8_c: 37.3
bgra_to_uv_8_sse2: 13.3
bgra_to_uv_8_avx: 14.8
bgra_to_uv_8_avx2: 19.8
bgra_to_uv_128_c: 563.8
bgra_to_uv_128_sse2: 111.3
bgra_to_uv_128_avx: 112.3
bgra_to_uv_128_avx2: 64.8
bgra_to_uv_1080_c: 4691.8
bgra_to_uv_1080_sse2: 893.8
bgra_to_uv_1080_avx: 899.8
bgra_to_uv_1080_avx2: 517.8
bgra_to_uv_1920_c: 8332.8
bgra_to_uv_1920_sse2: 1590.8
bgra_to_uv_1920_avx: 1605.8
bgra_to_uv_1920_avx2: 867.3
bgra_to_y_8_c: 22.3
bgra_to_y_8_sse2: 12.8
bgra_to_y_8_avx: 12.8
bgra_to_y_8_avx2: 17.3
bgra_to_y_128_c: 291.3
bgra_to_y_128_sse2: 67.8
bgra_to_y_128_avx: 69.3
bgra_to_y_128_avx2: 45.3
bgra_to_y_1080_c: 2357.3
bgra_to_y_1080_sse2: 508.3
bgra_to_y_1080_avx: 518.3
bgra_to_y_1080_avx2: 399.8
bgra_to_y_1920_c: 4202.8
bgra_to_y_1920_sse2: 906.8
bgra_to_y_1920_avx: 907.3
bgra_to_y_1920_avx2: 526.3
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:43:11 -03:00
James Almer
d5fe99dc5f
swscale/x86/input: add AVX2 optimized RGB24 to YUV functions
...
rgb24_to_uv_8_c: 39.3
rgb24_to_uv_8_sse2: 14.3
rgb24_to_uv_8_ssse3: 13.3
rgb24_to_uv_8_avx: 12.8
rgb24_to_uv_8_avx2: 14.3
rgb24_to_uv_128_c: 582.8
rgb24_to_uv_128_sse2: 127.3
rgb24_to_uv_128_ssse3: 107.3
rgb24_to_uv_128_avx: 111.3
rgb24_to_uv_128_avx2: 62.3
rgb24_to_uv_1080_c: 4981.3
rgb24_to_uv_1080_sse2: 1048.3
rgb24_to_uv_1080_ssse3: 876.8
rgb24_to_uv_1080_avx: 887.8
rgb24_to_uv_1080_avx2: 492.3
rgb24_to_uv_1280_c: 5906.8
rgb24_to_uv_1280_sse2: 1263.3
rgb24_to_uv_1280_ssse3: 1048.3
rgb24_to_uv_1280_avx: 1045.8
rgb24_to_uv_1280_avx2: 579.8
rgb24_to_uv_1920_c: 8665.3
rgb24_to_uv_1920_sse2: 1888.8
rgb24_to_uv_1920_ssse3: 1571.8
rgb24_to_uv_1920_avx: 1558.8
rgb24_to_uv_1920_avx2: 869.3
rgb24_to_y_8_c: 20.3
rgb24_to_y_8_sse2: 11.8
rgb24_to_y_8_ssse3: 10.3
rgb24_to_y_8_avx: 10.3
rgb24_to_y_8_avx2: 10.8
rgb24_to_y_128_c: 284.8
rgb24_to_y_128_sse2: 83.3
rgb24_to_y_128_ssse3: 66.8
rgb24_to_y_128_avx: 64.8
rgb24_to_y_128_avx2: 39.3
rgb24_to_y_1080_c: 2451.3
rgb24_to_y_1080_sse2: 696.3
rgb24_to_y_1080_ssse3: 516.8
rgb24_to_y_1080_avx: 518.8
rgb24_to_y_1080_avx2: 301.8
rgb24_to_y_1280_c: 2892.8
rgb24_to_y_1280_sse2: 816.8
rgb24_to_y_1280_ssse3: 623.3
rgb24_to_y_1280_avx: 616.3
rgb24_to_y_1280_avx2: 350.8
rgb24_to_y_1920_c: 4338.8
rgb24_to_y_1920_sse2: 1210.8
rgb24_to_y_1920_ssse3: 928.3
rgb24_to_y_1920_avx: 920.3
rgb24_to_y_1920_avx2: 534.8
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 13:42:09 -03:00
James Almer
6743c2fc6a
checkasm/sw_rgb: test rgb32/rgb32_1 to yuv
...
Test all four pixel formats, but only bench the two native endian ones for a
given target.
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 12:29:49 -03:00
James Almer
91b9af0058
x86/aacencdsp: add AVX version of quantize_bands
...
quant_bands_signed_c: 1928.0
quant_bands_signed_sse2: 406.0
quant_bands_signed_avx: 207.0
quant_bands_unsigned_c: 1702.0
quant_bands_unsigned_sse2: 404.0
quant_bands_unsigned_avx: 209.0
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-09 12:29:49 -03:00
Rémi Denis-Courmont
7a3369398f
sws/input: R-V V 32-bit RGB to halved UV
...
T-Head C908:
abgr_to_uv_half_8_c: 2.2
abgr_to_uv_half_8_rvv_i32: 3.5
abgr_to_uv_half_128_c: 44.0
abgr_to_uv_half_128_rvv_i32: 13.0
abgr_to_uv_half_1080_c: 245.0
abgr_to_uv_half_1080_rvv_i32: 107.2
abgr_to_uv_half_1920_c: 406.2
abgr_to_uv_half_1920_rvv_i32: 188.7
bgra_to_uv_half_8_c: 2.2
bgra_to_uv_half_8_rvv_i32: 3.5
bgra_to_uv_half_128_c: 26.5
bgra_to_uv_half_128_rvv_i32: 13.0
bgra_to_uv_half_1080_c: 219.7
bgra_to_uv_half_1080_rvv_i32: 107.0
bgra_to_uv_half_1920_c: 406.7
bgra_to_uv_half_1920_rvv_i32: 188.7
SpacemiT X60:
abgr_to_uv_half_8_c: 2.2
abgr_to_uv_half_8_rvv_i32: 3.0
abgr_to_uv_half_128_c: 28.2
abgr_to_uv_half_128_rvv_i32: 5.7
abgr_to_uv_half_1080_c: 235.5
abgr_to_uv_half_1080_rvv_i32: 47.7
abgr_to_uv_half_1920_c: 418.2
abgr_to_uv_half_1920_rvv_i32: 84.0
bgra_to_uv_half_8_c: 2.0
bgra_to_uv_half_8_rvv_i32: 3.0
bgra_to_uv_half_128_c: 23.7
bgra_to_uv_half_128_rvv_i32: 5.7
bgra_to_uv_half_1080_c: 195.5
bgra_to_uv_half_1080_rvv_i32: 47.7
bgra_to_uv_half_1920_c: 346.5
bgra_to_uv_half_1920_rvv_i32: 84.0
2024-06-09 14:33:04 +03:00
Rémi Denis-Courmont
e2f069905e
sws/input: R-V V 32-bit RGB to UV
2024-06-09 14:33:04 +03:00
Rémi Denis-Courmont
f5555cb106
sws/input: R-V V 32-bit RGB to Y
...
T-Head C908:
abgr_to_y_8_c: 2.5
abgr_to_y_8_rvv_i32: 2.2
abgr_to_y_128_c: 37.0
abgr_to_y_128_rvv_i32: 8.5
abgr_to_y_1080_c: 327.0
abgr_to_y_1080_rvv_i32: 69.5
abgr_to_y_1920_c: 552.0
abgr_to_y_1920_rvv_i32: 122.2
bgra_to_y_8_c: 2.5
bgra_to_y_8_rvv_i32: 2.2
bgra_to_y_128_c: 37.2
bgra_to_y_128_rvv_i32: 8.5
bgra_to_y_1080_c: 310.2
bgra_to_y_1080_rvv_i32: 69.5
bgra_to_y_1920_c: 568.2
bgra_to_y_1920_rvv_i32: 122.5
SpacemiT X60:
abgr_to_y_8_c: 2.5
abgr_to_y_8_rvv_i32: 2.0
abgr_to_y_128_c: 33.0
abgr_to_y_128_rvv_i32: 3.7
abgr_to_y_1080_c: 276.0
abgr_to_y_1080_rvv_i32: 31.5
abgr_to_y_1920_c: 493.7
abgr_to_y_1920_rvv_i32: 55.5
bgra_to_y_8_c: 2.2
bgra_to_y_8_rvv_i32: 2.0
bgra_to_y_128_c: 33.0
bgra_to_y_128_rvv_i32: 3.7
bgra_to_y_1080_c: 276.0
bgra_to_y_1080_rvv_i32: 31.5
bgra_to_y_1920_c: 490.7
bgra_to_y_1920_rvv_i32: 55.5
2024-06-09 14:33:04 +03:00
Andreas Rheinhardt
8b62fb231a
swscale/x86/rgb2rgb: Detemplatize
...
Every function in rgb2rgb_template.c is only compiled exactly
once; there is no overlap at all between the MMXEXT and the
SSE2 functions, so detemplatize it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 12:03:47 +02:00
Andreas Rheinhardt
5421dee0e7
swscale/x86/rgb2rgb_template: Remove unused uyvytoyv12
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 12:03:47 +02:00
Andreas Rheinhardt
c1c35380a7
swscale/x86/rgb2rgb: Don't unnecessarily check for inline ASM
...
The SSE2 and AVX versions of deinterleaveBytes are external ASM.
Move them out of the inline ASM template.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 12:03:47 +02:00
Andreas Rheinhardt
f7305eb3b3
swscale/x86/rgb2rgb_template: Remove unnecessary SFENCE
...
The ff_nv12ToUV_* functions don't use non-temporal stores
at all.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 12:03:47 +02:00
Andreas Rheinhardt
fca796ac3b
tests/checkasm/sw_rgb: Be more strict about clobbering MMX state
...
The MMXEXT versions of the rgb2rgb functions tested here
always emit emms on their own. Therefore one can use
a stricter test to ensure that it stays that way.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 12:03:47 +02:00
Andreas Rheinhardt
3af6136669
avcodec/dnxhdenc: Simplify padding
...
It is unnecessary to first pad to 32bits; the memset later
will pad everything will with zeroes anyway.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Andreas Rheinhardt
b0e0b3c58a
avcodec/dnxhdenc: Move PutBitContext from ctx to stack
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Andreas Rheinhardt
542abee213
avcodec/cbs_h266_syntax_template: Use correct format specifier
...
H266RawSliceHeader.num_entry_points is an uint32_t.
Fixes -Wformat warnings:
https://fate.ffmpeg.org/log.cgi?slot=aarch64-osx-clang-1200.0.32.29&time=20240604151047&log=compile
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Andreas Rheinhardt
8f199cfb5b
avformat/evc: Fix format specifiers
...
Fixes -Wformat warnings; see e.g.
https://fate.ffmpeg.org/log.cgi?slot=aarch64-osx-clang-1200.0.32.29&time=20240604151047&log=compile
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Andreas Rheinhardt
5f31a4fd16
avformat/vvc: Don't use uint8_t iterators, fix shadowing
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Andreas Rheinhardt
1c4362cce9
avformat/vvc: Fix comment
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Andreas Rheinhardt
fa77dc8c44
avformat/vvc: Reindent after the previous commit
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Andreas Rheinhardt
8b6c7e7cda
avformat/vvc: Fix crash on allocation failure, avoid allocations
...
This is the VVC version of 8b5d155301
.
(Hint: This ensures that the order of NALU arrays is OPI-VPS-SPS-PPS-
Prefix-SEI-Suffix-SEI, regardless of the order in the original
extradata. I hope this is right.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Andreas Rheinhardt
4482b3353d
avformat/vvc: Don't use ff_copy_bits()
...
There is no benefit in using it: The fast path of copying
is not taken because of misalignment; furthermore we are
only dealing with a few byte here anyway, so simply copy
the bytes manually, avoiding the dependency on bitstream.c
in lavf (which also contains a function that is completely
unused in lavf).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Andreas Rheinhardt
52fb49a8a3
avformat/vvc: Use put_bytes_output()
...
The PutBitContext has just been flushed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Andreas Rheinhardt
dd8fb0aaae
avcodec/hevc/Makefile: Move rules for lavc/* files to lavc/Makefile
...
If any of these files (say A) would be changed in such a way
that A acquires a new dependency on another file B, building B
would need to be added to all the rules that lead to A being built.
Yet currently the rules for several files are spread over
the lavc Makefile and the Makefile of the lavc/hevc subdir, making
it more likely to be forgotten. So move the rules for these files
to the lavc/Makefile.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-09 10:59:33 +02:00
Rémi Denis-Courmont
daac101e61
lavc/aacencdsp: fix rounding in R-V V quantize_bands
...
We need to round toward zero here.
2024-06-08 18:30:43 +03:00
Rémi Denis-Courmont
658439934b
lavc/vp8dsp: R-V V vp8_idct_add
...
T-Head C908 (cycles):
vp8_idct_add_c: 312.2
vp8_idct_add_rvv_i32: 117.0
2024-06-08 18:30:43 +03:00
Rémi Denis-Courmont
e0f4d185f1
sws/input: R-V V rgb24ToUV_half and bgr24ToUV_half
...
T-Head C908:
rgb24_to_uv_half_4_c: 2.0
rgb24_to_uv_half_4_rvv_i32: 3.5
rgb24_to_uv_half_64_c: 27.0
rgb24_to_uv_half_64_rvv_i32: 12.5
rgb24_to_uv_half_540_c: 223.7
rgb24_to_uv_half_540_rvv_i32: 105.2
rgb24_to_uv_half_640_c: 265.5
rgb24_to_uv_half_640_rvv_i32: 123.7
rgb24_to_uv_half_960_c: 414.5
rgb24_to_uv_half_960_rvv_i32: 249.5
SpacemiT X60:
rgb24_to_uv_half_4_c: 1.7
rgb24_to_uv_half_4_rvv_i32: 4.2
rgb24_to_uv_half_64_c: 24.0
rgb24_to_uv_half_64_rvv_i32: 8.7
rgb24_to_uv_half_540_c: 199.2
rgb24_to_uv_half_540_rvv_i32: 72.5
rgb24_to_uv_half_640_c: 235.7
rgb24_to_uv_half_640_rvv_i32: 85.2
rgb24_to_uv_half_960_c: 353.5
rgb24_to_uv_half_960_rvv_i32: 127.5
2024-06-08 18:30:43 +03:00
Rémi Denis-Courmont
3ef5867e4b
sws/input: R-V V rgb24ToUV and bgr24ToUV
...
T-Head C908:
rgb24_to_uv_8_c: 2.7
rgb24_to_uv_8_rvv_i32: 3.2
rgb24_to_uv_128_c: 41.0
rgb24_to_uv_128_rvv_i32: 12.7
rgb24_to_uv_1080_c: 342.5
rgb24_to_uv_1080_rvv_i32: 105.7
rgb24_to_uv_1280_c: 406.0
rgb24_to_uv_1280_rvv_i32: 124.2
rgb24_to_uv_1920_c: 626.0
rgb24_to_uv_1920_rvv_i32: 186.0
SpacemiT X60:
rgb24_to_uv_8_c: 2.5
rgb24_to_uv_8_rvv_i32: 3.0
rgb24_to_uv_128_c: 36.5
rgb24_to_uv_128_rvv_i32: 5.7
rgb24_to_uv_1080_c: 304.2
rgb24_to_uv_1080_rvv_i32: 49.0
rgb24_to_uv_1280_c: 360.5
rgb24_to_uv_1280_rvv_i32: 57.5
rgb24_to_uv_1920_c: 540.7
rgb24_to_uv_1920_rvv_i32: 86.2
2024-06-08 18:30:43 +03:00
Rémi Denis-Courmont
79dfdac4db
sws/input: R-V V rgb24ToY & bgr24ToY
...
T-Head C908:
rgb24_to_y_8_c: 2.0
rgb24_to_y_8_rvv_i32: 2.7
rgb24_to_y_128_c: 26.2
rgb24_to_y_128_rvv_i32: 9.2
rgb24_to_y_1080_c: 219.5
rgb24_to_y_1080_rvv_i32: 76.2
rgb24_to_y_1280_c: 276.2
rgb24_to_y_1280_rvv_i32: 89.7
rgb24_to_y_1920_c: 389.7
rgb24_to_y_1920_rvv_i32: 134.2
SpacemiT X60:
rgb24_to_y_8_c: 1.7
rgb24_to_y_8_rvv_i32: 2.2
rgb24_to_y_128_c: 23.2
rgb24_to_y_128_rvv_i32: 4.2
rgb24_to_y_1080_c: 195.0
rgb24_to_y_1080_rvv_i32: 33.7
rgb24_to_y_1280_c: 231.0
rgb24_to_y_1280_rvv_i32: 40.0
rgb24_to_y_1920_c: 346.2
rgb24_to_y_1920_rvv_i32: 59.7
2024-06-08 18:30:43 +03:00
Wenbin Chen
7560db937d
libavfi/dnn: enable LibTorch xpu device option support
...
Add xpu device support to libtorch backend.
To enable xpu support you need to add
"-Wl,--no-as-needed -lintel-ext-pt-gpu -Wl,--as-needed" to
"--extra-libs" when configure ffmpeg.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
2024-06-08 19:45:21 +08:00
Nuo Mi
f68f40736f
avcodec/vvcdec: support mv wraparound
...
A 360 video specific tool
see https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9503377
passed files:
DMVR_A_Huawei_3.bit
WRAP_D_InterDigital_4.bit
WRAP_A_InterDigital_4.bit
WRAP_B_InterDigital_4.bit
WRAP_C_InterDigital_4.bit
ERP_A_MediaTek_3.bit
2024-06-08 17:45:55 +08:00
Nuo Mi
685174069f
avcodec/vvcdec: misc, reindent inter.c
2024-06-08 17:45:55 +08:00
Nuo Mi
a4013e748a
avcodec/vvcdec: refact out emulated_edge_no_wrap
...
prepare for refrence wraparound
2024-06-08 17:45:55 +08:00
Nuo Mi
8abdf0a28e
avcodec/vvcdec: misc, move src offset inside emulated_edge
2024-06-08 17:45:55 +08:00
Nuo Mi
2d98786fee
avcodec/vvcdec: refact, remove emulated_edge_dmvr and emulated_edge_bilinear to simplify code
2024-06-08 17:45:55 +08:00