Rémi Denis-Courmont
b3825bbe45
riscv: test for assembler support
...
This should fix the build on LLVM 16 and earlier, at the cost of turning
all non-RVV optimisations off.
2023-12-08 17:21:09 +02:00
sunyuechi
0b9d009b4a
lavc/vc1dsp: R-V V inv_trans
...
C908:
vc1dsp.vc1_inv_trans_4x4_dc_c: 125.7
vc1dsp.vc1_inv_trans_4x4_dc_rvv_i32: 53.5
vc1dsp.vc1_inv_trans_4x8_dc_c: 230.7
vc1dsp.vc1_inv_trans_4x8_dc_rvv_i32: 65.5
vc1dsp.vc1_inv_trans_8x4_dc_c: 228.7
vc1dsp.vc1_inv_trans_8x4_dc_rvv_i64: 64.5
vc1dsp.vc1_inv_trans_8x8_dc_c: 476.5
vc1dsp.vc1_inv_trans_8x8_dc_rvv_i64: 80.2
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-08 17:20:48 +02:00
Mikhail Nitenko
0f745b74ec
lavc/aarch64: h264qpel, add 10-bit lowpass_8_10 based functions
...
Benchmarks A53 A55 A72 A76
avg_h264_qpel_8_mc01_10_c: 936.5 924.0 656.0 504.7
avg_h264_qpel_8_mc01_10_neon: 234.7 202.0 120.7 63.2
avg_h264_qpel_8_mc02_10_c: 921.0 920.0 669.2 493.7
avg_h264_qpel_8_mc02_10_neon: 202.0 173.2 102.7 58.5
avg_h264_qpel_8_mc03_10_c: 936.5 924.0 656.0 509.5
avg_h264_qpel_8_mc03_10_neon: 236.2 203.7 120.0 63.2
avg_h264_qpel_8_mc10_10_c: 1441.0 1437.7 806.7 478.5
avg_h264_qpel_8_mc10_10_neon: 325.7 324.0 153.7 94.2
avg_h264_qpel_8_mc11_10_c: 2160.7 2148.2 1366.7 906.7
avg_h264_qpel_8_mc11_10_neon: 492.0 464.0 242.5 134.5
avg_h264_qpel_8_mc13_10_c: 2157.0 2138.2 1357.0 908.2
avg_h264_qpel_8_mc13_10_neon: 494.0 467.2 242.0 140.0
avg_h264_qpel_8_mc20_10_c: 1433.5 1410.0 785.2 486.0
avg_h264_qpel_8_mc20_10_neon: 293.7 289.7 138.0 91.5
avg_h264_qpel_8_mc30_10_c: 1458.5 1461.7 813.7 483.2
avg_h264_qpel_8_mc30_10_neon: 341.7 339.2 154.0 95.2
avg_h264_qpel_8_mc31_10_c: 2194.7 2197.2 1358.7 928.0
avg_h264_qpel_8_mc31_10_neon: 520.0 495.0 245.5 142.5
avg_h264_qpel_8_mc33_10_c: 2188.0 2205.5 1356.7 910.7
avg_h264_qpel_8_mc33_10_neon: 521.0 494.5 245.7 145.7
avg_h264_qpel_16_mc01_10_c: 3717.2 3595.0 2610.0 2012.0
avg_h264_qpel_16_mc01_10_neon: 920.5 791.5 483.2 240.5
avg_h264_qpel_16_mc02_10_c: 3684.0 3633.0 2659.0 1919.7
avg_h264_qpel_16_mc02_10_neon: 790.7 678.2 409.2 217.0
avg_h264_qpel_16_mc03_10_c: 3726.5 3596.0 2606.7 2010.0
avg_h264_qpel_16_mc03_10_neon: 922.0 792.5 483.2 239.7
avg_h264_qpel_16_mc10_10_c: 5912.0 5803.2 3241.5 1916.7
avg_h264_qpel_16_mc10_10_neon: 1267.5 1277.2 616.5 365.0
avg_h264_qpel_16_mc11_10_c: 8599.2 8482.5 5338.0 3616.2
avg_h264_qpel_16_mc11_10_neon: 1913.0 1827.0 956.2 542.2
avg_h264_qpel_16_mc13_10_c: 8643.7 8488.5 5388.0 3628.5
avg_h264_qpel_16_mc13_10_neon: 1914.7 1828.7 969.2 530.5
avg_h264_qpel_16_mc20_10_c: 5719.5 5641.0 3147.0 1946.2
avg_h264_qpel_16_mc20_10_neon: 1139.5 1150.0 539.5 344.0
avg_h264_qpel_16_mc30_10_c: 5930.0 5872.5 3267.5 1918.0
avg_h264_qpel_16_mc30_10_neon: 1331.5 1341.2 616.5 369.5
avg_h264_qpel_16_mc31_10_c: 8758.7 8697.7 5353.0 3630.7
avg_h264_qpel_16_mc31_10_neon: 2018.7 1941.7 982.2 574.7
avg_h264_qpel_16_mc33_10_c: 8683.2 8675.2 5339.2 3634.7
avg_h264_qpel_16_mc33_10_neon: 2019.7 1940.2 994.5 566.0
put_h264_qpel_8_mc01_10_c: 854.2 843.0 599.2 478.0
put_h264_qpel_8_mc01_10_neon: 192.7 168.0 101.7 56.7
put_h264_qpel_8_mc02_10_c: 766.5 760.0 550.2 441.0
put_h264_qpel_8_mc02_10_neon: 160.0 139.2 88.7 53.0
put_h264_qpel_8_mc03_10_c: 854.2 843.0 599.2 479.0
put_h264_qpel_8_mc03_10_neon: 194.2 169.7 102.0 56.2
put_h264_qpel_8_mc10_10_c: 1352.7 1353.7 749.7 446.7
put_h264_qpel_8_mc10_10_neon: 289.7 294.2 135.5 88.5
put_h264_qpel_8_mc11_10_c: 2080.0 2066.2 1309.5 876.7
put_h264_qpel_8_mc11_10_neon: 450.0 429.7 229.7 131.2
put_h264_qpel_8_mc13_10_c: 2074.7 2060.2 1294.5 870.5
put_h264_qpel_8_mc13_10_neon: 452.5 434.5 226.5 130.0
put_h264_qpel_8_mc20_10_c: 1221.5 1216.0 684.5 399.7
put_h264_qpel_8_mc20_10_neon: 257.7 262.5 121.2 78.7
put_h264_qpel_8_mc30_10_c: 1379.0 1374.7 757.2 449.5
put_h264_qpel_8_mc30_10_neon: 305.7 310.2 135.5 86.5
put_h264_qpel_8_mc31_10_c: 2109.2 2119.7 1299.5 878.0
put_h264_qpel_8_mc31_10_neon: 478.0 458.5 226.0 137.2
put_h264_qpel_8_mc33_10_c: 2101.5 2115.2 1306.5 887.0
put_h264_qpel_8_mc33_10_neon: 479.0 458.7 229.7 141.7
put_h264_qpel_16_mc01_10_c: 3485.7 3396.7 2460.5 1914.5
put_h264_qpel_16_mc01_10_neon: 752.5 665.5 397.0 213.2
put_h264_qpel_16_mc02_10_c: 3103.5 3023.2 2154.7 1720.7
put_h264_qpel_16_mc02_10_neon: 622.7 551.2 347.7 196.2
put_h264_qpel_16_mc03_10_c: 3486.2 3394.0 2436.5 1917.7
put_h264_qpel_16_mc03_10_neon: 754.0 666.5 397.0 215.7
put_h264_qpel_16_mc10_10_c: 5533.0 5488.5 2989.0 1783.0
put_h264_qpel_16_mc10_10_neon: 1123.5 1165.2 535.2 334.7
put_h264_qpel_16_mc11_10_c: 8437.7 8281.2 5209.0 3510.7
put_h264_qpel_16_mc11_10_neon: 1745.0 1697.0 878.5 513.5
put_h264_qpel_16_mc13_10_c: 8567.7 8468.0 5221.5 3528.0
put_h264_qpel_16_mc13_10_neon: 1751.7 1698.2 889.2 507.0
put_h264_qpel_16_mc20_10_c: 4907.5 4885.0 2786.2 1607.5
put_h264_qpel_16_mc20_10_neon: 995.5 1034.5 475.5 307.0
put_h264_qpel_16_mc30_10_c: 5579.7 5537.7 3045.2 1789.5
put_h264_qpel_16_mc30_10_neon: 1187.5 1231.2 532.5 334.5
put_h264_qpel_16_mc31_10_c: 8677.2 8672.5 5204.2 3516.0
put_h264_qpel_16_mc31_10_neon: 1850.7 1813.2 893.0 545.2
put_h264_qpel_16_mc33_10_c: 8688.7 8671.2 5223.2 3512.0
put_h264_qpel_16_mc33_10_neon: 1851.7 1814.2 908.5 535.2
Signed-off-by: Mikhail Nitenko <mnitenko@gmail.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-07 23:20:14 +02:00
Haihao Xiang
f89cff96d0
lavu/hwcontext_qsv: Make sure hardware vendor is Intel for qsv on d3d11va
...
When multiple hardwares are available, the default one might not be
Intel Hardware. We can use option vendor_id to choose the required
vendor.
Tested-by: Artem Galin <artem.galin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2023-12-07 10:32:16 +08:00
Haihao Xiang
e5f8b5313e
doc/ffmpeg: Update the description about d3d11va
...
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2023-12-07 10:32:16 +08:00
Artem Galin
a556be69a7
lavu/hwcontext_d3d11va: Add option vendor_id
...
User may choose the hardware via option vendor_id when multiple
hardwares are available.
Signed-off-by: Artem Galin <artem.galin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2023-12-07 10:32:16 +08:00
Cosmin Stejerean
737ede405b
avfilter/bwdif: account for chroma sub-sampling in min size calculation
...
The current logic for detecting frames that are too small for the
algorithm does not account for chroma sub-sampling, and so a sample
where the luma plane is large enough, but the chroma planes are not
will not be rejected. In that event, a heap overflow will occur.
This change adjusts the logic to consider the chroma planes and makes
the change to all three bwdif implementations.
Fixes #10688
Signed-off-by: Cosmin Stejerean <cosmin@cosmin.at>
Reviewed-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: Philip Langdale <philipl@overt.org>
2023-12-07 10:00:12 +08:00
sunyuechi
8bdb663062
lavc/ac3dsp: R-V V float_to_fixed24
...
c910
float_to_fixed24_c: 2207.2
float_to_fixed24_rvv_f32: 696.2
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-06 16:04:22 +02:00
Paul B Mahol
ebfd3912e9
avfilter/asrc_flite: use streaming function
...
Fix continuous accumulation of audio samples for big txt inputs.
2023-12-06 10:52:46 +01:00
Paul B Mahol
d793af982e
avfilter/asrc_flite: switch to activate
...
Allows to set EOF timestamp.
2023-12-06 10:52:45 +01:00
Anton Khirnov
99d2fa38ad
fftools/ffmpeg: make sure FrameData is writable when we modify it
...
Also, add a function that returns const FrameData* for cases that only
read from it.
2023-12-06 10:30:28 +01:00
Anton Khirnov
1d536e0283
fftools/ffmpeg_filter: track input/output index in {Input,Output}FilterPriv
...
Will be useful in following commits.
2023-12-06 10:01:21 +01:00
Anton Khirnov
c6483f1c2a
lavfi/buffersink: avoid leaking peeked_frame on uninit
2023-12-06 10:01:09 +01:00
Nuo Mi
e78c6a1f2c
MAINTAINERS: add myself as vvc maintainer
2023-12-05 20:51:37 +01:00
Paul B Mahol
3047f05a99
tests/fate: add asegment filter tests
2023-12-05 14:50:40 +01:00
Paul B Mahol
7e453dad3c
avcodec/qoadec: fix overreads and fix packet size check
2023-12-05 14:50:21 +01:00
Jean-Baptiste Kempf
6e26a5a64e
doc/developer: require asm for RISC-V
...
Explicitly document our usage of assembly, following suit with other
architectures.
Signed-off-by: J. Dekker <jdek@itanimul.li>
2023-12-05 14:44:18 +01:00
Michael Niedermayer
22daf2148f
avcodec/av1dec: Fix resolving zero divisor
...
Fixes: Out of array read
Fixes: global-buffer-overflow-AV1
Found-by: "Leonelli, Matteo" <matteo.leonelli@cispa.de>
Tested-by: "Wang, Fei W" <fei.w.wang@intel.com>
Reviewed-by: "Wang, Fei W" <fei.w.wang@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-05 12:38:16 +01:00
Michael Niedermayer
4cdf2c7f76
avformat/mov: Ignore duplicate ftyp
...
Fixes: switch_1080p_720p.mp4
Found-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-05 12:12:10 +01:00
Leo Izen
b60d39fbe2
fate/jpegxl: add parser test for extboxes and small files
...
Add a fate test for the above commits fixing extremely small files or
files with extended box sizes.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
2023-12-05 05:54:58 -05:00
Leo Izen
c4be080e65
avcodec/jpegxl_parser: fix parsing sequences of extremely small files
...
This patch allows the JXL parser to parse sequences of extremely small
files concatenated together. (e.g. smaller than the parser buffer)
Signed-off-by: Leo Izen <leo.izen@gmail.com>
2023-12-05 05:54:34 -05:00
Leo Izen
019b3ea65a
avcodec/jpegxl_parse{,r}: use correct ISOBMFF extended size location
...
According to ISO/IEC 14996-12, size == 1 means a 64-bit extended-size
field occurs *after* the 32-bit box type, not before. This fix should
allow correct parsing of JXL files with extended-size boxes.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
2023-12-05 05:53:32 -05:00
Haihao Xiang
35a555e2b9
lavfi/vf_vpp_qsv: set the default value of async_depth to 4
...
Both qsv encoders and decoders use 4 as the default value of
async_depth, let's use 4 as the default value for vpp_qsv filter too.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2023-12-05 10:12:16 +08:00
Haihao Xiang
05debdaa5f
lavu/hwcontext_qsv: use mfxImplDescription instead of mfxExtendedDeviceId on Linux
...
mfxExtendedDeviceId mightn't be supported in certain configurations of
oneVPL on Linux, so we can't ensure a property filter for
mfxExtendedDeviceId.DeviceID or mfxExtendedDeviceId.VendorID works as
expected. This fixed the issue mentioned in [1]
[1] http://ffmpeg.org/pipermail/ffmpeg-user/2023-October/056983.html
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2023-12-05 10:11:19 +08:00
Haihao Xiang
fc73b372cd
lavc/qsvdec: reduce info message when more data is required
...
demote the info to AV_LOG_VERBOSE
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2023-12-05 10:10:57 +08:00
Haihao Xiang
e233f3e75f
lavc/qsvdec: return 0 if more data is required
...
The type of qsv decoders is FF_CODEC_CB_TYPE_DECODE which must not
return AVERROR(EAGAIN). commit 42b20c9
added an assertion to check the
returned value.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2023-12-05 10:10:57 +08:00
Haihao Xiang
5717fbbea2
configure: don't warn deprecated symbols from libvpl
...
libvpl deprecated some symbols (e.g. MFX_EXTBUFF_VPP_DENOISE2 is used to
replace MFX_EXTBUFF_VPP_DENOISE), however the new symbols aren't support
by MediaSDK runtime. In order to support the combination of libvpl and
MediaSDK runtime on legacy devices, we continue to use the deprecated
symbols in FFmpeg. This patch added -DMFX_DEPRECATED_OFF to CFLAGS to
silence the corresponding compilation warnings.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2023-12-05 10:10:29 +08:00
Haihao Xiang
d36d9994e4
lavu/hwcontext_vaapi: ignore nonexistent device in default DRM device selection
...
It is possible that renderD128 doesn't exist but renderD129 is
available in a system (see [1]). This change can make sure the default
DRM device selection works even if renderD128 doesn't exist.
[1] https://github.com/intel/intel-device-plugins-for-kubernetes/blob/main/cmd/gpu_plugin/README.md#issues-with-media-workloads-on-multi-gpu-setups
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2023-12-05 10:09:55 +08:00
Paul B Mahol
9e74c7ae87
avfilter/af_dialoguenhance: add double-floating point sample format support
2023-12-04 23:14:40 +01:00
Paul B Mahol
704ef556fe
avfilter/af_surround: refactor some code
2023-12-04 23:14:39 +01:00
Cosmin Stejerean
634216dc40
fate: Add tests for QOA decoder
2023-12-04 23:14:38 +01:00
Kyle Swanson
9f1dbca820
avfilter/libvmaf: small cleanup for style, whitespace, unused LIBVMAFContext struct members
...
Signed-off-by: Kyle Swanson <kswanson@netflix.com>
2023-12-04 10:00:01 -08:00
Lynne
8c117b75af
lavc/Makefile: build vulkan decode code if vulkan_av1 has been enabled
...
Forgotten.
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
Tested-by: Neal Gompa <ngompa13@gmail.com>
2023-12-04 07:57:27 +01:00
Paul B Mahol
d9e41ead82
avfilter/avfilter: fix OOM case for default activate
...
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
2023-12-03 23:26:43 +01:00
Paul B Mahol
e3e3531d1e
avfilter/vsrc_gradients: make rotation always continuous if speed changes
2023-12-03 18:43:26 +01:00
Paul B Mahol
8888574d10
avfilter/vsrc_gradients: add commands support
2023-12-03 18:30:06 +01:00
Stefano Sabatini
ddecc39c39
doc/encoders/libx264: review and extend option description
...
Also, merge x264opts and x264-opts option docs to avoid duplication
and make it clearer that they provide mostly the same functionality.
2023-12-03 13:03:01 +01:00
Paul B Mahol
f84412d6f4
avfilter/vf_corr: for all zero returns zero score instead of 1
2023-12-03 03:10:05 +01:00
Paul B Mahol
aad3223978
avfilter/vf_corr: add slice threading support
2023-12-03 03:10:03 +01:00
Martin Storsjö
12598e72e3
checkasm: Fix the signature of float_to_fixed24
...
The len parameter was changed from unsigned int to size_t in
567c67c6c8
.
This fixes crashes in the reference C code, when running checkasm
on aarch64.
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-02 18:16:09 +02:00
Paul B Mahol
0a13178de8
avcodec/qoadec: add support for midstream sample rate/layout changes
2023-12-02 16:51:00 +01:00
Anton Khirnov
5230257ea1
lavc/dvdsubenc: only check canvas size when it is actually set
...
Fixes #10650
2023-12-02 11:22:46 +01:00
Anton Khirnov
6a22d80041
doc/filters:ddagrab: elaborate on the semantics of framerate
2023-12-02 11:22:46 +01:00
Alfred Wingate
e5ce473040
swscale/x86/rgb_2_rgb: Add opaque pointer to missed definitions of ff_nv12ToUV
...
Opaque parameters were previously added to the original definition of
ff_nv12ToUV, leading to gcc noticing a type mismatch with -Wlto-type-mismatch.
f2de911818
https://bugs.gentoo.org/907484
Signed-off-by: Alfred Wingate <parona@protonmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2023-12-02 11:22:46 +01:00
Dale Curtis
2182173a69
avformat/mov: Fix integer overflow in mov_read_packet().
...
Fixes https://crbug.com/1499669 :
runtime error: signed integer overflow: 9223372036853334272 + 1375731456
cannot be represented in type 'int64_t' (aka 'long')
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-02 00:25:15 +01:00
Paul B Mahol
db7b838237
avfilter/vf_chromanr: compare correct variables for advanced mode
2023-12-01 21:31:38 +01:00
Logan Lyu
fa0470347e
lavc/aarch64: new optimization for 8-bit hevc_qpel_bi_hv
...
put_hevc_qpel_bi_hv4_8_c: 433.7
put_hevc_qpel_bi_hv4_8_i8mm: 117.9
put_hevc_qpel_bi_hv6_8_c: 803.9
put_hevc_qpel_bi_hv6_8_i8mm: 252.7
put_hevc_qpel_bi_hv8_8_c: 1296.4
put_hevc_qpel_bi_hv8_8_i8mm: 316.2
put_hevc_qpel_bi_hv12_8_c: 2867.4
put_hevc_qpel_bi_hv12_8_i8mm: 669.2
put_hevc_qpel_bi_hv16_8_c: 4709.4
put_hevc_qpel_bi_hv16_8_i8mm: 929.9
put_hevc_qpel_bi_hv24_8_c: 9639.7
put_hevc_qpel_bi_hv24_8_i8mm: 2072.4
put_hevc_qpel_bi_hv32_8_c: 16663.7
put_hevc_qpel_bi_hv32_8_i8mm: 3391.4
put_hevc_qpel_bi_hv48_8_c: 36972.9
put_hevc_qpel_bi_hv48_8_i8mm: 7505.7
put_hevc_qpel_bi_hv64_8_c: 64106.4
put_hevc_qpel_bi_hv64_8_i8mm: 13145.2
Co-Authored-By: J. Dekker <jdek@itanimul.li>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-01 21:25:39 +02:00
Logan Lyu
595f97028b
lavc/aarch64: new optimization for 8-bit hevc_qpel_bi_v
...
put_hevc_qpel_bi_v4_8_c: 166.1
put_hevc_qpel_bi_v4_8_neon: 61.9
put_hevc_qpel_bi_v6_8_c: 309.4
put_hevc_qpel_bi_v6_8_neon: 75.6
put_hevc_qpel_bi_v8_8_c: 531.1
put_hevc_qpel_bi_v8_8_neon: 78.1
put_hevc_qpel_bi_v12_8_c: 1139.9
put_hevc_qpel_bi_v12_8_neon: 238.1
put_hevc_qpel_bi_v16_8_c: 2063.6
put_hevc_qpel_bi_v16_8_neon: 308.9
put_hevc_qpel_bi_v24_8_c: 4317.1
put_hevc_qpel_bi_v24_8_neon: 629.9
put_hevc_qpel_bi_v32_8_c: 8241.9
put_hevc_qpel_bi_v32_8_neon: 1140.1
put_hevc_qpel_bi_v48_8_c: 18422.9
put_hevc_qpel_bi_v48_8_neon: 2533.9
put_hevc_qpel_bi_v64_8_c: 37508.6
put_hevc_qpel_bi_v64_8_neon: 4520.1
Co-Authored-By: J. Dekker <jdek@itanimul.li>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-01 21:25:39 +02:00
Logan Lyu
00290a64f7
lavc/aarch64: new optimization for 8-bit hevc_epel_bi_hv
...
put_hevc_epel_bi_hv4_8_c: 242.9
put_hevc_epel_bi_hv4_8_i8mm: 68.6
put_hevc_epel_bi_hv6_8_c: 402.4
put_hevc_epel_bi_hv6_8_i8mm: 135.9
put_hevc_epel_bi_hv8_8_c: 636.4
put_hevc_epel_bi_hv8_8_i8mm: 145.6
put_hevc_epel_bi_hv12_8_c: 1363.1
put_hevc_epel_bi_hv12_8_i8mm: 324.1
put_hevc_epel_bi_hv16_8_c: 2222.1
put_hevc_epel_bi_hv16_8_i8mm: 509.1
put_hevc_epel_bi_hv24_8_c: 4793.4
put_hevc_epel_bi_hv24_8_i8mm: 1091.9
put_hevc_epel_bi_hv32_8_c: 8393.9
put_hevc_epel_bi_hv32_8_i8mm: 1720.6
put_hevc_epel_bi_hv48_8_c: 19526.6
put_hevc_epel_bi_hv48_8_i8mm: 4285.9
put_hevc_epel_bi_hv64_8_c: 33915.4
put_hevc_epel_bi_hv64_8_i8mm: 6783.6
Co-Authored-By: J. Dekker <jdek@itanimul.li>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-01 21:25:39 +02:00
Logan Lyu
0448f27f41
lavc/aarch64: new optimization for 8-bit hevc_epel_bi_v
...
put_hevc_epel_bi_v4_8_c: 138.4
put_hevc_epel_bi_v4_8_neon: 33.7
put_hevc_epel_bi_v6_8_c: 302.9
put_hevc_epel_bi_v6_8_neon: 46.7
put_hevc_epel_bi_v8_8_c: 408.7
put_hevc_epel_bi_v8_8_neon: 48.7
put_hevc_epel_bi_v12_8_c: 779.4
put_hevc_epel_bi_v12_8_neon: 139.7
put_hevc_epel_bi_v16_8_c: 1344.9
put_hevc_epel_bi_v16_8_neon: 160.2
put_hevc_epel_bi_v24_8_c: 2981.7
put_hevc_epel_bi_v24_8_neon: 344.9
put_hevc_epel_bi_v32_8_c: 5280.9
put_hevc_epel_bi_v32_8_neon: 618.4
put_hevc_epel_bi_v48_8_c: 12494.9
put_hevc_epel_bi_v48_8_neon: 1364.4
put_hevc_epel_bi_v64_8_c: 22127.7
put_hevc_epel_bi_v64_8_neon: 2473.7
Co-Authored-By: J. Dekker <jdek@itanimul.li>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-01 21:25:39 +02:00