FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-18 03:19:31 +02:00

Author	SHA1	Message	Date
Logan Lyu	fa0470347e	lavc/aarch64: new optimization for 8-bit hevc_qpel_bi_hv put_hevc_qpel_bi_hv4_8_c: 433.7 put_hevc_qpel_bi_hv4_8_i8mm: 117.9 put_hevc_qpel_bi_hv6_8_c: 803.9 put_hevc_qpel_bi_hv6_8_i8mm: 252.7 put_hevc_qpel_bi_hv8_8_c: 1296.4 put_hevc_qpel_bi_hv8_8_i8mm: 316.2 put_hevc_qpel_bi_hv12_8_c: 2867.4 put_hevc_qpel_bi_hv12_8_i8mm: 669.2 put_hevc_qpel_bi_hv16_8_c: 4709.4 put_hevc_qpel_bi_hv16_8_i8mm: 929.9 put_hevc_qpel_bi_hv24_8_c: 9639.7 put_hevc_qpel_bi_hv24_8_i8mm: 2072.4 put_hevc_qpel_bi_hv32_8_c: 16663.7 put_hevc_qpel_bi_hv32_8_i8mm: 3391.4 put_hevc_qpel_bi_hv48_8_c: 36972.9 put_hevc_qpel_bi_hv48_8_i8mm: 7505.7 put_hevc_qpel_bi_hv64_8_c: 64106.4 put_hevc_qpel_bi_hv64_8_i8mm: 13145.2 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-12-01 21:25:39 +02:00
Logan Lyu	595f97028b	lavc/aarch64: new optimization for 8-bit hevc_qpel_bi_v put_hevc_qpel_bi_v4_8_c: 166.1 put_hevc_qpel_bi_v4_8_neon: 61.9 put_hevc_qpel_bi_v6_8_c: 309.4 put_hevc_qpel_bi_v6_8_neon: 75.6 put_hevc_qpel_bi_v8_8_c: 531.1 put_hevc_qpel_bi_v8_8_neon: 78.1 put_hevc_qpel_bi_v12_8_c: 1139.9 put_hevc_qpel_bi_v12_8_neon: 238.1 put_hevc_qpel_bi_v16_8_c: 2063.6 put_hevc_qpel_bi_v16_8_neon: 308.9 put_hevc_qpel_bi_v24_8_c: 4317.1 put_hevc_qpel_bi_v24_8_neon: 629.9 put_hevc_qpel_bi_v32_8_c: 8241.9 put_hevc_qpel_bi_v32_8_neon: 1140.1 put_hevc_qpel_bi_v48_8_c: 18422.9 put_hevc_qpel_bi_v48_8_neon: 2533.9 put_hevc_qpel_bi_v64_8_c: 37508.6 put_hevc_qpel_bi_v64_8_neon: 4520.1 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-12-01 21:25:39 +02:00
Logan Lyu	00290a64f7	lavc/aarch64: new optimization for 8-bit hevc_epel_bi_hv put_hevc_epel_bi_hv4_8_c: 242.9 put_hevc_epel_bi_hv4_8_i8mm: 68.6 put_hevc_epel_bi_hv6_8_c: 402.4 put_hevc_epel_bi_hv6_8_i8mm: 135.9 put_hevc_epel_bi_hv8_8_c: 636.4 put_hevc_epel_bi_hv8_8_i8mm: 145.6 put_hevc_epel_bi_hv12_8_c: 1363.1 put_hevc_epel_bi_hv12_8_i8mm: 324.1 put_hevc_epel_bi_hv16_8_c: 2222.1 put_hevc_epel_bi_hv16_8_i8mm: 509.1 put_hevc_epel_bi_hv24_8_c: 4793.4 put_hevc_epel_bi_hv24_8_i8mm: 1091.9 put_hevc_epel_bi_hv32_8_c: 8393.9 put_hevc_epel_bi_hv32_8_i8mm: 1720.6 put_hevc_epel_bi_hv48_8_c: 19526.6 put_hevc_epel_bi_hv48_8_i8mm: 4285.9 put_hevc_epel_bi_hv64_8_c: 33915.4 put_hevc_epel_bi_hv64_8_i8mm: 6783.6 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-12-01 21:25:39 +02:00
Logan Lyu	0448f27f41	lavc/aarch64: new optimization for 8-bit hevc_epel_bi_v put_hevc_epel_bi_v4_8_c: 138.4 put_hevc_epel_bi_v4_8_neon: 33.7 put_hevc_epel_bi_v6_8_c: 302.9 put_hevc_epel_bi_v6_8_neon: 46.7 put_hevc_epel_bi_v8_8_c: 408.7 put_hevc_epel_bi_v8_8_neon: 48.7 put_hevc_epel_bi_v12_8_c: 779.4 put_hevc_epel_bi_v12_8_neon: 139.7 put_hevc_epel_bi_v16_8_c: 1344.9 put_hevc_epel_bi_v16_8_neon: 160.2 put_hevc_epel_bi_v24_8_c: 2981.7 put_hevc_epel_bi_v24_8_neon: 344.9 put_hevc_epel_bi_v32_8_c: 5280.9 put_hevc_epel_bi_v32_8_neon: 618.4 put_hevc_epel_bi_v48_8_c: 12494.9 put_hevc_epel_bi_v48_8_neon: 1364.4 put_hevc_epel_bi_v64_8_c: 22127.7 put_hevc_epel_bi_v64_8_neon: 2473.7 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-12-01 21:25:39 +02:00
Logan Lyu	216275bd80	lavc/aarch64: new optimization for 8-bit hevc_epel_bi_h put_hevc_epel_bi_h4_8_c: 96.0 put_hevc_epel_bi_h4_8_neon: 36.3 put_hevc_epel_bi_h6_8_c: 288.3 put_hevc_epel_bi_h6_8_neon: 59.3 put_hevc_epel_bi_h8_8_c: 358.5 put_hevc_epel_bi_h8_8_neon: 61.5 put_hevc_epel_bi_h12_8_c: 759.8 put_hevc_epel_bi_h12_8_neon: 159.5 put_hevc_epel_bi_h16_8_c: 1307.0 put_hevc_epel_bi_h16_8_neon: 182.0 put_hevc_epel_bi_h24_8_c: 2778.3 put_hevc_epel_bi_h24_8_neon: 430.5 put_hevc_epel_bi_h32_8_c: 4952.3 put_hevc_epel_bi_h32_8_neon: 679.5 put_hevc_epel_bi_h48_8_c: 11803.3 put_hevc_epel_bi_h48_8_neon: 1443.5 put_hevc_epel_bi_h64_8_c: 20654.8 put_hevc_epel_bi_h64_8_neon: 2737.0 put_hevc_qpel_bi_h4_8_c: 140.0 put_hevc_qpel_bi_h4_8_neon: 111.5 put_hevc_qpel_bi_h6_8_c: 318.0 put_hevc_qpel_bi_h6_8_neon: 85.8 put_hevc_qpel_bi_h8_8_c: 536.5 put_hevc_qpel_bi_h8_8_neon: 95.3 put_hevc_qpel_bi_h12_8_c: 1188.5 put_hevc_qpel_bi_h12_8_neon: 291.3 put_hevc_qpel_bi_h16_8_c: 2064.3 put_hevc_qpel_bi_h16_8_neon: 365.3 put_hevc_qpel_bi_h24_8_c: 4757.5 put_hevc_qpel_bi_h24_8_neon: 1010.0 put_hevc_qpel_bi_h32_8_c: 8351.8 put_hevc_qpel_bi_h32_8_neon: 2917.8 put_hevc_qpel_bi_h48_8_c: 19299.8 put_hevc_qpel_bi_h48_8_neon: 2976.8 put_hevc_qpel_bi_h64_8_c: 34182.5 put_hevc_qpel_bi_h64_8_neon: 5236.3 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-12-01 21:25:39 +02:00
Logan Lyu	40cf4a5ca3	lavc/aarch64: new optimization for 8-bit hevc_pel_bi_pixels put_hevc_pel_bi_pixels4_8_c: 54.7 put_hevc_pel_bi_pixels4_8_neon: 43.0 put_hevc_pel_bi_pixels6_8_c: 94.7 put_hevc_pel_bi_pixels6_8_neon: 37.0 put_hevc_pel_bi_pixels8_8_c: 171.0 put_hevc_pel_bi_pixels8_8_neon: 24.0 put_hevc_pel_bi_pixels12_8_c: 354.0 put_hevc_pel_bi_pixels12_8_neon: 68.7 put_hevc_pel_bi_pixels16_8_c: 588.2 put_hevc_pel_bi_pixels16_8_neon: 77.5 put_hevc_pel_bi_pixels24_8_c: 1670.7 put_hevc_pel_bi_pixels24_8_neon: 173.0 put_hevc_pel_bi_pixels32_8_c: 2267.7 put_hevc_pel_bi_pixels32_8_neon: 281.2 put_hevc_pel_bi_pixels48_8_c: 5787.5 put_hevc_pel_bi_pixels48_8_neon: 673.5 put_hevc_pel_bi_pixels64_8_c: 9897.0 put_hevc_pel_bi_pixels64_8_neon: 1159.5 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-12-01 21:25:39 +02:00
sunyuechi	d0ec826077	checkasm/ac3dsp: add float_to_fixed24 test Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>	2023-12-01 20:26:48 +02:00
James Almer	6d19611251	avcodec/ac3dsp: add missing stddef.h include Should fix make checkheaders Signed-off-by: James Almer <jamrial@gmail.com>	2023-12-01 12:42:22 -03:00
Paul B Mahol	a30adf9f96	avfilter/framesync: fix OOM case Fixes OOM when caller keeps adding frames into filtergraph that reached EOF by other means, for example EOF is signalled by other filter in filtergraph or by buffersink.	2023-11-30 11:08:34 +01:00
Paul B Mahol	47e214245b	avfilter/arls_template: use defines for all constants	2023-11-28 16:09:12 +01:00
Paul B Mahol	f66536cc58	avfilter: add Affine Projection adaptive audio filter	2023-11-28 15:40:34 +01:00
xufuji456	cc86343b96	lavc/hevcdsp_qpel_neon: using movi.16b instead of movi.2d Building iOS platform with arm64, the compiler has a warning: "instruction movi.2d with immediate #0 may not function correctly on this CPU, converting to movi.16b" Signed-off-by: xufuji456 <839789740@qq.com> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-11-28 15:54:49 +02:00
Paul B Mahol	67ce690bc6	avfilter/af_anlms: set output frame duration	2023-11-28 13:17:19 +01:00
Paul B Mahol	411c516453	avfilter/af_arls: set output frame duration	2023-11-28 13:17:13 +01:00
Paul B Mahol	bafbb0697e	avfilter/af_amix: set output frame duration	2023-11-28 13:17:13 +01:00
Paul B Mahol	358aced447	avfilter/af_amultiply: set output frame duration	2023-11-28 13:17:12 +01:00
Paul B Mahol	8b9c400f1d	avfilter/af_amerge: use already provided outlink	2023-11-28 13:17:12 +01:00
Paul B Mahol	c979ccdfd7	avfilter: no need to request more samples if internal frame is available	2023-11-28 13:16:18 +01:00
Anton Khirnov	66a02a8508	tools/general_assembly: add newly voted-in extra GA members Cf. * https://vote.ffmpeg.org/cgi-bin/civs/results.pl?id=E_d0b225b9aa8d45d5 * http://lists.ffmpeg.org/pipermail/ffmpeg-devel/2023-November/317496.html Message-Id <170115613784.8914.4950266152609138336@lain.khirnov.net>	2023-11-28 09:07:23 +01:00
Paul B Mahol	3bca828d39	avfilter/af_arls: add double sample format support	2023-11-27 20:27:27 +01:00
Paul B Mahol	42e45ea8ff	avfilter/af_anlms: add double sample format support	2023-11-27 20:27:25 +01:00
sunyuechi	ea6817d2a7	checkasm: test for dcmul_add Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>	2023-11-27 17:55:24 +02:00
Zhao Zhili	d526a34c20	avcodec/videotoolboxenc: refactor dump encoder name Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2023-11-27 23:49:01 +08:00
Zhao Zhili	cb049d377f	avcodec/videotoolboxenc: Fix build failure due to PropertyKey_EncoderID Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2023-11-27 23:48:55 +08:00
Leo Izen	36980179a0	fftools/ffplay_renderer: declare function argument as const Declaring the function argument as const fixes a warning down the line that the const parameter is stripped. We don't modify this argument. Signed-off-by: Leo Izen <leo.izen@gmail.com> Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2023-11-27 23:39:48 +08:00
Paul B Mahol	5f87a68cf7	avfilter/vf_colorcorrect: fix memory leaks	2023-11-27 12:10:26 +01:00
Paul B Mahol	f1f973313b	avfilter/af_dialoguenhance: do output scaling once	2023-11-27 11:56:27 +01:00
Paul B Mahol	b1942734c7	avfilter/af_afwtdn: fix crash with EOF handling	2023-11-27 11:56:26 +01:00
Paul B Mahol	4671fb7dfb	avfilter/af_dialoguenhance: simplify channels copy	2023-11-27 11:56:23 +01:00
Gyan Doshi	0ea9e26636	doc/filters: restore entry for libvmaf option pool `3d29724c00` removed the doc entry for the option pool while adding a parser function for it at the same time! The option remains available and undeprecated. Fixes trac #10693	2023-11-27 15:37:01 +05:30
Paul B Mahol	44e9cccffa	avformat: add QOA demuxer	2023-11-26 17:49:11 +01:00
Paul B Mahol	3609d2b783	avcodec: add QOA decoder	2023-11-26 17:49:09 +01:00
Geoffrey McRae	93b5d9030b	libavcodec/mlpdec: add missing correction to ch_layout when downmixing This fixes corrupted audio for applications relying on ch_layout when codec downmixing is active. Signed-off-by: Geoffrey McRae <geoff@hostfission.com> Signed-off-by: James Almer <jamrial@gmail.com>	2023-11-26 10:18:33 -03:00
Geoffrey McRae	a8677bcc8f	libavcodec/dcadec: adjust the `ch_layout` when downmix is active Applications making use of this codec with the `downmix` option are segfaulting unless the `ch_layout` is overridden after `avcodec_open2` as can be seen in projects like MythTV[1] This patch fixes this by overriding the ch_layout as done in other decoders such as AC3. 1: `af6f362a14/mythtv/libs/libmythtv/decoders/avformatdecoder.cpp (L4607)` Signed-off-by: Geoffrey McRae <geoff@hostfission.com> Signed-off-by: James Almer <jamrial@gmail.com>	2023-11-26 10:18:33 -03:00
Wenbin Chen	47b2328076	libavfilter/vf_dnn_detect: Add yolo support Add yolo support. Yolo model doesn't output final result. It outputs candidate boxes, so we need post-process to remove overlap boxes to get final results. Also, the box's coordinators relate to cell and anchors, so we need these information to calculate boxes as well. Model detail please refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v2-tf Signed-off-by: Wenbin Chen <wenbin.chen@intel.com> Reviewed-by: Guo Yejun <yejun.guo@intel.com>	2023-11-26 20:38:36 +08:00
Wenbin Chen	caa5d123a7	libavfilter/vf_dnn_detect: Add model_type option. There are many kinds of detection DNN model and they have different preprocess and postprocess methods. To support more models, "model_type" option is added to help to choose preprocess and postprocess function. Signed-off-by: Wenbin Chen <wenbin.chen@intel.com> Reviewed-by: Guo Yejun <yejun.guo@intel.com>	2023-11-26 20:15:55 +08:00
Anton Khirnov	2020ef9770	tools/general_assembly: restore printing HEAD	2023-11-26 10:12:01 +01:00
Anton Khirnov	56a8b34b64	tools/general_assembly: implement extra GA members	2023-11-26 10:12:01 +01:00
Paul B Mahol	e7111ba44a	avfilter/vsrc_gradients: allow zero speed	2023-11-26 02:07:45 +01:00
Paul B Mahol	f1acb0d843	avfilter/vsrc_gradients: add square type	2023-11-26 02:07:44 +01:00
James Almer	72390dea00	mips/ac3dsp_mips: add missing stddef.h header include Fixes compilation failures after `567c67c6c8`. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>	2023-11-25 21:51:04 -03:00
James Almer	e40ea9f34b	x86/ac3dsp: add ff_float_to_fixed24_avx() Signed-off-by: James Almer <jamrial@gmail.com>	2023-11-25 21:50:56 -03:00
James Almer	d8b1a34433	x86/ac3dsp: reduce instruction count inside the float_to_fixed24 loop Signed-off-by: James Almer <jamrial@gmail.com>	2023-11-25 21:50:56 -03:00
Paul B Mahol	2d9ed64859	avfilter/af_dialoguenhance: fix overreads	2023-11-25 13:05:31 +01:00
Paul B Mahol	37c5bcc4e8	avfilter/af_channelmap: do not override set channel layout	2023-11-25 13:05:31 +01:00
Zhao Zhili	bbdedd9663	Revert "avformat/rtmpproto: Pass rw_timeout to underlying transport protocol" This reverts commit `bec6dfcd5c`. The patch is NOP since ffurl_open_whitelist copy options from parent automatically. Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2023-11-25 10:56:00 +08:00
Rémi Denis-Courmont	7212466e73	checkasm/riscv: report an error upon SIGILL Terminating the whole checkasm process is not very helpful. This will report if an illegal instruction occurs while executing a tested function. This is a common occurrence whilst developping RISC-V assembler, due to the compatibility between vector configuration and instruction done at run-time.	2023-11-23 19:04:07 +02:00
Rémi Denis-Courmont	286d674221	checkasm: add helper to report a fatal signal	2023-11-23 18:57:18 +02:00
Rémi Denis-Courmont	0fa421c8f1	lavc/llvidencdsp: add R-V V diff_bytes diff_bytes_c: 163.0 diff_bytes_rvv_i32: 52.7	2023-11-23 18:57:18 +02:00
Rémi Denis-Courmont	0183c2c830	lavc/aacpsdsp: use LMUL=2 and amortise strides The input is laid out in 16 segments, of which 13 actually need to be loaded. There are no really efficient ways to deal with this: 1) If we load 8 segments wit unit stride, then narrow to 16 segments with right shifts, we can only get one half-size vector per segment, or just 2 elements per vector (EMUL=1/2) - at least with 128-bit vectors. This ends up unsurprisingly about as fas as the C code. 2) The current approach is to load with strides. We keep that approach, but improve it using three 4-segmented loads instead of 12 single-segment loads. This divides the number of distinct loaded addresses by 4. 3) A potential third approach would be to avoid segmentation altogether and splat the scalar coefficient into vectors. Then we can use a unit-stride and maximum EMUL. But the downside then is that we have to multiply the 3 (of 16) unused segments with zero as part of the multiply-accumulate operations. In addition, we also reuse vectors mid-loop so as to increase the EMUL from 1 to 2, which also improves performance a little bit. Oeverall the gains are quite small with the device under test, as it does not deal with segmented loads very well. But at least the code is tidier, and should enjoy bigger speed-ups on better hardware implementation. Before: ps_hybrid_analysis_c: 1819.2 ps_hybrid_analysis_rvv_f32: 1037.0 (before) ps_hybrid_analysis_rvv_f32: 990.0 (after)	2023-11-23 18:57:18 +02:00

... 2 3 4 5 6 ...

113035 Commits