FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-11-23 21:54:53 +02:00

Author	SHA1	Message	Date
Michael Yang	26dee5b43e	libavfilter/vf_nlmeans_vulkan: reverse img_bar	2025-10-16 21:32:43 +00:00
Michael Yang	71ff349cc1	libavfilter/vf_nlmeans_vulkan: lower strength min Lower (per-component) strength minimum from 1.0 to 0.0, with 0.0 skipping integral and weights calculations.	2025-10-16 21:32:43 +00:00
Michael Yang	2e12b3251d	libavfilter/vf_nlmeans_vulkan: clean up naming Add `nb_components` to push data. Rename `ws_total_`` to `ws_`.	2025-10-16 21:32:43 +00:00
Michael Yang	3fac2d8593	avfilter/vf_nlmeans_vulkan: rewrite filter This is a major rewrite of the exising nlmeans vulkan code, with bug fixes and major performance improvement. Fix visual artifacts found in ticket #10661, #10733. Add OOB checks for image loading and patch sized area around the border. Correct chroma plane height, strength and buffer barrier index. Improve parallelism with component workgroup axis and more but smaller workgroups. Split weights pass into vertical/horizontal (integral) and weights passes. Remove h/v order logic to always calculate sum on vertical pass. Remove atomic float requirement, which causes high memory locking contentions, at the cost of higher memory usage of w/s buffer. Use cache blocking in h pass to reduce memory bandwidth usage.	2025-10-16 21:32:43 +00:00
Martin Storsjö	36896af64a	movenc: Make the hybrid_fragmented mode more robust Write the moov tag at the end first, before overwriting the mdat size at the start of the file. In case writing the final moov box fails (e.g. due to being out of disk), we haven't broken the initial moov box yet. Thus if writing stops between these steps, we could end up with a file with two moov boxes - which arguably is more feasible to recover from, than from a file with no moov boxes at all.	2025-10-16 18:58:54 +00:00
Niklas Haas	a45d30a675	avutil/hwcontext_vulkan: always enable baseline usage flags The documentation states that this field is for enabling "extra" usage flags. This conflicts with the implementation, and the rest of the comment, though. In resolving this ambiguity, I think it's better to lean towards the first sentence and treat this field purely as specifying extra usage flags to enable. Otherwise, this may break vulkan encoding or subsequent hwdownload if the upstream filter did not specifically advertise this. Change the default behavior and update the documentation slightly to more clearly document the semantics.	2025-10-16 17:40:25 +00:00
Andreas Rheinhardt	b1f2eea1cd	avfilter/vf_noise: Deduplicate option flags Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-16 19:10:51 +02:00
Andreas Rheinhardt	3ba570de8b	avfilter/x86/vf_noise: Port line_noise funcs to SSE2 This avoids having to fix up ABI violations via emms_c and also leads to a 73% speedup for the line noise average version here. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-16 19:09:45 +02:00
Andreas Rheinhardt	adfec0f52e	avfilter/x86/vf_noise: Make line_noise_avg_mmx() match C function Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-16 18:41:19 +02:00
Andreas Rheinhardt	214b52df43	avfilter/vf_noise: Avoid cast Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-16 18:41:19 +02:00
Andreas Rheinhardt	ece623b1b3	avfilter/vf_noise: Fix race with very tall images When using averaged noise with height > MAX_RES (i.e. 4096), multiple threads would access the same prev_shift slot, leading to races. Fix this by disabling slice threading in such scenarios. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-16 18:41:19 +02:00
Andreas Rheinhardt	6a53a4e341	avfilter/vf_noise: Don't write beyond end-of-array This is not only UB, but also leads to races and nondeterministic output, because the write one last the end of the buffer actually conflicts with accesses by the thread that actually owns it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-16 18:41:18 +02:00
Andreas Rheinhardt	94948bd6b9	avfilter/vf_noise: Make private context smaller "all" only exists to set options; it does not need the big arrays contained in FilterParams. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-16 18:41:18 +02:00
Zhao Zhili	cd4b01707d	Revert "avformat/movenc: sidx earliest_presentation_time is applied after editlist" This reverts commit `301141b576`. cluster[0].dts, pts and frag_info[0].time are already in presentation timeline, so they shouldn't be shift by start_pts. Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2025-10-16 11:22:37 +08:00
Zhao Zhili	0de3b1f358	avformat/mov: don't shift sidx_pts sidx_pts is already in presentation time, so it shouldn't be shift by sc->time_offset again. Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2025-10-16 11:22:37 +08:00
James Almer	2e1d702cfc	avformat/dump: fix log level passed to av_log when printing stream group side data Signed-off-by: James Almer <jamrial@gmail.com>	2025-10-15 17:49:11 -03:00
Andreas Rheinhardt	74a3c1ddb6	avfilter/x86/vf_pullup: Port pullup functions to SSE2, SSSE3 The diff and var functions benefit from psadbw, comb from wider registers which allows to avoid reloading values, reducing the number of loads from 48 to 10. Performance increased by 117% (the loop in compute_metric() has been timed); codesize decreased by 144B. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-15 19:43:37 +02:00
Andreas Rheinhardt	dcb28ed860	avfilter/x86/vf_spp: Port store_slice to SSE2 This allows to remove an emms_c from the filter. It also gives 25% speedup here (when timing the calls to store_slice using START/STOP_TIMER). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-15 19:43:37 +02:00
Andreas Rheinhardt	f4a87d8ca4	avcodec/x86/mpegvideoencdsp_init: Use xmm registers in SSSE3 functions Improves performance and no longer breaks the ABI (by forgetting to call emms). Old benchmarks: add_8x8basis_c: 43.6 ( 1.00x) add_8x8basis_ssse3: 12.3 ( 3.55x) New benchmarks: add_8x8basis_c: 43.0 ( 1.00x) add_8x8basis_ssse3: 6.3 ( 6.79x) Notice that the output of try_8x8basis_ssse3 changes a bit: Before this commit, it computes certain values and adds the values for i,i+1,i+4 and i+5 before right shifting them; now it adds the values for i,i+1,i+8,i+9. The second pair in these lists could be avoided (by shifting xmm0 and xmm1 before adding both together instead of only shifting xmm0 after adding them), but the former i,i+1 is inherent in using pmaddwd. This is the reason that this function is not bitexact. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-15 08:55:13 +02:00
Andreas Rheinhardt	cffd029e98	avcodec/x86/mpegvideoencdsp_init: Don't use slow path unnecessarily The only requirement of this code (and essentially the pmulhrsw instruction) is that the scaled scale fits into an int16_t. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-15 08:55:13 +02:00
Andreas Rheinhardt	ce499ebf96	tests/checkasm/mpegvideoencdsp: Add test for add_8x8basis Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-15 08:55:13 +02:00
Michael Niedermayer	566e9032b1	swscale/output: Fix unsigned cast position in yuv2* Fixes: signed overflow Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-10-14 20:55:54 +02:00
Michael Niedermayer	0c6b7f9483	swscale/output: Fix integer overflow in yuv2ya16_X_c_template() Found-by: colod colod <colodcolod7@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-10-14 20:55:53 +02:00
Zhao Zhili	6b961f5963	avformat/mov: fix missing video size when some decoders are disabled Fix #20667 Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2025-10-14 20:05:55 +08:00
Andreas Rheinhardt	a24e0f536d	avcodec/x86/hpeldsp_init: Remove check for inline mmx Forgotten in `4c55724da8`. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-14 12:31:15 +02:00
Frank Plowman	b0c77e5a12	lavc/vvc: Store RefStruct references to referenced PSs/headers in slice This loosens the coupling between CBS and the decoder by no longer using CodedBitstreamH266Context (containing the most recently parsed PSs & PH) to retrieve the PSs & PH in the decoder. Doing so is beneficial in two ways: 1. It improves robustness to the case in which an AVPacket doesn't contain precisely one PU. 2. It allows the decoder parameter set manager to properly handle the case in which a single PU (erroneously) contains conflicting parameter sets. Signed-off-by: Frank Plowman <post@frankplowman.com>	2025-10-13 19:05:36 +01:00
Andreas Rheinhardt	31f0749cd4	avcodec/vp3: Optimize alignment check away when possible Check only on arches that need said check. (Btw: I do not see how h_loop_filter benefits from alignment at all and why h_loop_filter_unaligned exists.) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-13 18:59:49 +02:00
Andreas Rheinhardt	5823ab347a	avcodec/vp3dsp: Remove unused flags parameter from ff_vp3dsp_init() No longer necessary now that the x86 loop filter functions are bitexact. Reviewed-by: Sean McGovern <gseanmcg@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-13 18:59:24 +02:00
Andreas Rheinhardt	e3ca57ae8f	avcodec/x86/vp3dsp: Port loop filters to SSE2 The old code operated on bytes and did lots of tricks due to their limited range; it did not completely succeed, which is why the old versions were not used when bitexact output was requested. In contrast, the new version is much simpler: It operates on signed 16 bit words whose range is more than sufficient. This means that these functions don't need a check for bitexactness (and can be used in FATE). Old benchmarks (for this, the AV_CODEC_FLAG_BITEXACT check has been removed from checkasm): h_loop_filter_c: 29.8 ( 1.00x) h_loop_filter_mmxext: 32.2 ( 0.93x) h_loop_filter_unaligned_c: 29.9 ( 1.00x) h_loop_filter_unaligned_mmxext: 31.4 ( 0.95x) v_loop_filter_c: 39.3 ( 1.00x) v_loop_filter_mmxext: 14.2 ( 2.78x) v_loop_filter_unaligned_c: 38.9 ( 1.00x) v_loop_filter_unaligned_mmxext: 14.3 ( 2.72x) New benchmarks: h_loop_filter_c: 29.2 ( 1.00x) h_loop_filter_sse2: 28.6 ( 1.02x) h_loop_filter_unaligned_c: 29.0 ( 1.00x) h_loop_filter_unaligned_sse2: 26.9 ( 1.08x) v_loop_filter_c: 38.3 ( 1.00x) v_loop_filter_sse2: 11.0 ( 3.47x) v_loop_filter_unaligned_c: 35.5 ( 1.00x) v_loop_filter_unaligned_sse2: 11.2 ( 3.18x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-13 18:58:50 +02:00
Andreas Rheinhardt	5d9a392bce	tests/checkasm: Add VP3 loop filter test Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-13 18:58:50 +02:00
zhanghongyuan	0bc54cddb1	fftools/opt_common: add long-form license option Add "license" as a long-form command line option alongside the existing "L" short option for showing license information. This maintains consistent option naming patterns with other commands that provide both short and long forms (help/?/help, etc.) and improves command line usability by providing more descriptive option names.	2025-10-12 03:26:21 +00:00
Tong Wu	10e9672a8c	avcodec/d3d12va_encode: use macros to set QP range and max frame size Signed-off-by: Tong Wu <wutong1208@outlook.com>	2025-10-12 01:50:57 +00:00
Andreas Rheinhardt	36f92206bb	avcodec/x86/hpeldsp: Improve ff_{avg,put}_pixels8_xy2_ssse3() This SSSE3 function uses MMX registers (of course without emms at the end) and processes eight bytes of input by unpacking it into two MMX registers. This is very suboptimal given that one can just use XMM registers to process eight words. This commit switches them to using XMM registers. Old benchmarks: avg_pixels_tab[1][3]_c: 114.5 ( 1.00x) avg_pixels_tab[1][3]_ssse3: 43.6 ( 2.62x) put_pixels_tab[1][3]_c: 83.6 ( 1.00x) put_pixels_tab[1][3]_ssse3: 34.0 ( 2.46x) New benchmarks: avg_pixels_tab[1][3]_c: 115.3 ( 1.00x) avg_pixels_tab[1][3]_ssse3: 24.6 ( 4.69x) put_pixels_tab[1][3]_c: 83.8 ( 1.00x) put_pixels_tab[1][3]_ssse3: 19.7 ( 4.24x) Reviewed-by: Kieran Kunhya <kieran@kunhya.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-12 02:45:37 +02:00
Andreas Rheinhardt	4c55724da8	avcodec/x86/hpeldsp: Add ff_put_no_rnd_pixels8_xy2_ssse3() Given that one has to deal with 16 byte intermediates it is unsurprising that SSE2 wins against MMX; the MMX version has therefore been removed (as well as the now unused inline_asm.h). The new function is even 32B smaller than the old MMX one. Old benchmarks: put_no_rnd_pixels_tab[1][3]_c: 84.1 ( 1.00x) put_no_rnd_pixels_tab[1][3]_mmx: 41.1 ( 2.05x) New benchmarks: put_no_rnd_pixels_tab[1][3]_c: 84.0 ( 1.00x) put_no_rnd_pixels_tab[1][3]_ssse3: 22.1 ( 3.80x) Reviewed-by: Kieran Kunhya <kieran@kunhya.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-12 02:45:25 +02:00
Andreas Rheinhardt	f84e06026a	avcodec/x86/hpeldsp: Add SSE2 of {avg,put} no_rnd xy2 with blocksize 16 Also remove the now superseded MMX versions (the new functions have the exact same codesize as the removed ones). Old benchmarks: avg_no_rnd_pixels_tab[0][3]_c: 233.7 ( 1.00x) avg_no_rnd_pixels_tab[0][3]_mmx: 121.5 ( 1.92x) put_no_rnd_pixels_tab[0][3]_c: 171.4 ( 1.00x) put_no_rnd_pixels_tab[0][3]_mmx: 82.6 ( 2.08x) New benchmarks: avg_no_rnd_pixels_tab[0][3]_c: 233.3 ( 1.00x) avg_no_rnd_pixels_tab[0][3]_sse2: 45.0 ( 5.18x) put_no_rnd_pixels_tab[0][3]_c: 172.1 ( 1.00x) put_no_rnd_pixels_tab[0][3]_sse2: 40.9 ( 4.21x) Reviewed-by: Kieran Kunhya <kieran@kunhya.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-12 02:43:29 +02:00
Andreas Rheinhardt	ce9d181444	avcodec/mjpegdec: Remove unnecessary reloads Hint: The parts of this patch in decode_block_progressive() and decode_block_refinement() rely on the fact that GET_VLC returns -1 on error, so that it enters the codepaths for actually coded block coefficients. Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-11 08:20:42 +02:00
Andreas Rheinhardt	dad06a445f	avcodec/Makefile: Remove h263 decoder->mpeg4videodec.o dependency Also prefer using #if CONFIG_MPEG4_DECODER checks in order not to rely on DCE. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-11 07:51:01 +02:00
Andreas Rheinhardt	10d3479da0	avcodec/h263dec: Avoid redundant branch Only the MPEG-4 decoder can have partitioned frames here. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-11 07:51:01 +02:00
Andreas Rheinhardt	d96f8d32ad	avcodec/x86/h264_qpel: Don't instantiate unused functions The v_lowpass wrappers (which are instantiated by this macro) are only used in the put (and not the avg) form for SSSE3 (the avg form is only used for mc02, which doesn't exist for SSSE3). Clang warns about the unused functions. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-10 16:27:57 +02:00
Niklas Haas	6f1ab828d3	libavfilter/vf_libplacebo: add `temperature` option	2025-10-09 20:45:09 +00:00
Leo Izen	eab3b68237	avcodec/exif: avoid printing errors for makernote non-IFD parsing When we parse a MakerNote, we first try to parse it as an IFD and if that fails, we try to re-parse it as a binary blob. This is because MakerNote is not well-documented in its nature. However, if we fail to parse it the first time, we should not av_log error messages about the parse failure, so instead we log these as AV_LOG_DEBUG. Signed-off-by: Leo Izen <leo.izen@gmail.com> Reported-by: Ramiro Polla <ramiro.polla@gmail.com>	2025-10-09 12:40:41 -04:00
James Almer	41c168444e	avcodec/hevc/sei: don't attempt to use stale values in HEVCSEITimeCode Invalidate the whole struct on SEI reset. Signed-off-by: James Almer <jamrial@gmail.com>	2025-10-09 12:09:35 -03:00
James Almer	8e01bff774	avcodec/hevc/sei: don't attempt to use stale values in HEVCSEITDRDI Invalidate the whole struct on SEI reset. Signed-off-by: James Almer <jamrial@gmail.com>	2025-10-09 12:09:35 -03:00
James Almer	d448d6d1a0	avcodec/hevc/sei: prevent storing a potentially bogus num_ref_displays value in HEVCSEITDRDI Fixes: 439711052/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-4956250308935680 Fixes: out of array access Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: James Almer <jamrial@gmail.com>	2025-10-09 12:09:35 -03:00
Jack Lau	a934d48440	doc/muxers: correct default pkt_size value of whip Signed-off-by: Jack Lau <jacklau1222@qq.com>	2025-10-09 14:33:02 +00:00
Jack Lau	b43f8dec18	avformat/whip: add macros to replace magic number Signed-off-by: Jack Lau <jacklau1222@qq.com>	2025-10-09 14:32:03 +00:00
Jack Lau	bc6164eb6f	avformat/whip: remove WHIP_STATE_DTLS_CONNECTING This value is only useful when dtls handshake is NONBLOCK mode, dtls handshake just need to call ffurl_handshake once since it force block mode. Signed-off-by: Jack Lau <jacklau1222@qq.com>	2025-10-09 14:32:03 +00:00
Jack Lau	76b13ca0a6	avformat/whip: check the peer whether is ice lite See RFC 5245 Section 4.3 If an agent is a lite implementation, it MUST include an "a=ice-lite" session-level attribute in its SDP. If an agent is a full implementation, it MUST NOT include this attribute. Signed-off-by: Jack Lau <jacklau1222@qq.com>	2025-10-09 14:32:03 +00:00
Jack Lau	ec0a04de0d	avformat/whip: remind user increase -buffer_size The udp buffer size might be too small to easily be full temporarily and return WSAEWOULDBLOCK. The udp code will handle the windows error code and convert it to AVERROR(EAGAIN). This issue just can be reproduced on windows. If sleep a interval and retry to send pkt when hit EAGAIN, it will increase latency, and appropriate interval is hard to define. So this patch just remind user increase the buffer size via -buffer_size to avoid this issue. Signed-off-by: Jack Lau <jacklau1222@qq.com>	2025-10-09 09:55:18 +00:00
Jack Lau	b3793d9941	avformat/whip: pass through buffer_size option to udp Signed-off-by: Jack Lau <jacklau1222@qq.com>	2025-10-09 09:55:18 +00:00

1 2 3 4 5 ...

121456 Commits