1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-11-23 21:54:53 +02:00
Commit Graph

121735 Commits

Author SHA1 Message Date
Baptiste Coudurier
06b04da4dc lavf/mxfenc: require pixel format to be set for video streams 2025-10-24 00:16:21 +00:00
Baptiste Coudurier
0b0cb7cd6c lavf/movenc: improve AVdh atom generation for DNxHD/DNxHR 2025-10-24 00:16:21 +00:00
cenzhanquan1
1120b3db30 avcodec/liblc3enc: support packed float (AV_SAMPLE_FMT_FLT) input.
Previously, the LC3 encoder only accepted planar float (AV_SAMPLE_FMT_FLTP).
This change extends support to packed float (AV_SAMPLE_FMT_FLT) by properly
handling channel layout and sample stride.

The pcm data pointer and stride are now calculated based on the sample
format: for planar, use frame->data[ch]; for packed, use frame->data[0]
with channel offset. The stride is set to 1 for planar and number of
channels for packed layout.

This enables encoding from common packed audio sources without requiring
a prior planar conversion, improving usability and efficiency.

Signed-off-by: cenzhanquan1 <cenzhanquan1@xiaomi.com>
2025-10-23 14:42:50 +00:00
cenzhanquan1
0eb572f080 avcodec/liblc3dec: support sample format negotiation and planar layout.
1. Adds support for respecting the requested sample format. Previously,
   the decoder always used AV_SAMPLE_FMT_FLTP. Now it checks if the
   caller requested a specific format via avctx->request_sample_fmt and
   honors that request when supported.

2. Improves planar/interleaved audio buffer handling. The decoding
   logic now properly handles both planar and interleaved sample
   formats by calculating the correct stride and buffer pointers based
   on the actual sample format.

The changes include:
- Added format mapping between AVSampleFormat and lc3_pcm_format
- Implemented format selection logic in initialization.
- Updated buffer pointer calculation for planar/interleaved data.
- Maintained backward compatibility with existing behavior.

Signed-off-by: cenzhanquan1 <cenzhanquan1@xiaomi.com>
2025-10-23 22:06:04 +08:00
Jack Lau
cb5e201f5c avformat/whip: cleanup the redundant variable
Signed-off-by: Jack Lau <jacklau1222@qq.com>
2025-10-23 10:58:45 +00:00
Jack Lau
0e8cff52bc avformat/whip: add DTLS active role support
Add dtls_active flag to specify the dtls role.

Properly set the send key and recv key depends on DTLS role:

As DTLS server, the recv key is client master key plus salt,
the send key is server master key plus salt.
As DTLS client, the recv key is server master key plus salt,
the send key is client master key plus salt.

Signed-off-by: Jack Lau <jacklau1222@qq.com>
2025-10-23 10:58:45 +00:00
Gyan Doshi
535d4047d3 ffmpeg: unbreak max_error_rate application
The calculation of decode error rate neglected to cast
its operands to float, thus always leading to a value of 0.
2025-10-21 13:22:08 +00:00
Zhao Zhili
edf5b777c9 avcodec/vvc: fix false alarm of missing ref on RASL 2025-10-21 13:21:52 +00:00
Andreas Rheinhardt
05b8608c76 avcodec/x86/mpegvideoencdsp_init: Fix left shift of negative number
Uncovered by UBSan when running the mpegvideoencdsp checkasm
test.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-21 12:11:55 +02:00
Martin Storsjö
e442128944 movenc: Make sure to flush the delayed moov atom for hybrid fragmented
If using the delay_moov flag in combination with hybrid_fragment
(which is a potentially problematic combination otherwise - the
ftyp box does end up hidden in the end), then we need to flush
twice to get both the moov box and the first fragment, if the
file is finished before the first fragment is completed.
2025-10-21 08:38:32 +00:00
Martin Storsjö
27f5561885 movenc: Fix sample clustering for hybrid_fragmented+delay_moov
If samples were available when the moov was written, chunking
for those samples has been done already, which has to be reset
here.

This is the case when not using empty_moov, when the moov box
describes the first fragment - this case was accounted for already.
But if using the delay_moov flag, then those samples also were
available when writing the moov, so chunking for them has already
been done in this case as well.

Therefore, always reset chunking here (it should be harmless to
always do it), and update the comment to clarify the cases
involved here.
2025-10-21 08:38:32 +00:00
Leo Izen
dc39a576ad avcodec/pngenc: include EXIF buffer in max_packet_size
When calculating the max size of an output PNG packet, we should
include the size of a possible eXIf chunk that we may write.

This fixes a regression since d3190a64c3
as well as a bug that existed prior in the apng encoder since commit
4a580975d4.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2025-10-19 09:17:38 -04:00
Michael Niedermayer
51d3c4b4b6 tools/target_dec_fuzzer: Adjust threshold for PIXLET
Fixes: Timeout
Fixes: 425754611/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PIXLET_fuzzer-4778526102585344

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-10-19 02:07:03 +02:00
Michael Niedermayer
388e6fb3be avcodec/ffv1enc: Consider variation in slice sizes
When splitting a 5 lines image in 2 slices one will be 3 lines and thus need more space

Fixes: Assertion sc->slice_coding_mode == 0 failed at libavcodec/ffv1enc.c:1668
Fixes: 422811239/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FFV1_fuzzer-4933405139861504

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-10-19 01:37:26 +02:00
Michael Niedermayer
56ef66d350 tools/target_dec_fuzzer: Adjust threshold for CRI
Fixes: Timeout
Fixes: 421997576/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_CRI_fuzzer-5335057265131520

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-10-19 01:37:17 +02:00
Michael Niedermayer
b132c1755a tools/target_dec_fuzzer: Adjust threshold for qdraw
Fixes: Timeout
Fixes: 421954735/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_QDRAW_fuzzer-4515776981172224

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-10-19 01:37:14 +02:00
Michael Niedermayer
8988734d09 tools/target_dec_fuzzer: Adjust threshold for CAVS
Fixes: Timeout
Fixes: 421951267/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_CAVS_fuzzer-4766360421072896

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpe
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-10-19 01:37:10 +02:00
Michael Niedermayer
51f0f2d2cf tools/target_dec_fuzzer: Adjust threshold for interplay video
Fixes: Timeout
Fixes: 421945523/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_INTERPLAY_VIDEO_fuzzer-4776910965506048

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-10-19 01:37:06 +02:00
Michael Niedermayer
d43f19064e MAINTAINERS: libtheoraenc seems unmaintained
See: [FFmpeg-devel] libtheora maintainer ?

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-10-19 01:37:00 +02:00
Michael Niedermayer
4666c1eed3 libavcodec/cbs_apv_syntax_template: limit tile to 2gb
We do not support larger tiles as we use signed int
Alternatively we can check this in apv_decode_tile_component() or init_get_bits*()
or support bitstreams above 2gb length

Fixes: init_get_bits() failure later
Fixes: 421817631/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_APV_fuzzer-4957386534354944

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-10-19 01:32:42 +02:00
Araz Iusubov
d19b7c283c avcodec/d3d12va_encode: D3D12 H264 encoding support
This patch introduces hardware-accelerated H.264 encoding
using Direct3D 12 Video API (D3D12VA).
2025-10-18 12:20:11 +00:00
Andreas Rheinhardt
ed007ad427 avcodec/x86/fpel: Port ff_put_pixels8_mmx() to SSE2
This has the advantage of not violating the ABI by using
MMX registers without issuing emms; it e.g. allows
to remove an emms_c from bink.c.

This commit uses GP registers on Unix64 (there are not
enough volatile registers to do likewise on Win64) which
reduces codesize and is faster on some CPUs.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-17 13:27:56 +02:00
Andreas Rheinhardt
d91b1559e0 avcodec/x86/me_cmp: Replace MMXEXT size 16 funcs by unaligned SSE2 funcs
Snow calls some of the me_cmp_funcs with insufficient alignment
for the first pointer (see get_block_rd() in snowenc.c);
therefore SSE2 functions which really need this alignment
don't get set for Snow and 542765ce3e
consequently didn't remove MMXEXT functions which are overridden
by these SSE2 functions for normal codecs.

For reference, here is a command line which would segfault
if one simply used the ordinary SSE2 functions for Snow:

./ffmpeg -i mm-short.mpg -an -vcodec snow -t 0.2 -pix_fmt yuv444p \
-vstrict -2 -qscale 2 -flags +qpel -motion_est iter 444iter.avi

This commit adds unaligned SSE2 versions of these functions
and removes the MMXEXT ones. This in particular implies that
sad 16x16 now never uses MMX which allows to remove an emms_c
from ac3enc.c.

Benchmarks (u means unaligned version):
sad_0_c:                                                 8.2 ( 1.00x)
sad_0_mmxext:                                           10.8 ( 0.76x)
sad_0_sse2:                                              6.2 ( 1.33x)
sad_0_sse2u:                                             6.7 ( 1.23x)

vsad_0_c:                                               44.7 ( 1.00x)
vsad_0_mmxext (approx):                                 12.2 ( 3.68x)
vsad_0_sse2 (approx):                                    7.8 ( 5.75x)

vsad_4_c:                                               88.4 ( 1.00x)
vsad_4_mmxext:                                           7.1 (12.46x)
vsad_4_sse2:                                             4.2 (21.15x)
vsad_4_sse2u:                                            5.5 (15.96x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-17 13:05:07 +02:00
Andreas Rheinhardt
69a700043d avcodec/x86/me_cmp: Remove MMXEXT functions overridden by SSE2
The SSE2 function overriding them are currently only set
if the SSE2SLOW flag is not set and if the codec is not Snow.
The former affects only outdated processors (AMDs from
before Barcelona (i.e. before 2007)) and is therefore irrelevant.
Snow does not use the pix_abs function pointers at all,
so this is also no obstacle.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-17 13:05:07 +02:00
Andreas Rheinhardt
20c4608af8 avcodec/x86/me_cmp: Add SSE2 sad 8,16 xy2 functions
The new functions are faster than the existing exact
functions, yet get beaten by the nonexact functions
(they can avoid unpacking to words and back).
The exact (slow) MMX functions have therefore been
removed, which was actually beneficial size-wise
(416B of new functions, 619B of functions removed).

pix_abs_0_3_c:                                         216.8 ( 1.00x)
pix_abs_0_3_mmx:                                        71.8 ( 3.02x)
pix_abs_0_3_mmxext (approximative):                     17.6 (12.34x)
pix_abs_0_3_sse2:                                       23.5 ( 9.23x)
pix_abs_0_3_sse2 (approximative):                        9.9 (21.94x)

pix_abs_1_3_c:                                          98.4 ( 1.00x)
pix_abs_1_3_mmx:                                        36.9 ( 2.66x)
pix_abs_1_3_mmxext (approximative):                      9.2 (10.73x)
pix_abs_1_3_sse2:                                       14.8 ( 6.63x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-17 13:05:07 +02:00
Michael Yang
20051ed3af avcodec/vulkan_encode_av1: fix level index 2025-10-16 21:59:24 +00:00
Michael Yang
62d43ba2e3 libavfilter/vf_nlmeans_vulkan: fix str defaults
Revert back to NAN as -1.0 was erroneously to 0.0 to fit in the options
range.

Add special handling of str per requested.
2025-10-16 21:32:43 +00:00
Michael Yang
e8213f766f libavfilter/vf_nlmeans_vulkan: amend doc 2025-10-16 21:32:43 +00:00
Michael Yang
7d65ce7763 libavfilter/vf_nlmeans_vulkan: clean up defaults
Change per-plane strength defaults to -1.0.
2025-10-16 21:32:43 +00:00
Michael Yang
26dee5b43e libavfilter/vf_nlmeans_vulkan: reverse img_bar 2025-10-16 21:32:43 +00:00
Michael Yang
71ff349cc1 libavfilter/vf_nlmeans_vulkan: lower strength min
Lower (per-component) strength minimum from 1.0 to 0.0, with 0.0 skipping
integral and weights calculations.
2025-10-16 21:32:43 +00:00
Michael Yang
2e12b3251d libavfilter/vf_nlmeans_vulkan: clean up naming
Add `nb_components` to push data.

Rename `ws_total_*`` to `ws_*`.
2025-10-16 21:32:43 +00:00
Michael Yang
3fac2d8593 avfilter/vf_nlmeans_vulkan: rewrite filter
This is a major rewrite of the exising nlmeans vulkan code, with bug
fixes and major performance improvement.

Fix visual artifacts found in ticket #10661, #10733. Add OOB checks for
image loading and patch sized area around the border. Correct chroma
plane height, strength and buffer barrier index.

Improve parallelism with component workgroup axis and more but smaller
workgroups. Split weights pass into vertical/horizontal (integral) and
weights passes. Remove h/v order logic to always calculate sum on
vertical pass. Remove atomic float requirement, which causes high memory
locking contentions, at the cost of higher memory usage of w/s buffer.
Use cache blocking in h pass to reduce memory bandwidth usage.
2025-10-16 21:32:43 +00:00
Martin Storsjö
36896af64a movenc: Make the hybrid_fragmented mode more robust
Write the moov tag at the end first, before overwriting the mdat size
at the start of the file.

In case writing the final moov box fails (e.g. due to being out
of disk), we haven't broken the initial moov box yet.

Thus if writing stops between these steps, we could end up with
a file with two moov boxes - which arguably is more feasible to
recover from, than from a file with no moov boxes at all.
2025-10-16 18:58:54 +00:00
Niklas Haas
a45d30a675 avutil/hwcontext_vulkan: always enable baseline usage flags
The documentation states that this field is for enabling "extra" usage
flags. This conflicts with the implementation, and the rest of the comment,
though.

In resolving this ambiguity, I think it's better to lean towards the first
sentence and treat this field purely as specifying *extra* usage flags to
enable. Otherwise, this may break vulkan encoding or subsequent hwdownload
if the upstream filter did not specifically advertise this.

Change the default behavior and update the documentation slightly to more
clearly document the semantics.
2025-10-16 17:40:25 +00:00
Andreas Rheinhardt
b1f2eea1cd avfilter/vf_noise: Deduplicate option flags
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-16 19:10:51 +02:00
Andreas Rheinhardt
3ba570de8b avfilter/x86/vf_noise: Port line_noise funcs to SSE2
This avoids having to fix up ABI violations via emms_c and
also leads to a 73% speedup for the line noise average version
here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-16 19:09:45 +02:00
Andreas Rheinhardt
adfec0f52e avfilter/x86/vf_noise: Make line_noise_avg_mmx() match C function
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-16 18:41:19 +02:00
Andreas Rheinhardt
214b52df43 avfilter/vf_noise: Avoid cast
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-16 18:41:19 +02:00
Andreas Rheinhardt
ece623b1b3 avfilter/vf_noise: Fix race with very tall images
When using averaged noise with height > MAX_RES (i.e. 4096),
multiple threads would access the same prev_shift slot,
leading to races. Fix this by disabling slice threading
in such scenarios.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-16 18:41:19 +02:00
Andreas Rheinhardt
6a53a4e341 avfilter/vf_noise: Don't write beyond end-of-array
This is not only UB, but also leads to races and nondeterministic
output, because the write one last the end of the buffer actually
conflicts with accesses by the thread that actually owns it.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-16 18:41:18 +02:00
Andreas Rheinhardt
94948bd6b9 avfilter/vf_noise: Make private context smaller
"all" only exists to set options; it does not need the big arrays
contained in FilterParams.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-16 18:41:18 +02:00
Zhao Zhili
cd4b01707d Revert "avformat/movenc: sidx earliest_presentation_time is applied after editlist"
This reverts commit 301141b576.

cluster[0].dts, pts and frag_info[0].time are already in presentation
timeline, so they shouldn't be shift by start_pts.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2025-10-16 11:22:37 +08:00
Zhao Zhili
0de3b1f358 avformat/mov: don't shift sidx_pts
sidx_pts is already in presentation time, so it shouldn't be shift
by sc->time_offset again.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2025-10-16 11:22:37 +08:00
James Almer
2e1d702cfc avformat/dump: fix log level passed to av_log when printing stream group side data
Signed-off-by: James Almer <jamrial@gmail.com>
2025-10-15 17:49:11 -03:00
Andreas Rheinhardt
74a3c1ddb6 avfilter/x86/vf_pullup: Port pullup functions to SSE2, SSSE3
The diff and var functions benefit from psadbw, comb from wider
registers which allows to avoid reloading values, reducing the number
of loads from 48 to 10. Performance increased by 117% (the loop
in compute_metric() has been timed); codesize decreased by 144B.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-15 19:43:37 +02:00
Andreas Rheinhardt
dcb28ed860 avfilter/x86/vf_spp: Port store_slice to SSE2
This allows to remove an emms_c from the filter. It also gives
25% speedup here (when timing the calls to store_slice using
START/STOP_TIMER).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-15 19:43:37 +02:00
Andreas Rheinhardt
f4a87d8ca4 avcodec/x86/mpegvideoencdsp_init: Use xmm registers in SSSE3 functions
Improves performance and no longer breaks the ABI (by forgetting
to call emms).

Old benchmarks:
add_8x8basis_c:                                         43.6 ( 1.00x)
add_8x8basis_ssse3:                                     12.3 ( 3.55x)

New benchmarks:
add_8x8basis_c:                                         43.0 ( 1.00x)
add_8x8basis_ssse3:                                      6.3 ( 6.79x)

Notice that the output of try_8x8basis_ssse3 changes a bit:
Before this commit, it computes certain values and adds the values
for i,i+1,i+4 and i+5 before right shifting them; now it adds
the values for i,i+1,i+8,i+9. The second pair in these lists
could be avoided (by shifting xmm0 and xmm1 before adding both together
instead of only shifting xmm0 after adding them), but the former
i,i+1 is inherent in using pmaddwd. This is the reason that this
function is not bitexact.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-15 08:55:13 +02:00
Andreas Rheinhardt
cffd029e98 avcodec/x86/mpegvideoencdsp_init: Don't use slow path unnecessarily
The only requirement of this code (and essentially the pmulhrsw
instruction) is that the scaled scale fits into an int16_t.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-15 08:55:13 +02:00
Andreas Rheinhardt
ce499ebf96 tests/checkasm/mpegvideoencdsp: Add test for add_8x8basis
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-15 08:55:13 +02:00