1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-08-04 22:03:09 +02:00
Commit Graph

118628 Commits

Author SHA1 Message Date
db8546dff7 avcodec/mpeg12dec: Remove write-only assignments
This decoder unquantizes while parsing blocks
and does not use dct_unquantize_mpeg1_intra
(which uses *_dc_scale) at all.

repeat_field was unused since
e0a3d744a0.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:53:15 +01:00
3474475e58 avcodec/h261dec: Inline constant
The value here has been set in ff_set_qscale() from
ff_mpeg1_dc_scale_table, all of whose entries are 8.

Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:51:54 +01:00
9c16d54a16 avcodec/h261dec: Remove dead check
H.261 does not have non-reference frames.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:51:54 +01:00
a5d590963c tests/fate/vcodec: Test H.261 loop-filter
Increases coverage.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:51:54 +01:00
57ade06ffe tests/fate/vcodec: Test using mpeg2-quantizers for MPEG-4
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:51:54 +01:00
c960b42efc tests/fate/vcodec: Test alternate_scan
Encoding was untested before this.
Notice that the filesize degradation is partially due to
mpegvideo no longer using progressive_sequence and
progressive_frame.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:51:54 +01:00
a885351ad0 avcodec/vc1dec: Reenable debug-info output for field pictures
Effectively reverts c59b5e3d1e.
This is possible now that ff_print_debug_info2() uses
the MPVPicture dimensions.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:51:54 +01:00
973c7a0c65 avcodec/mpegvideo_dec: Use picture-dimensions in ff_print_debug_info()
It will allow to avoid the special case for VC-1 field pictures.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:51:54 +01:00
d4fd475005 avcodec/motion_est: Avoid branches for put(_no_rnd) selection
MotionEstContext contains pointers (to function pointers) that
have been set on a per-frame basis depending upon no_rounding
in ff_me_init_pic() to avoid branches like these. Also makes
MotionEstContext more independent of MpegEncContext.

Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:45:33 +01:00
b3ba961df6 avcodec/mpegvideo_enc: Add AV_CODEC_CAP_DR1
The mpegvideo-based encoders do one uncommon thing with
the packet's data given by ff_alloc_packet(): They potentially
reallocate it. But this only affects the internal buffer
and is not user-facing at all, so one can nevertheless
use the AV_CODEC_CAP_DR1 for them.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:45:33 +01:00
ab768b88e0 avcodec/mpegvideo_enc: Don't set qscale_table value prematurely
When there are multiple candidates for macroblock type, the encoder
tries them all. In order to do so, it keeps several sets of states
containing the variables that get modified when encoding
the macroblock and in the end uses the best of these.

Yet one variable was set, but not included in this state:
The current macroblock's qscale value in the current picture's
qscale_table. This may currently be set multiple times in
mpv_reconstruct_mb(), yet it is read when adaptive_quant is true.
Currently, the value read can be the value set by the last attempt
to write the current macroblock and not the initial value.

Fix this by only setting the qscale_table value in one place
outside of mpv_reconstruct_mb() (where it does not belong at all).

Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-04 12:44:18 +01:00
3e9777dc75 aarch64/hevcdsp_idct_neon: Add implementation for idct dc 12
Reduce binary size at the same time. The performance compared to clang -O3
is the same.

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2025-03-04 17:01:58 +08:00
5977bff569 aarch64/hevcdsp_idct_neon: Optimize idct dc
clang does better than the assembly code before the patch, especially
for small size:

hevc_idct_4x4_dc_8_c:                                   11.2 ( 1.00x)
hevc_idct_4x4_dc_8_neon:                                15.5 ( 0.73x)
hevc_idct_4x4_dc_10_c:                                  12.0 ( 1.00x)
hevc_idct_4x4_dc_10_neon:                               15.2 ( 0.79x)
hevc_idct_8x8_dc_8_c:                                   13.2 ( 1.00x)
hevc_idct_8x8_dc_8_neon:                                18.2 ( 0.73x)
hevc_idct_8x8_dc_10_c:                                  13.5 ( 1.00x)
hevc_idct_8x8_dc_10_neon:                               17.2 ( 0.78x)
hevc_idct_16x16_dc_8_c:                                 41.8 ( 1.00x)
hevc_idct_16x16_dc_8_neon:                              37.8 ( 1.11x)
hevc_idct_16x16_dc_10_c:                                41.8 ( 1.00x)
hevc_idct_16x16_dc_10_neon:                             37.8 ( 1.11x)
hevc_idct_32x32_dc_8_c:                                130.2 ( 1.00x)
hevc_idct_32x32_dc_8_neon:                             132.2 ( 0.98x)
hevc_idct_32x32_dc_10_c:                               130.2 ( 1.00x)
hevc_idct_32x32_dc_10_neon:                            132.2 ( 0.98x)

This patch basically clone what the compiler does, so the performance
is the same.

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2025-03-04 17:01:58 +08:00
5a32496962 avcodec/libuavs3d: process extradata
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2025-03-04 17:01:58 +08:00
2d7966aee1 avformat/isom_tags: Add tag for AVS3
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2025-03-04 17:01:58 +08:00
a053516e64 avformat/movenc: Add AVS3 support
'avs3' is registered at mp4ra.org. The Avs3ConfigurationBox 'av3c'
inside 'avs3' hasn't been registered yet, but is specified by the
AVS3 spec.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2025-03-04 17:01:58 +08:00
71a91485fa avcodec/aarch64/vvc: Optimize NEON version of vvc_dmvr
This patch replaces blocks of instructions performing rounding and
widening shifts with one-liners achieving the same result.

Before and after on A78
dmvr_8_12x20_neon:                                      86.2 ( 6.90x)
dmvr_8_20x12_neon:                                      94.8 ( 5.93x)
dmvr_8_20x20_neon:                                     141.5 ( 6.50x)
dmvr_12_12x20_neon:                                    158.0 ( 3.76x)
dmvr_12_20x12_neon:                                    151.2 ( 3.73x)
dmvr_12_20x20_neon:                                    247.2 ( 3.71x)
dmvr_hv_8_12x20_neon:                                  423.2 ( 3.75x)
dmvr_hv_8_20x12_neon:                                  434.0 ( 3.69x)
dmvr_hv_8_20x20_neon:                                  706.0 ( 3.69x)

dmvr_8_12x20_neon:                                      77.2 ( 7.70x)
dmvr_8_20x12_neon:                                      66.5 ( 8.49x)
dmvr_8_20x20_neon:                                      92.2 ( 9.90x)
dmvr_12_12x20_neon:                                     80.2 ( 7.38x)
dmvr_12_20x12_neon:                                     58.2 ( 9.59x)
dmvr_12_20x20_neon:                                     90.0 (10.15x)
dmvr_hv_8_12x20_neon:                                  369.0 ( 4.34x)
dmvr_hv_8_20x12_neon:                                  355.8 ( 4.49x)
dmvr_hv_8_20x20_neon:                                  574.2 ( 4.51x)

Signed-off-by: Martin Storsjö <martin@martin.st>
2025-03-04 10:35:31 +02:00
d765e5f043 swscale/aarch64: dotprod implementation of rgba32_to_Y
The idea is to split the 16 bit coefficients into lower and upper half,
invoke udot for the lower half, shift by 8, and follow by udot for the
upper half.

Benchmark on A78:
bgra_to_y_128_c:                                       682.0 ( 1.00x)
bgra_to_y_128_neon:                                    181.2 ( 3.76x)
bgra_to_y_128_dotprod:                                 117.8 ( 5.79x)
bgra_to_y_1080_c:                                     5742.5 ( 1.00x)
bgra_to_y_1080_neon:                                  1472.5 ( 3.90x)
bgra_to_y_1080_dotprod:                                906.5 ( 6.33x)
bgra_to_y_1920_c:                                    10194.0 ( 1.00x)
bgra_to_y_1920_neon:                                  2589.8 ( 3.94x)
bgra_to_y_1920_dotprod:                               1573.8 ( 6.48x)

Signed-off-by: Martin Storsjö <martin@martin.st>
2025-03-04 10:16:44 +02:00
081c865867 avformat/riffdec: change declaration of ff_get_wav_header()
Change the type of logctx from void* to AVFormatContext*. This is in
preparation for the next commit.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-03-04 02:07:01 +01:00
25c439296b avformat/mov: fix overflow in corrected_dts calculation
Fixes: Integer-overflow
Fixes: 400093647/clusterfuzz-testcase-minimized-media_metadata_parser_fuzzer-4794341562187776

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: James Almer <jamrial@gmail.com>
2025-03-03 18:10:10 -03:00
01f63ef0b4 fftools/ffmpeg_filter: also remove display matrix side data from buffered frames
Some frames may be buffered before a complex filtergraph can be configured.
This change ensures the side data removal in the cases where autorotation is
enabled also applies to them.

Fixes ticket #11487

Signed-off-by: James Almer <jamrial@gmail.com>
2025-03-03 18:10:10 -03:00
848576b4df fftools/ffmpeg_dec: remove side data copy block
It's no longer needed now that lavc handles this.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-03-03 18:10:10 -03:00
603334e86f avcodec/decode: inject missing global side data to output frames
ff_decode_frame_props() injects global side data passed by the caller (Usually
coming from the container) but ignores the global side data the decoder
gathered from the bitstream itself.
This commit amends this.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-03-03 18:10:10 -03:00
964d28e83c avutil/frame: move side data helpers to a new file
Should reduce clutter in frame.c, plus allow us to make opaque changes to
side data handling.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-03-03 18:10:10 -03:00
8631990f22 vulkan: take refs of frames using the regular buffer ref path
This simplifies the code, reduces allocations, and critically, does
not store references of frames, along with references to hw_frames_ctx.
The issue was that storing refs to frames while transferring stored
refs to hw_frames_ctx of frames, and so created a circular dependency,
which caused the Vulkan device to never be terminated.

This only stores what it strictly needs as a dependency, and enables
the frames context to be freed, even while doing asynchronous transfers.
2025-03-03 19:43:57 +01:00
d21ed2298e avfilter/dnn_detect: fail on filter if mandatory anchor option is missing
It prevents the filter of running in case such option is missing,
failing early, during init() instead of simply logging an error
during runtime.

Signed-off-by: Leandro Santiago <leandrosansilva@gmail.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
2025-03-03 18:30:18 +08:00
90fbb40da5 avfilter/dnn: do not manually parse anchors filter option
Instead, rely on AV_OPT_TYPE_FLAG_ARRAY, which will automatically
perform the parsing.

Signed-off-by: Leandro Santiago <leandrosansilva@gmail.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
2025-03-03 18:29:33 +08:00
5c5be37daf avcodec/nvenc: factor out mastering display data into its own function 2025-03-02 18:43:53 +01:00
e567dca12f avcodec/nvenc: add time code writing for h264 2025-03-02 18:43:53 +01:00
31ca6c1bfe avcodec/utils: use new ff_timecode_set_smpte function 2025-03-02 18:43:53 +01:00
9d5d51bd12 avutil/timecode: add ff_timecode_set_smpte 2025-03-02 18:43:08 +01:00
600ad36949 lavc/vvc: Fix pps_single_slice_picture
Signed-off-by: Frank Plowman <post@frankplowman.com>
2025-03-02 20:42:13 +08:00
dede00f003 lavc/vvcdec: fix undefined reference to ff_h2645_pixel_aspect
This issue was introduced by commit bb8e95b650

Reproduce steps:
./configure --enable-ffmpeg --disable-everything --enable-decoder=vvc --enable-parser=vvc --enable-demuxer=vvc --enable-protocol=file,pipe --enable-encoder=rawvideo,wrapped_avframe --enable-muxer=rawvideo,md5,null && make -j
2025-03-02 16:16:39 +08:00
e8d4c55987 avcodec/aarch64/ac3dsp_neon.S: Optimize ac3_sum_square_butterfly_int32_neon
Instead of calculating a^2, b^2, (a+b)^2 and (a-b)^2, calculate only
a^2, b^2 and 2*a*b in each iteration and derive the latter parts from
these three at the end.

Before and after:

A78
ac3_sum_square_bufferfly_int32_neon:                   484.8 ( 2.00x)
ac3_sum_square_bufferfly_int32_neon:                   468.2 ( 2.08x)

A72
ac3_sum_square_bufferfly_int32_neon:                   793.6 ( 1.26x)
ac3_sum_square_bufferfly_int32_neon:                   527.3 ( 1.92x)

Signed-off-by: Martin Storsjö <martin@martin.st>
2025-03-02 01:17:53 +02:00
38929b824b swscale/aarch64: Refactor hscale_16_to_15__fs_4
This patch removes the use of stack for temporary state and replaces
interleaved ld4 loads with ld1.

Before/after:

A78
hscale_16_to_15__fs_4_dstW_8_neon:                      86.8 ( 1.72x)
hscale_16_to_15__fs_4_dstW_24_neon:                    147.5 ( 2.73x)
hscale_16_to_15__fs_4_dstW_128_neon:                   614.0 ( 3.14x)
hscale_16_to_15__fs_4_dstW_144_neon:                   680.5 ( 3.18x)
hscale_16_to_15__fs_4_dstW_256_neon:                  1193.2 ( 3.19x)
hscale_16_to_15__fs_4_dstW_512_neon:                  2305.0 ( 3.27x)

hscale_16_to_15__fs_4_dstW_8_neon:                      86.0 ( 1.74x)
hscale_16_to_15__fs_4_dstW_24_neon:                    106.8 ( 3.78x)
hscale_16_to_15__fs_4_dstW_128_neon:                   404.0 ( 4.81x)
hscale_16_to_15__fs_4_dstW_144_neon:                   451.8 ( 4.80x)
hscale_16_to_15__fs_4_dstW_256_neon:                   760.5 ( 5.06x)
hscale_16_to_15__fs_4_dstW_512_neon:                  1520.0 ( 5.01x)

A72
hscale_16_to_15__fs_4_dstW_8_neon:                     156.8 ( 1.52x)
hscale_16_to_15__fs_4_dstW_24_neon:                    217.8 ( 2.52x)
hscale_16_to_15__fs_4_dstW_128_neon:                   906.8 ( 2.90x)
hscale_16_to_15__fs_4_dstW_144_neon:                  1014.5 ( 2.91x)
hscale_16_to_15__fs_4_dstW_256_neon:                  1751.5 ( 2.96x)
hscale_16_to_15__fs_4_dstW_512_neon:                  3469.3 ( 2.97x)

hscale_16_to_15__fs_4_dstW_8_neon:                     151.2 ( 1.54x)
hscale_16_to_15__fs_4_dstW_24_neon:                    173.4 ( 3.15x)
hscale_16_to_15__fs_4_dstW_128_neon:                   660.0 ( 3.98x)
hscale_16_to_15__fs_4_dstW_144_neon:                   735.7 ( 4.00x)
hscale_16_to_15__fs_4_dstW_256_neon:                  1273.5 ( 4.09x)
hscale_16_to_15__fs_4_dstW_512_neon:                  2488.2 ( 4.16x)

Signed-off-by: Martin Storsjö <martin@martin.st>
2025-03-02 01:17:29 +02:00
76b1810017 libswscale/arm/swscale_unscaled: Fix function prototype
Constify dstStrice argument of rgbx_to_nv12_neon_16_wrapper to be
compatible with other functions as used in function assignment.

Signed-off-by: Adam Lackorzynski <adam@l4re.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
2025-03-02 01:10:38 +02:00
0245e9382c lavu: bump minor and add APIChanges entry for new GRAY32 pixfmts 2025-03-01 20:16:00 +01:00
0ef678f5c5 APIChanges: add entry for new AMD AMF pixfmt 2025-03-01 20:16:00 +01:00
a73760da53 APIChanges: add entries for new planar GBR and float gray pixfmts
Was not done when the patches were pushed.
2025-03-01 20:15:59 +01:00
ded6772359 fate-sws-pixdesc-query: update ref for new pixfmts 2025-03-01 20:15:59 +01:00
e41b45509b fate-imgutils: update reference for new pixel formats 2025-03-01 20:15:55 +01:00
629e8a2425 vulkan: add support for AV_PIX_FMT_GRAY32 2025-03-01 13:11:13 +01:00
300b82c3ea pixfmt: add AV_PIX_FMT_GRAY32
This is a useful format for high-precision intermediates.
2025-03-01 13:11:12 +01:00
469b7a0ee4 doc/developer: Better {} style rule
This makes developer.texi consistent with tools/patcheck

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-03-01 02:47:33 +01:00
ace9f03a6c avformat/hls: Partially revert "reduce default max reload to 3"
(setting to 100 as a reasonable compromise)

The change has caused regressions for many users and consumers.
Playlist reloads only happen when a playlist doesn't indicate that it
has ended (via #EXT-X-ENDLIST), which means that the addition of future
segments is still expected.
It is well possible that an HLS server is temporarily unable to serve
further segments but resumes after some time, either indicating a
discontinuity or even by fully catching up.
With a segment length of 3s, a max_reload value of 1000 corresponds to
a duration of 50 minutes which appears to be a reasonable default.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-03-01 02:47:20 +01:00
b137347278 aarch64: Fix a few misindented lines
Signed-off-by: Martin Storsjö <martin@martin.st>
2025-02-28 23:23:09 +02:00
b37ce9b016 libavutil/vulkan: Expose ff_vk_set_descriptor_image
Useful when creating a descriptor array of separate images
2025-02-28 13:44:49 +01:00
9993a64d7b avutil/aarch64/tx_float_neon.S: clean up FFT4_X2 2025-02-28 13:42:54 +01:00
bb87d19cd9 ffv1enc_vulkan: disable autodetection of async_depth
The issue is that this could consume gigabytes of VRAM at higher
resolutions for not that much of a speedup.
Automatic detection was not a good idea as we can't know how much
VRAM is actually free.
Just remove it.
2025-02-27 19:08:42 +01:00
85d81dcfd6 hwcontext_vulkan: enable read/write without storage
BGR formats in Vulkan cannot be used in storage images, as the
pixel labels on storage images are always ordered as RGB, and
swizzling is not an option due to old hardware limitations.
This means that you must always use an RGB format and manually
swizzle when reading or writing to BGR images, or simply not use
a format in the shader itself.
This adds support for the latter.
2025-02-27 19:06:41 +01:00