1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-19 05:49:09 +02:00

112897 Commits

Author SHA1 Message Date
Rémi Denis-Courmont
adc87a5f7c lavc/opusdsp: rewrite R-V V postfilter
This uses a more traditional approach allowing up processing of up to
period minus two elements per iteration. This also allows the algorithm
to work for all and any vector length.

As the T-Head C908 device under test can load 16 elements loop, there is
unsurprisingly a little performance drop when the period is minimal and
the parallelism is capped at 13 elements:

Before:
postfilter_15_c:         21222.2
postfilter_15_rvv_f32:   22007.7
postfilter_512_c:        20189.7
postfilter_512_rvv_f32:  22004.2
postfilter_1022_c:       20189.7
postfilter_1022_rvv_f32: 22004.2

After:
postfilter_15_c:         20189.5
postfilter_15_rvv_f32:    7057.2
postfilter_512_c:        20189.5
postfilter_512_rvv_f32:   5667.2
postfilter_1022_c:       20192.7
postfilter_1022_rvv_f32:  5667.2
2023-11-06 22:09:30 +02:00
Rémi Denis-Courmont
02594c8c01 lavc/pixblockdsp: rework R-V V get_pixels_unaligned
As in the aligned case, we can use VLSE64.V, though the way of doing so
gets more convoluted, so the performance gains are more modest:

get_pixels_unaligned_c:       126.7
get_pixels_unaligned_rvv_i32: 145.5 (before)
get_pixels_unaligned_rvv_i64:  62.2 (after)

For the reference, those are the aligned benchmarks (unchanged) on the
same T-Head C908 hardware:

get_pixels_c:                 126.7
get_pixels_rvi:                85.7
get_pixels_rvv_i64:            33.2
2023-11-06 19:42:49 +02:00
Rémi Denis-Courmont
f68ad5d2de lavc/sbrdsp: R-V V sbr_hf_g_filt
hf_g_filt_c:      1552.5
hf_g_filt_rvv_f32: 679.5
2023-11-06 19:42:49 +02:00
Paul B Mahol
44a0148fad avfilter/af_adynamicequalizer: do detection of threshold first
Makes better results in final output if multiple filters are cascaded at once.
2023-11-05 16:00:29 +01:00
Paul B Mahol
799fad1828 avfilter/af_adynamicequalizer: always start filtering from unit gain 2023-11-05 16:00:28 +01:00
Anton Khirnov
f9fdaa2ca9 configure: warn when threading is disabled
Explicitly state what the implications of this are.
2023-11-05 11:30:13 +01:00
Anton Khirnov
ad3df6bf35 lavf/smacker: export sample_aspect_ratio
Partially fixes #10617
2023-11-05 11:30:13 +01:00
Rob Hall
1a7a85137e ffbuild: Add gzip -n flag to fix reproducible builds
Without this flag, timestamps were embedded into the final
binary if CUDA was enabled.

Signed-off-by: Rob Hall <robxnanocode@outlook.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2023-11-05 11:30:13 +01:00
Paul B Mahol
fd1712b6fb avfilter/af_adynamicequalizer: merge direction option with mode option
More user-friendly and self-explanatory what certain mode does.
2023-11-04 15:39:24 +01:00
Paul B Mahol
43226efc21 avfilter/af_adynamicequalizer: add new structure to hold filtering state 2023-11-04 15:39:23 +01:00
Andreas Rheinhardt
3f890fbfd9 avcodec/cbs_h2645: Fix leak of SPS VUI extension data
Fixes: VUI extension leak
Fixes: 63004/clusterfuzz-testcase-minimized-ffmpeg_BSF_VVC_METADATA_fuzzer-4928832253329408

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:27:41 +01:00
Andreas Rheinhardt
de4846dd18 avfilter/deshake: Merge header into its only user
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:26:25 +01:00
Andreas Rheinhardt
2fdaeec41b avfilter/vf_deshake: Remove unnecessary emms_c
Redundant since ea043cc53ed3506775ec6239ed5f8a20718b1098
(which made 16x16 no longer use MMX).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:26:25 +01:00
Andreas Rheinhardt
392ab35db1 avfilter/vf_mpdecimate: Remove emms_c
Unnecessary now that the pixelutils API abides by the ABI.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:26:25 +01:00
Andreas Rheinhardt
5b85ca5317 avutil/x86/pixelutils: Empty MMX state in ff_pixelutils_sad_8x8_mmxext
We currently mostly do not empty the MMX state in our MMX
DSP functions; instead we only do so before code that might
be using x87 code. This is a violation of the System V i386 ABI
(and maybe of other ABIs, too):
"The CPU shall be in x87 mode upon entry to a function. Therefore,
every function that uses the MMX registers is required to issue an
emms or femms instruction after using MMX registers, before returning
or calling another function." (See 2.2.1 in [1])
This patch does not intend to change all these functions to abide
by the ABI; it only does so for ff_pixelutils_sad_8x8_mmxext, as this
function can by called by external users, because it is exported
via the pixelutils API. Without this, the following fragment will
assert (on x86/x64):
    uint8_t src1[8 * 8], src2[8 * 8];
    av_pixelutils_sad_fn fn = av_pixelutils_get_sad_fn(3, 3, 0, NULL);
    fn(src1, 8, src2, 8);
    av_assert0_fpu();

[1]: https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/intel386-psABI-1.1.pdf

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:26:03 +01:00
Andreas Rheinhardt
8661b5e8f9 avfilter/vf_format: Deduplicate inputs
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Andreas Rheinhardt
c32c1a18b9 avfilter/vsrc_testsrc: Deduplicate outputs
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Andreas Rheinhardt
748c168f8e avfilter/vf_xmedian: Deduplicate outputs
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Andreas Rheinhardt
93abb9b560 avfilter/vf_hsvkey: Deduplicate inputs and outputs
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Andreas Rheinhardt
50d3c5bd8c avfilter/vf_convolve: Deduplicate outputs
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Andreas Rheinhardt
e557d89ac1 avfilter/vf_chromakey: Deduplicate inputs and outputs
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Andreas Rheinhardt
2e2c28119f avfilter/vf_blend: Deduplicate outputs
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Andreas Rheinhardt
1d33a310df avfilter/vf_aspect: Deduplicate inputs
Also avoid using the avfilter-prefix for static objects.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Andreas Rheinhardt
a40f833bac avfilter/f_graphmonitor: Deduplicate outputs
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Andreas Rheinhardt
a02670ded7 avfilter/f_drawgraph: Deduplicate outputs
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Andreas Rheinhardt
5935423e1e avcodec/aactab: Deduplicate swb_offset_960 tabs
swb_offset_960_48 and swb_offset_960_32 coincide.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-04 01:24:09 +01:00
Michael Niedermayer
4fb9d94688
avformat/lafdec: Check for 0 parameters
Fixes: Timeout
Fixes: 63661/clusterfuzz-testcase-minimized-ffmpeg_dem_LAF_fuzzer-6615365234589696

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-11-03 22:16:33 +01:00
Michael Niedermayer
03a4aa9699
avcodec/flicvideo: consider width in copy loops
Fixes: out of array write
Fixes: 63520/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FLIC_fuzzer-4876198087622656
Regression since: c7f8d42c12582b0626ea38117df6c9aea9fcf5b1 (was not posted to ffmpeg-devel)

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-11-03 22:16:33 +01:00
Michael Niedermayer
c0a18e884c
avfilter/buffersink: fix order of operation with = and <0
Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-11-03 22:17:18 +01:00
Michael Niedermayer
9450a4a7fe
avfilter/framesync: fix order of operation with = and <0
Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-11-03 22:16:33 +01:00
Andreas Rheinhardt
155f0c8ef7 avformat/webpenc: Check seeks
Addresses the issue reported in ticket #4609.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-03 14:47:57 +01:00
Zhao Zhili
e920a84801 mailmap: remap my email accounts
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2023-11-03 20:57:49 +08:00
Reimar Döffinger
9dd49c8b52 libavutil/log.c: only include valgrind header when used.
This is cleaner, but it is also a workaround for when
the header exists, but cannot be compiled.
This will happen when the compiler has no inline asm
support.
Possibly the configure check should be improved as well.
2023-11-02 21:03:43 +01:00
Reimar Döffinger
0ea184fc39 libavutil/aarch64/cpu.c: HWCAPS requires inline asm support.
Fixes compilation with tcc, which does not have aarch64
inline asm support.
2023-11-02 21:03:43 +01:00
Reimar Döffinger
a31992634f configure: fix _Pragma check.
The test can currently pass when _Pragma is not supported, since
_Pragma might be treated as a implicitly declared function.
This happens e.g. with tinycc.
Extending the check to 2 pragmas both matches the actual use
better and avoids this misdetection.
2023-11-02 21:03:43 +01:00
Andreas Rheinhardt
02064ba3a3 fftools/ffmpeg_mux_init: Restrict disabling automatic copying of metadata
Fixes ticket #10638 (and should also fix ticket #10482)
by restoring the behaviour from before
3c7dd5ed37da6d2de06c4850de5a319ca9cdd47f.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-02 13:24:21 +01:00
zheng qian
4dbfb52230 doc/decoders: correctly note an option's default in libaribcaption
The `-caption_encoding` option was reported as having a default value of
'ass', whereas it's actually 'auto'.

Signed-off-by: zheng qian <xqq@xqq.im>
Signed-off-by: Gyan Doshi <ffmpeg@gyani.pro>
2023-11-02 14:07:00 +05:30
Rémi Denis-Courmont
d06fd18f8f lavc/sbrdsp: R-V V neg_odd_64
With 128-bit vectors, this is mostly pointless but also harmless.
Performance gains should be more noticeable with larger vector sizes.

neg_odd_64_c:       76.2
neg_odd_64_rvv_i64: 74.7
2023-11-01 22:53:26 +02:00
Rémi Denis-Courmont
b0aba7dd0c lavc/sbrdsp: R-V V sum_square
sum_square_c:       803.5
sum_square_rvv_f32: 283.2
2023-11-01 22:53:26 +02:00
Rémi Denis-Courmont
86bee42473 lavc/sbrdsp: R-V V sum64x5
sum64x5_c:       385.0
sum64x5_rvv_f32: 116.0
2023-11-01 22:53:26 +02:00
Andreas Rheinhardt
eba73142ad avcodec/vp9: Join extradata buffer pools
Up until now each thread had its own buffer pool for extradata
buffers when using frame-threading. Each thread can have at most
three references to extradata and in the long run, each thread's
bufferpool seems to fill up with three entries. But given
that at any given time there can be at most 2 + number of threads
entries used (the oldest thread can have two references to preceding
frames that are not currently decoded and each thread has its own
current frame, but there can be no references to any other frames),
this is wasteful. This commit therefore uses a single buffer pool
that is synced across threads.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-01 20:16:02 +01:00
Andreas Rheinhardt
0c44f63b02 avcodec/refstruct: Allow to share pools
To do this, make FFRefStructPool itself refcounted according
to the RefStruct API.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-01 20:15:54 +01:00
Andreas Rheinhardt
92abc7266b avcodec/vaapi_encode: Use RefStruct pool API, stop abusing AVBuffer API
Up until now, the VAAPI encoder uses fake data with the
AVBuffer-API: The data pointer does not point to real memory,
but is instead just a VABufferID converted to a pointer.
This has probably been copied from the VAAPI-hwcontext-API
(which presumably does it to avoid allocations).

This commit changes this without causing additional allocations
by switching to the RefStruct-pool API. This also fixes an
unchecked av_buffer_ref().

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-01 20:14:22 +01:00
Andreas Rheinhardt
8c0350f57e avcodec/vp9: Use RefStruct-pool API for extradata
It avoids allocations and corresponding error checks.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-01 20:14:06 +01:00
Andreas Rheinhardt
090d9956fd avcodec/refstruct: Allow to always return zeroed pool entries
This is in preparation for the following commit.

Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-01 20:13:40 +01:00
Andreas Rheinhardt
e01e30ede1 avcodec/nvdec: Use RefStruct-pool API for decoder pool
It involves less allocations, in particular no allocations
after the entry has been created. Therefore creating a new
reference from an existing one can't fail and therefore
need not be checked. It also avoids indirections and casts.

Also note that nvdec_decoder_frame_init() (the callback
to initialize new entries from the pool) does not use
atomics to read and replace the number of entries
currently used by the pool. This relies on nvdec (like
most other hwaccels) not being run in a truely frame-threaded
way.

Tested-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-01 20:13:01 +01:00
Andreas Rheinhardt
fd2e65871c avcodec/hevcdec: Use RefStruct-pool API instead of AVBufferPool API
It involves less allocations and therefore has the nice property
that deriving a reference from a reference can't fail,
simplifying hevc_ref_frame().

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-01 20:10:20 +01:00
Andreas Rheinhardt
736b510fcc avcodec/h264dec: Use RefStruct-pool API instead of AVBufferPool API
It involves less allocations and therefore has the nice property
that deriving a reference from a reference can't fail.
This allows for considerable simplifications in
ff_h264_(ref|replace)_picture().
Switching to the RefStruct API also allows to make H264Picture
smaller, because some AVBufferRef* pointers could be removed
without replacement.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-01 20:07:56 +01:00
Andreas Rheinhardt
26c0a7321f avcodec/refstruct: Add RefStruct pool API
Very similar to the AVBufferPool API, but with some differences:
1. Reusing an already existing entry does not incur an allocation
at all any more (the AVBufferPool API needs to allocate an AVBufferRef).
2. The tasks done while holding the lock are smaller; e.g.
allocating new entries is now performed without holding the lock.
The same goes for freeing.
3. The entries are freed as soon as possible (the AVBufferPool API
frees them in two batches: The first in av_buffer_pool_uninit() and
the second immediately before the pool is freed when the last
outstanding entry is returned to the pool).
4. The API is designed for objects and not naked buffers and
therefore has a reset callback. This is called whenever an object
is returned to the pool.
5. Just like with the RefStruct API, custom allocators are not
supported.

(If desired, the FFRefStructPool struct itself could be made
reference counted via the RefStruct API; an FFRefStructPool
would then be freed via ff_refstruct_unref().)

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-11-01 20:07:23 +01:00
Rémi Denis-Courmont
92bcc6703a lavc/pixblockdsp: remove R-V V get_pixels_16
In the aligned case, the existing RVI assembler is actually much
faster. In the unaligned case, there is nothing much to gain over C.
2023-11-01 19:27:22 +02:00