1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00
Commit Graph

116556 Commits

Author SHA1 Message Date
Fei Wang
eab4a9e9f8 lavu/hwcontext_qsv: Use vendor id to create device
New kernel driver "xe" will be supported from Lunar Lake instead of
"i915".

"xe" kernel driver:
https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/xe

Signed-off-by: Fei Wang <fei.w.wang@intel.com>
2024-08-09 13:40:26 +08:00
Fei Wang
dbd74ba3c8 lavu/hwcontext_vaapi: Add option to allow to specify vendor id when init hw device
Vendor id will help to select desired device in case of kernel driver is
unknow or unsupported, for vendor may support different kernel driver on
different platforms.

Signed-off-by: Fei Wang <fei.w.wang@intel.com>
2024-08-09 13:40:24 +08:00
Michael Niedermayer
c390234da2
avformat/wtvdec: Check length of read mpeg2_descriptor
Fixes: Use of uninitialized value
Fixes: 70900/clusterfuzz-testcase-minimized-ffmpeg_dem_WTV_fuzzer-6286909377150976

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-08 19:10:05 +02:00
Michael Niedermayer
c95ea03104
avformat/wtvdec: clear sectors
The code can leave uninitialized holes in the array.
Fixes: use of uninitialized values
Fixes: 70883/clusterfuzz-testcase-minimized-ffmpeg_dem_WTV_fuzzer-6698694567591936

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-08 18:24:31 +02:00
Kacper Michajłow
b534e402d8
avformat/mov: ensure required number of bytes is read
Fixes: use-of-uninitialized-value

Found by OSS-Fuzz.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-08 18:23:39 +02:00
Michael Niedermayer
d2a25dc2bf
add tools/target_swr_fuzzer
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-08 15:26:52 +02:00
James Almer
94165d1b79 avformat/iamf: use aligned intreadwrite macros where possible
Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-07 00:16:21 -03:00
James Almer
49a6e448d7 avformat/movenc: use stream indexes when generating track ids
In some scenarios nb_tracks isn't the same as nb_streams, so a given id may end
up being used for two separate streams.

e.g. when muxing an IAMF track followed by a video track, if the IAMF track
consists of several streams, the video track would end up having an id of 2,
which may also be used by one of the IAMF streams.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-07 00:16:21 -03:00
James Almer
210740b4ed avutil/frame: use the maximum compile time supported alignment for strides
This puts lavu frame buffer allocator helpers in sync with lavc's decoder frame
buffer allocator's STRIDE_ALIGN define.

Remove the comment about av_cpu_max_align() while at it as using it is not
ideal when CPU flags can be changed mid process.

Should fix ticket #11116.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-07 00:16:21 -03:00
Kacper Michajłow
792a9979eb
avformat/rtpproto: free ip filters on open error
Found by OSS-Fuzz.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-07 00:59:19 +02:00
Kacper Michajłow
8485f7a378
avformat/srtpproto: pass options to nested protocol
This fixes passing options dict.

Fixes some timeouts found by OSS-Fuzz.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-07 00:59:19 +02:00
Kacper Michajłow
9876158ee2
avcodec/wmavoice: use av_clipd for double values
Fixes Clang warning.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-07 00:59:18 +02:00
Kacper Michajłow
1165c14444
avcodec/vp9mvs: fix misaligned access when clearing VP9mv
Fixes runtime error: member access within misaligned address
<addr> for type 'av_alias64', which requires 8 byte alignment.

VP9mv is aligned to 4 bytes, so instead doing 8 bytes clear, let's do
2 times 4 bytes.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-07 00:59:18 +02:00
Andreas Rheinhardt
bfcee368e2 avcodec/cbs_sei: Always zero-initialize SEI payload
Fixes: Use-of-uninitialized value
Fixes: clusterfuzz-testcase-minimized-ffmpeg_BSF_H264_METADATA_fuzzer-5458626041413632

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-08-06 20:25:23 +02:00
Kacper Michajłow
5dfc0cc841
avcodec/parser: ensure input padding is zeroed
Fixes use of uninitialized value, reported by MSAN.

Found by OSS-Fuzz.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>

Fixes: 70852/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5179190066872320
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-05 23:17:46 +02:00
Kacper Michajłow
2b5f000d3f
avformat/jpegxl_anim_dec: ensure input padding is zeroed
Fixes use of uninitialized value, reported by MSAN.

Found by OSS-Fuzz.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>

Fixes: 70837/clusterfuzz-testcase-minimized-ffmpeg_dem_JPEGXL_ANIM_fuzzer-5089407768526848
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-05 23:17:46 +02:00
Michael Niedermayer
3978e81809
avformat/img2dec: Clear padding data after EOF
Fixes: use-of-uninitialized-value
Fixes: 70852/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5179190066872320

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Kacper Michajlow <kasper93@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-05 23:17:46 +02:00
Michael Niedermayer
79a1cf30d1
avformat/wavdec: Check if there are 16 bytes before testing them
Fixes: use-of-uninitialized-value
Fixes: 70839/clusterfuzz-testcase-minimized-ffmpeg_dem_W64_fuzzer-5212907590189056

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-05 23:17:45 +02:00
Rémi Denis-Courmont
e0f9f4d491 lavu/cpu: deprecate RISC-V F, D and zba CPU flags 2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
d1326b6347 lavu/riscv: drop probing for zba CPU capability 2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
210877c5fd sws/riscv: depend on RVB and simplify accordingly 2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
f30e5bf1f5 lavfi/riscv: depend on RVB and simplify accordingly 2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
616fdeaea3 lavc/riscv: depend on RVB and simplify accordingly
There is no known (real) hardware with V and without the complete B
extension. B was indeed required in the RISC-V application profile from
2022, earlier than V. There should not be any relevant hardware in the
future either.

In practice, different R-V Vector optimisations in FFmpeg already depend on
every constituent of the B extension anyhow, so it would not work well.
2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
cb31f17ca8 lavu/riscv: depend on RVB and simplify accordingly 2024-08-05 21:16:26 +03:00
Nathan E. Egge
ba88e8174a lavu: Set default FF_TIMER_UNITS to "ns"
Signed-off-by: Nathan E. Egge <unlord@xiph.org>
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
4edfc11a28 lavc/h264dsp: R-V V idct4_add8 (all depths)
These are really just wrappers for idct4_add16intra functions, which are in
turn mostly wrappers for idct4_add and idct4_dc_add functions.

For benchmarks refer to the later two sets.
2024-08-05 21:16:26 +03:00
James Almer
eb3cc508d8 fate/mov: add an IAMF+video muxing test
Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-04 12:09:40 -03:00
James Almer
5b87869c09 avformat/mov: fix track handling when mixing IAMF and video tracks
Fixes crashes when muxing the two together.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-04 12:09:40 -03:00
Timo Rothenpieler
9a2171318d avcodec/nvenc: fix signedness of timing fields 2024-08-03 20:04:31 +02:00
James Almer
4a56b5f3d8 avcodec/cbs_h265: don't attempt to read 0 length elements in sei_3d_reference_displays_info
Fixes: 70458/clusterfuzz-testcase-minimized-ffmpeg_BSF_TRACE_HEADERS_fuzzer-5259339779080192
Fixes: Assertion width > 0 && width <= 32 failed at libavcodec/cbs.c:608

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-03 11:59:14 -03:00
Rémi Denis-Courmont
de7f999481 lavc/videodsp: work-around LLVM-as
For some reason, it can't handle the normal syntax for an address operand
without an offset, so add a dummy zero offset.
2024-08-02 21:24:01 +03:00
Rémi Denis-Courmont
677f28b310 lavc/h264dsp: stick R-V V weight to 16-bit precision
T-Head C908 (ns):
h264_weight2_8_c:        1607.8
h264_weight2_8_rvv_i32:   515.0 (before)
h264_weight2_8_rvv_i32:   348.5 (after)
h264_weight4_8_c:        2255.8
h264_weight4_8_rvv_i32:  1015.0 (before)
h264_weight4_8_rvv_i32:   691.0 (after)
h264_weight8_8_c:        3857.5
h264_weight8_8_rvv_i32:  2218.8 (before)
h264_weight8_8_rvv_i32:  1561.3 (after)
h264_weight16_8_c:       7431.5
h264_weight16_8_rvv_i32: 2737.3 (before)
h264_weight16_8_rvv_i32: 1848.3 (after)

SpacemiT X60 (ns):
h264_weight2_8_c:        1624.1
h264_weight2_8_rvv_i32:   352.6 (before)
h264_weight2_8_rvv_i32:   259.3 (after)
h264_weight4_8_c:        2259.3
h264_weight4_8_rvv_i32:   685.8 (before)
h264_weight4_8_rvv_i32:   530.3 (after)
h264_weight8_8_c:        4103.3
h264_weight8_8_rvv_i32:  1581.8 (before)
h264_weight8_8_rvv_i32:  1238.6 (after)
h264_weight16_8_c:       7624.3
h264_weight16_8_rvv_i32: 2738.1 (before)
h264_weight16_8_rvv_i32: 1853.3 (after)
2024-08-02 21:24:01 +03:00
Rémi Denis-Courmont
afd45c7ff7 lavc/h264dsp: stick R-V V biweight to 16-bit
T-Head C908 (ns):
h264_biweight2_8_c:        2414.5
h264_biweight2_8_rvv_i32:   701.8 (before)
h264_biweight2_8_rvv_i32:   468.5 (after)
h264_biweight4_8_c:        4655.3
h264_biweight4_8_rvv_i32:  1377.5 (before)
h264_biweight4_8_rvv_i32:   931.8 (after)
h264_biweight8_8_c:        9701.5
h264_biweight8_8_rvv_i32:  2896.0 (before)
h264_biweight8_8_rvv_i32:  2070.5 (after)
h264_biweight16_8_c:      18025.0
h264_biweight16_8_rvv_i32: 3460.8 (before)
h264_biweight16_8_rvv_i32: 1978.0 (after)

SpacemiT X60 (ns):
h264_biweight2_8_c:        2415.5
h264_biweight2_8_rvv_i32:   478.2 (before)
h264_biweight2_8_rvv_i32:   362.8 (after)
h264_biweight4_8_c:        4655.3
h264_biweight4_8_rvv_i32:   946.7 (before)
h264_biweight4_8_rvv_i32:   727.3 (after)
h264_biweight8_8_c:        9061.8
h264_biweight8_8_rvv_i32:  2071.7 (before)
h264_biweight8_8_rvv_i32:  1685.8 (after)
h264_biweight16_8_c:      18020.5
h264_biweight16_8_rvv_i32: 3457.2 (before)
h264_biweight16_8_rvv_i32: 1935.8 (after)
2024-08-02 21:24:01 +03:00
Zhao Zhili
670ff6c7ce avcodec/nvenc: rework on DTS generation
Before the patch, the method to generate DTS only works with
timebase equal to 1/fps. With timebase like 1/1000

./ffmpeg -i foo.mp4 -an -c:v h264_nvenc -enc_time_base 1/1000 bar.mp4

pts 0    dts -3
pts 160  dts 37
pts 80   dts 77
pts 40   dts 117 <-- invalid
pts 120  dts 157
pts 320  dts 197
pts 240  dts 237
pts 200  dts 277 <-- invalid
pts 280  dts 317 <-- invalid

The generated DTS can be larger than PTS, since it only reorder the
input PTS and minus the number of frame delay, which doesn't take
timebase into account. It should minus the "time" of frame delay.

9a245bd trying to fix the issue, but the implementation is incomplete,
which only use time_base.num. Then it got reverted by ac7c265b33.

After this patch:

pts 0    dts -120
pts 160  dts -80
pts 80   dts -40
pts 40   dts 0
pts 120  dts 40
pts 320  dts 80
pts 240  dts 120
pts 200  dts 160
pts 280  dts 200

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2024-08-02 17:57:19 +02:00
Roman Arzumanyan
bcea693f75 avcodec/cuviddec: more accurately guess probed sw pixel format
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2024-08-02 17:38:46 +02:00
Gnattu OC
d50f9701b6 avutil/hwcontext_videotoolbox: Correctly set trc
The color trc key was assigned a color primaries value which causes
the resulting colorspace is always SDR.

Fixes #10884.

Signed-off-by: Gnattu OC <gnattuoc@me.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-08-02 10:24:09 +08:00
Rémi Denis-Courmont
1b2a925e94 lavc/riscv: drop probing for F & D extensions
F and D extensions are included in all RISC-V application profiles ever
made (so starting from RV64GC a.k.a. RVA20). Realistically they need to be
selected at compilation time.

Currently, there are no consumers for these two flags. If there is ever a
need to reintroduce F- or D-specific optimisations, we can always use
__riscv_f or __riscv_d compiler predefined macros respectively.
2024-08-01 22:56:50 +03:00
Rémi Denis-Courmont
2f083fd581 lavc/audiodsp: drop R-V F vector_clipf
This is now firmly slower than C.

SiFive-U74 (cycles):
audiodsp.vector_clipf_c:   31.2
audiodsp.vector_clipf_rvf: 39.5
2024-08-01 19:29:40 +03:00
Rémi Denis-Courmont
c48213b2dc lavc/audiodsp: drop opposite sign optimisation
This was added along side the original SSE(one) DSP function in
0a68cd876e without rationale. This was
presumably faster on x87, which is no longer relevant since we pretty
much assume SSE2 or later on x86.

Meanwhile this function is ~2.5x slower than the normal floating point
one on SiFive-U74.
2024-08-01 19:29:40 +03:00
Rémi Denis-Courmont
d86b6767ce lavc/audiodsp: properly unroll vector_clipf
Given that source and destination can alias, the compiler was forced to
perform each read-modify-write sequentially. We cannot use the `restrict`
qualifier to avoid this here because the AC-3 encoder uses the function
in-place. Instead this commit provides an explicit guarantee to the
compiler that batches of 8 elements will not overlap, so that it can
interleave calculations.

In practice contemporary optimising compilers are able to unroll and keep
the temporary array in FPU registers (without spilling).

On SiFive-U74, this speeds the same signs branch by 4x, and the
opposite signs branch 1.5x.
2024-08-01 19:29:40 +03:00
Rémi Denis-Courmont
d527d23872 lavc/pixblockdsp: specialise aligned 16-bit get_pixels
The current code assumes that we have unaligned rows, which hurts on
platforms with slower unaligned accesses. (Also, this lets the compiler
unroll manually, which it seems to do in practice.)
2024-08-01 18:44:01 +03:00
Rémi Denis-Courmont
656a9664bf checkasm/riscv: preserve T1 whilst calling...
This preserves T1 whilst calling the instrumented function. In a Sci-Fi
setting where type-based Control Flow Integrity (CFI) is supported, the
calling code (i.e., the `checkasm` test case) will set T1 to the expected
value of the landing pad label (LPL) of the instrumented function.

The call wrapper will always use LPL zero which is a wild card. We should
preserve the value of T1 at least until the indirect call to the
instrumented function. Of course this is Sci-Fi, because:
1) there is no hardware (or even QEMU) support yet,
2) all our assembler functions currently use LPL zero anyway.

This uses T3 rather than T2 because indirect branches with T2 is reserved
for notionally direct calls made with an indirect call instruction (e.g.
due to GOT indirection), and are exempted from forward-edge CFI checks.
2024-08-01 18:44:01 +03:00
Rémi Denis-Courmont
54b1970c60 lavu/riscv: fix return type 2024-08-01 18:44:01 +03:00
Rémi Denis-Courmont
54ae270213 lavc/rv34dsp: use saturating add/sub for R-V V DC add
T-Head C908 (cycles):
rv34_idct_dc_add_c:      113.2
rv34_idct_dc_add_rvv_i32: 48.5 (before)
rv34_idct_dc_add_rvv_i32: 39.5 (after)
2024-08-01 18:43:04 +03:00
Rémi Denis-Courmont
952b426f3b lavc/bswapdsp: add RV Zvbb bswap16 and bswap32 2024-08-01 18:43:04 +03:00
James Almer
f4daf633b2 avcodec/aacps_tablegen_template: don't redefine CONFIG_HARDCODED_TABLES
Fixes relevant warnings when compiling with --enable-hardcoded-tables

Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-01 12:13:53 -03:00
James Almer
6f8e365a2a avutil/hwcontext_vaapi: use the correct type for VASurfaceAttribExternalBuffers.buffers
Should fix ticket #11115.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-01 12:13:53 -03:00
Marvin Scholz
ca7fcf5089 avutil/hwcontext_videotoolbox: Fix build with older SDKs
The previous fix was not sufficient.
To make things easier to reason about, split the function and
add the guards there instead of complicating the call site more.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-08-01 20:58:27 +08:00
Anton Khirnov
bcf08c1171 lavc/ffv1: change FFV1SliceContext.plane into a RefStruct object
Frame threading in the FFV1 decoder works in a very unusual way - the
state that needs to be propagated from the previous frame is not decoded
pixels(¹), but each slice's entropy coder state after decoding the slice.

For that purpose, the decoder's update_thread_context() callback stores
a pointer to the previous frame thread's private data. Then, when
decoding each slice, the frame thread uses the standard progress
mechanism to wait for the corresponding slice in the previous frame to
be completed, then copies the entropy coder state from the
previously-stored pointer.

This approach is highly dubious, as update_thread_context() should be
the only point where frame-thread contexts come into direct contact.
There are no guarantees that the stored pointer will be valid at all, or
will contain any particular data after update_thread_context() finishes.

More specifically, this code can break due to the fact that keyframes
reset entropy coder state and thus do not need to wait for the previous
frame. As an example, consider a decoder process with 2 frame threads -
thread 0 with its context 0, and thread 1 with context 1 - decoding a
previous frame P, current frame F, followed by a keyframe K. Then
consider concurrent execution consistent with the following sequence of
events:
* thread 0 starts decoding P
* thread 0 reads P's slice header, then calls
  ff_thread_finish_setup() allowing next frame thread to start
* main thread calls update_thread_context() to transfer state from
  context 0 to context 1; context 1 stores a pointer to context 0's private
  data
* thread 1 starts decoding F
* thread 1 reads F's slice header, then calls
  ff_thread_finish_setup() allowing the next frame thread to start
  decoding
* thread 0 finishes decoding P
* thread 0 starts decoding K; since K is a keyframe, it does not
  wait for F and reallocates the arrays holding entropy coder state
* thread 0 finishes decoding K
* thread 1 reads entropy coder state from its stored pointer to context
  0, however it finds state from K rather than from P

This execution is currently prevented by special-casing FFV1 in the
generic frame threading code, however that is supremely ugly. It also
involves unnecessary copies of the state arrays, when in fact they can
only be used by one thread at a time.

This commit addresses these deficiencies by changing the array of
PlaneContext (each of which contains the allocated state arrays)
embedded in FFV1SliceContext into a RefStruct object. This object can
then be propagated across frame threads in standard manner. Since the
code structure guarantees only one thread accesses it at a time, no
copies are necessary. It is also re-created for keyframes, solving the
above issue cleanly.

Special-casing of FFV1 in the generic frame threading code will be
removed in a later commit.

(¹) except in the case of a damaged slice, when previous frame's pixels
    are used directly
2024-08-01 10:09:26 +02:00
Anton Khirnov
c335218a81 lavc/ffv1dec: inline copy_fields() into update_thread_context()
It is now only called from a single place, so there is no point in it
being a separate function.
2024-08-01 10:09:26 +02:00