1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-12 19:18:44 +02:00
Commit Graph

107804 Commits

Author SHA1 Message Date
J. Dekker
ea6ecb12aa checkasm/hevc_add_res: add 12bit test
Also fix the bug where in every other byte only the lower 2 bits were
used in the 8bit test.

Signed-off-by: J. Dekker <jdek@itanimul.li>
2022-08-16 14:00:34 +02:00
Swinney, Jonathan
0d7caa5b09 swscale/aarch64: add vscale specializations
This commit adds new code paths for vscale when filterSize is 2, 4, or
8. By using specialized code with unrolling to match the filterSize we
can improve performance.

On AWS c7g (Graviton 3, Neoverse V1) instances:
                                 before   after
yuv2yuvX_2_0_512_accurate_neon:  558.8    268.9
yuv2yuvX_4_0_512_accurate_neon:  637.5    434.9
yuv2yuvX_8_0_512_accurate_neon:  1144.8   806.2
yuv2yuvX_16_0_512_accurate_neon: 2080.5   1853.7

Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-16 13:40:42 +03:00
Swinney, Jonathan
3e708722a2 swscale/aarch64: vscale optimization
Use scalar times vector multiply accumlate instructions instead of
vector times vector to remove the need for replicating load instructions
which are slightly slower.

On AWS c7g (Graviton 3, Neoverse V1) instances:
yuv2yuvX_8_0_512_accurate_neon:  1144.8  987.4
yuv2yuvX_16_0_512_accurate_neon: 2080.5 1869.4

Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-16 13:40:42 +03:00
Swinney, Jonathan
4dcd191a50 checkasm: updated tests for sw_scale
Change the reference to exactly match the C reference in swscale,
instead of exactly matching the x86 SIMD implementations (which
differs slightly). Test with and without SWS_ACCURATE_RND - if this
flag isn't set, the output must match the C reference exactly,
otherwise it is allowed to be off by 2.

Mark a couple x86 functions as unavailable when SWS_ACCURATE_RND
is set - apparently this discrepancy hasn't been noticed in other
exact tests before.

Add a test for yuv2plane1.

Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-16 13:40:42 +03:00
Timo Rothenpieler
317f5252c0 doc/APIchanges: add missing rgbaf16 pixfmt entry 2022-08-16 12:31:03 +02:00
Anton Khirnov
ab31473830 fftools/ffmpeg: store a separate copy of input codec parameters
Use it instead of AVStream.codecpar in the main thread. While
AVStream.codecpar is documented to only be updated when the stream is
added or avformat_find_stream_info(), it is actually updated during
demuxing. Accessing it from a different thread then constitutes a race.

Ideally, some mechanism should eventually be provided for signalling
parameter updates to the user. Then the demuxing thread could pick up
the changes and propagate them to the decoder.
2022-08-16 11:09:09 +02:00
Swinney, Jonathan
75ffca7eef libswscale/aarch64: add another hscale specialization
This specialization handles the case where filtersize is 4 mod 8, e.g.
12, 20, etc. Aarch64 was previously using the c function for this case.
This implementation speeds up that case significantly.

hscale_8_to_15__fs_12_dstW_512_c: 6234.1
hscale_8_to_15__fs_12_dstW_512_neon: 1505.6

Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-16 12:08:38 +03:00
Zhao Zhili
1af7797d21 avformat/mov: fix encryption index in the case of multiple trun
frag_stream_info->index_entry isn't the first sample/trun index.
cenc.frag_index_entry_base failed to catch the case since
current_index > 0.

Fix ticket #9807.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2022-08-16 18:47:40 +08:00
Zhao Zhili
98dcdd1868 avformat/mov: fix frag_index.current out of sync
frag_index.current is used by cenc_filter, and is updated inside
mov_read_moof. It can out of sync regarding to mov_read_packet.

Partly fix ticket #9807.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2022-08-16 18:47:33 +08:00
Lynne
ae66a9db7b
lavu/tx: optimize and simplify inverse MDCTs
Convert the input from a scatter to a gather instead,
which is faster and better for SIMD.
Also, add a pre-shuffled exptab version to avoid
gathering there at all. This doubles the exptab size,
but the speedup makes it worth it. In SIMD, the
exptab will likely be purged to a higher cache
anyway because of the FFT in the middle, and
the amount of loads stays identical.

For a 960-point inverse MDCT, the speedup is 10%.

This makes it possible to write sane and fast SIMD
versions of inverse MDCTs.
2022-08-16 01:22:38 +02:00
Derek Buitenhuis
412922cc6f ipfsgateway: Remove default gateway
A gateway can see everything, and we should not be shipping a hardcoded
default from a third party company; it's a security risk.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2022-08-15 20:09:13 +01:00
Andreas Rheinhardt
6789b73a81 avcodec/mpegvideo: Don't zero unnecessarily
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-15 18:19:19 +02:00
Andreas Rheinhardt
9703f5d87d avcodec/mpegvideodec: Constify some functions
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-15 18:19:19 +02:00
Andreas Rheinhardt
b645138a34 avcodec/mpegpicture: Don't copy unnecessarily, fix race
mpegvideo uses an array of Pictures and when it is done with using
them, it only unreferences them incompletely: Some buffers are kept
so that they can be reused lateron if the same slot in the Picture
array is reused, making this a sort of a bufferpool.
(Basically, a Picture is considered used if the AVFrame's buf is set.)
Yet given that other pieces of the decoder may have a reference to
these buffers, they need not be writable and are made writable using
av_buffer_make_writable() when preparing a new Picture. This involves
reading the buffer's data, although the old content of the buffer
need not be retained.

Worse, this read can be racy, because the buffer can be used by another
thread at the same time. This happens for Real Video 3 and 4.

This commit fixes this race by no longer copying the data;
instead the old buffer is replaced by a new, zero-allocated buffer.

(Here are the details of what happens with three or more decoding threads
when decoding rv30.rm from the FATE-suite as happens in the rv30 test:
The first decoding thread uses the first slot of its picture array
to store its current pic; update_thread_context copies this for the
second thread that decodes a P-frame. It uses the second slot in its
Picture array to store its P-frame. This arrangement is then copied
to the third decode thread, which decodes a B-frame. It uses the third
slot in its Picture array for its current frame.
update_thread_context copies this to the next thread. It unreferences
the third slot containing the other B-frame and then it reuses this
slot for its current frame. Because the pic array slots are only
incompletely unreferenced, the buffers of the previous B-frame are
still in there and they are not writable; in fact the previous
thread is concurrently writing to them, causing races when making
the buffer writable.)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-15 18:10:31 +02:00
Andreas Rheinhardt
70f3035482 avcodec/avcodec: Remove redundant check
At this point active_thread_type is set iff active_thread_type
is set to FF_THREAD_FRAME iff AVCodecInternal.frame_thread_encoder
is set.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-15 18:10:31 +02:00
Andreas Rheinhardt
3040876833 avcodec/avcodec: Move initializing frame-thrd encoder to encode_preinit
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-15 18:10:31 +02:00
Timo Rothenpieler
c469c3c3b1 avfilter/vsrc_ddagrab: add options for more control over output format fallback 2022-08-13 15:22:14 +02:00
Timo Rothenpieler
6a574e3901 avfilter/vsrc_ddagrab: add rgbaf16 output support 2022-08-13 15:21:59 +02:00
Timo Rothenpieler
dd94a03468 avutil/hwcontext_d3d11va: add support for rgbaf16 pixel format 2022-08-13 15:21:59 +02:00
Timo Rothenpieler
e95b08a7dd lavu/pixfmt: add packed RGBA float16 format
This is the default format of the Windows compositor and what DXGI
Desktop Duplication will give you for any kind of HDR output.
2022-08-13 15:21:46 +02:00
Timo Rothenpieler
9ca3b8b7cd compat: add msvc windres wrapper
This is by no means a complete wrapper. It's only designed to fit the
usecase ffmpegs build system has.
2022-08-13 14:42:52 +02:00
Timo Rothenpieler
f85e0673c3 fftools: add DPI awareness manifest
Some filters, like gdigrab, rely on this to be set to see and report
proper dimensions.
2022-08-13 14:42:52 +02:00
Timo Rothenpieler
b77fff47d0 configure: always enable gnu_windres if available
Use the appropiate Makefile variable to ensure the resource file is
only built into shared libraries instead.
2022-08-13 14:42:36 +02:00
Anton Khirnov
6ded80af92 fftools/ffmpeg: move packet timestamp processing to demuxer thread
Discontinuity detection/correction is left in the main thread, as it is
entangled with InputStream.next_dts and related variables, which may be
set by decoding code.

Fixes races e.g. in fate-ffmpeg-streamloop after
aae9de0cb2.
2022-08-13 12:41:05 +02:00
Anton Khirnov
3b2beceae1 fftools/ffmpeg: use a separate variable for discontinuity offset
This will allow to move normal offset handling to demuxer thread, since
discontinuities currently have to be processed in the main thread, as
the code uses some decoder-produced values.
2022-08-13 12:41:05 +02:00
Anton Khirnov
ca38fe927e fftools/ffmpeg: simplify conditions in ts_discontinuity_process 2022-08-13 12:41:05 +02:00
Anton Khirnov
aa6d4a53e3 fftools/ffmpeg: move inter-stream ts discontinuity handling to ts_discontinuity_process() 2022-08-13 12:41:05 +02:00
Anton Khirnov
e2d784a5b7 fftools/ffmpeg: move timestamp discontinuity correction out of process_input() 2022-08-13 12:41:05 +02:00
Anton Khirnov
274c8d5882 fftools/ffmpeg: pre-compute the streamcopy start pts before transcoding starts
InputFile.ts_offset can change during transcoding, due to discontinuity
correction. This should not affect the streamcopy starting timestamp.

Cf. bf2590aed3
2022-08-13 12:41:05 +02:00
Anton Khirnov
86e9cef77b fftools/ffmpeg: move stream-dependent starttime correction to transcode_init()
Currently this code is located in the discontinuity handling block,
where it does not belong.
2022-08-13 12:41:05 +02:00
Anton Khirnov
ee2092ddec fftools/ffmpeg_mux: avoid leaking pkt on errors 2022-08-13 12:41:05 +02:00
Anton Khirnov
5d499d3250 fftools/ffmpeg: mark all encode sync queues as done before flushing encoders 2022-08-13 12:41:05 +02:00
Stephen Hutchinson
f6a36c7cf9 avformat/avisynth: add missing avs_release_video_frame
The AviSynth C API requires using avs_release_video_frame
whenever avs_get_frame has been used, but the recent addition
of frameprop reading to the demuxer was missing this in
avisynth_create_stream_video.

Signed-off-by: Stephen Hutchinson <qyot27@gmail.com>
2022-08-12 17:21:44 -04:00
Andreas Rheinhardt
c1b966a189 avcodec/mimic: Fix undefined pointer arithmetic
NULL + anything is UB.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-12 19:37:06 +02:00
Pierre-Anthony Lemieux
d5b46fa07d avformat/imfdec: preserve stream information
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-12 18:55:00 +02:00
Pierre-Anthony Lemieux
f2403d1530 avformat: refactor ff_stream_encode_params_copy() to stream_params_copy()
Addresses http://ffmpeg.org/pipermail/ffmpeg-devel/2022-August/299726.html

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-12 18:54:19 +02:00
Haihao Xiang
7158f1e64d configure: add --enable-libvpl option
This allows user to build FFmpeg against Intel oneVPL. oneVPL 2.6
is the required minimum version when building Intel oneVPL code.

It will fail to run configure script if both libmfx and libvpl are
enabled.

It is recommended to use oneVPL for new work, even for currently available
hardwares [1]

Note the preferred child device type is d3d11va for libvpl on Windows.
The commands below will use d3d11va if d3d11va is available on Windows.
$ ffmpeg -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -qsv_device 0 -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -init_hw_device qsv=qsv:hw_any -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -init_hw_device qsv=qsv:hw_any,child_device=0 -hwaccel qsv -c:v h264_qsv ...

User may use child_device_type option to specify child device type to
dxva2 or derive a qsv device from a dxva2 device
$ ffmpeg -init_hw_device qsv=qsv:hw_any,child_device=0,child_device_type=dxva2 -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -init_hw_device dxva2=d3d9:0 -init_hw_device qsv=qsv@d3d9 -hwaccel qsv -c:v h264_qsv ...

[1] https://www.intel.com/content/www/us/en/develop/documentation/upgrading-from-msdk-to-onevpl/top.html
2022-08-12 10:43:39 +08:00
Haihao Xiang
54c4196d56 lavfi/qsv: create mfx session using oneVPL for qsv filters
Use the mfxLoader handle in qsv hwdevice to create mfx session for qsv
filters.

This is in preparation for oneVPL support
2022-08-12 10:43:39 +08:00
Haihao Xiang
6900feef06 lavc/qsv: create mfx session using oneVPL for decoding/encoding
If qsv hwdevice is available, use the mfxLoader handle in qsv hwdevice
to create mfx session. Otherwise create mfx session with a new mfxLoader
handle.

This is in preparation for oneVPL support
2022-08-12 10:43:39 +08:00
Haihao Xiang
05bd88dca2 lavu/hwcontext_qsv: make qsv hwdevice works with oneVPL
In oneVPL, MFXLoad() and MFXCreateSession() are required to create a
workable mfx session[1]

Add config filters for D3D9/D3D11 session (galinart)

The default device is changed to d3d11va for oneVPL when both d3d11va
and dxva2 are enabled on Microsoft Windows

This is in preparation for oneVPL support

[1] https://spec.oneapi.io/versions/latest/elements/oneVPL/source/programming_guide/VPL_prg_session.html#onevpl-dispatcher

Co-authored-by: galinart <artem.galin@intel.com>
Signed-off-by: galinart <artem.galin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-08-12 10:43:39 +08:00
Haihao Xiang
e0bbdbe0a6 lavu/hwcontext_qsv: add loader field to AVQSVDeviceContext
In oneVPL, a valid mfxLoader handle is needed when creating mfx session
for decoding, encoding and processing[1], so add loader field to
AVQSVDeviceContext. User should fill this field before calling
av_hwdevice_ctx_init() if using oneVPL

This is in preparation for oneVPL support

[1]https://spec.oneapi.io/versions/latest/elements/oneVPL/source/programming_guide/VPL_prg_session.html#onevpl-dispatcher
2022-08-12 10:43:39 +08:00
Haihao Xiang
c77149bc37 qsv: restrict OPAQUE memory to MFX_VERSION < 2.0
OPAQUE memory isn't supported for MFX_VERSION >= 2.0[1][2]. This is in
preparation for oneVPL support

[1] https://spec.oneapi.io/versions/latest/elements/oneVPL/source/VPL_intel_media_sdk.html#msdk-full-name-feature-removals
[2] https://github.com/oneapi-src/oneVPL
2022-08-12 10:43:39 +08:00
Haihao Xiang
63cda40930 qsvenc: restrict MFX_RATECONTROL_LA_EXT to MFX_VERSION < 2.0
MFX_RATECONTROL_LA_EXT isn't supported for MFX_VERSION >= 2.0[1][2].
This is in preparation for oneVPL support

[1] https://spec.oneapi.io/versions/latest/elements/oneVPL/source/VPL_intel_media_sdk.html#msdk-full-name-feature-removals
[2] https://github.com/oneapi-src/oneVPL
2022-08-12 10:43:39 +08:00
Haihao Xiang
fdfab65583 qsvenc: restrict multi-frame encode to MFX_VERSION < 2.0
Multi-frame encode isn't supported for MFX_VERSION >= 2.0[1][2]. This is
in preparation for oneVPL support

[1] https://spec.oneapi.io/versions/latest/elements/oneVPL/source/VPL_intel_media_sdk.html#msdk-full-name-feature-removals
[2] https://github.com/oneapi-src/oneVPL
2022-08-12 10:43:39 +08:00
Haihao Xiang
40684899e8 qsv: restrict audio related code to MFX_VERSION < 2.0
Audio isn't supported for MFX_VERSION >= 2.0[1][2]. This is in
preparation for oneVPL support

[1] https://spec.oneapi.io/versions/latest/elements/oneVPL/source/VPL_intel_media_sdk.html#msdk-full-name-feature-removals
[2] https://github.com/oneapi-src/oneVPL
2022-08-12 10:43:39 +08:00
Haihao Xiang
6aea224382 qsv: restrict user plugin to MFX_VERSION < 2.0
User plugin isn't supported for MFX_VERSION >= 2.0[1][2]. This is in
preparation for oneVPL Support

[1] https://spec.oneapi.io/versions/latest/elements/oneVPL/source/VPL_intel_media_sdk.html#msdk-full-name-feature-removals
[2] https://github.com/oneapi-src/oneVPL
2022-08-12 10:43:39 +08:00
Haihao Xiang
3e61b7dd7f qsv: remove mfx/ prefix from mfx headers
The following Cflags has been added to libmfx.pc, so mfx/ prefix is no
longer needed when including mfx headers in FFmpeg.
   Cflags: -I${includedir} -I${includedir}/mfx

Some old versions of libmfx have the following Cflags in libmfx.pc
   Cflags: -I${includedir}

We may add -I${includedir}/mfx to CFLAGS when running 'configure
--enable-libmfx' for old versions of libmfx, if so, mfx headers without
mfx/ prefix can be included too.

If libmfx comes without pkg-config support, we may do a small change to
the settings of the environment(e.g. set -I/opt/intel/mediasdk/include/mfx
instead of -I/opt/intel/mediasdk/include to CFLAGS), then the build can
find the mfx headers without mfx/ prefix

After applying this change, we won't need to change #include for mfx
headers when mfx headers are installed under a new directory.

This is in preparation for oneVPL support (mfx headers in oneVPL are
installed under vpl directory)
2022-08-12 10:43:39 +08:00
Haihao Xiang
7c713ab42c configure: fix the check for MFX_CODEC_VP9
The data structures for VP9 in mfxvp9.h is wrapped by
MFX_VERSION_NEXT, which means those data structures have never been used
in a public release. Actually MFX_CODEC_VP9 and other VP9 stuffs are
added in mfxstructures.h. In addition, mfxdefs.h is included in
mfxvp9.h, so we may use the check in this patch for MFX_CODEC_VP9

This is in preparation for oneVPL support because mfxvp9.h is removed
from oneVPL [1]

[1]: https://github.com/oneapi-src/oneVPL
2022-08-12 10:43:39 +08:00
Haihao Xiang
fea5aed279 configure: ensure --enable-libmfx uses libmfx 1.x
Intel's oneVPL is a successor to MediaSDK, but removed some obsolete
features of MediaSDK[1], some early versions of oneVPL still use libmfx
as library name[2]. However some of obsolete features, including OPAQUE
memory, multi-frame encode, user plugins and LA_EXT rate control mode
etc, have been enabled in QSV, so user can not use --enable-libmfx to
enable QSV if using an early version of oneVPL SDK. In order to ensure
user builds FFmpeg against a right version of libmfx, this patch added a
check for version < 2.0 and warning message about the used obsolete
features.

[1] https://spec.oneapi.io/versions/latest/elements/oneVPL/source/VPL_intel_media_sdk.html
[2] https://github.com/oneapi-src/oneVPL
2022-08-12 10:43:39 +08:00
Andreas Rheinhardt
3443330b17 avcodec/mpegvideo: Move setting mb_height to ff_mpv_init_context_frame
It is the proper place to set it, directly besides mb_width and
mb_stride. The reason for doing it the way it is done now seems
to be that the code does not create more slice contexts than necessary
(i.e. not more than one per row), so that this number needs to be
known before setting the number of slices. But this can always be
arranged by just moving the code that sets the number of slices.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-10 18:49:35 +02:00