1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-03 05:10:03 +02:00
Commit Graph

5883 Commits

Author SHA1 Message Date
Lynne
d0f1d937fe
hwcontext_vulkan: free temporary array once unneeded
Fixes a small memory leak.
This also prevents leaks on malloc/mutex init errors.
2023-06-15 22:00:41 +02:00
Lynne
b4d5baa8b0
hwcontext_vulkan: call ff_vk_uninit() on device uninit
This fixes three memory leaks from ff_vk_load_props().
2023-06-15 22:00:41 +02:00
Philip Langdale
41be6a5593 lavu/hwcontext_cuda: declare support for rgb32/bgr32
nvenc declares support for these formats, but if hwcontext_cuda doesn't
do that as well, then it's not possible to hwupload them for use in a
possible cuda pipeline before encoding.
2023-06-15 12:29:52 -07:00
Martin Storsjö
d78bffbf3d libavutil: Add version bump for new aarch64 cpu flags
This was missed in 397cb623c8.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-06-10 00:21:58 +03:00
Lynne
eff565dc19
hwcontext_vulkan: tune execution pools
Having less in-flight resources is better in this case.
2023-06-07 23:59:17 +02:00
Lynne
5f1be341c2
vulkan: discard dependencies when explicitly waiting for execution
This reduces memory needed dramatically, as unneeded resources
can be immediately returned to the pool.
Although waitforfences is threadsafe, we add a mutex wait around
it, as the mutex fence in combination with waitforfences assures
us that no other thread will reset the fence in the meanwhile
whilst the mutex is locked. This allows is to call
ff_vk_exec_discard_deps.
2023-06-07 23:59:16 +02:00
Lynne
975cd48bb3
vulkan: synchronize access to execution pool fences
vkResetFences is specified as being user-synchronized
(yet vkWaitFences, is not).
2023-06-07 23:59:16 +02:00
Martin Storsjö
c76643021e aarch64: Add Windows runtime detection of the dotprod instructions
For Windows, there's no publicly defined constant for checking for
the i8mm extension yet.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-06-06 12:50:15 +03:00
Martin Storsjö
9b0052200a aarch64: Add Apple runtime detection of dotprod and i8mm using sysctl
For now, there's not much value in this since Clang don't support
enabling the dotprod or i8mm features with either .arch_extension
or .arch (it has to be enabled by the base arch flags passed to
the compiler). But it may be supported in the future.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-06-06 12:41:20 +03:00
Martin Storsjö
493fcde50a aarch64: Add Linux runtime cpu feature detection using HWCAP_CPUID
Based partially on code by Janne Grunau.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-06-06 12:40:57 +03:00
Martin Storsjö
397cb623c8 aarch64: Add cpu flags for the dotprod and i8mm extensions
Set these available if they are available unconditionally for
the compiler.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-06-06 12:40:42 +03:00
Martin Storsjö
fb1b88af77 configure: aarch64: Support assembling the dotprod and i8mm arch extensions
These are available since ARMv8.4-a and ARMv8.6-a respectively,
but can also be available optionally since ARMv8.2-a.

Check if ".arch armv8.2-a" and ".arch_extension {dotprod,i8mm}" are
supported, and check if the instructions can be assembled.

Current clang versions fail to support the dotprod and i8mm
features in the .arch_extension directive, but do support them
if enabled with -march=armv8.4-a on the command line. (Curiously,
lowering the arch level with ".arch armv8.2-a" doesn't make the
extensions unavailable if they were enabled with -march; if that
changes, Clang should also learn to support these extensions via
.arch_extension for them to remain usable here.)

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-06-06 12:40:26 +03:00
Philip Langdale
378fb40282 avutil/hwcontext_vulkan: disable multiplane when deriving from cuda
Today, cuda is not able to import multiplane images, and cuda requires
images to be imported whether you trying to import to cuda or export
from cuda (in the later case, the image is imported and then copied
into on the cuda side). So any interop between cuda and vulkan requires
that multiplane be disabled.

The existing option for this is not sufficient, because when deriving
devices it is not possible to specify any options.

And, it is necessary to derive the Vulkan device, because any pipeline
that involves uploading from cuda to vulkan and then back to cuda must
use the same cuda context on both sides, and the only way to propagate
the cuda context all the way through is to derive the device at each
stage.

ie:

-vf hwupload=derive_device=vulkan,<filters>,hwupload=derive_device=cuda
2023-06-03 16:29:38 -07:00
Lynne
58f82fc26a
vulkan: replace usage of %lu with %"SIZE_SPECIFIER" 2023-05-29 03:22:58 +02:00
Michael Niedermayer
75918016ab
Move bessel_i0() from swresample/resample to avutil/mathematics
0th order modified bessel function of the first kind are used in multiple
places, lets avoid having 3+ different implementations
I picked this one as its accurate and quite fast, it can be replaced if
a better one is found

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-05-29 00:45:28 +02:00
Lynne
db1d022781
APIchanges: add hwcontext_vulkan changes and bump lavu minor 2023-05-29 00:42:02 +02:00
Lynne
bef86ba86c
APIchanges: add new pixel formats supported and bump lavu minor 2023-05-29 00:42:02 +02:00
Lynne
160a415e22
lavfi: add nlmeans_vulkan filter 2023-05-29 00:42:01 +02:00
Lynne
dfff3877b7
vulkan: add support for the atomic float ops extension 2023-05-29 00:42:01 +02:00
Lynne
77478f6793
av1dec: add Vulkan hwaccel 2023-05-29 00:42:00 +02:00
Niklas Haas
9675e54b02
avutil/hwcontext_vulkan: add libplacebo required features
For compatibility with vf_libplacebo
2023-05-29 00:41:55 +02:00
Lynne
05ce6473ac
lavfi: add lavfi-only Vulkan infrastructure 2023-05-29 00:41:51 +02:00
Lynne
51b7fe81be
hwcontext_vulkan: enable additional device properties 2023-05-29 00:41:51 +02:00
Lynne
33fc919bb7
hwcontext_vulkan: remove duplicate code, port to use generic vulkan utils
The temporary AVFrame on staack enables us to use the common
dependency/dispatch code in prepare_frame().
The prepare_frame() function is used for both frame initialization
and frame import/export queue family transfer operations.
In the former case, no AVFrame exists yet, so, as this is purely
libavutil code, we create a temporary frame on stack. Otherwise,
we'd need to allocate multiple frames somewhere, one for each
possible command buffer dispatch.
2023-05-29 00:41:51 +02:00
Lynne
94e17a63a4
hwcontext_vulkan: don't change properties if prepare_frame fails 2023-05-29 00:41:50 +02:00
Lynne
32fc36ee61
hwcontext_vulkan: remove linear+host_visible "fast" path
The idea was that it's faster to map linear images and copy them
via regular memcpy. This is a very niche use, plus very inconsistently
useful, as it would only really be faster on a few Intel GPUs.
Even then, using the non-cached memcpy would've been better.

Instead, scrap this code. Drivers are better at figuring out
what copy to use, and if we're host-mapping, it should actually be
just as fast, if not faster.
2023-05-29 00:41:50 +02:00
Lynne
48f85de0e7
hwcontext_vulkan: rewrite to support multiplane surfaces
This commit adds proper handling of multiplane images throughout
all of the hwcontext code. To avoid breakage of individual
components, the change is performed as a single commit.
2023-05-29 00:41:49 +02:00
Lynne
a4d63b46d9
vulkan: make GLSL macro functions semicolumn-safe 2023-05-29 00:41:49 +02:00
Lynne
83024beec2
vulkan: enable forcing of full subgroups 2023-05-29 00:41:49 +02:00
Lynne
758f8b26b9
vulkan: add ff_vk_count_images() 2023-05-29 00:41:48 +02:00
Lynne
b5eaeb1f13
vulkan: rewrite to support all necessary features
This commit rewrites the majority of vulkan.c to enable its use
as a general-purpose high-level utility code, usable for decoding,
encoding, and filtering of video frames.

The dependency system was rewritten to simplify management of
execution.

The image handling system was rewritten to accomodate multiplane
images.

Due to how related all the new features were, this is a single
commit.
2023-05-29 00:41:48 +02:00
Lynne
721b71da4a
vulkan: return current queue index from ff_vk_qf_rotate() 2023-05-29 00:41:48 +02:00
Lynne
b15104ed97
vulkan: add support for retrieving queue, query and video properties 2023-05-29 00:41:47 +02:00
Lynne
6eaf3fe69c
vulkan: add support for queries 2023-05-29 00:41:47 +02:00
Lynne
f3fb1b50bb
vulkan: minor indent fix, add support for synchronous submission/waiting 2023-05-29 00:41:46 +02:00
Lynne
d386988c39
vulkan: use device properties 2 and add a convenience loader function 2023-05-29 00:41:46 +02:00
Lynne
bf69a64135
vulkan: add size tracking to buffer structs 2023-05-29 00:41:46 +02:00
Lynne
b18e20a4ee
vulkan: do not wait for device idle when destroying buffers
This should be done explicitly.
2023-05-29 00:41:45 +02:00
Lynne
15de0af8f0
vulkan: allow alloc pNext in ff_vk_create_buf 2023-05-29 00:41:45 +02:00
Lynne
af48790465
vulkan: support ignoring memory properties when allocating 2023-05-29 00:41:45 +02:00
Lynne
3c2f43d8ee
vulkan: expose ff_vk_alloc_mem() 2023-05-29 00:41:44 +02:00
Lynne
fa67ccee37
vulkan: add ff_vk_image_create() 2023-05-29 00:41:44 +02:00
Lynne
e8fce74abf
vulkan: add ff_vk_qf_fill() 2023-05-29 00:41:43 +02:00
Lynne
b5e333bba7
vulkan: add pNext argument to ff_vk_create_buf() 2023-05-29 00:41:43 +02:00
Lynne
a0d47a2ad9
vulkan: fix comment statement about exec_queue blocking 2023-05-29 00:41:43 +02:00
Lynne
619b1265a2
vulkan: add additional error codes 2023-05-29 00:41:42 +02:00
Lynne
0c9c0e40fb
vulkan: define VK_NO_PROTOTYPES
This just disables the vulkan headers from defining any symbols
like vkCmdPipelineBarrier2(). Instead, all functions must be loaded
via the loader and used as function pointers as vk->CmdPipelineBarrier2.

Mostly just forces developers to write correct code, as using the
symbols can be undesirable in case API users define their own
function wrappers via the loader API.
2023-05-29 00:41:42 +02:00
Lynne
92ddd415bc
vulkan: lock queues before submitting operations 2023-05-29 00:41:42 +02:00
Lynne
9b385b480f
hwcontext_vulkan: enable GPU-assisted validation when debugging 2023-05-29 00:41:41 +02:00
Lynne
e5e12c5078
hwcontext_vulkan: load query-related functions
Needed for both encoding and decoding.
2023-05-29 00:41:41 +02:00