1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-19 05:49:09 +02:00

169 Commits

Author SHA1 Message Date
Lynne
e25667f9f1
hwcontext_vulkan: ignore false positive validation errors
Issue ref:
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/6627
2024-08-11 05:13:18 +02:00
Lynne
ef11a6456d
hwcontext_vulkan: do not chain structs of unsupported extensions in vkCreateDevice
Fixes:

vkCreateDevice(): pCreateInfo->pNext<VkPhysicalDeviceOpticalFlowFeaturesNV> includes a
pointer to a VkPhysicalDeviceOpticalFlowFeaturesNV, but when creating VkDevice, the
parent extension (VK_NV_optical_flow) was not included in ppEnabledExtensionNames.
The Vulkan spec states: Each pNext member of any structure (including this one) in
the pNext chain must be either NULL or a pointer to a valid struct for extending
VkDeviceCreateInfo.
2024-08-11 05:13:17 +02:00
Lynne
5f0f1f7b7a
libavutil: deprecate the old Vulkan queue API, add doc/APIchanges entries 2024-08-11 05:13:15 +02:00
Lynne
2ce0e51503
hwcontext_vulkan: add support for Vulkan encoding 2024-08-11 05:13:14 +02:00
Lynne
55adcb4fc5
hwcontext_vulkan: add support for VK_EXT_shader_object
We'd like to use it eventually, and its already covered by
the minimum version of the headers we require.
2024-08-11 05:13:13 +02:00
Lynne
c19af16f8d
hwcontext_vulkan: enable storageBuffer16BitAccess if available 2024-08-11 05:13:12 +02:00
Lynne
957d34784a
hwcontext_vulkan: constify validation layer features table
The struct data seem to get corrupted otherwise.
Possibly a validation layer or libvulkan issue.
2024-08-11 05:13:11 +02:00
Lynne
9e606b33a8
hwcontext_vulkan: add HOST_CACHED flag to transfer buffer
Significantly speeds up downloads on devices without host mapping.
2024-08-11 05:13:11 +02:00
Lynne
aea4d4b423
hwcontext_vulkan: rewrite upload/download
This commit was long overdue. The old transfer dubiously tried to
merge as much code as possible, and had very little in the way
of optimizations, apart from basic host-mapping.

The new code uses buffer pools for any temporary bufflers, and
handles falling back to buffer-based uploads if host-mapping fails.

Roundtrip performance difference:
ffmpeg -init_hw_device "vulkan=vk:0,debug=0,disable_multiplane=1" -f lavfi \
-i color=red:s=3840x2160 -vf hwupload,hwdownload,format=yuv420p -f null -

7900XTX:
Before: 224fps
After: 502fps

Ada, with proprietary drivers:
Before: 29fps
After: 54fps

Alder Lake:
Before: 85fps
After: 108fps

With the host-mapping codepath disabled:
Before: 32fps
After: 51fps
2024-08-11 05:13:11 +02:00
Lynne
81c5d4ea0e
hwcontext_vulkan: remove unused struct 2024-08-11 05:13:10 +02:00
Lynne
a30b7c0158
hwcontext_vulkan: initialize optical flow queues if available
Lets us implement FPS conversion.
2024-08-11 05:13:10 +02:00
Lynne
8790a30882
hwcontext_vulkan: rewrite queue picking system for the new API
This allows us to support different video ops on different queues,
as well as any other arbitrary queues we need.
2024-08-11 05:13:09 +02:00
Lynne
13489c8a21
hwcontext_vulkan: add a new mechanism to expose used queue families
The issue with the old mechanism is that we had to introduce new
API each time we needed a new queue family, and all the queue families
were functionally fixed to a given purpose.

Nvidia's GPUs are able to handle video encoding and compute on the
same queue, which results in a speedup when pre-processing is required.

Also, this enables us to expose optical flow queues for frame interpolation.
2024-08-11 05:13:03 +02:00
Haihao Xiang
a4630d479a lavu/hwcontext_vulkan: Support write on drm frame
Otherwise nothing is written into the destination when a write mapping
is requested.

For example, a vulkan frame mapped from a drm frame (which is wrapped as
a vaapi frame in the example) is used as the output of scale_vulkan
filter, it always gets a green screen without this patch.

ffmpeg -init_hw_device vaapi=va -init_hw_device vulkan=vulkan@va
-filter_hw_device vulkan -f lavfi -i testsrc=size=352x288,format=nv12
-vf
"hwupload,scale_vulkan,hwmap=derive_device=vaapi:reverse=1,format=vaapi,hwdownload,format=nv12"
-f nut - | ffplay -

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-06-12 01:53:18 +02:00
Andreas Rheinhardt
790f793844 avutil/common: Don't auto-include mem.h
There are lots of files that don't need it: The number of object
files that actually need it went down from 2011 to 884 here.

Keep it for external users in order to not cause breakages.

Also improve the other headers a bit while just at it.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-31 00:08:43 +01:00
Lynne
ecdc94b97f
vulkan_av1: port to the new stable API
Co-Authored-by: Dave Airlie <airlied@redhat.com>
2024-03-25 08:54:40 +01:00
Haihao Xiang
d296c8689d lavu/hwcontext_vulkan: check PCI ID if possible
Otherwise the derived device and the source device might have different
PCI ID in a multiple-device system.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-03-19 09:37:39 +08:00
Andreas Rheinhardt
3e669b24e2 avutil/hwcontext: Allocate AVHWFramesCtx jointly with its internals
This is possible because the lifetime of these structures coincide.
It has the advantage of allowing to remove AVHWFramesInternal
from the public header; given that AVHWFramesInternal.priv is no more,
most accesses to AVHWFramesInternal are no more; indeed, the only
field accessed of it outside of hwcontext.c is the internal frame pool,
making this commit very simple.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-07 08:53:31 -03:00
Andreas Rheinhardt
e70e9b6554 avutil/hwcontext_vulkan: Allocate pub and priv frames hwctx together
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanFramesPriv as one no longer has to
go through AVHWFramesInternal.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-01 18:48:06 +01:00
Andreas Rheinhardt
2d63379cae avutil/hwcontext_vulkan: Allocate public and priv device hwctx together
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanDevicePriv as one no longer has to
go through AVHWDeviceInternal.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-01 18:48:06 +01:00
Zhao Zhili
74e27d9e31 avutil/hwcontext_vulkan: Fix memleaks when transfer to vulkan
Without ff_vk_exec_discard_deps which is called by ff_vk_exec_wait,
the reference count of hwframe context cannot reach zero due to
circular reference created by ff_vk_exec_add_dep_frame.

Fix #10873

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-03-01 17:22:14 +08:00
Zhao Zhili
03275b0f09 avutil/hwcontext_vulkan: Fix leaks in map_from_drm
Also simplify error handing.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-03-01 17:20:29 +08:00
Zhao Zhili
6f9730cb28 avutil/hwcontext_vulkan: Fix leaks when semaphore creation fails
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-03-01 17:20:21 +08:00
Benjamin Cheng
185871fdd3 hwcontext_vulkan: guard unistd.h include
win32 typically doesn't have unistd.h, so always including it will break
MSVC builds. The usage of those POSIX functions are already guarded by
_WIN32, so use that to guard unistd.h include as well.
2023-12-11 16:36:56 +01:00
Diederik de Haas via ffmpeg-devel
c07ed10b0e apply spelling fixes
Fix spelling issue as reported by Debian's lintian tool:
accomodate -> accommodate
addtional -> additional
auxillary -> auxiliary
bellow -> below
betweeen -> between
Calulate -> Calculate
coefficents -> coefficients
Defalt -> Default
defaul -> default
higer -> higher
neccesary -> necessary
orignal -> original
ouput -> output
precison -> precision
processsing -> processing
substract -> subtract
Transfered -> Transferred
upto -> up to

Also add several of them to the 'common typos' check in patcheck.

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
2023-11-18 19:55:42 +01:00
Víctor Manuel Jáquez Leal
854012ec59 avutil/hwcontext_vulkan: get VkFormatFeatureFlagBits2
Rather than the VkFormatFeatureFlagBits enum

Signed-off-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>
2023-11-09 09:13:47 +01:00
Zhao Zhili
6f39dee974 avutil/hwcontext_vulkan: fix run on macOS
VK_KHR_PORTABILITY_ENUMERATION_EXTENSION_NAME is required on macOS,
and VK_INSTANCE_CREATE_ENUMERATE_PORTABILITY_BIT_KHR flag should
be set.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2023-11-09 19:23:01 +08:00
Zhao Zhili
63078b4599 avutil/hwcontext_vulkan: cuda doesn't belong to valid_sw_formats
Move it to transfer_get_formats.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2023-10-29 13:58:30 +08:00
Zhao Zhili
891f70c6d5 avutil/hwcontext_vulkan: fix memleak when device_create is skipped
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2023-10-29 13:57:43 +08:00
Lynne
c258623c0a
hwcontext_vulkan: improve queue family init code
When users zero-init'd the struct, or left it as-is, the encode
queue family matched the graphics queue family, which led it to be
incorrectly logged as being used for encode.

This just improves the logging so this isn't printed anymore.
2023-10-24 06:07:09 +02:00
Lynne
81cc0e1345
hwcontext_vulkan: properly support STORAGE usage for mutliplane images
Fixes multiplane support on Nvidia.

Also, remove the ENCODE usage, even if the driver signals it as supported.
Currently, it's not used, and when it is used, it'll be gated behind
two extension checks.
2023-10-05 23:50:30 +02:00
Andreas Rheinhardt
dfac782b13 avutil/hwcontext_vulkan: Cosmetics
The alignment in vulkan_unmap_from_drm() (formerly the clone
of vulkan_frame_free()) is nicer than the in vulkan_frame_free(),
let's preserve it.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-15 02:38:14 +02:00
Andreas Rheinhardt
677635cd04 avutil/hwcontext_vulkan: Deduplicate code
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-15 02:38:05 +02:00
Andreas Rheinhardt
47b1c0d0db avutil/hwcontext_vulkan: Improve type-safety
The AVBuffer API uses uint8_t as base type for buffers
and therefore its free callbacks need to abide by this.
Therefore vulkan_frame_free() used an inappropriate signature
which caused casts whenever this function has been called
manually.

This commit changes this by making vulkan_frame_free()
use the proper type and a vulkan_frame_free_cb() that
is used as free callback for the AVBuffer API.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-15 02:37:56 +02:00
Andreas Rheinhardt
a6bd2ee759 avutil/hwcontext_vulkan: Remove redundant resetting
vulkan_free_internal() already resets the AVVkFrame.internal
pointer.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-15 02:37:36 +02:00
Lynne
358919506d
vulkan: enable VK_KHR_cooperative_matrix
It's of interest to API users, and of interest to us,
as a DCT/DST can be implemented via matrix multiplies.
2023-08-26 23:14:53 +02:00
Chris Spencer
f0b1cab538 hwcontext_vulkan: always use create_pnext in vulkan_pool_alloc
Currently, create_pnext is only used if an applicable external memory
extension is enabled. This will usually the case when used from the command
line, but may not be when the Vulkan context is created manually.

For images used in video decoding, create_pnext contains the video profile
list, which is mandatory.[1] This fixes a GPU crash when using RADV.

[1] https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkImageCreateInfo.html#VUID-VkImageCreateInfo-usage-04815

Signed-off-by: Chris Spencer <spencercw@gmail.com>
2023-08-20 22:47:09 +02:00
Jan Beich
e6bd8b1323 hwcontext_vulkan: hide Linux-only header after 571756bf2fe2
major/minor are in <sys/types.h> on BSDs and <sys/mkdev.h> on Solaris-like.

libavutil/hwcontext_vulkan.c:55:10: fatal error: 'sys/sysmacros.h' file not found
#include <sys/sysmacros.h>
^~~~~~~~~~~~~~~~~
2023-07-21 20:04:10 +02:00
Lynne
d0f1d937fe
hwcontext_vulkan: free temporary array once unneeded
Fixes a small memory leak.
This also prevents leaks on malloc/mutex init errors.
2023-06-15 22:00:41 +02:00
Lynne
b4d5baa8b0
hwcontext_vulkan: call ff_vk_uninit() on device uninit
This fixes three memory leaks from ff_vk_load_props().
2023-06-15 22:00:41 +02:00
Lynne
eff565dc19
hwcontext_vulkan: tune execution pools
Having less in-flight resources is better in this case.
2023-06-07 23:59:17 +02:00
Philip Langdale
378fb40282 avutil/hwcontext_vulkan: disable multiplane when deriving from cuda
Today, cuda is not able to import multiplane images, and cuda requires
images to be imported whether you trying to import to cuda or export
from cuda (in the later case, the image is imported and then copied
into on the cuda side). So any interop between cuda and vulkan requires
that multiplane be disabled.

The existing option for this is not sufficient, because when deriving
devices it is not possible to specify any options.

And, it is necessary to derive the Vulkan device, because any pipeline
that involves uploading from cuda to vulkan and then back to cuda must
use the same cuda context on both sides, and the only way to propagate
the cuda context all the way through is to derive the device at each
stage.

ie:

-vf hwupload=derive_device=vulkan,<filters>,hwupload=derive_device=cuda
2023-06-03 16:29:38 -07:00
Lynne
dfff3877b7
vulkan: add support for the atomic float ops extension 2023-05-29 00:42:01 +02:00
Lynne
77478f6793
av1dec: add Vulkan hwaccel 2023-05-29 00:42:00 +02:00
Niklas Haas
9675e54b02
avutil/hwcontext_vulkan: add libplacebo required features
For compatibility with vf_libplacebo
2023-05-29 00:41:55 +02:00
Lynne
51b7fe81be
hwcontext_vulkan: enable additional device properties 2023-05-29 00:41:51 +02:00
Lynne
33fc919bb7
hwcontext_vulkan: remove duplicate code, port to use generic vulkan utils
The temporary AVFrame on staack enables us to use the common
dependency/dispatch code in prepare_frame().
The prepare_frame() function is used for both frame initialization
and frame import/export queue family transfer operations.
In the former case, no AVFrame exists yet, so, as this is purely
libavutil code, we create a temporary frame on stack. Otherwise,
we'd need to allocate multiple frames somewhere, one for each
possible command buffer dispatch.
2023-05-29 00:41:51 +02:00
Lynne
94e17a63a4
hwcontext_vulkan: don't change properties if prepare_frame fails 2023-05-29 00:41:50 +02:00
Lynne
32fc36ee61
hwcontext_vulkan: remove linear+host_visible "fast" path
The idea was that it's faster to map linear images and copy them
via regular memcpy. This is a very niche use, plus very inconsistently
useful, as it would only really be faster on a few Intel GPUs.
Even then, using the non-cached memcpy would've been better.

Instead, scrap this code. Drivers are better at figuring out
what copy to use, and if we're host-mapping, it should actually be
just as fast, if not faster.
2023-05-29 00:41:50 +02:00
Lynne
48f85de0e7
hwcontext_vulkan: rewrite to support multiplane surfaces
This commit adds proper handling of multiplane images throughout
all of the hwcontext code. To avoid breakage of individual
components, the change is performed as a single commit.
2023-05-29 00:41:49 +02:00