1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-06-30 22:24:04 +02:00
Commit Graph

175 Commits

Author SHA1 Message Date
604dfdb44c hwcontext_vulkan: align host mapping size to minImportedHostPointerAlignment
This was left out of the recent rewrite of the system.
2024-08-16 01:22:16 +02:00
18d964fc2c vulkan: enable encoding of images if video_maintenance1 is enabled
Vulkan encoding was designed in a very... consolidated way.
You had to know the exact codec and profile that the image was going to
eventually be encoded as at... image creation time. Unfortunately, as good
as our code is, glimpsing into the exact future isn't what its capable of.

video_maintenance1 removed that requirement, which only then made encoding
images practically possible.
2024-08-16 01:22:16 +02:00
46c13834b6 hwcontext_vulkan: enable VK_KHR_video_maintenance1
We require it for encoding.
2024-08-16 01:22:15 +02:00
97e947a2a7 hwcontext_vulkan: setup extensions before features
The issue is that enabling features requires that the device
extension is supported. The extensions bitfield was set later,
so it was always 0, leading to no features being added.
2024-08-16 01:22:15 +02:00
c3cbaf39bb hwcontext_vulkan: don't enable deprecated VK_KHR_sampler_ycbcr_conversion extension
It was added to Vulkan 1.1 a long time ago.
Validation layer will warn if this is enabled.
2024-08-16 01:22:15 +02:00
3f65d24075 hwcontext_vulkan: fix user layers, add support for different debug modes
The validation layer option only supported GPU-assisted validation.
This is mutually exclusive with shader debug printfs, so we need to
differentiate between the two.

This also fixes issues with user-given layers, and leaks in case of
errors.
2024-08-16 01:22:14 +02:00
e25667f9f1 hwcontext_vulkan: ignore false positive validation errors
Issue ref:
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/6627
2024-08-11 05:13:18 +02:00
ef11a6456d hwcontext_vulkan: do not chain structs of unsupported extensions in vkCreateDevice
Fixes:

vkCreateDevice(): pCreateInfo->pNext<VkPhysicalDeviceOpticalFlowFeaturesNV> includes a
pointer to a VkPhysicalDeviceOpticalFlowFeaturesNV, but when creating VkDevice, the
parent extension (VK_NV_optical_flow) was not included in ppEnabledExtensionNames.
The Vulkan spec states: Each pNext member of any structure (including this one) in
the pNext chain must be either NULL or a pointer to a valid struct for extending
VkDeviceCreateInfo.
2024-08-11 05:13:17 +02:00
5f0f1f7b7a libavutil: deprecate the old Vulkan queue API, add doc/APIchanges entries 2024-08-11 05:13:15 +02:00
2ce0e51503 hwcontext_vulkan: add support for Vulkan encoding 2024-08-11 05:13:14 +02:00
55adcb4fc5 hwcontext_vulkan: add support for VK_EXT_shader_object
We'd like to use it eventually, and its already covered by
the minimum version of the headers we require.
2024-08-11 05:13:13 +02:00
c19af16f8d hwcontext_vulkan: enable storageBuffer16BitAccess if available 2024-08-11 05:13:12 +02:00
957d34784a hwcontext_vulkan: constify validation layer features table
The struct data seem to get corrupted otherwise.
Possibly a validation layer or libvulkan issue.
2024-08-11 05:13:11 +02:00
9e606b33a8 hwcontext_vulkan: add HOST_CACHED flag to transfer buffer
Significantly speeds up downloads on devices without host mapping.
2024-08-11 05:13:11 +02:00
aea4d4b423 hwcontext_vulkan: rewrite upload/download
This commit was long overdue. The old transfer dubiously tried to
merge as much code as possible, and had very little in the way
of optimizations, apart from basic host-mapping.

The new code uses buffer pools for any temporary bufflers, and
handles falling back to buffer-based uploads if host-mapping fails.

Roundtrip performance difference:
ffmpeg -init_hw_device "vulkan=vk:0,debug=0,disable_multiplane=1" -f lavfi \
-i color=red:s=3840x2160 -vf hwupload,hwdownload,format=yuv420p -f null -

7900XTX:
Before: 224fps
After: 502fps

Ada, with proprietary drivers:
Before: 29fps
After: 54fps

Alder Lake:
Before: 85fps
After: 108fps

With the host-mapping codepath disabled:
Before: 32fps
After: 51fps
2024-08-11 05:13:11 +02:00
81c5d4ea0e hwcontext_vulkan: remove unused struct 2024-08-11 05:13:10 +02:00
a30b7c0158 hwcontext_vulkan: initialize optical flow queues if available
Lets us implement FPS conversion.
2024-08-11 05:13:10 +02:00
8790a30882 hwcontext_vulkan: rewrite queue picking system for the new API
This allows us to support different video ops on different queues,
as well as any other arbitrary queues we need.
2024-08-11 05:13:09 +02:00
13489c8a21 hwcontext_vulkan: add a new mechanism to expose used queue families
The issue with the old mechanism is that we had to introduce new
API each time we needed a new queue family, and all the queue families
were functionally fixed to a given purpose.

Nvidia's GPUs are able to handle video encoding and compute on the
same queue, which results in a speedup when pre-processing is required.

Also, this enables us to expose optical flow queues for frame interpolation.
2024-08-11 05:13:03 +02:00
a4630d479a lavu/hwcontext_vulkan: Support write on drm frame
Otherwise nothing is written into the destination when a write mapping
is requested.

For example, a vulkan frame mapped from a drm frame (which is wrapped as
a vaapi frame in the example) is used as the output of scale_vulkan
filter, it always gets a green screen without this patch.

ffmpeg -init_hw_device vaapi=va -init_hw_device vulkan=vulkan@va
-filter_hw_device vulkan -f lavfi -i testsrc=size=352x288,format=nv12
-vf
"hwupload,scale_vulkan,hwmap=derive_device=vaapi:reverse=1,format=vaapi,hwdownload,format=nv12"
-f nut - | ffplay -

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-06-12 01:53:18 +02:00
790f793844 avutil/common: Don't auto-include mem.h
There are lots of files that don't need it: The number of object
files that actually need it went down from 2011 to 884 here.

Keep it for external users in order to not cause breakages.

Also improve the other headers a bit while just at it.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-31 00:08:43 +01:00
ecdc94b97f vulkan_av1: port to the new stable API
Co-Authored-by: Dave Airlie <airlied@redhat.com>
2024-03-25 08:54:40 +01:00
d296c8689d lavu/hwcontext_vulkan: check PCI ID if possible
Otherwise the derived device and the source device might have different
PCI ID in a multiple-device system.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-03-19 09:37:39 +08:00
3e669b24e2 avutil/hwcontext: Allocate AVHWFramesCtx jointly with its internals
This is possible because the lifetime of these structures coincide.
It has the advantage of allowing to remove AVHWFramesInternal
from the public header; given that AVHWFramesInternal.priv is no more,
most accesses to AVHWFramesInternal are no more; indeed, the only
field accessed of it outside of hwcontext.c is the internal frame pool,
making this commit very simple.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-07 08:53:31 -03:00
e70e9b6554 avutil/hwcontext_vulkan: Allocate pub and priv frames hwctx together
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanFramesPriv as one no longer has to
go through AVHWFramesInternal.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-01 18:48:06 +01:00
2d63379cae avutil/hwcontext_vulkan: Allocate public and priv device hwctx together
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanDevicePriv as one no longer has to
go through AVHWDeviceInternal.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-01 18:48:06 +01:00
74e27d9e31 avutil/hwcontext_vulkan: Fix memleaks when transfer to vulkan
Without ff_vk_exec_discard_deps which is called by ff_vk_exec_wait,
the reference count of hwframe context cannot reach zero due to
circular reference created by ff_vk_exec_add_dep_frame.

Fix #10873

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-03-01 17:22:14 +08:00
03275b0f09 avutil/hwcontext_vulkan: Fix leaks in map_from_drm
Also simplify error handing.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-03-01 17:20:29 +08:00
6f9730cb28 avutil/hwcontext_vulkan: Fix leaks when semaphore creation fails
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-03-01 17:20:21 +08:00
185871fdd3 hwcontext_vulkan: guard unistd.h include
win32 typically doesn't have unistd.h, so always including it will break
MSVC builds. The usage of those POSIX functions are already guarded by
_WIN32, so use that to guard unistd.h include as well.
2023-12-11 16:36:56 +01:00
c07ed10b0e apply spelling fixes
Fix spelling issue as reported by Debian's lintian tool:
accomodate -> accommodate
addtional -> additional
auxillary -> auxiliary
bellow -> below
betweeen -> between
Calulate -> Calculate
coefficents -> coefficients
Defalt -> Default
defaul -> default
higer -> higher
neccesary -> necessary
orignal -> original
ouput -> output
precison -> precision
processsing -> processing
substract -> subtract
Transfered -> Transferred
upto -> up to

Also add several of them to the 'common typos' check in patcheck.

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
2023-11-18 19:55:42 +01:00
854012ec59 avutil/hwcontext_vulkan: get VkFormatFeatureFlagBits2
Rather than the VkFormatFeatureFlagBits enum

Signed-off-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>
2023-11-09 09:13:47 +01:00
6f39dee974 avutil/hwcontext_vulkan: fix run on macOS
VK_KHR_PORTABILITY_ENUMERATION_EXTENSION_NAME is required on macOS,
and VK_INSTANCE_CREATE_ENUMERATE_PORTABILITY_BIT_KHR flag should
be set.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2023-11-09 19:23:01 +08:00
63078b4599 avutil/hwcontext_vulkan: cuda doesn't belong to valid_sw_formats
Move it to transfer_get_formats.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2023-10-29 13:58:30 +08:00
891f70c6d5 avutil/hwcontext_vulkan: fix memleak when device_create is skipped
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2023-10-29 13:57:43 +08:00
c258623c0a hwcontext_vulkan: improve queue family init code
When users zero-init'd the struct, or left it as-is, the encode
queue family matched the graphics queue family, which led it to be
incorrectly logged as being used for encode.

This just improves the logging so this isn't printed anymore.
2023-10-24 06:07:09 +02:00
81cc0e1345 hwcontext_vulkan: properly support STORAGE usage for mutliplane images
Fixes multiplane support on Nvidia.

Also, remove the ENCODE usage, even if the driver signals it as supported.
Currently, it's not used, and when it is used, it'll be gated behind
two extension checks.
2023-10-05 23:50:30 +02:00
dfac782b13 avutil/hwcontext_vulkan: Cosmetics
The alignment in vulkan_unmap_from_drm() (formerly the clone
of vulkan_frame_free()) is nicer than the in vulkan_frame_free(),
let's preserve it.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-15 02:38:14 +02:00
677635cd04 avutil/hwcontext_vulkan: Deduplicate code
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-15 02:38:05 +02:00
47b1c0d0db avutil/hwcontext_vulkan: Improve type-safety
The AVBuffer API uses uint8_t as base type for buffers
and therefore its free callbacks need to abide by this.
Therefore vulkan_frame_free() used an inappropriate signature
which caused casts whenever this function has been called
manually.

This commit changes this by making vulkan_frame_free()
use the proper type and a vulkan_frame_free_cb() that
is used as free callback for the AVBuffer API.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-15 02:37:56 +02:00
a6bd2ee759 avutil/hwcontext_vulkan: Remove redundant resetting
vulkan_free_internal() already resets the AVVkFrame.internal
pointer.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-15 02:37:36 +02:00
358919506d vulkan: enable VK_KHR_cooperative_matrix
It's of interest to API users, and of interest to us,
as a DCT/DST can be implemented via matrix multiplies.
2023-08-26 23:14:53 +02:00
f0b1cab538 hwcontext_vulkan: always use create_pnext in vulkan_pool_alloc
Currently, create_pnext is only used if an applicable external memory
extension is enabled. This will usually the case when used from the command
line, but may not be when the Vulkan context is created manually.

For images used in video decoding, create_pnext contains the video profile
list, which is mandatory.[1] This fixes a GPU crash when using RADV.

[1] https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkImageCreateInfo.html#VUID-VkImageCreateInfo-usage-04815

Signed-off-by: Chris Spencer <spencercw@gmail.com>
2023-08-20 22:47:09 +02:00
e6bd8b1323 hwcontext_vulkan: hide Linux-only header after 571756bf2f
major/minor are in <sys/types.h> on BSDs and <sys/mkdev.h> on Solaris-like.

libavutil/hwcontext_vulkan.c:55:10: fatal error: 'sys/sysmacros.h' file not found
#include <sys/sysmacros.h>
^~~~~~~~~~~~~~~~~
2023-07-21 20:04:10 +02:00
d0f1d937fe hwcontext_vulkan: free temporary array once unneeded
Fixes a small memory leak.
This also prevents leaks on malloc/mutex init errors.
2023-06-15 22:00:41 +02:00
b4d5baa8b0 hwcontext_vulkan: call ff_vk_uninit() on device uninit
This fixes three memory leaks from ff_vk_load_props().
2023-06-15 22:00:41 +02:00
eff565dc19 hwcontext_vulkan: tune execution pools
Having less in-flight resources is better in this case.
2023-06-07 23:59:17 +02:00
378fb40282 avutil/hwcontext_vulkan: disable multiplane when deriving from cuda
Today, cuda is not able to import multiplane images, and cuda requires
images to be imported whether you trying to import to cuda or export
from cuda (in the later case, the image is imported and then copied
into on the cuda side). So any interop between cuda and vulkan requires
that multiplane be disabled.

The existing option for this is not sufficient, because when deriving
devices it is not possible to specify any options.

And, it is necessary to derive the Vulkan device, because any pipeline
that involves uploading from cuda to vulkan and then back to cuda must
use the same cuda context on both sides, and the only way to propagate
the cuda context all the way through is to derive the device at each
stage.

ie:

-vf hwupload=derive_device=vulkan,<filters>,hwupload=derive_device=cuda
2023-06-03 16:29:38 -07:00
dfff3877b7 vulkan: add support for the atomic float ops extension 2023-05-29 00:42:01 +02:00
77478f6793 av1dec: add Vulkan hwaccel 2023-05-29 00:42:00 +02:00