1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-02-04 06:08:26 +02:00

194 Commits

Author SHA1 Message Date
Lynne
ac092c6707
hwcontext_vulkan: guard all uses of new spec defines and fix stray bracket
This fixes compilation with less recent Vulkan headers.
2024-10-04 10:41:03 +02:00
Lynne
a304cbeb8d
vulkan: add profiling debug setting
This simply keeps all shader optimizations, but allows debug
data to be generated.
2024-10-04 10:10:46 +02:00
Lynne
356d1cc8ff
vulkan: parse instance list and add the DEBUG_UTILS extension
Required to let users know whether debugging is active.
2024-10-04 10:10:44 +02:00
Lynne
e3676d96cb
hwcontext_vulkan: move device feature struct setup to a new function 2024-10-04 10:10:43 +02:00
Lynne
535e5eb7f3
hwcontext_vulkan: enable VK_KHR_shader_relaxed_extended_instruction 2024-10-04 10:10:43 +02:00
Lynne
d80f9f55c8
vulkan: always enable GL_EXT_scalar_block_layout
This makes std430 (which we use everywhere already) fully match C
layout.
Extension was made mandatory in 1.2.
2024-10-04 10:10:42 +02:00
Lynne
37d5cb84e8
vulkan: check if current buffer has finished execution before picking another
This saves resources, as dependencies are freed/reclaimed with a lower latency,
and provies a speedup.
2024-10-04 10:10:42 +02:00
Lynne
bc36fe6f1f
vulkan: use push descriptors where possible
Push descriptors are in theory slightly faster, but come with
limitations for which we have to check.

Either way, they're not difficult to implement, so even though
no one should be using peasant-tier descriptors, do it anyway.
2024-09-23 13:41:07 +02:00
Lynne
8a7af4aa49
vulkan: add support for regular descriptor pools
This permits:
 - The use of Vulkan filtering on many more devices
 - Better debugging due to lack of descriptor buffer support in layers

Much of the changes here are due to a requirement that updates to
descriptors must happen between the command buffer being waited on,
and the pipeline not being bound.

We routinely did it the other way around, by updating only after
we bind the pipeline.
2024-09-23 13:40:38 +02:00
Lynne
2942ca802a
hwcontext_vulkan: forward debug_mode to check_extensions() for devices
This allows disabling of certain extensions when debug mode is turned on.
2024-09-23 13:40:37 +02:00
Lynne
b5184c5d45
hwcontext_vulkan: add the PROFILE_INDEPENDENT only when needed 2024-09-23 13:40:36 +02:00
Lynne
a577d313b2
hwcontext_vulkan: add support for implicit DRM synchronization
More recent kernel versions allow for users to extract a sync_file
handle from a DMA-BUF, which can then be imported into Vulkan as a
binary semaphore.

This finally allows for synchronization between Vulkan and DMA-BUF
images, such as those from screen capture software, or VAAPI,
avoiding any corruption artifacts.

This is done fully asynchronously, where we use the kernel's
given binary semaphores as a dependency to increment the image's
usual VkSemaphores we allocate. The old imported binary semaphores
are cleaned up after execution as usual.

In the future, hwcontext_drm should receive support for explicitly
synchronized images as well, which would make the synchronization
more robust and portable.
2024-09-22 02:11:08 +02:00
Lynne
1445102e68
hwcontext_vulkan: fix VUID-VkPhysicalDeviceImageFormatInfo2-usage-requiredbitmask
fmt_props.usage was initialized to 0 as create_info.usage was set later.
2024-09-22 02:11:04 +02:00
Lynne
8875da347a
hwcontext_vulkan: consider encode DBPs as always independent 2024-09-18 05:59:46 +02:00
Lynne
32d47166c8
hwcontext_vulkan: fix p->img_qfs
The array was tied to our old queue API, which meant that if users
set it, it was never set.
2024-09-18 05:59:41 +02:00
Lynne
0fa6f33875
hwcontext_vulkan: add support for x2bgr10 and proper DRM mappings for 10-bit images
This allows mapping of 10-bit DRM images.
2024-09-16 14:03:56 +02:00
Lynne
3415e0533f
hwcontext_vulkan: disable more false positive validation checks
Both of these are fundamentally incompatible with video decoding.

The third one can be tracked in the following validation layer issue:
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/6627
2024-09-09 07:05:46 +02:00
Lynne
0ffa967170
hwcontext_vulkan: ask for storage images by default
The issue is that we ask for storage images by default if
available, but because that is gated by the format supporting
storage images, and the check for the format supporting storage
images is gated by the usage, this resulted in a catch-22.
2024-09-09 07:05:45 +02:00
Lynne
c41ef7f2ff
hwcontext_vulkan: add PREP_MODE_GENERAL for non-transfer_dst images
Vulkan filters don't need images which can be transferred into.
2024-09-09 07:05:44 +02:00
Lynne
604dfdb44c
hwcontext_vulkan: align host mapping size to minImportedHostPointerAlignment
This was left out of the recent rewrite of the system.
2024-08-16 01:22:16 +02:00
Lynne
18d964fc2c
vulkan: enable encoding of images if video_maintenance1 is enabled
Vulkan encoding was designed in a very... consolidated way.
You had to know the exact codec and profile that the image was going to
eventually be encoded as at... image creation time. Unfortunately, as good
as our code is, glimpsing into the exact future isn't what its capable of.

video_maintenance1 removed that requirement, which only then made encoding
images practically possible.
2024-08-16 01:22:16 +02:00
Lynne
46c13834b6
hwcontext_vulkan: enable VK_KHR_video_maintenance1
We require it for encoding.
2024-08-16 01:22:15 +02:00
Lynne
97e947a2a7
hwcontext_vulkan: setup extensions before features
The issue is that enabling features requires that the device
extension is supported. The extensions bitfield was set later,
so it was always 0, leading to no features being added.
2024-08-16 01:22:15 +02:00
Lynne
c3cbaf39bb
hwcontext_vulkan: don't enable deprecated VK_KHR_sampler_ycbcr_conversion extension
It was added to Vulkan 1.1 a long time ago.
Validation layer will warn if this is enabled.
2024-08-16 01:22:15 +02:00
Lynne
3f65d24075
hwcontext_vulkan: fix user layers, add support for different debug modes
The validation layer option only supported GPU-assisted validation.
This is mutually exclusive with shader debug printfs, so we need to
differentiate between the two.

This also fixes issues with user-given layers, and leaks in case of
errors.
2024-08-16 01:22:14 +02:00
Lynne
e25667f9f1
hwcontext_vulkan: ignore false positive validation errors
Issue ref:
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/6627
2024-08-11 05:13:18 +02:00
Lynne
ef11a6456d
hwcontext_vulkan: do not chain structs of unsupported extensions in vkCreateDevice
Fixes:

vkCreateDevice(): pCreateInfo->pNext<VkPhysicalDeviceOpticalFlowFeaturesNV> includes a
pointer to a VkPhysicalDeviceOpticalFlowFeaturesNV, but when creating VkDevice, the
parent extension (VK_NV_optical_flow) was not included in ppEnabledExtensionNames.
The Vulkan spec states: Each pNext member of any structure (including this one) in
the pNext chain must be either NULL or a pointer to a valid struct for extending
VkDeviceCreateInfo.
2024-08-11 05:13:17 +02:00
Lynne
5f0f1f7b7a
libavutil: deprecate the old Vulkan queue API, add doc/APIchanges entries 2024-08-11 05:13:15 +02:00
Lynne
2ce0e51503
hwcontext_vulkan: add support for Vulkan encoding 2024-08-11 05:13:14 +02:00
Lynne
55adcb4fc5
hwcontext_vulkan: add support for VK_EXT_shader_object
We'd like to use it eventually, and its already covered by
the minimum version of the headers we require.
2024-08-11 05:13:13 +02:00
Lynne
c19af16f8d
hwcontext_vulkan: enable storageBuffer16BitAccess if available 2024-08-11 05:13:12 +02:00
Lynne
957d34784a
hwcontext_vulkan: constify validation layer features table
The struct data seem to get corrupted otherwise.
Possibly a validation layer or libvulkan issue.
2024-08-11 05:13:11 +02:00
Lynne
9e606b33a8
hwcontext_vulkan: add HOST_CACHED flag to transfer buffer
Significantly speeds up downloads on devices without host mapping.
2024-08-11 05:13:11 +02:00
Lynne
aea4d4b423
hwcontext_vulkan: rewrite upload/download
This commit was long overdue. The old transfer dubiously tried to
merge as much code as possible, and had very little in the way
of optimizations, apart from basic host-mapping.

The new code uses buffer pools for any temporary bufflers, and
handles falling back to buffer-based uploads if host-mapping fails.

Roundtrip performance difference:
ffmpeg -init_hw_device "vulkan=vk:0,debug=0,disable_multiplane=1" -f lavfi \
-i color=red:s=3840x2160 -vf hwupload,hwdownload,format=yuv420p -f null -

7900XTX:
Before: 224fps
After: 502fps

Ada, with proprietary drivers:
Before: 29fps
After: 54fps

Alder Lake:
Before: 85fps
After: 108fps

With the host-mapping codepath disabled:
Before: 32fps
After: 51fps
2024-08-11 05:13:11 +02:00
Lynne
81c5d4ea0e
hwcontext_vulkan: remove unused struct 2024-08-11 05:13:10 +02:00
Lynne
a30b7c0158
hwcontext_vulkan: initialize optical flow queues if available
Lets us implement FPS conversion.
2024-08-11 05:13:10 +02:00
Lynne
8790a30882
hwcontext_vulkan: rewrite queue picking system for the new API
This allows us to support different video ops on different queues,
as well as any other arbitrary queues we need.
2024-08-11 05:13:09 +02:00
Lynne
13489c8a21
hwcontext_vulkan: add a new mechanism to expose used queue families
The issue with the old mechanism is that we had to introduce new
API each time we needed a new queue family, and all the queue families
were functionally fixed to a given purpose.

Nvidia's GPUs are able to handle video encoding and compute on the
same queue, which results in a speedup when pre-processing is required.

Also, this enables us to expose optical flow queues for frame interpolation.
2024-08-11 05:13:03 +02:00
Haihao Xiang
a4630d479a lavu/hwcontext_vulkan: Support write on drm frame
Otherwise nothing is written into the destination when a write mapping
is requested.

For example, a vulkan frame mapped from a drm frame (which is wrapped as
a vaapi frame in the example) is used as the output of scale_vulkan
filter, it always gets a green screen without this patch.

ffmpeg -init_hw_device vaapi=va -init_hw_device vulkan=vulkan@va
-filter_hw_device vulkan -f lavfi -i testsrc=size=352x288,format=nv12
-vf
"hwupload,scale_vulkan,hwmap=derive_device=vaapi:reverse=1,format=vaapi,hwdownload,format=nv12"
-f nut - | ffplay -

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-06-12 01:53:18 +02:00
Andreas Rheinhardt
790f793844 avutil/common: Don't auto-include mem.h
There are lots of files that don't need it: The number of object
files that actually need it went down from 2011 to 884 here.

Keep it for external users in order to not cause breakages.

Also improve the other headers a bit while just at it.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-31 00:08:43 +01:00
Lynne
ecdc94b97f
vulkan_av1: port to the new stable API
Co-Authored-by: Dave Airlie <airlied@redhat.com>
2024-03-25 08:54:40 +01:00
Haihao Xiang
d296c8689d lavu/hwcontext_vulkan: check PCI ID if possible
Otherwise the derived device and the source device might have different
PCI ID in a multiple-device system.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-03-19 09:37:39 +08:00
Andreas Rheinhardt
3e669b24e2 avutil/hwcontext: Allocate AVHWFramesCtx jointly with its internals
This is possible because the lifetime of these structures coincide.
It has the advantage of allowing to remove AVHWFramesInternal
from the public header; given that AVHWFramesInternal.priv is no more,
most accesses to AVHWFramesInternal are no more; indeed, the only
field accessed of it outside of hwcontext.c is the internal frame pool,
making this commit very simple.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-07 08:53:31 -03:00
Andreas Rheinhardt
e70e9b6554 avutil/hwcontext_vulkan: Allocate pub and priv frames hwctx together
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanFramesPriv as one no longer has to
go through AVHWFramesInternal.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-01 18:48:06 +01:00
Andreas Rheinhardt
2d63379cae avutil/hwcontext_vulkan: Allocate public and priv device hwctx together
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanDevicePriv as one no longer has to
go through AVHWDeviceInternal.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-01 18:48:06 +01:00
Zhao Zhili
74e27d9e31 avutil/hwcontext_vulkan: Fix memleaks when transfer to vulkan
Without ff_vk_exec_discard_deps which is called by ff_vk_exec_wait,
the reference count of hwframe context cannot reach zero due to
circular reference created by ff_vk_exec_add_dep_frame.

Fix #10873

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-03-01 17:22:14 +08:00
Zhao Zhili
03275b0f09 avutil/hwcontext_vulkan: Fix leaks in map_from_drm
Also simplify error handing.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-03-01 17:20:29 +08:00
Zhao Zhili
6f9730cb28 avutil/hwcontext_vulkan: Fix leaks when semaphore creation fails
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-03-01 17:20:21 +08:00
Benjamin Cheng
185871fdd3 hwcontext_vulkan: guard unistd.h include
win32 typically doesn't have unistd.h, so always including it will break
MSVC builds. The usage of those POSIX functions are already guarded by
_WIN32, so use that to guard unistd.h include as well.
2023-12-11 16:36:56 +01:00
Diederik de Haas via ffmpeg-devel
c07ed10b0e apply spelling fixes
Fix spelling issue as reported by Debian's lintian tool:
accomodate -> accommodate
addtional -> additional
auxillary -> auxiliary
bellow -> below
betweeen -> between
Calulate -> Calculate
coefficents -> coefficients
Defalt -> Default
defaul -> default
higer -> higher
neccesary -> necessary
orignal -> original
ouput -> output
precison -> precision
processsing -> processing
substract -> subtract
Transfered -> Transferred
upto -> up to

Also add several of them to the 'common typos' check in patcheck.

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
2023-11-18 19:55:42 +01:00