The issue is that vulkan_device_create_internal() is only called for
devices that lavu creates by itself.
For external devices, this was never done.
This also solves some mid-function declaration warnings.
This feature fundamentally relies on host-visible VRAM, which restricts the
set of available memory types to (typically) host-visible device-local ones.
When resizable BAR is disabled, this memory type is usually limited to
e.g. 256 MiB in size, which is just plain insufficient for allocation of
general purpose GPU images, causing OOM errors on even the simplest of
commands.
The easiest solution is to disable host transfers entirely on machines
without host-addressable VRAM. In theory, we could try and recover the use
of host transfers for images which are *not* restricted to device-local
memory types, but this is rarely the case in practice, and the effort
required would exceed the benefit, especially since ReBAR is a standard
feature on all platforms recent enough to have Vulkan drivers, and only
occasionally disabled in the UEFI for by default for some hare-brained
notion of "backwards compatibiility" with ancient software.
On NVIDIA, there's a global maximum limit of approximately 112 queues,
which means it takes ONLY 7 total programs using the maximum amount of
queues to cause the driver to error out/*segfault* during initialization.
Also, each queue takes about 30ms to allocate, which quickly adds up.
This reduces the queues allocate to the minimum that we would be happy
with. Its not worth limiting decode/encode queues as they're generally
not a lot, and do help.
In the call to vkGetPhysicalDeviceImageFormatProperties2(), we were
previously requesting the properties of the first fallback format (e.g.
VK_FORMAT_R8_UNORM for VK_FORMAT_G8_B8R8_2PLANE_420_UNORM) instead of
the actual format in use.
We don’t do anything with it afterwards, but there is no reason to keep
querying the wrong format.
We violated the spec, which, despite the actual command buffer pool
*not* being involved in any functions which require external synchronization
of the pool, *require* external synchronization even if only the
command buffers are used.
This also has the effect of *significantly* speeding up execution
in case command buffers are contended.
This commit adds support for compiler hints.
While on AMD these are not used/needed, Nvidia benefits from them, and gives
a sizeable 10% speedup on 4k.
This commit uses the recently exported code for host mapping images back
where it was exported from.
The function also had broken download code for image downloading since its
recent refactor.
This patch refactors the CUDA import code to allow for Vulkan images
with multiple planes to be mapped.
Currently, a driver bug exists which causes NV12 images to be mapped
incorrectly when the memory being mapped contains both planes, the
issue has been reported to NVIDIA.
yuv420p does work correctly, however.
This is still an improvement, as the code used to crash when trying to
map the memory, unless disable_multiplane=1 was given as an option.
BGR formats in Vulkan cannot be used in storage images, as the
pixel labels on storage images are always ordered as RGB, and
swizzling is not an option due to old hardware limitations.
This means that you must always use an RGB format and manually
swizzle when reading or writing to BGR images, or simply not use
a format in the shader itself.
This adds support for the latter.
lavapipe recently added support for external_semaphore_fd, but only for syncfiles,
not for opaque file descriptors.
The code is written to allow using syncfiles later on.
Ref: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12422
The issue is that some compilers complain if a struct or array
is empty.
This extension does nothing by default, and can be useful, so just add it
to keep the array non-empty.
We recently introduced a public field which was a superset
of the queue context we used to have.
Switch to using it entirely.
This also allows us to get rid of the NIH function which was
valid only for video queues.
This function dates back a long time ago, before vkfmt_from_pixfmt2.
When it was converted over, the thought was that this was far too
restrictive to demand storage images for each format.
With the new clever function, it makes sure to check that the compatible
subformats a format can be used as support storage capabilities.
This gets rid of fake support for RGB48/RGB96 which some implementations
offer but don't support using as storage images.