We violated the spec, which, despite the actual command buffer pool
*not* being involved in any functions which require external synchronization
of the pool, *require* external synchronization even if only the
command buffers are used.
This also has the effect of *significantly* speeding up execution
in case command buffers are contended.
This commit adds support for compiler hints.
While on AMD these are not used/needed, Nvidia benefits from them, and gives
a sizeable 10% speedup on 4k.
This commit uses the recently exported code for host mapping images back
where it was exported from.
The function also had broken download code for image downloading since its
recent refactor.
This patch refactors the CUDA import code to allow for Vulkan images
with multiple planes to be mapped.
Currently, a driver bug exists which causes NV12 images to be mapped
incorrectly when the memory being mapped contains both planes, the
issue has been reported to NVIDIA.
yuv420p does work correctly, however.
This is still an improvement, as the code used to crash when trying to
map the memory, unless disable_multiplane=1 was given as an option.
BGR formats in Vulkan cannot be used in storage images, as the
pixel labels on storage images are always ordered as RGB, and
swizzling is not an option due to old hardware limitations.
This means that you must always use an RGB format and manually
swizzle when reading or writing to BGR images, or simply not use
a format in the shader itself.
This adds support for the latter.
lavapipe recently added support for external_semaphore_fd, but only for syncfiles,
not for opaque file descriptors.
The code is written to allow using syncfiles later on.
Ref: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12422
The issue is that some compilers complain if a struct or array
is empty.
This extension does nothing by default, and can be useful, so just add it
to keep the array non-empty.
We recently introduced a public field which was a superset
of the queue context we used to have.
Switch to using it entirely.
This also allows us to get rid of the NIH function which was
valid only for video queues.
This function dates back a long time ago, before vkfmt_from_pixfmt2.
When it was converted over, the thought was that this was far too
restrictive to demand storage images for each format.
With the new clever function, it makes sure to check that the compatible
subformats a format can be used as support storage capabilities.
This gets rid of fake support for RGB48/RGB96 which some implementations
offer but don't support using as storage images.
We do uploads asynchronously, and we map the software frames in
order to avoid 2-stage copying. However, whilst we added a dependency
upon the mapped buffers, we did not add the original frame backing
those buffers as a dependency.
This caused issues on RADV, particularly with RGB images.
Push descriptors are in theory slightly faster, but come with
limitations for which we have to check.
Either way, they're not difficult to implement, so even though
no one should be using peasant-tier descriptors, do it anyway.