I want to start adding more data layouts, like semiplanar formats (nv12), or
palette formats. I made an effort to distinguish existing checks for rw.packed
into "mode != PLANAR" and "mode == PACKED", based on the intent of the
surrounding code, in anticipation of these new layouts.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
This is a minor cosmetic improvement that allows me to use more
convenient names for a filter-related metadata fields, without
confusion.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
The old x86 backend was the only backend that actually mutated the ops list.
With this gone, we can constify this parameter.
Signed-off-by: Niklas Haas <git@haasn.dev>
This allows constraining the set of available backends. This serves as a
better replacement for the "unstable" flag, which is a bit ambiguous. Allows
users to, for example, opt into the memcpy or x86 backend, while excluding
e.g. the upcoming JIT backends.
Signed-off-by: Niklas Haas <git@haasn.dev>
The issue is that while Vulkan already does the decomposition for us,
swscale assumes that the pixels will be in bitstream order, rather than
in their decomposed form.
This is valid for all packed formats for which these instructions are
issued (XV30 and X2RGB10).
This allows us to support the formats in Vulkan.
Sponsored-by: Sovereign Tech Fund
Instead of implicitly testing for NaN values. This is mostly a straightforward
translation, but we need some slight extra boilerplate to ensure the mask
is correctly updated when e.g. commuting past a swizzle.
Signed-off-by: Niklas Haas <git@haasn.dev>
A failure while preparing a dither buffer leaves the newly allocated
buffer outside the cleanup range, leaking Vulkan resources. Make the
failure path cover the current buffer as well.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
swscale gets runtime-defined assembly once again!
This commit splits the Vulkan backend into two, SPIR-V and GLSL,
enabling falling back onto the GLSL implementation if an instruction
is unavailable, or simply for testing.
Sponsored-by: Sovereign Tech Fund
This commit adds a SPIR-V assembler header file. It was partially generated
from the SPIR-V header file JSON definition, then edited by hand to template
and reduce its size as much as possible.
It only implements the essentials required for SPIR-V assembly that swscale
requires.
Sponsored-by: Sovereign Tech Fund
Uniform buffers are much simpler to index, and require no work from
the driver compiler to optimize.
In SPIR-V, large 2D shader constants can be spilled into scratch memory,
since you need to create a function variable to index them during runtime.
Sponsored-by: Sovereign Tech Fund
The issue is that very often, hardware has limited support for BGRA
formats.
As this is a limitation of Vulkan itself, we cannot work around this
in a compatible way.
Sponsored-by: Sovereign Tech Fund
The issue is that with multiplane images, or packed images,
there may be some mismatching between what .elems has, and what
we need.
Descriptors are cheap, so just always reserve 4.
Sponsored-by: Sovereign Tech Fund
The issue is that the main Vulkan context is shared between possibly
multiple shaders, and registering a new shader requires allocating
descriptors.
Sponsored-by: Sovereign Tech Fund
It was a bit clunky, lacked semantic contextual information, and made it
harder to reason about the effects of extending this struct. There should be
zero runtime overhead as a result of the fact that this is already a big
union.
I made the changes in this commit by hand, but due to the length and noise
level of the commit, I used Opus 4.6 to verify that I did not accidentally
introduce any bugs or typos.
Signed-off-by: Niklas Haas <git@haasn.dev>
Just define these directly as integer arrays; there's really no point in
having them re-use SwsSwizzleOp; the only place this was ever even remotely
relevant was in the no-op check, which any decent compiler should already
be capable of optimizing into a single 32-bit comparison.
Signed-off-by: Niklas Haas <git@haasn.dev>
> packed = load all components from a single plane (the index given by order_src[0])
> planar = load one component each from separate planes (the index given by order_src[i])
Sponsored-by: Sovereign Tech Fund
This allows reads to directly embed filter kernels. This is because, in
practice, a filter needs to be combined with a read anyways. To accomplish
this, we define filter ops as their semantic high-level operation types, and
then have the optimizer fuse them with the corresponding read/write ops
(where possible).
Ultimately, something like this will be needed anyways for subsampled formats,
and doing it here is just incredibly clean and beneficial compared to each
of the several alternative designs I explored.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
This reverts commit 32554fc107.
Accidentally pushed this commit twice, with the wrong location.
Correct version is 97682155e6.
Signed-off-by: Niklas Haas <git@haasn.dev>
Avoids some unnecessary round-trips through the execution harness, as well
as removing one unnecessary layer of abstraction (SwsOpExec).
It's a bit unfortunate that we have to cast away the const on the AVFrame,
since the Vulkan functions take non-const everywhere, even though all they're
doing is modifying frame internal metadata, but alas.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
And call it on the read/write ops directly, rather than this awkward loop.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
And call it on the read/write ops directly, rather than this awkward loop.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>