Otherwise the derived device and the source device might have different
PCI ID in a multiple-device system.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
At least on latest Win 11 and Visual Studio 2022, that DLL does not
exist anymore and can't be installed via any of the usual means.
However, debugging works just fine regardless, so this check makes
debugging impossible.
D3D11CreateDevice will fail anyway if debugging is not supported, so
let's rely on that instead.
These strings are so short (longest takes 11B) that using
pointers is wasteful. Avoiding them also moves hashdesc
into .rodata (from .data.rel.ro).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also make initialization/uninitialization behaviour more explicit in the docs,
and make sure we do not leak a channel map on error.
Signed-off-by: Marton Balint <cus@passwd.hu>
Also make use of the av_channel_from_string() function to determine the channel
id. This fixes some parse issues in av_channel_layout_from_string().
Signed-off-by: Marton Balint <cus@passwd.hu>
We lacked tests which supposed to fail, and there are some which should fail
but right now it does not. This will be fixed in a later commit.
Signed-off-by: Marton Balint <cus@passwd.hu>
Deduplicates a lot of code.
Some minor differences (mostly white space and inconsistent use of quotes) are
expected in the fate tests, there was no point aiming for exactly the same
formatting.
Signed-off-by: Marton Balint <cus@passwd.hu>
All versions of MSVC that support C11 (namely >= v19.27)
also support the restrict keyword, therefore av_restrict
is no longer necessary since 75697836b1.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Native access is from the code that declared the options, foreign access
is from code that is using the options. Forbid foreign access to
AVOption.offset/default_val, for which there is no good reason, and
which should allow us more freedom in extending their semantics in a
compatible way.
Replace the opt_size() function, currently only called from
av_opt_copy(), with
* a constant array of element sizes
* a function that signals whether an option type is POD (i.e.
memcpyable) or not
Will be useful in following commits.
This is possible because the lifetime of these structures coincide.
It has the advantage of allowing to remove AVHWFramesInternal
from the public header; given that AVHWFramesInternal.priv is no more,
most accesses to AVHWFramesInternal are no more; indeed, the only
field accessed of it outside of hwcontext.c is the internal frame pool,
making this commit very simple.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is no longer used by any hwcontext, as they all allocate
their private data together with their public data and access
it via AVHWFramesContext.hwctx.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to D3D12VAFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of these structures coincide.
It has the advantage of allowing to remove the AVHWDeviceInternal
from the public header; given that AVHWDeviceInternal.priv is no more,
all accesses to it happen in hwcontext.c, so that this commit moves
the joint structure there.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is no longer used by any hwcontext, as they all allocate
their private data together with their public data and access
it via AVHWDeviceContext.hwctx.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to D3D12VADevicePriv as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VTFramesContext as one no longer has to
go through AVHWFramesInternal.
Tested-by: Jan Ekström <jeebjp@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to QSVFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to QSVDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to D3D11VAFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to DXVA2FramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use AVHWFramesContext.hwctx instead.
This simplifies access to VDPAUFramesContext as one no longer has
to go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VDPAUDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This makes the code much simpler (especially for adding support
for other instruction set extensions), avoids needing inline
assembly for this feature, and generally is more of the canonical
way to do this.
The CPU feature detection was added in
493fcde50a, using HWCAP_CPUID.
The argument for using that, was that HWCAP_CPUID was added much
earlier in the kernel (in Linux v4.11), while the HWCAP flags for
individual features always come later. This allows detecting support
for new CPU extensions before the kernel exposes information about
them via hwcap flags.
However in practice, there's probably quite little advantage in this.
E.g. HWCAP2_I8MM was added in Linux v5.10 - long after HWCAP_CPUID,
but there's probably very little practical cases where one would
run a kernel older than that on a CPU that supports those instructions.
Additionally, we provide our own definitions of the flag values to
check (as they are fixed constants anyway), with names not conflicting
with the ones from system headers. This reduces the number of ifdefs
needed, and allows detecting those features even if building with
userland headers that are lacking the definitions of those flags.
Also, slightly older versions of QEMU, e.g. 6.2 in Ubuntu 22.04,
do expose support for these features via HWCAP flags, but the
emulated cpuid registers are missing the bits for exposing e.g. I8MM.
(This issue is fixed in later versions of QEMU though.)
Signed-off-by: Martin Storsjö <martin@martin.st>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to OpenCLFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to OpenCLDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To do so, concatenate all the names together to one big string
name1\0name2\0....lastname\0\0. This avoids the pointer in
the FunctionLoadInfo structure and thereby moves vk_load_info
into .rodata (and makes it smaller by 888B).
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Saves 16B per entry here (four of these 16 bytes are padding);
leads to 1776 B of savings in each file that uses
ff_vk_load_functions().
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There are three possible names for the functions requested;
they only differ in an extension: "", "EXT" or "KHR".
Yet vk_load_info contained pointers to all these strings.
This is wasteful and this commit changes it to avoid
the latter two strings. This saves 6353B of strings,
1776 B of .data.rel.ro as well as 5328 B due to the removed
relocations (corresponding to 2 * 111 removed pointers)
in lavc/vulkan_decode.o alone (ff_vk_load_functions()
is inlined in lavfi/vulkan_filter.c, lavu/hwcontext_vulkan.c
and lavc_vulkan_decode.c, so the savings are three times
this for shared builds; for static builds, the number may
be smaller depending upon whether strings are deduplicated).
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanFramesPriv as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanDevicePriv as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use AVHWFramesContext.hwctx instead.
This simplifies accesses to VDPAUFramesContext as one no longer has
to go through AVHWFramesInternal.
Tested-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Correct the names of the format-specific headers (not hwframe_*.h)
and clarify that the user shall ignore this field if there is no
public context associated with it.
In particular, this allows to use this field for the private context
alone if there is no public context. This can't break conforming
API users, because they always have to live with the possibility
that a new public context has been introduced.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Without ff_vk_exec_discard_deps which is called by ff_vk_exec_wait,
the reference count of hwframe context cannot reach zero due to
circular reference created by ff_vk_exec_add_dep_frame.
Fix#10873
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs,
which then end up heap-allocated.
By declaring any variable in a struct, or tree of structs, to be 32 byte
aligned, it allows the compiler to safely assume the entire struct
itself is also 32 byte aligned.
This might make the compiler emit code which straight up crashes or
misbehaves in other ways, and at least in one instances is now
documented to actually do (see ticket 10549 on trac).
The issue there is that an unrelated variable in SingleChannelElement is
declared to have an alignment of 32 bytes. So if the compiler does a copy
in decode_cpe() with avx instructions, but ffmpeg is built with
--disable-avx, this results in a crash, since the memory is only 16 byte
aligned.
Mind you, even if the compiler does not emit avx instructions, the code
is still invalid and could misbehave. It just happens not to. Declaring
any variable in a struct with a 32 byte alignment promises 32 byte
alignment of the whole struct to the compiler.
This patch limits the maximum alignment to the maximum possible simd
alignment according to configure.
While not perfect, it at the very least gets rid of a lot of UB, by
matching up the maximum DECLARE_ALIGNED value with the alignment of heap
allocations done by lavu.
av_get_sample/pix_fmt() return their respective enums
and are therefore not of the type int (*)(const char*),
yet they are called as-if they were of this type.
This works in practice, but is actually undefined behaviour.
With Clang 17 UBSan these violations are flagged, affecting lots
of tests. The number of failing tests went down from 3363 to 164
here with this patch.
Reviewed-by: Mark Thompson <sw@jkqxz.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The output of TX is extremely verbose and makes it harder to find other debug
log messages, so print most of it at trace level.
Signed-off-by: James Almer <jamrial@gmail.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VAAPIFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VAAPIDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This fixes the previous commit and adds more cases (DCT-I and DST-I).
I am holding off on defining a scale parameter for FFTs as I'd like
to use a complex value for them.
If a custom layout is equivalent to a native one, check if it matches one of the
known layout names and print that instead.
Signed-off-by: James Almer <jamrial@gmail.com>
Make av_tx_init() agree with documentation:
* Real to complex and complex to real DFTs.
* For the float and int32 variants, the scale type is 'float', while for
* the double variant, it's a 'double'. If scale is NULL, 1.0 will be used
* as a default.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Peter Ross <pross@xvid.org>
Makes it robust against adding fields before it, which will be useful in
following commits.
Majority of the patch generated by the following Coccinelle script:
@@
typedef AVOption;
identifier arr_name;
initializer list il;
initializer list[8] il1;
expression tail;
@@
AVOption arr_name[] = { il, { il1,
- tail
+ .unit = tail
}, ... };
with some manual changes, as the script:
* has trouble with options defined inside macros
* sometimes does not handle options under an #else branch
* sometimes swallows whitespace
The currently used pointer when unmapping DXVA2 and D3D11
actually points to an OpenCLDeviceContext.
Reviewed-by: Mark Thompson <sw@jkqxz.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These inline implementations of AV_COPY64, AV_SWAP64 and AV_ZERO64
are known to clobber the FPU state - which has to be restored
with the 'emms' instruction afterwards.
This was known and signaled with the FF_COPY_SWAP_ZERO_USES_MMX
define, which calling code seems to have been supposed to check,
in order to call emms_c() after using them. See
0b1972d409,
29c4c0886d and
df215e5758 for history on earlier
fixes in the same area.
However, new code can use these AV_*64() macros without knowing
about the need to call emms_c().
Just get rid of these dangerous inline assembly snippets; this
doesn't make any difference for 64 bit architectures anyway.
Signed-off-by: Martin Storsjö <martin@martin.st>
FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs,
which then end up heap-allocated.
By declaring any variable in a struct, or tree of structs, to be 32 byte
aligned, it allows the compiler to safely assume the entire struct
itself is also 32 byte aligned.
This might make the compiler emit code which straight up crashes or
misbehaves in other ways, and at least in one instances is now
documented to actually do (see ticket 10549 on trac).
The issue there is that an unrelated variable in SingleChannelElement is
declared to have an alignment of 32 bytes. So if the compiler does a copy
in decode_cpe() with avx instructions, but ffmpeg is built with
--disable-avx, this results in a crash, since the memory is only 16 byte
aligned.
Mind you, even if the compiler does not emit avx instructions, the code
is still invalid and could misbehave. It just happens not to. Declaring
any variable in a struct with a 32 byte alignment promises 32 byte
alignment of the whole struct to the compiler.
This patch limits the maximum alignment to the maximum possible simd
alignment according to configure.
While not perfect, it at the very least gets rid of a lot of UB, by
matching up the maximum DECLARE_ALIGNED value with the alignment of heap
allocations done by lavu.
Forgot to do this with the previous commit.
Actually makes the assembly being used.
Still the fastest FFT in the world, 15% faster than FFTW on the
largest available size.
It uses the int64_t instead of the double member.
(This code can currently not be reached: av_opt_get() calls
av_opt_find2() with NULL as unit in which case AV_OPT_TYPE_CONST
options are never returned, leading av_opt_get() to always
return AVERROR_OPTION_NOT_FOUND when searching for AV_OPT_TYPE_CONST*.
For the same reason the code read_number() will never be called
from get_number() when searching for an option of type
AV_OPT_TYPE_CONST. The other callers of read_number() also only
call it with types other than AV_OPT_TYPE_CONST.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes parsing small timebases from expressions (where the expression API
converts the result to double), like in this command line:
ffprobe -f lavfi -i testsrc=d=1,settb=1/2000000000 -show_streams -show_entries stream=time_base
Before the patch timebase was parsed as 1/1999999999.
Signed-off-by: Marton Balint <cus@passwd.hu>
Avoids casts all over the place; in this case, it also
replaces the unsafe cast uint8_t**->const uint8_t **
by the safe cast uint8_t**->const uint8_t * const*.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
AV_OPT_TYPE_INT64 should not be used for ints.
Should fix warnings about store to misaligned address for type 'int64_t'
Signed-off-by: James Almer <jamrial@gmail.com>
Previously AV_PIX_FMT_RGB8 was documented as "RGB 3:3:2,
(msb)2R 3G 3B(lsb)". While the RGB 3:3:2 part is correct, the latter
part should be: (msb)3R 3G 2B(lsb). This commit also updates the
format's pixdesc description to be (msb)3R 3G 2B(lsb).
Signed-off-by: Jeffrey Knockel <jeff@jeffreyknockel.com>
Reviewed-by: "Diederick C. Niehorster" <dcnieho@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is the 64bit version of Chris Doty-Humphreys SFC64
Compared to the LCGs these produce much better quality numbers.
Compared to LFGs this needs less state. (our LFG has 224 byte
state for its 32bit version) this has 32byte state
Also the initialization for our LFG is slower.
This is also much faster than KISS or PCG.
This commit replaces the broken LCG used before.
(broken as it had only a period ~200M due to being put in a double)
This changes the output from random() which is why libswresample.mak
is updated, update was done using the command in libswresample.mak
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
VideoToolbox use different identifiers for the same pixel format
with different color range, like
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange
kCVPixelFormatType_420YpCbCr8BiPlanarFullRange.
Before the patch, vt_pool_alloc() always use limited range, and it
will fail for pixel format AV_PIX_FMT_BGRA since there is no limited
range kCVPixelFormatType_32BGRA.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Unnecessary since acf63d5350adeae551d412db699f8ca03f7e76b9;
also avoids relocations.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It caused lacking a public declaration build error with
-Werror=missing-prototypes.
Since DXGI_FORMAT is moved to public since patch set V10, this function
is no longer useful. Now remove it.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The names of the cpu flags, when parsed from a string with
av_parse_cpu_caps, are parsed by the libavutil eval functions. These
interpret dashes as subtractions. Therefore, these previous cpu flag
names haven't been possible to set.
Use the official names for these extensions, as the previous ad-hoc
names wasn't parseable.
libavutil/tests/cpu tests that the cpu flags can be set, and prints
the detected flags.
Acked-by: Rémi Denis-Courmont <remi@remlab.net>
Signed-off-by: Martin Storsjö <martin@martin.st>
- Fixes YA formats, because previous code always assumed alpha as the 4th
component.
- Fixes PAL format (as long as 0 is black, as in a systematic palette), because
previous code assumed it as limited Y.
- Fixes XYZ format because it does not need nonzero chroma components
- Fixes xv30be as the bitstream mode got merged to the non-bitstream mode.
Signed-off-by: Marton Balint <cus@passwd.hu>