No need to write a custom string parser when we can just use an integer
option with preset values. The various bits of fallback logic are wholly
redundant with equivalent logic already inside sws_getCoefficients.
Note: I disallowed setting 'out_color_matrix=auto', because this does
not do anything meaningful in the current code (just hard-codes
AVCOL_SPC_BT470BG fallback).
The documentation states that invalid entries default to SWS_CS_DEFAULT.
A value of 0 is not a valid SWS_CS_*, yet the code incorrectly
hard-codes it to BT.709 coefficients instead of SWS_CS_DEFAULT.
This was a complete hack seemingly designed to work around a different
bug, which was fixed in the previous commit. As such, there is no more
reason not to do this, as it simply breaks changing color range in
sws_setColorspaceDetails for no reason.
More commonly, this fixes the case of sws_setColorspaceDetails after
sws_getContext, since the latter implies sws_init_context.
The problem here is that sws_init_context sets up the range conversion
and fast path tables based on the values of srcRange/dstRange at init
time. This may result in locking in a "wrong" path (either using
unscaled fast path when range conversion later required, or using
scaled slow path when range conversion becomes no longer required).
There are two way outs:
1. Always initialize range conversion and unscaled converters, even if
they will be unused, and extend the runtime check.
2. Re-do initialization if the values change after
sws_setColorspaceDetails.
I opted for approach 1 because it was simpler and easier to reason
about.
Reword the av_log message to make it clear that this special converter
is not necessarily used, depending on whether or not there is range
conversion or YUV matrix conversion going on.
The check if drawing needs to be initialized and supported formats
should be drawable ones was flawed, as pad_stop/pad_start is only
populated from stop_duration/start_duration after these checks.
To fix that, check the _duration variants as well and for better
readability and maintainability break the check out into its own
helper.
Fixes a regression from 86b252ea9dFix#10621
Signed-off-by: Anton Khirnov <anton@khirnov.net>
We calculated offsets as pairs, but addressed them in the shader
as single float values, while reading them as ivec2s.
Also removes unused code (was provisionally added if cooperative matrices
could be used, but that turned out to be impossible).
VK_KHR_PORTABILITY_ENUMERATION_EXTENSION_NAME is required on macOS,
and VK_INSTANCE_CREATE_ENUMERATE_PORTABILITY_BIT_KHR flag should
be set.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Texinfo 7.0 produces quite different HTML to Texinfo 6.8. Without
this change, enumerated option flags (i.e. Possible values of x
are...) render as white text on a white background with Texinfo 7.0
and are unreadable. This change removes a style for the selector
`.table .table` which causes the background to turn white for these
elements. As far as I can tell, it is not actually used anywhere in
files generated by Texinfo 6.8.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Resolves trac ticket #10636 (http://trac.ffmpeg.org/ticket/10636).
Texinfo 7.0, released in November 2022, changed the names of various
functions. Compiling docs with Texinfo 7.0 resulted in warnings and
improperly formatted documentation. More old names appear to have
been removed in Texinfo 7.1, released October 2023, which causes docs
compilation to fail.
This commit addresses the issue by adding logic to switch between the old
and new function names depending on the Texinfo version. Texinfo 6.8
produces identical documentation before and after the patch.
CC
https://www.mail-archive.com/debian-bugs-dist@lists.debian.org/msg1938238.htmlhttps://bugs.gentoo.org/916104
Signed-off-by: Frank Plowman <post@frankplowman.com>
By intent, and in practice, the "active contributor" criterion applies
to the person authoring the commits, not the one pushing them into the
repository.
When operating on large blocks of data it's common to repeatedly use
an instruction on multiple registers. Using the REPX macro makes it
easy to quickly write dense code to achieve this without having to
explicitly duplicate the same instruction over and over.
For example,
REPX {paddw x, m4}, m0, m1, m2, m3
REPX {mova [r0+16*x], m5}, 0, 1, 2, 3
will expand to
paddw m0, m4
paddw m1, m4
paddw m2, m4
paddw m3, m4
mova [r0+16*0], m5
mova [r0+16*1], m5
mova [r0+16*2], m5
mova [r0+16*3], m5
Commit taken from x264:
6d10612ab0
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Partially fixes ticket #798
Reviewed-by: James Almer <jamrial@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Peter Ross <pross@xvid.org>
This uses a more traditional approach allowing up processing of up to
period minus two elements per iteration. This also allows the algorithm
to work for all and any vector length.
As the T-Head C908 device under test can load 16 elements loop, there is
unsurprisingly a little performance drop when the period is minimal and
the parallelism is capped at 13 elements:
Before:
postfilter_15_c: 21222.2
postfilter_15_rvv_f32: 22007.7
postfilter_512_c: 20189.7
postfilter_512_rvv_f32: 22004.2
postfilter_1022_c: 20189.7
postfilter_1022_rvv_f32: 22004.2
After:
postfilter_15_c: 20189.5
postfilter_15_rvv_f32: 7057.2
postfilter_512_c: 20189.5
postfilter_512_rvv_f32: 5667.2
postfilter_1022_c: 20192.7
postfilter_1022_rvv_f32: 5667.2
As in the aligned case, we can use VLSE64.V, though the way of doing so
gets more convoluted, so the performance gains are more modest:
get_pixels_unaligned_c: 126.7
get_pixels_unaligned_rvv_i32: 145.5 (before)
get_pixels_unaligned_rvv_i64: 62.2 (after)
For the reference, those are the aligned benchmarks (unchanged) on the
same T-Head C908 hardware:
get_pixels_c: 126.7
get_pixels_rvi: 85.7
get_pixels_rvv_i64: 33.2
Without this flag, timestamps were embedded into the final
binary if CUDA was enabled.
Signed-off-by: Rob Hall <robxnanocode@outlook.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
We currently mostly do not empty the MMX state in our MMX
DSP functions; instead we only do so before code that might
be using x87 code. This is a violation of the System V i386 ABI
(and maybe of other ABIs, too):
"The CPU shall be in x87 mode upon entry to a function. Therefore,
every function that uses the MMX registers is required to issue an
emms or femms instruction after using MMX registers, before returning
or calling another function." (See 2.2.1 in [1])
This patch does not intend to change all these functions to abide
by the ABI; it only does so for ff_pixelutils_sad_8x8_mmxext, as this
function can by called by external users, because it is exported
via the pixelutils API. Without this, the following fragment will
assert (on x86/x64):
uint8_t src1[8 * 8], src2[8 * 8];
av_pixelutils_sad_fn fn = av_pixelutils_get_sad_fn(3, 3, 0, NULL);
fn(src1, 8, src2, 8);
av_assert0_fpu();
[1]: https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/intel386-psABI-1.1.pdf
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>