RVV defines a total of 12 different extensions, including:
- 5 different instruction subsets:
- Zve32x: 8-, 16- and 32-bit integers,
- Zve32f: Zve32x plus single precision floats,
- Zve64x: Zve32x plus 64-bit integers,
- Zve64f: Zve32f plus Zve64x,
- Zve64d: Zve64f plus double precision floats.
- 6 different vector lengths:
- Zvl32b (embedded only),
- Zvl64b (embedded only),
- Zvl128b,
- Zvl256b,
- Zvl512b,
- Zvl1024b,
- and the V extension proper: equivalent to Zve64f and Zvl128b.
In total, there are 6 different possible sets of supported instructions
(including the empty set), but for convenience we allocate one bit for
each type sets: up-to-32-bit ints (RVV_I32), floats (RVV_F32),
64-bit ints (RVV_I64) and doubles (RVV_F64).
Whence the vector size is needed, it can be retrieved by reading the
unprivileged read-only vlenb CSR. This should probably be a separate
helper macro if needed at a later point.
This introduces compile-time and run-time CPU detection on RISC-V. In
practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of
I, F and D extensions, and if it does, it probably won't have run-time
detection. So the flags are essentially always set.
But as things stand, checkasm wants them that way. Compare the ARMV8
flag on AArch64. We are nowhere near running short on CPU flag bits.
LSX and LASX is loongarch SIMD extention.
They are enabled by default if compiler support it, and can be disabled
with '--disable-lsx' '--disable-lasx'.
Change-Id: Ie2608ea61dbd9b7fffadbf0ec2348bad6c124476
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
av_set_cpu_flags_mask() has been deprecated in the commit which merged
it: 6df42f98746be06c883ce683563e07c9a2af983f; av_parse_cpu_flags() has
been deprecated in 4b529edff8.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Add MMI & MSA runtime detection for MIPS.
Basically there are two code pathes. For systems that
natively support CPUCFG instruction or kernel emulated
that instruction, we'll sense this feature from HWCAP and
report the flags according to values grab from CPUCFG. For
systems that have no CPUCFG (or not export it in HWCAP),
we'll parse /proc/cpuinfo instead.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit 'e6bff23f1e11aefb16a2b5d6ee72bf7469c5a66e':
cpu: add a function for querying maximum required data alignment
Adapted to work with the arbitrary runtime cpuflag changes av_force_cpu_flags()
can generate.
Merged-by: James Almer <jamrial@gmail.com>
* commit '7d7355aa92bb36ca0765c49a569a999bcb96f332':
x86: Add SSSE3_SLOW CPU flag and related convenience macros
Merged-by: James Almer <jamrial@gmail.com>
Make the one-time initialization in av_get_cpu_flags() thread-safe. The
static variable |cpu_flags| in libavutil/cpu.c is read and written using
normal load and store operations. These are considered as data races.
The fix is to use atomic load and store operations.
The fix can be verified by running the libavutil/tests/cpu_init.c test
program under ThreadSanitizer:
./configure --toolchain=clang-tsan
make libavutil/tests/cpu_init
libavutil/tests/cpu_init
There should be no warnings from ThreadSanitizer.
Co-author: Dmitry Vyukov of Google, who suggested the data race fix.
Signed-off-by: Wan-Teh Chang <wtc@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Make the one-time initialization in av_get_cpu_flags() thread-safe. The
static variables |flags|, |cpuflags_mask|, and |checked| in
libavutil/cpu.c are read and written using normal load and store
operations. These are considered as data races. The fix is to use atomic
load and store operations.
Remove the |checked| variable because the invalid value of -1 for
|flags| can be used to indicate the same condition. Rename |flags| to
|cpu_flags| and move it to file scope.
The fix can be verified by running the libavutil/tests/cpu_init.c test
program under ThreadSanitizer:
./configure --toolchain=clang-tsan
make libavutil/tests/cpu_init
libavutil/tests/cpu_init
There should be no warnings from ThreadSanitizer.
Co-author: Dmitry Vyukov of Google, who suggested the data race fix.
Signed-off-by: Wan-Teh Chang <wtc@google.com>
The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu
implementations do not support it in hardware. Vector mode code will
depending the OS either be emulated in software or result in an illegal
instruction on cpus which does not support it. This was not really
problem in practice since NEON implementations of the same functions are
preferred. It will however become a problem for checkasm which tests
every cpu flag separately.
Since this is a cpu feature newer cpu do not support anymore the
behaviour of this flag differs from the other flags. It can be only
activated by runtime cpu feature selection.
* commit '7d07ee5a9bd170a06d26fd967cf8de5d3b1ce331':
ppc: cpu: Add support for VSX and POWER8 extensions
Merged-by: Michael Niedermayer <michaelni@gmx.at>
this allows disabling and enabling it
it also prevents crashes if vfpv3 and neon are disabled which previously
would have enabled the flag
And last but not least one can enable setend on cpus like cortex-a8 where
its fast but disabled by default
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b':
avutil: Move internal CPU detection function declarations to private header
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This function is problematic in several ways, its also quite
unpredictable which flags it ends up turning on
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
lavr: fix handling of custom mix matrices
fate: force pix_fmt in lagarith-rgb32 test
fate: add tests for lagarith lossless video codec.
ARMv6: vp8: fix stack allocation with Apple's assembler
ARM: vp56: allow inline asm to build with clang
fft: 3dnow: fix register name typo in DECL_IMDCT macro
x86: dct32: port to cpuflags
x86: build: replace mmx2 by mmxext
Revert "wmapro: prevent division by zero when sample rate is unspecified"
wmapro: prevent division by zero when sample rate is unspecified
lagarith: fix color plane inversion for YUY2 output.
lagarith: pad RGB buffer by 1 byte.
dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.
Conflicts:
doc/APIchanges
libavcodec/lagarith.c
libavfilter/x86/gradfun.c
libavutil/cpu.h
libavutil/version.h
libswscale/utils.c
libswscale/version.h
libswscale/x86/yuv2rgb.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
* qatar/master: (29 commits)
lavfi: reclassify showfiltfmts as a TESTPROG
graph2dot: fix printf format specifier
swscale: yuv2planeX 8bit >=sse2 functions need aligned stack on x86-32.
vp8: loopfilter >=sse2 functions need aligned stack on x86-32.
amr: remove shift out of the AMR_BIT() macro.
dsputilenc: group yasm and inline asm function pointer assignment.
mov: use forward declaration of a function instead of a table.
Clarify Doxygen comment for FF_API_* #defines.
configure: simplify get_version()
Create version.h headers for libraries that lack them
gitignore: Use full path instead of relative path to specify patterns
mpegvideo: remove VLAs
Add XTEA encryption support in libavutil
Add Blowfish encryption support in libavutil
eval: Add the isinf() function and tests for it
flacdec: move lpc filter to flacdsp
flacdec: split off channel decorrelation as flacdsp
avplay: Add an option for not limiting the input buffer size
FATE: add a test for WMA cover art.
FATE: add a test for apetag cover art
...
Conflicts:
.gitignore
configure
ffplay.c
libavcodec/Makefile
libavcodec/error_resilience.c
libavcodec/mpegvideo.c
libavcodec/ratecontrol.c
libavdevice/avdevice.h
libavfilter/Makefile
libavfilter/filtfmts.c
libavfilter/version.h
libavformat/mov.c
libavformat/version.h
libavutil/Makefile
libavutil/avutil.h
libavutil/version.h
libswscale/swscale.h
libswscale/x86/swscale_mmx.c
tests/fate/libavutil.mak
tests/lavfi-regression.sh
tools/graph2dot.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>