1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-03 05:10:03 +02:00
FFmpeg/libavutil
Timo Rothenpieler ef2c2a2220 avutil/half2float: use native _Float16 if available
_Float16 support was available on arm/aarch64 for a while, and with gcc
12 was enabled on x86 as long as SSE2 is supported.

If the target arch supports f16c, gcc emits fairly efficient assembly,
taking advantage of it. This is the case on x86-64-v3 or higher.
Same goes on arm, which has native float16 support.
On x86, without f16c, it emulates it in software using sse2 instructions.

This has shown to perform rather poorly:

_Float16 full SSE2 emulation:
frame=50074 fps=848 q=-0.0 size=N/A time=00:33:22.96 bitrate=N/A speed=33.9x

_Float16 f16c accelerated (Zen2, --cpu=znver2):
frame=50636 fps=1965 q=-0.0 Lsize=N/A time=00:33:45.40 bitrate=N/A speed=78.6x

classic half2float full software implementation:
frame=49926 fps=1605 q=-0.0 Lsize=N/A time=00:33:17.00 bitrate=N/A speed=64.2x

Hence an additional check was introduced, that only enables use of
_Float16 on x86 if f16c is being utilized.

On aarch64, a similar uplift in performance is seen:

RPi4 half2float full software implementation:
frame= 6088 fps=126 q=-0.0 Lsize=N/A time=00:04:03.48 bitrate=N/A speed=5.06x

RPi4 _Float16:
frame= 6103 fps=158 q=-0.0 Lsize=N/A time=00:04:04.08 bitrate=N/A speed=6.32x

Since arm/aarch64 always natively support 16 bit floats, it can always
be considered fast there.

I'm not aware of any additional platforms that currently support
_Float16. And if there are, they should be considered non-fast until
proven fast.
2022-08-19 22:09:36 +02:00
..
aarch64 aarch64: Only emit the PAC/BTI note section when targeting ELF 2022-03-15 00:44:28 +02:00
arm avutil: use getauxval(3) for CPU capabilities on linux/android ARM 2022-02-07 13:42:40 -08:00
avr32
bfin
loongarch avcodec/loongarch/h264chroma, vc1dsp_lasx: Add wrapper for __lasx_xvldx 2022-08-05 02:59:58 +02:00
mips avutil/mips: Use $at as MMI macro temporary register 2021-07-28 23:31:48 +02:00
ppc avutil/ppc/cpu: Use proper header for OpenBSD PPC CPU detection 2022-06-25 12:16:51 +02:00
sh4
tests avutil/test/pixfmt_best: test the VUYA pixel format 2022-08-07 09:33:16 -03:00
tomi
x86 x86/tx_float: save a branch during coefficient deinterleaving 2022-08-09 03:35:12 +02:00
.gitignore
adler32.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
adler32.h avutil: Switch crypto APIs to size_t 2021-04-27 10:43:13 -03:00
aes_ctr.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
aes_ctr.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
aes_internal.h All: update names in copyright headers 2021-01-20 01:02:56 -06:00
aes.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
aes.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
attributes.h avutil/attributes: add support for clang in AV_NOWARN_DEPRECATED 2022-03-16 12:29:37 -03:00
audio_fifo.c avutil/audio_fifo: Avoid avutil.h inclusion 2022-02-24 12:56:49 +01:00
audio_fifo.h avutil/audio_fifo: Avoid avutil.h inclusion 2022-02-24 12:56:49 +01:00
avassert.h avutil/avassert: Don't include avutil.h 2022-02-24 12:56:49 +01:00
avsscanf.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
avstring.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
avstring.h avutil/{avstring,bprint}: add XML escaping from ffprobe to avutil 2021-03-05 19:45:00 +02:00
avutil.h libavutil: Deprecate av_fopen_utf8, provide an avpriv version 2022-05-23 13:52:26 +03:00
avutilres.rc
base64.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
base64.h
blowfish.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
blowfish.h
bprint.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
bprint.h
bswap.h
buffer_internal.h Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
buffer.c avutil/buffer: Never poison returned buffers 2022-08-10 18:49:35 +02:00
buffer.h avutil/buffer: constify some function parameters 2021-09-17 13:28:09 -03:00
camellia.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
camellia.h
cast5.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
cast5.h
channel_layout.c Revert "avutil/channel_layout: av_channel_layout_describe_bprint: Check for buffer end" 2022-07-04 14:04:54 -03:00
channel_layout.h channel_layout: add support for Ambisonic 2022-03-15 09:42:47 -03:00
color_utils.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
color_utils.h
colorspace.h
common.h Remove obsolete version.h inclusions 2022-02-24 12:56:49 +01:00
cpu_internal.h avutil/cpu_internal: Fix check for SSE2SLOW 2022-06-18 19:25:03 +02:00
cpu.c all: Replace if (ARCH_FOO) checks by #if ARCH_FOO 2022-06-15 04:56:37 +02:00
cpu.h avutil/cpu: add AVX512 Icelake flag 2022-03-10 16:45:48 -03:00
crc.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
crc.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
csp.c avutil/csp: create public API for colorspace structs 2022-06-01 13:52:38 -04:00
csp.h avutil/csp: create public API for colorspace structs 2022-06-01 13:52:38 -04:00
cuda_check.h avutil/log: Don't include avutil.h 2022-02-24 12:56:49 +01:00
des.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
des.h
detection_bbox.c avutil/detection_bbox: Fix av_detection_bbox_alloc failed if nb_bboxes == 0 2021-10-08 10:11:59 +08:00
detection_bbox.h lavu/detection_bbox.h: use AV_NUM_DETECTION_BBOX_CLASSIFY to replace AV_NUM_BBOX_CLASSIFY 2021-04-18 10:41:17 +08:00
dict.c avutil/dict: av_realloc -> av_realloc_array() 2020-06-06 10:32:07 +08:00
dict.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
display.c avutil/display: Don't include avutil.h 2022-02-24 12:56:49 +01:00
display.h avutil/display: Don't include avutil.h 2022-02-24 12:56:49 +01:00
dovi_meta.c lavu/frame: Add Dolby Vision metadata side data type 2022-01-04 11:59:02 +01:00
dovi_meta.h lavu/frame: Add Dolby Vision metadata side data type 2022-01-04 11:59:02 +01:00
downmix_info.c
downmix_info.h
dynarray.h
encryption_info.c Replace all occurences of av_mallocz_array() by av_calloc() 2021-09-20 01:03:52 +02:00
encryption_info.h
error.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
error.h avutil/error: Include macros.h for MKTAG 2021-07-29 22:02:05 +02:00
eval.c libavutil/eval: Remove CONFIG_TRAPV special handling 2021-02-10 12:28:29 +01:00
eval.h avutil/eval: Don't include avutil.h 2022-02-24 12:56:49 +01:00
ffmath.h
fifo.c avutil/fifo: Don't include avutil.h 2022-02-24 12:56:49 +01:00
fifo.h avutil/fifo: Don't include avutil.h 2022-02-24 12:56:49 +01:00
file_open.c avutil/wchar_filename,file_open: Support long file names on Windows 2022-06-09 13:03:47 +03:00
file.c
file.h avutil/file: Don't include avutil.h 2022-02-24 12:56:49 +01:00
film_grain_params.c libavutil: introduce AVFilmGrainParams side data 2020-11-25 23:06:33 +01:00
film_grain_params.h avcodec/h264_slice: compute and export film grain seed 2021-08-24 09:58:52 -03:00
fixed_dsp.c all: Replace if (ARCH_FOO) checks by #if ARCH_FOO 2022-06-15 04:56:37 +02:00
fixed_dsp.h Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
float2half.c avutil/half2float: use native _Float16 if available 2022-08-19 22:09:36 +02:00
float2half.h avutil/half2float: use native _Float16 if available 2022-08-19 22:09:36 +02:00
float_dsp.c all: Replace if (ARCH_FOO) checks by #if ARCH_FOO 2022-06-15 04:56:37 +02:00
float_dsp.h
frame.c lavu/frame: allow calling av_frame_make_writable() on non-refcounted frames 2022-08-02 10:44:37 +02:00
frame.h lavu/frame: allow calling av_frame_make_writable() on non-refcounted frames 2022-08-02 10:44:37 +02:00
getenv_utf8.h libavutil: Add wchartoutf8(), wchartoansi(), utf8toansi(), getenv_utf8(), freeenv_utf8() and getenv_dup() 2022-06-21 13:27:46 +03:00
half2float.c avutil/half2float: use native _Float16 if available 2022-08-19 22:09:36 +02:00
half2float.h avutil/half2float: use native _Float16 if available 2022-08-19 22:09:36 +02:00
hash.c avutil: Switch crypto APIs to size_t 2021-04-27 10:43:13 -03:00
hash.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
hdr_dynamic_metadata.c
hdr_dynamic_metadata.h
hdr_dynamic_vivid_metadata.c avutil: support for CUVA Vivid HDR metadata 2022-03-01 09:08:43 +08:00
hdr_dynamic_vivid_metadata.h avutil: support for CUVA Vivid HDR metadata 2022-03-01 09:08:43 +08:00
hmac.c Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
hmac.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
hwcontext_cuda_internal.h
hwcontext_cuda.c avutil/hwcontext_cuda: return more useful error codes from init functions 2021-11-22 23:03:21 +01:00
hwcontext_cuda.h
hwcontext_d3d11va.c avutil/hwcontext_d3d11va: add support for rgbaf16 pixel format 2022-08-13 15:21:59 +02:00
hwcontext_d3d11va.h libavutil/hwcontext_d3d11va: adding more texture information to the D3D11 hwcontext API 2021-09-08 17:48:02 -03:00
hwcontext_drm.c hwcontext_drm: make dependency on Linux kernel headers optional 2020-12-30 23:14:46 +01:00
hwcontext_drm.h
hwcontext_dxva2.c avutil/hwcontext_dxva2: add ARGB format 2021-11-13 19:22:57 +01:00
hwcontext_dxva2.h
hwcontext_internal.h Revert "avutils/hwcontext: When deriving a hwdevice, search for existing device in both directions" 2022-01-05 11:56:58 +08:00
hwcontext_mediacodec.c
hwcontext_mediacodec.h
hwcontext_opencl.c qsv: remove mfx/ prefix from mfx headers 2022-08-12 10:43:39 +08:00
hwcontext_opencl.h
hwcontext_qsv.c lavu/hwcontext_qsv: make qsv hwdevice works with oneVPL 2022-08-12 10:43:39 +08:00
hwcontext_qsv.h lavu/hwcontext_qsv: add loader field to AVQSVDeviceContext 2022-08-12 10:43:39 +08:00
hwcontext_stub.c )hwcontext: add a stub implementation for Vulkan functions 2022-07-05 15:20:08 +02:00
hwcontext_vaapi.c lavu/hwcontext_vaapi: Map the AYUV format 2022-08-03 14:10:12 -07:00
hwcontext_vaapi.h
hwcontext_vdpau.c avutil/buffer: Switch AVBuffer API to size_t 2021-04-27 10:43:13 -03:00
hwcontext_vdpau.h
hwcontext_videotoolbox.c avutil/hwcontext_videotoolbox: create real buffer pool 2022-04-29 17:27:37 +08:00
hwcontext_videotoolbox.h avutil/hwcontext_videotoolbox: add missing include for AVFrame 2022-08-08 11:08:55 +08:00
hwcontext_vulkan.c avutil/hwcontext_vulkan: fix typo in undef 2022-03-14 17:50:07 +01:00
hwcontext_vulkan.h hwcontext_vulkan: stricter semaphore number requirements 2021-12-10 17:04:22 +01:00
hwcontext.c lavu/hwcontext: clarify behavior on av_hwframe_map() failure 2022-02-17 11:05:44 +01:00
hwcontext.h lavu/hwcontext: clarify behavior on av_hwframe_map() failure 2022-02-17 11:05:44 +01:00
imgutils_internal.h
imgutils.c imgutils: expose av_image_copy_plane_uc_from() 2021-08-14 00:27:43 +02:00
imgutils.h avutil/imgutils: Don't include avutil.h 2022-02-24 12:56:49 +01:00
integer.c avutil/integer: Don't include common.h 2022-02-24 12:56:49 +01:00
integer.h avutil/integer: Don't include common.h 2022-02-24 12:56:49 +01:00
internal.h libavutil: Deprecate av_fopen_utf8, provide an avpriv version 2022-05-23 13:52:26 +03:00
intfloat.h
intmath.c
intmath.h
intreadwrite.h
lfg.c
lfg.h
libavutil.v
libm.h
lls.c all: Replace if (ARCH_FOO) checks by #if ARCH_FOO 2022-06-15 04:56:37 +02:00
lls.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
log2_tab.c
log.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
log.h avutil/log: Don't include avutil.h 2022-02-24 12:56:49 +01:00
lzo.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
lzo.h
macos_kperf.c lavu/kperf: use ff_thread_once() 2021-07-21 16:35:27 +02:00
macos_kperf.h lavu/kperf: use ff_thread_once() 2021-07-21 16:35:27 +02:00
macros.h avutil/common, macros: Move several macros from common.h to macros.h 2021-07-29 22:02:05 +02:00
Makefile configure: always enable gnu_windres if available 2022-08-13 14:42:36 +02:00
mastering_display_metadata.c
mastering_display_metadata.h
mathematics.c avutil/avassert: Don't include avutil.h 2022-02-24 12:56:49 +01:00
mathematics.h avutil/mathematics: Document av_rescale_rnd() behavior on non int64 results 2021-10-21 14:13:03 +02:00
md5.c avutil/md5: Avoid av_unused variable 2021-10-02 17:13:57 +02:00
md5.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
mem_internal.h avutil/mem: make ff_fast_malloc() internal to mem.c 2021-05-27 10:29:52 -03:00
mem.c avutil/mem: Handle fast allocations near UINT_MAX properly 2022-07-06 22:53:15 +02:00
mem.h avutil/mem: fix doc for reallocs 2022-05-26 17:18:23 +08:00
motion_vector.h
murmur3.c avutil: Switch crypto APIs to size_t 2021-04-27 10:43:13 -03:00
murmur3.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
objc.h avutil: add obj-c helpers into header-only include 2021-12-18 11:55:47 -08:00
opt.c avutil/opt: Combine multiple av_log statements 2022-08-03 21:09:24 +02:00
opt.h lavu: support AVChannelLayout AVOptions 2022-03-15 09:42:29 -03:00
parseutils.c avutil/parseutils: use quadhd for Quad HD 2022-01-12 13:42:26 +08:00
parseutils.h
pca.c
pca.h
pixdesc.c lavu/pixfmt: add packed RGBA float16 format 2022-08-13 15:21:46 +02:00
pixdesc.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
pixelutils.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
pixelutils.h avutil/pixelutils: Don't include common.h 2022-02-24 12:56:49 +01:00
pixfmt.h lavu/pixfmt: add packed RGBA float16 format 2022-08-13 15:21:46 +02:00
qsort.h Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
random_seed.c
random_seed.h
rational.c lavu: add av_gcd_q(). 2020-05-23 15:51:44 +02:00
rational.h lavu: add av_gcd_q(). 2020-05-23 15:51:44 +02:00
rc4.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
rc4.h
replaygain.h
reverse.c
reverse.h
ripemd.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
ripemd.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
samplefmt.c avutil/samplefmt: Don't include attributes.h, avutil.h 2022-02-24 12:56:49 +01:00
samplefmt.h avutil/samplefmt: Don't include attributes.h, avutil.h 2022-02-24 12:56:49 +01:00
sha512.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
sha512.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
sha.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
sha.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
slicethread.c avutil/avassert: Don't include avutil.h 2022-02-24 12:56:49 +01:00
slicethread.h
softfloat_ieee754.h
softfloat_tables.h
softfloat.h
spherical.c avutil/spherical: Use av_strstart instead of strncmp 2021-02-28 17:14:21 +01:00
spherical.h
stereo3d.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
stereo3d.h
tablegen.h
tea.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
tea.h
thread.h avutil/log: Don't include avutil.h 2022-02-24 12:56:49 +01:00
threadmessage.c avutil/fifo: Don't include avutil.h 2022-02-24 12:56:49 +01:00
threadmessage.h
time_internal.h
time.c lavu: use address-of operator checking clock_gettime 2020-12-28 01:12:26 -03:00
time.h
timecode.c avutil/timecode: use timecode fps for number of frame digits 2022-04-22 22:54:58 +02:00
timecode.h avutil/timecode: add av_timecode_init_from_components 2020-12-03 18:32:54 +01:00
timer.h avutil/log: Don't include avutil.h 2022-02-24 12:56:49 +01:00
timestamp.h
tree.c
tree.h Remove obsolete version.h inclusions 2021-07-22 14:34:31 +02:00
twofish.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
twofish.h
tx_double.c
tx_float.c
tx_int32.c lavu/tx: implement 32 bit fixed point FFT and MDCT 2020-02-13 17:10:34 +00:00
tx_priv.h lavu/tx: optimize and simplify inverse MDCTs 2022-08-16 01:22:38 +02:00
tx_template.c lavu/tx: optimize and simplify inverse MDCTs 2022-08-16 01:22:38 +02:00
tx.c lavu/tx: optimize and simplify inverse MDCTs 2022-08-16 01:22:38 +02:00
tx.h avutil/tx: Fix documentation of av_tx_uninit() 2022-02-11 19:38:41 +01:00
utils.c lib*/version: Move library version functions into files of their own 2022-05-10 06:49:32 +02:00
uuid.c avutil/uuid: add utility library for manipulating UUIDs as specified in RFC 4122 2022-06-12 18:34:28 +10:00
uuid.h avutil/uuid: add utility library for manipulating UUIDs as specified in RFC 4122 2022-06-12 18:34:28 +10:00
version_major.h Fix libversion.sh for split version headers, to unbreak shared library builds 2022-03-17 11:11:17 +02:00
version.c lib*/version: Move library version functions into files of their own 2022-05-10 06:49:32 +02:00
version.h lavu/pixfmt: add packed RGBA float16 format 2022-08-13 15:21:46 +02:00
video_enc_params.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
video_enc_params.h mpegvideo: use the AVVideoEncParams API for exporting QP tables 2021-01-01 14:23:19 +01:00
vulkan_functions.h hwcontext_vulkan: avoid using 64-bit enums 2022-01-27 10:27:09 +01:00
vulkan_glslang.c avutil/vulkan_glslang: fix compiling failure issue 2021-11-19 16:47:48 +01:00
vulkan_loader.h vulkan_loader: fix typo in error message 2021-11-18 06:40:52 +01:00
vulkan_shaderc.c lavu/vulkan: add support for using libshaderc as a GLSL compiler 2021-11-19 16:47:30 +01:00
vulkan.c lavu/vulkan: avoid using strlen as a loop condition 2022-02-22 06:30:12 +01:00
vulkan.h vulkan: fix checkheaders 2021-11-19 16:47:28 +01:00
wchar_filename.h avutil/wchar_filename: Make the header C++ compatible 2022-06-28 10:59:31 +02:00
xga_font_data.c
xga_font_data.h
xtea.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
xtea.h