FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-18 03:19:31 +02:00

Author	SHA1	Message	Date
Andreas Rheinhardt	75c6a253a4	avcodec/vp3: Avoid complete VLC struct, only use VLCElem* Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:47:00 +01:00
Andreas Rheinhardt	6c7a344b65	avcodec/vp3: Share coefficient VLCs between threads These VLCs are very big: The VP3 one have 164382 elements but due to the overallocation enough memory for 313344 elements are allocated (1.195 MiB with sizeof(VLCElem) == 4); for VP4 the numbers are very similar, namely 311296 and 164392 elements. Since `1f4cf92cfb`, each frame thread has its own copy of these VLCs. This commit fixes this by sharing these VLCs across threads. The approach used here will also make it easier to support stream reconfigurations in case of frame-multithreading in the future. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:47:00 +01:00
Andreas Rheinhardt	7fee90efac	avcodec/imc: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:47:00 +01:00
Andreas Rheinhardt	6fb96ef755	avcodec/atrac9dec: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:47:00 +01:00
Andreas Rheinhardt	4c7e8b969e	avcodec/clearvideo: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:47:00 +01:00
Andreas Rheinhardt	c95e123e8c	avcodec/intrax8: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:47:00 +01:00
Andreas Rheinhardt	886fbec82f	avcodec/mpc7: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:47:00 +01:00
Andreas Rheinhardt	e5e05fd3c8	avcodec/rv40: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:47:00 +01:00
Andreas Rheinhardt	460c6ae597	avcodec/svq1dec: Increase size of VLC It allows to reduce the number of maximum reloads by one. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:47:00 +01:00
Andreas Rheinhardt	7d542e26a9	avcodec/svq1dec: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:47:00 +01:00
Andreas Rheinhardt	7902c0df4c	avcodec/msmpeg4_vc1_data: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Also combine the ff_msmp4_dc_(luma\|chroma)_vlcs as well as the tables used to generate them to simplify the code. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Andreas Rheinhardt	ff886fc282	avcodec/ituh263dec: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Andreas Rheinhardt	5a9e185dfc	avcodec/h261dec: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Andreas Rheinhardt	363837de0e	avcodec/faxcompr: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Andreas Rheinhardt	a99285aedf	avcodec/asvdec: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Andreas Rheinhardt	ab8a8246c8	avcodec/h264_cavlc: Remove code duplication Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Andreas Rheinhardt	bd4c778e19	avcodec/h264_cavlc: Avoid indirection for coefficient table VLCs Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Andreas Rheinhardt	fe748ddf62	avcodec/h264_cavlc: Avoid superfluous VLC structures Of all these VLCs here, only VLC.table was really used after init, so use the ff_vlc_init_tables API to get rid of them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Andreas Rheinhardt	c630d76b27	avcodec/vp3: Increase some VLC tables These are quite small and therefore force reloads that can be avoided by modest increases in the number of bits used. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Andreas Rheinhardt	1fee3a3dce	avcodec/vp3: Make VLC tables static where possible This is especially important for frame-threaded decoders like this one, because up until now each thread had an identical copy of all these VLC tables. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Andreas Rheinhardt	edc50658d9	avcodec/vlc: Add functions to init static VLCElem[] without VLC For lots of static VLCs, the number of bits is not read from VLC.bits, but rather a compile-constant that is hardcoded at the callsite of get_vlc2(). Only VLC.table is ever used and not using it directly is just an unnecessary indirection. This commit adds helper functions and macros to avoid the VLC structure when initializing VLC tables; there are 2x2 functions: Two choices for init_sparse or from_lengths and two choices for "overlong" initialization (as used when multiple VLCs are initialized that share the same underlying table). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-31 20:46:59 +01:00
Rémi Denis-Courmont	424c8ceb08	lavc/huffyuvdsp: R-V V add_int16 add_int16_128_c: 2390.5 add_int16_128_rvv_i32: 832.0 add_int16_rnd_width_c: 2390.2 add_int16_rnd_width_rvv_i32: 832.5	2023-10-31 21:33:25 +02:00
Rémi Denis-Courmont	7e1cdc69fb	lavc/utvideodsp: R-V V restore_rgb_planes10 restore_rgb_planes10_c: 185852.2 restore_rgb_planes10_rvv_i32: 90130.5	2023-10-31 21:33:25 +02:00
Rémi Denis-Courmont	4aea0da230	lavc/utvideodsp: R-V V restore_rgb_planes restore_rgb_planes_c: 133065.7 restore_rgb_planes_rvv_i32: 33317.2	2023-10-31 21:33:25 +02:00
Niklas Haas	6aff17a451	avformat/vf_vapoursynth: simplify xyz format check	2023-10-31 15:46:38 +01:00
Niklas Haas	93f07d98d9	avutil/pixdesc: simplify xyz pixfmt check	2023-10-31 15:46:38 +01:00
Niklas Haas	d312a33ed2	avfilter/drawutils: remove redundant xyz format check The code above this does a whitelist on desc->flags, which now includes the (disallowed) AV_PIX_FMT_FLAG_XYZ for XYZ formats. So there is no more need for a separate check, here.	2023-10-31 15:46:38 +01:00
Niklas Haas	57c16323f2	avutil/pixdesc: add AV_PIX_FMT_FLAG_XYZ There are already several places in the codebase that match desc->name against "xyz", and many downstream clients replicate this behavior. I have no idea why this is not just a flag. Motivated by my desire to add yet another check for XYZ to the codebase, and I'd rather not keep copy/pasting a string comparison hack.	2023-10-31 15:46:07 +01:00
Niklas Haas	96dfc4481b	avfilter/drawutils: ban XYZ formats These are not supported by the drawing functions at all, and were incorrectly advertised as supported in the past. Note: This check is added only to separate the logic change from the API change in the following commit, and will be removed again after it becomes redundant.	2023-10-31 15:43:30 +01:00
Logan Lyu	55f28eb627	lavc/aarch64: new optimization for 8-bit hevc_qpel_hv checkasm bench: put_hevc_qpel_hv4_8_c: 422.1 put_hevc_qpel_hv4_8_i8mm: 101.6 put_hevc_qpel_hv6_8_c: 756.4 put_hevc_qpel_hv6_8_i8mm: 225.9 put_hevc_qpel_hv8_8_c: 1189.9 put_hevc_qpel_hv8_8_i8mm: 296.6 put_hevc_qpel_hv12_8_c: 2407.4 put_hevc_qpel_hv12_8_i8mm: 552.4 put_hevc_qpel_hv16_8_c: 4021.4 put_hevc_qpel_hv16_8_i8mm: 886.6 put_hevc_qpel_hv24_8_c: 8992.1 put_hevc_qpel_hv24_8_i8mm: 1968.9 put_hevc_qpel_hv32_8_c: 15197.9 put_hevc_qpel_hv32_8_i8mm: 3209.4 put_hevc_qpel_hv48_8_c: 32811.1 put_hevc_qpel_hv48_8_i8mm: 7442.1 put_hevc_qpel_hv64_8_c: 58106.1 put_hevc_qpel_hv64_8_i8mm: 12423.9 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-10-31 14:14:21 +02:00
Logan Lyu	97a9d12657	lavc/aarch64: new optimization for 8-bit hevc_qpel_v checkasm bench: put_hevc_qpel_v4_8_c: 138.1 put_hevc_qpel_v4_8_neon: 41.1 put_hevc_qpel_v6_8_c: 276.6 put_hevc_qpel_v6_8_neon: 60.9 put_hevc_qpel_v8_8_c: 478.9 put_hevc_qpel_v8_8_neon: 72.9 put_hevc_qpel_v12_8_c: 1072.6 put_hevc_qpel_v12_8_neon: 203.9 put_hevc_qpel_v16_8_c: 1852.1 put_hevc_qpel_v16_8_neon: 264.1 put_hevc_qpel_v24_8_c: 4137.6 put_hevc_qpel_v24_8_neon: 586.9 put_hevc_qpel_v32_8_c: 7579.1 put_hevc_qpel_v32_8_neon: 1036.6 put_hevc_qpel_v48_8_c: 16355.6 put_hevc_qpel_v48_8_neon: 2326.4 put_hevc_qpel_v64_8_c: 33545.1 put_hevc_qpel_v64_8_neon: 4126.4 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-10-31 14:14:21 +02:00
Logan Lyu	265450b89e	lavc/aarch64: new optimization for 8-bit hevc_epel_hv checkasm bench: put_hevc_epel_hv4_8_c: 213.7 put_hevc_epel_hv4_8_i8mm: 59.4 put_hevc_epel_hv6_8_c: 350.9 put_hevc_epel_hv6_8_i8mm: 130.2 put_hevc_epel_hv8_8_c: 548.7 put_hevc_epel_hv8_8_i8mm: 136.9 put_hevc_epel_hv12_8_c: 1126.7 put_hevc_epel_hv12_8_i8mm: 302.2 put_hevc_epel_hv16_8_c: 1925.2 put_hevc_epel_hv16_8_i8mm: 459.9 put_hevc_epel_hv24_8_c: 4301.9 put_hevc_epel_hv24_8_i8mm: 1024.9 put_hevc_epel_hv32_8_c: 7509.2 put_hevc_epel_hv32_8_i8mm: 1680.4 put_hevc_epel_hv48_8_c: 16566.9 put_hevc_epel_hv48_8_i8mm: 3945.4 put_hevc_epel_hv64_8_c: 29134.2 put_hevc_epel_hv64_8_i8mm: 6567.7 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-10-31 14:02:53 +02:00
Logan Lyu	22c7291506	lavc/aarch64: new optimization for 8-bit hevc_epel_v checkasm bench: put_hevc_epel_v4_8_c: 79.9 put_hevc_epel_v4_8_neon: 25.7 put_hevc_epel_v6_8_c: 151.4 put_hevc_epel_v6_8_neon: 46.4 put_hevc_epel_v8_8_c: 250.9 put_hevc_epel_v8_8_neon: 41.7 put_hevc_epel_v12_8_c: 542.7 put_hevc_epel_v12_8_neon: 108.7 put_hevc_epel_v16_8_c: 939.4 put_hevc_epel_v16_8_neon: 169.2 put_hevc_epel_v24_8_c: 2104.9 put_hevc_epel_v24_8_neon: 307.9 put_hevc_epel_v32_8_c: 3713.9 put_hevc_epel_v32_8_neon: 524.2 put_hevc_epel_v48_8_c: 8175.2 put_hevc_epel_v48_8_neon: 1197.2 put_hevc_epel_v64_8_c: 16049.4 put_hevc_epel_v64_8_neon: 2094.9 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-10-31 14:02:53 +02:00
Logan Lyu	772865717b	lavc/aarch64: new optimization for 8-bit hevc_epel_pixels and and hevc_qpel_pixels checkasm bench: put_hevc_pel_pixels4_8_c: 33.7 put_hevc_pel_pixels4_8_neon: 20.2 put_hevc_pel_pixels6_8_c: 61.4 put_hevc_pel_pixels6_8_neon: 25.4 put_hevc_pel_pixels8_8_c: 121.4 put_hevc_pel_pixels8_8_neon: 16.9 put_hevc_pel_pixels12_8_c: 199.9 put_hevc_pel_pixels12_8_neon: 40.2 put_hevc_pel_pixels16_8_c: 355.9 put_hevc_pel_pixels16_8_neon: 43.4 put_hevc_pel_pixels24_8_c: 774.7 put_hevc_pel_pixels24_8_neon: 78.9 put_hevc_pel_pixels32_8_c: 1345.2 put_hevc_pel_pixels32_8_neon: 152.2 put_hevc_pel_pixels48_8_c: 2963.7 put_hevc_pel_pixels48_8_neon: 309.4 put_hevc_pel_pixels64_8_c: 5236.2 put_hevc_pel_pixels64_8_neon: 514.2 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-10-31 14:02:53 +02:00
Martin Storsjö	2c3d2a0245	configure: Improve aarch64 feature detection on older, broken Clang versions Clang versions before 17 (Xcode versions up to and including 15.0) had a very annoying bug in its behaviour of the ".arch" directive in assembly. If the directive only contained a level, such as ".arch armv8.2-a", it did validate the name of the level, but it didn't apply the level to what instructions are allowed. The level was applied if the directive contained an extra feature enabled, such as ".arch armv8.2-a+crc" though. It was also applied on the next ".arch_extension" directive. This bug, combined with the fact that the same versions of Clang didn't support the dotprod/i8mm extension names in either ".arch <level>+<feature>" or in ".arch_extension", could lead to unexepcted build failures. As the dotprod/i8mm extensions couldn't be enabled dynamically via the ".arch_extension" directive, someone building ffmpeg could try to enable them by configuring their build with --extra-cflags="-march=armv8.6-a". During configure, we test for support for the i8mm instructions like this: # Built with -march=armv8.6-a .arch armv8.2-a # Has no visible effect here #.arch_extension i8mm # Omitted as the extension name isn't known usdot v0.4s, v0.16b, v0.16b # Successfully assembled as armv8.6-a is the effective level, # and i8mm is enabled implicitly in armv8.6-a. Thus, we would enable assembling those instructions. However if we later check for another extension, such as sve (which those versions of Clang actually do support), we can later run into the following situation when building actual code: # Built with -march=armv8.6-a .arch armv8.2-a # Has no visible effect here #.arch_extension i8mm # Omitted as the extension name isn't known .arch_extension sve # Included as "sve" is as supported extension name # .arch_extension effectively activates the previous .arch directive, # so the effective level is armv8.2-a+sve now. usdot v0.4s, v0.16b, v0.16b # Fails to build the instructions that require i8mm. Despite the # configure check, the unrelated ".arch_extension sve" directive # breaks the functionality of the i8mm feature. This patch avoids this situation: - By adding a dummy feature such as "+crc" on the .arch directive (if supported), we make sure that it does get applied immediately, avoiding it taking effect spuriously at a later unrelated ".arch_extension" directive. - By checking for higher arch levels such as armv8.4-a and armv8.6-a, we can assemble the dotprod and i8mm extensions without the user needing to pass -march=armv8.6-a. This allows using the dotprod/i8mm codepaths via runtime detection while keeping the binary runnable on older versions. I.e. this enables the i8mm codepaths on Apple M2 machines while built with Xcode's Clang. TL;DR: Enable the I8MM extensions for Apple M2 without the user needing to do a custom configuration; avoid potential build breakage if a user does such a custom configuration. Once Xcode versions that have these issues fixed are prevalent, we can consider reverting this change. Signed-off-by: Martin Storsjö <martin@martin.st>	2023-10-31 12:23:31 +02:00
Martin Storsjö	f05948ada4	aarch64: Simplify the linux runtime cpu detection code Skip doing the whole getauxval(AT_HWCAP) if HWCAP_CPUID isn't defined. Signed-off-by: Martin Storsjö <martin@martin.st>	2023-10-31 12:23:27 +02:00
Rémi Denis-Courmont	ae72412aa8	lavc/idctdsp: improve R-V V put_pixels_clamped	2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont	d48810f3a5	lavc/idctdsp: improve R-V V add_pixels_clamped	2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont	600c6f1b55	lavc/idctdsp: improve R-V V put_signed_pixels_clamped This follows the same idea as with pixblockdsp, but applied at the other end, whilst writing data at the end of the function.	2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont	3ea2310e89	lavc/idctdsp: require Zve64x for R-V V functions This will be required for the following changesets.	2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont	300ee8b02d	lavc/pixblockdsp: aligned R-V V 8-bit functions If the scan lines are aligned, we can load each row as a 64-bit value, thus avoiding segmentation. And then we can factor the conversion or subtraction. In principle, the same optimisation should be possible for high depth, but would require 128-bit elements, for which no FFmpeg CPU flag exists.	2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont	722765687b	lavc/pixblockdsp: rename unaligned R-V V functions	2023-10-30 18:14:16 +02:00
Paul B Mahol	6323ca5902	avfilter/vf_feedback: add timeline support	2023-10-30 16:06:46 +01:00
Paul B Mahol	2f268505b9	doc/filters: add one more example for feedback filter	2023-10-30 15:12:12 +01:00
James Almer	4cba3e0f07	avutil/video_enc_params: fix doxy for av_video_enc_params_block() Reviewed-by: Anton Khirnov <anton@khirnov.net> Signed-off-by: James Almer <jamrial@gmail.com>	2023-10-30 10:30:05 -03:00
Kieran Kunhya	2532e832d2	libavcodec/mpeg12: Reindent	2023-10-29 22:12:05 +00:00
Kieran Kunhya	7d497a1119	libavcodec/mpeg12: Remove "fast" mode	2023-10-29 22:12:02 +00:00
Rémi Denis-Courmont	04b49fb3c5	lavu/riscv: fix typo	2023-10-29 22:15:15 +02:00
TADANO Tokumei	a824c6f2f6	lavc/libaribcaption: rename `replace_fullwidth_ascii` to `replace_msz_ascii` This should hopefully clarify that the option only affects MSZ full-width characters, and not all full-width ASCII. Additionally, this matches the prefix with the upstream option. Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>	2023-10-29 18:21:05 +02:00
TADANO Tokumei	21bfadd9b4	lavc/libaribcaption: add MSZ character related options This patch adds two MSZ (Middle Size; half width) character related options, mapping against newly added upstream functionality: * `replace_msz_japanese`, which was introduced in version 1.0.1 of libaribcaption. * `replace_msz_glyph`, which was introduced in version 1.1.0 of libaribcaption. The latter option improves bitmap type rendering if specified fonts contain half-width glyphs (e.g., BIZ UDGothic), even if both ASCII and Japanese MSZ replacement options are set to false. As these options require newer versions of libaribcaption, the configure requirement has been bumped accordingly. Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>	2023-10-29 18:20:43 +02:00

... 5 6 7 8 9 ...

112895 Commits