1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-07 11:13:41 +02:00
Commit Graph

48939 Commits

Author SHA1 Message Date
Andreas Rheinhardt
a99285aedf avcodec/asvdec: Avoid superfluous VLC structures
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-10-31 20:46:59 +01:00
Andreas Rheinhardt
ab8a8246c8 avcodec/h264_cavlc: Remove code duplication
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-10-31 20:46:59 +01:00
Andreas Rheinhardt
bd4c778e19 avcodec/h264_cavlc: Avoid indirection for coefficient table VLCs
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-10-31 20:46:59 +01:00
Andreas Rheinhardt
fe748ddf62 avcodec/h264_cavlc: Avoid superfluous VLC structures
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-10-31 20:46:59 +01:00
Andreas Rheinhardt
c630d76b27 avcodec/vp3: Increase some VLC tables
These are quite small and therefore force reloads
that can be avoided by modest increases in the number of bits used.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-10-31 20:46:59 +01:00
Andreas Rheinhardt
1fee3a3dce avcodec/vp3: Make VLC tables static where possible
This is especially important for frame-threaded decoders like
this one, because up until now each thread had an identical
copy of all these VLC tables.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-10-31 20:46:59 +01:00
Andreas Rheinhardt
edc50658d9 avcodec/vlc: Add functions to init static VLCElem[] without VLC
For lots of static VLCs, the number of bits is not read from
VLC.bits, but rather a compile-constant that is hardcoded
at the callsite of get_vlc2(). Only VLC.table is ever used
and not using it directly is just an unnecessary indirection.

This commit adds helper functions and macros to avoid the VLC
structure when initializing VLC tables; there are 2x2 functions:
Two choices for init_sparse or from_lengths and two choices
for "overlong" initialization (as used when multiple VLCs are
initialized that share the same underlying table).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-10-31 20:46:59 +01:00
Rémi Denis-Courmont
424c8ceb08 lavc/huffyuvdsp: R-V V add_int16
add_int16_128_c:      2390.5
add_int16_128_rvv_i32: 832.0
add_int16_rnd_width_c:      2390.2
add_int16_rnd_width_rvv_i32: 832.5
2023-10-31 21:33:25 +02:00
Rémi Denis-Courmont
7e1cdc69fb lavc/utvideodsp: R-V V restore_rgb_planes10
restore_rgb_planes10_c:      185852.2
restore_rgb_planes10_rvv_i32: 90130.5
2023-10-31 21:33:25 +02:00
Rémi Denis-Courmont
4aea0da230 lavc/utvideodsp: R-V V restore_rgb_planes
restore_rgb_planes_c:      133065.7
restore_rgb_planes_rvv_i32: 33317.2
2023-10-31 21:33:25 +02:00
Logan Lyu
55f28eb627 lavc/aarch64: new optimization for 8-bit hevc_qpel_hv
checkasm bench:
put_hevc_qpel_hv4_8_c: 422.1
put_hevc_qpel_hv4_8_i8mm: 101.6
put_hevc_qpel_hv6_8_c: 756.4
put_hevc_qpel_hv6_8_i8mm: 225.9
put_hevc_qpel_hv8_8_c: 1189.9
put_hevc_qpel_hv8_8_i8mm: 296.6
put_hevc_qpel_hv12_8_c: 2407.4
put_hevc_qpel_hv12_8_i8mm: 552.4
put_hevc_qpel_hv16_8_c: 4021.4
put_hevc_qpel_hv16_8_i8mm: 886.6
put_hevc_qpel_hv24_8_c: 8992.1
put_hevc_qpel_hv24_8_i8mm: 1968.9
put_hevc_qpel_hv32_8_c: 15197.9
put_hevc_qpel_hv32_8_i8mm: 3209.4
put_hevc_qpel_hv48_8_c: 32811.1
put_hevc_qpel_hv48_8_i8mm: 7442.1
put_hevc_qpel_hv64_8_c: 58106.1
put_hevc_qpel_hv64_8_i8mm: 12423.9

Co-Authored-By: J. Dekker <jdek@itanimul.li>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-31 14:14:21 +02:00
Logan Lyu
97a9d12657 lavc/aarch64: new optimization for 8-bit hevc_qpel_v
checkasm bench:

put_hevc_qpel_v4_8_c: 138.1
put_hevc_qpel_v4_8_neon: 41.1
put_hevc_qpel_v6_8_c: 276.6
put_hevc_qpel_v6_8_neon: 60.9
put_hevc_qpel_v8_8_c: 478.9
put_hevc_qpel_v8_8_neon: 72.9
put_hevc_qpel_v12_8_c: 1072.6
put_hevc_qpel_v12_8_neon: 203.9
put_hevc_qpel_v16_8_c: 1852.1
put_hevc_qpel_v16_8_neon: 264.1
put_hevc_qpel_v24_8_c: 4137.6
put_hevc_qpel_v24_8_neon: 586.9
put_hevc_qpel_v32_8_c: 7579.1
put_hevc_qpel_v32_8_neon: 1036.6
put_hevc_qpel_v48_8_c: 16355.6
put_hevc_qpel_v48_8_neon: 2326.4
put_hevc_qpel_v64_8_c: 33545.1
put_hevc_qpel_v64_8_neon: 4126.4

Co-Authored-By: J. Dekker <jdek@itanimul.li>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-31 14:14:21 +02:00
Logan Lyu
265450b89e lavc/aarch64: new optimization for 8-bit hevc_epel_hv
checkasm bench:
put_hevc_epel_hv4_8_c: 213.7
put_hevc_epel_hv4_8_i8mm: 59.4
put_hevc_epel_hv6_8_c: 350.9
put_hevc_epel_hv6_8_i8mm: 130.2
put_hevc_epel_hv8_8_c: 548.7
put_hevc_epel_hv8_8_i8mm: 136.9
put_hevc_epel_hv12_8_c: 1126.7
put_hevc_epel_hv12_8_i8mm: 302.2
put_hevc_epel_hv16_8_c: 1925.2
put_hevc_epel_hv16_8_i8mm: 459.9
put_hevc_epel_hv24_8_c: 4301.9
put_hevc_epel_hv24_8_i8mm: 1024.9
put_hevc_epel_hv32_8_c: 7509.2
put_hevc_epel_hv32_8_i8mm: 1680.4
put_hevc_epel_hv48_8_c: 16566.9
put_hevc_epel_hv48_8_i8mm: 3945.4
put_hevc_epel_hv64_8_c: 29134.2
put_hevc_epel_hv64_8_i8mm: 6567.7

Co-Authored-By: J. Dekker <jdek@itanimul.li>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-31 14:02:53 +02:00
Logan Lyu
22c7291506 lavc/aarch64: new optimization for 8-bit hevc_epel_v
checkasm bench:
put_hevc_epel_v4_8_c: 79.9
put_hevc_epel_v4_8_neon: 25.7
put_hevc_epel_v6_8_c: 151.4
put_hevc_epel_v6_8_neon: 46.4
put_hevc_epel_v8_8_c: 250.9
put_hevc_epel_v8_8_neon: 41.7
put_hevc_epel_v12_8_c: 542.7
put_hevc_epel_v12_8_neon: 108.7
put_hevc_epel_v16_8_c: 939.4
put_hevc_epel_v16_8_neon: 169.2
put_hevc_epel_v24_8_c: 2104.9
put_hevc_epel_v24_8_neon: 307.9
put_hevc_epel_v32_8_c: 3713.9
put_hevc_epel_v32_8_neon: 524.2
put_hevc_epel_v48_8_c: 8175.2
put_hevc_epel_v48_8_neon: 1197.2
put_hevc_epel_v64_8_c: 16049.4
put_hevc_epel_v64_8_neon: 2094.9

Co-Authored-By: J. Dekker <jdek@itanimul.li>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-31 14:02:53 +02:00
Logan Lyu
772865717b lavc/aarch64: new optimization for 8-bit hevc_epel_pixels and and hevc_qpel_pixels
checkasm bench:
put_hevc_pel_pixels4_8_c: 33.7
put_hevc_pel_pixels4_8_neon: 20.2
put_hevc_pel_pixels6_8_c: 61.4
put_hevc_pel_pixels6_8_neon: 25.4
put_hevc_pel_pixels8_8_c: 121.4
put_hevc_pel_pixels8_8_neon: 16.9
put_hevc_pel_pixels12_8_c: 199.9
put_hevc_pel_pixels12_8_neon: 40.2
put_hevc_pel_pixels16_8_c: 355.9
put_hevc_pel_pixels16_8_neon: 43.4
put_hevc_pel_pixels24_8_c: 774.7
put_hevc_pel_pixels24_8_neon: 78.9
put_hevc_pel_pixels32_8_c: 1345.2
put_hevc_pel_pixels32_8_neon: 152.2
put_hevc_pel_pixels48_8_c: 2963.7
put_hevc_pel_pixels48_8_neon: 309.4
put_hevc_pel_pixels64_8_c: 5236.2
put_hevc_pel_pixels64_8_neon: 514.2

Co-Authored-By: J. Dekker <jdek@itanimul.li>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-31 14:02:53 +02:00
Rémi Denis-Courmont
ae72412aa8 lavc/idctdsp: improve R-V V put_pixels_clamped 2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont
d48810f3a5 lavc/idctdsp: improve R-V V add_pixels_clamped 2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont
600c6f1b55 lavc/idctdsp: improve R-V V put_signed_pixels_clamped
This follows the same idea as with pixblockdsp, but applied at the
other end, whilst writing data at the end of the function.
2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont
3ea2310e89 lavc/idctdsp: require Zve64x for R-V V functions
This will be required for the following changesets.
2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont
300ee8b02d lavc/pixblockdsp: aligned R-V V 8-bit functions
If the scan lines are aligned, we can load each row as a 64-bit value,
thus avoiding segmentation. And then we can factor the conversion or
subtraction.

In principle, the same optimisation should be possible for high depth,
but would require 128-bit elements, for which no FFmpeg CPU flag
exists.
2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont
722765687b lavc/pixblockdsp: rename unaligned R-V V functions 2023-10-30 18:14:16 +02:00
Kieran Kunhya
2532e832d2 libavcodec/mpeg12: Reindent 2023-10-29 22:12:05 +00:00
Kieran Kunhya
7d497a1119 libavcodec/mpeg12: Remove "fast" mode 2023-10-29 22:12:02 +00:00
TADANO Tokumei
a824c6f2f6 lavc/libaribcaption: rename replace_fullwidth_ascii to replace_msz_ascii
This should hopefully clarify that the option only affects MSZ
full-width characters, and not all full-width ASCII. Additionally,
this matches the prefix with the upstream option.

Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>
2023-10-29 18:21:05 +02:00
TADANO Tokumei
21bfadd9b4 lavc/libaribcaption: add MSZ character related options
This patch adds two MSZ (Middle Size; half width) character
related options, mapping against newly added upstream
functionality:

* `replace_msz_japanese`, which was introduced in version 1.0.1
  of libaribcaption.
* `replace_msz_glyph`, which was introduced in version 1.1.0
  of libaribcaption.

The latter option improves bitmap type rendering if specified
fonts contain half-width glyphs (e.g., BIZ UDGothic), even
if both ASCII and Japanese MSZ replacement options are set
to false.

As these options require newer versions of libaribcaption, the
configure requirement has been bumped accordingly.

Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>
2023-10-29 18:20:43 +02:00
TADANO Tokumei
82faba8a6c lavc/libaribcaption: switch all bool context variables to int
On some environments, a `bool` variable is of smaller size than `int`.
As AV_OPT_TYPE_BOOL is internally handled as sizeof(int), if a `bool`
option was set on such an environment, the memory of following
variables would be filled. Additionally, set values may be destroyed
by av_opt_copy().

Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>
2023-10-29 18:19:58 +02:00
Michael Niedermayer
47e784f881
Bump versions after 6.1
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-10-29 16:19:14 +01:00
Michael Niedermayer
9d3a7d30c4
Bump versions prior to 6.1
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-10-29 15:34:05 +01:00
Michael Niedermayer
88453250db
avcodec/jpeg2000dec: Check image offset
Fixes: left shift of negative value -538967841
Fixes: 62447/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_JPEG2000_fuzzer-6427134337613824

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-10-27 18:10:47 +02:00
Michael Niedermayer
9690d71f11
avcodec/vlc: dont pass nb_elems into multi vlc code
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-10-27 18:10:46 +02:00
Michael Niedermayer
9b546a0717
avcodec/vlc: merge lost 16bit end of array check
Also cleanup related code

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-10-27 18:10:46 +02:00
Michael Niedermayer
a23d527ec5
avcodec/magicyuv: remove redundant check in inner loop
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-10-27 18:10:46 +02:00
Michael Niedermayer
4ddf4f5001
avcodec/magicyuv: correct end of array check in multi VLC parsing
Fixes: out of array write
Fixes: 63390/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MAGICYUV_fuzzer-5144552979431424.fuzz

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-10-27 18:10:45 +02:00
Michael Niedermayer
ffac64a270
avcodec/bitstream_template: Basic documentation for read_vlc_multi()
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-10-27 18:10:28 +02:00
Paul B Mahol
36eb774ad4 avcodec/mlpenc: try different filter parameters in case of out of range output from LPC 2023-10-27 12:45:23 +02:00
Paul B Mahol
567af48fba avcodec/mlpenc: add support for 4.0/4.1 ch layout 2023-10-27 12:45:23 +02:00
Paul B Mahol
210e844def avcodec/mlpdec: support for truehd with channels not representable with 5bit field in second stream
Fixes decoding for 4.0/4.1 layouts.
2023-10-27 12:45:23 +02:00
Paul B Mahol
deb4c28dcc avcodec/mlpenc: add 3.1 ch layout support for truehd 2023-10-27 12:45:23 +02:00
Andreas Rheinhardt
ba6a5e7a3d avcodec/hevcdec: Move collocated_ref to HEVCContext
Only the collocated_ref of the current frame (i.e. HEVCContext.ref)
is ever used*, so move it to HEVCContext directly after ref.

*: This goes so far that collocated_ref was not even synced across
threads in case of frame-threading.

Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-10-26 13:18:01 +02:00
Lynne
70864e6adb
vulkan_decode: correct flipped condition in image layout
Changed by the previous commit.
Caused validation issues on hardware with !reuse_dpb_dst but not layered_dpb.
2023-10-25 22:01:21 +02:00
Lynne
0b3616231d
vulkan_decode: fix another validation issue
Surprising no one, the insane usage rule has a catch.
2023-10-25 20:51:55 +02:00
Lynne
467e411839
vulkan_decode: fix pedantic validation issue
"Validation Error: [ VUID-VkImageViewCreateInfo-imageViewType-04974 ] Object 0: handle = 0x9f9b41000000003c, type = VK_OBJECT_TYPE_IMAGE; | MessageID = 0xc120e150 | vkCreateImageView():
Using pCreateInfo->viewType VK_IMAGE_VIEW_TYPE_2D and the subresourceRange.layerCount VK_REMAINING_ARRAY_LAYERS=(17) and must 1 (try looking into VK_IMAGE_VIEW_TYPE_*_ARRAY).
The Vulkan spec states: If viewType is VK_IMAGE_VIEW_TYPE_1D, VK_IMAGE_VIEW_TYPE_2D, or VK_IMAGE_VIEW_TYPE_3D; and subresourceRange.layerCount is VK_REMAINING_ARRAY_LAYERS,
then the remaining number of layers must be 1"
2023-10-25 20:51:54 +02:00
Lynne
9ee4f47c94
vulkan_decode: use coded_width/height instead of the non-coded width and height
Partially fixes https://streams.videolan.org/issues/19938/20000_20180305-15.04.59.ts
The is coded as 1920x1080, meant to be rendered at 1440x1080 with cropping,
or 1680x1080 before cropping. Currently, the created DPB is 1440x1080, which results
in the image being decoded incorrectly, as the decoder overwrites output memory.
This commit fixes this.
2023-10-25 20:51:05 +02:00
Martin Storsjö
a4877f1ec1 aarch64: Only enable extensions in the intended files/regions
This eases actual development of the assembly functions, by only
allowing extension instructions within the sections that explicitly
enable them, instead of having all extensions enabled everywhere.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-24 14:46:20 +03:00
Martin Storsjö
1762975ba1 libavcodec/aarch64/hevc: Require consistent use of trailing semicolon
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-23 10:39:12 +03:00
Andreas Rheinhardt
6e4030a07b avcodec/av1dec, vaapi_av1: Remove excessive logmessages
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-10-22 22:11:37 +02:00
Andreas Rheinhardt
315c956cbd avcodec/pthread_frame: Remove ff_thread_release_buffer()
It is unnecessary since the removal of non-thread-safe callbacks
in e0786a8eeb. Since then, the
AVCodecContext has only been used as logcontext.

Removing ff_thread_release_buffer() allowed to remove AVCodecContext*
parameters from several other functions (not only unref functions,
but also e.g. ff_h264_ref_picture() which calls ff_h264_unref_picture()
on error).

Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-10-22 22:09:59 +02:00
Leo Izen
86ed68420d
avcodec/librsvgdec: fix memory leaks and deprecated functions
At various points through the function librsvg_decode_frame, errors are
returned from immediately without deallocating any allocated structs.
This patch both fixes those leaks, and also fixes the use of functions
that are deprecated since librsvg version 2.52.0. The older calls are
still used, guarded by #ifdefs while the newer replacements are used if
librsvg >= 2.52.0. One of the deprecated functions is used as a check
for the configure shell script, so it was replaced with a different
function.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2023-10-22 15:18:13 -04:00
Martin Storsjö
a76b409dd0 aarch64: Reindent all assembly to 8/24 column indentation
libavcodec/aarch64/vc1dsp_neon.S is skipped here, as it intentionally
uses a layered indentation style to visually show how different
unrolled/interleaved phases fit together.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-21 23:25:54 +03:00
Martin Storsjö
7f905f3672 aarch64: Make the indentation more consistent
Some functions have slightly different indentation styles; try
to match the surrounding code.

libavcodec/aarch64/vc1dsp_neon.S is skipped here, as it intentionally
uses a layered indentation style to visually show how different
unrolled/interleaved phases fit together.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-21 23:25:29 +03:00