1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00
Commit Graph

49254 Commits

Author SHA1 Message Date
Connor Worley
dfbbd11a4b lavc/dxvenc: add DXV encoder with support for DXT1 texture format
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2024-01-23 21:31:22 +01:00
James Almer
1496ce8f6b avcodec/vvc_ctu: align motion vector fields
Should fix "member access within misaligned address 0xf00 for type 'const union
av_alias64', which requires 8 byte alignment" errors as reported by GCC ubsan.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-01-23 17:24:15 -03:00
Frank Plowman
8157b5d405 lavc/vvc: Remove left shifts of negative values
VVC specifies << as arithmetic left shift, i.e. x << y is equivalent to
x * pow2(y).  C's << on the other hand has UB if x is negative.  This
patch removes all UB resulting from this, mostly by replacing x << y
with x * (1 << y), but there are also a couple places where the OOP was
changed instead.

Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-01-23 11:17:05 -03:00
James Almer
ab39cc36c7 avcodec/speexdec: fix setting frame_size from extradata
Finishes fixing vp5/potter512-400-partial.avi

The fate-matroska-ms-mode test ref is updated to reflect that the Speex decoder
can now read the stream.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-01-22 10:58:12 -03:00
James Almer
cad35f0a77 avcodec/speexdec: relax the extradata check for the speex string
There could be bogus bytes at the start, as is the case of
vp5/potter512-400-partial.avi from the FATE suite, which could be a case of bad
remuxing from an OGG source.

Partially fixes decoding of vp5/potter512-400-partial.avi

Signed-off-by: James Almer <jamrial@gmail.com>
2024-01-22 10:58:12 -03:00
Anton Khirnov
08bebeb1be Revert "all: Don't set AVClass.item_name to its default value"
Some callers assume that item_name is always set, so this may be
considered an API break.

This reverts commit 0c6203c97a.
2024-01-20 10:34:48 +01:00
James Almer
0a5813fc68 avcodec/vvcdec: allocate and store structs on their own within the table list
Fixes "runtime error: member access within misaligned address 0xf00 for type
'struct bar', which requires # byte alignment" errors under GCC ubsan.

Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-01-19 08:53:32 -03:00
sunyuechi
8e23ebe6f9 lavc/svq1enc: R-V V ssd_int8_vs_int16
C908
ssd_int8_vs_int16_c: 207.7
ssd_int8_vs_int16_rvv_i32: 14.2

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-01-17 17:49:54 +02:00
Nuo Mi
d595e0a0b6 avcodec/vvcdec: misc, constify hor_ctu_edge
Signed-off-by: James Almer <jamrial@gmail.com>
2024-01-17 10:14:50 -03:00
Nuo Mi
375dcf469e avcodec/vvcdec: deblock, fix uninitialized values
see https://fate.ffmpeg.org/report.cgi?slot=x86_64-archlinux-gcc-valgrind&time=20240105201935
If tc is zero, the max_len_q, max_len_p are uninitialized.

Reported-by: James Almer <jamrial@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-01-17 10:14:35 -03:00
aybe aybe
36b402f80d
avcodec/mdec: DC reading for STRv1 is like STRv2
As I understand, support for .STR files is broken for almost 10 years now (since 161442ff2c it seems).

Currently, ffmpeg fails with tons of errors like this on version 1 STRs, e.g. Wipeout 1:
[mdec @ 00000000027c72c0] ac-tex damaged at 1 9

What happens is that only the audio is present in the video file.

Anyway, that one character patch fixes the problem, video is now rendered.

Signed-off-by: aybe <aybe@users.noreply.github.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-16 01:34:56 +01:00
sunyuechi
0befc1fca7 lvac/svqenc: add ff_svq1enc_init
This is for clarity and use in testing, consistent with other parts of the code

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-01-15 19:03:03 +02:00
Rémi Denis-Courmont
278b4b60d6 lavc/takdsp: R-V V decorrelate_sf
decorrelate_sf_c:      259.2
decorrelate_sf_rvv_i32: 45.5
2024-01-15 19:00:25 +02:00
yuanhecai
a87a52ed0b
avcodec/hevc: Add ff_hevc_idct_32x32_lasx asm opt
tests/checkasm/checkasm:

                          C          LSX       LASX
hevc_idct_32x32_8_c:      1243.0     211.7     101.7

Speedup of decoding H265 4K 30FPS 30Mbps on
3A6000 with 8 threads is 1fps(56fps-->57fps).

Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-12 23:35:40 +01:00
jinbo
9239081db3
avcodec/hevc: Add asm opt for the following functions
tests/checkasm/checkasm:           C       LSX     LASX
put_hevc_qpel_uni_h4_8_c:          5.7     1.2
put_hevc_qpel_uni_h6_8_c:          12.2    2.7
put_hevc_qpel_uni_h8_8_c:          21.5    3.2
put_hevc_qpel_uni_h12_8_c:         47.2    9.2     7.2
put_hevc_qpel_uni_h16_8_c:         87.0    11.7    9.0
put_hevc_qpel_uni_h24_8_c:         188.2   27.5    21.0
put_hevc_qpel_uni_h32_8_c:         335.2   46.7    28.5
put_hevc_qpel_uni_h48_8_c:         772.5   104.5   65.2
put_hevc_qpel_uni_h64_8_c:         1383.2  142.2   109.0

put_hevc_epel_uni_w_v4_8_c:        5.0     1.5
put_hevc_epel_uni_w_v6_8_c:        10.7    3.5     2.5
put_hevc_epel_uni_w_v8_8_c:        18.2    3.7     3.0
put_hevc_epel_uni_w_v12_8_c:       40.2    10.7    7.5
put_hevc_epel_uni_w_v16_8_c:       70.2    13.0    9.2
put_hevc_epel_uni_w_v24_8_c:       158.2   30.2    22.5
put_hevc_epel_uni_w_v32_8_c:       281.0   52.0    36.5
put_hevc_epel_uni_w_v48_8_c:       631.7   116.7   82.7
put_hevc_epel_uni_w_v64_8_c:       1108.2  207.5   142.2

put_hevc_epel_uni_w_h4_8_c:        4.7     1.2
put_hevc_epel_uni_w_h6_8_c:        9.7     3.5     2.7
put_hevc_epel_uni_w_h8_8_c:        17.2    4.2     3.5
put_hevc_epel_uni_w_h12_8_c:       38.0    11.5    7.2
put_hevc_epel_uni_w_h16_8_c:       69.2    14.5    9.2
put_hevc_epel_uni_w_h24_8_c:       152.0   34.7    22.5
put_hevc_epel_uni_w_h32_8_c:       271.0   58.0    40.0
put_hevc_epel_uni_w_h48_8_c:       597.5   136.7   95.0
put_hevc_epel_uni_w_h64_8_c:       1074.0  252.2   168.0

put_hevc_epel_bi_h4_8_c:           4.5     0.7
put_hevc_epel_bi_h6_8_c:           9.0     1.5
put_hevc_epel_bi_h8_8_c:           15.2    1.7
put_hevc_epel_bi_h12_8_c:          33.5    4.2     3.7
put_hevc_epel_bi_h16_8_c:          59.7    5.2     4.7
put_hevc_epel_bi_h24_8_c:          132.2   11.0
put_hevc_epel_bi_h32_8_c:          232.7   20.2    13.2
put_hevc_epel_bi_h48_8_c:          521.7   45.2    31.2
put_hevc_epel_bi_h64_8_c:          949.0   71.5    51.0

After this patch, the peformance of decoding H265 4K 30FPS
30Mbps on 3A6000 with 8 threads improves 1fps(55fps-->56fsp).

Change-Id: I8cc1e41daa63ca478039bc55d1ee8934a7423f51
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-12 23:35:40 +01:00
jinbo
1f642b99af
avcodec/hevc: Add epel_uni_w_hv4/6/8/12/16/24/32/48/64 asm opt
tests/checkasm/checkasm:           C       LSX     LASX
put_hevc_epel_uni_w_hv4_8_c:       9.5     2.2
put_hevc_epel_uni_w_hv6_8_c:       18.5    5.0     3.7
put_hevc_epel_uni_w_hv8_8_c:       30.7    6.0     4.5
put_hevc_epel_uni_w_hv12_8_c:      63.7    14.0    10.7
put_hevc_epel_uni_w_hv16_8_c:      107.5   22.7    17.0
put_hevc_epel_uni_w_hv24_8_c:      236.7   50.2    31.7
put_hevc_epel_uni_w_hv32_8_c:      414.5   88.0    53.0
put_hevc_epel_uni_w_hv48_8_c:      917.5   197.7   118.5
put_hevc_epel_uni_w_hv64_8_c:      1617.0  349.5   203.0

After this patch, the peformance of decoding H265 4K 30FPS 30Mbps
on 3A6000 with 8 threads improves 3fps (52fps-->55fsp).

Change-Id: If067e394cec4685c62193e7adb829ac93ba4804d
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-12 23:35:40 +01:00
jinbo
6c6bf18ce8
avcodec/hevc: Add qpel_uni_w_v|h4/6/8/12/16/24/32/48/64 asm opt
tests/checkasm/checkasm:           C       LSX     LASX
put_hevc_qpel_uni_w_h4_8_c:        6.5     1.7     1.2
put_hevc_qpel_uni_w_h6_8_c:        14.5    4.5     3.7
put_hevc_qpel_uni_w_h8_8_c:        24.5    5.7     4.5
put_hevc_qpel_uni_w_h12_8_c:       54.7    17.5    12.0
put_hevc_qpel_uni_w_h16_8_c:       96.5    22.7    13.2
put_hevc_qpel_uni_w_h24_8_c:       216.0   51.2    33.2
put_hevc_qpel_uni_w_h32_8_c:       385.7   87.0    53.2
put_hevc_qpel_uni_w_h48_8_c:       860.5   192.0   113.2
put_hevc_qpel_uni_w_h64_8_c:       1531.0  334.2   200.0

put_hevc_qpel_uni_w_v4_8_c:        8.0     1.7
put_hevc_qpel_uni_w_v6_8_c:        17.2    4.5
put_hevc_qpel_uni_w_v8_8_c:        29.5    6.0     5.2
put_hevc_qpel_uni_w_v12_8_c:       65.2    16.0    11.7
put_hevc_qpel_uni_w_v16_8_c:       116.5   20.5    14.0
put_hevc_qpel_uni_w_v24_8_c:       259.2   48.5    37.2
put_hevc_qpel_uni_w_v32_8_c:       459.5   80.5    56.0
put_hevc_qpel_uni_w_v48_8_c:       1028.5  180.2   126.5
put_hevc_qpel_uni_w_v64_8_c:       1831.2  319.2   224.2

Speedup of decoding H265 4K 30FPS 30Mbps on
3A6000 with 8 threads is 4fps(48fps-->52fps).

Change-Id: I1178848541d90083869225ba98a02e6aa8bb8c5a
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-12 23:35:40 +01:00
jinbo
a28eea2a27
avcodec/hevc: Add pel_uni_w_pixels4/6/8/12/16/24/32/48/64 asm opt
tests/checkasm/checkasm:           C       LSX     LASX
put_hevc_pel_uni_w_pixels4_8_c:    2.7     1.0
put_hevc_pel_uni_w_pixels6_8_c:    6.2     2.0     1.5
put_hevc_pel_uni_w_pixels8_8_c:    10.7    2.5     1.7
put_hevc_pel_uni_w_pixels12_8_c:   23.0    5.5     5.0
put_hevc_pel_uni_w_pixels16_8_c:   41.0    8.2     5.0
put_hevc_pel_uni_w_pixels24_8_c:   91.0    19.7    13.2
put_hevc_pel_uni_w_pixels32_8_c:   161.7   32.5    16.2
put_hevc_pel_uni_w_pixels48_8_c:   354.5   73.7    43.0
put_hevc_pel_uni_w_pixels64_8_c:   641.5   130.0   64.2

Speedup of decoding H265 4K 30FPS 30Mbps on 3A6000 with
8 threads is 1fps(47fps-->48fps).

Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-12 23:35:40 +01:00
jinbo
cfbdda607d
avcodec/hevc: Add add_residual_4/8/16/32 asm opt
After this patch, the peformance of decoding H265 4K 30FPS 30Mbps
on 3A6000 with 8 threads improves 2fps (45fps-->47fsp).

Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-12 23:35:40 +01:00
Zhao Zhili
13c1fea92f avcodec/videotoolboxenc: fix setting avctx color_range doesn't work
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-01-12 10:49:18 +08:00
Nuo Mi
8d0dda8260 vvcdec: reuse h26x/h2656_deblock_template.c 2024-01-11 22:53:05 +08:00
Nuo Mi
ae0a83477b hevcdec: move deblock template to h26x/h2656_deblock_template.c 2024-01-11 22:53:05 +08:00
Nuo Mi
69e179e8bf vvcdec: reuse h26x/h2656_sao_template.c 2024-01-11 22:53:05 +08:00
Nuo Mi
d2fe23b835 hevcdec: move sao template to h26x/h2656_sao_template.c 2024-01-11 22:53:05 +08:00
Clément Bœsch
af509f9957 avcodec/proresenc_anatoliy: do not write into chroma reserved bitfields
The layout for the frame flags is as follow:

   chroma_format  u(2)
   reserved       u(2)
   interlace_mode u(2)
   reserved       u(2)

chroma_format has 2 allowed values:
   0: reserved
   1: reserved
   2: 4:2:2
   3: 4:4:4

interlace_mode has 3 allowed values:
   0: progressive
   1: tff
   2: bff
   3: reserved

0x80 is what we expect for "422 not interlaced", and the extra 0x2 from
0x82 is actually writing into the reserved bits.
2024-01-10 23:33:02 +01:00
Clément Bœsch
21f7a814ea avcodec/proresenc_anatoliy: do not write into alpha reserved bitfields
This byte represents 4 reserved bits followed by 4 alpha_channel_type bits.

alpha_channel_type currently has 3 differents defined values: 0 (no
alpha), 1 (8b alpha), and 2 (16b alpha), all the other values are
reserved. The 4 initial reserved bits are expected to be 0.
2024-01-10 23:33:02 +01:00
Clément Bœsch
6d35911667 avcodec/proresenc_kostya: do not write into alpha reserved bitfields
This byte represents 4 reserved bits followed by 4 alpha_channel_type bits.

alpha_channel_type currently has 3 differents defined values: 0 (no
alpha), 1 (8b alpha), and 2 (16b alpha), all the other values are
reserved. This part is correctly written (alpha_bits>>3 does the correct
thing), but the 4 initial bits are reserved.
2024-01-10 23:33:02 +01:00
Clément Bœsch
aa7ccd0ce9 avcodec/proresenc_kostya: use a compatible bitstream version
Quoting SMPTE RDD 36:2015:
  A decoder shall abort if it encounters a bitstream with an unsupported
  bitstream_version value. If 0, the value of the chroma_format syntax
  element shall be 2 (4:2:2 sampling) and the value of the
  alpha_channel_type element shall be 0 (no encoded alpha); if 1, any
  permissible value may be used for those syntax elements.

So if we're not in 4:2:2 or if there is alpha, we are not allowed to use
version 0.
2024-01-10 23:33:02 +01:00
Clément Bœsch
85cb1b9b20 avcodec/proresenc_anatoliy: use a compatible bitstream version
Quoting SMPTE RDD 36:2015:
  A decoder shall abort if it encounters a bitstream with an unsupported
  bitstream_version value. If 0, the value of the chroma_format syntax
  element shall be 2 (4:2:2 sampling) and the value of the
  alpha_channel_type element shall be 0 (no encoded alpha); if 1, any
  permissible value may be used for those syntax elements.

So if we're not in 4:2:2 or if there is alpha, we are not allowed to use
version 0.
2024-01-10 23:33:02 +01:00
Clément Bœsch
1081bae94d avcodec/proresenc_kostya: make a few cosmetics in encode_acs()
Unify cosmetics with encode_acs() from proresenc_anatoliy.
2024-01-10 14:08:00 +01:00
Clément Bœsch
cc2206d142 avcodec/proresenc_anatoliy: make a few cosmetics in encode_acs()
This makes the function pretty much identical to the function of the
same name in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
8fb2e96d7e avcodec/proresenc_anatoliy: execute AC run/level FFMIN() at assignment
This matches the logic from the function of the same name in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
096a69ad43 avcodec/proresenc_anatoliy: rework inner loop in encode_acs()
This matches the logic from the function of the same name in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
25f28b9308 avcodec/proresenc_anatoliy: avoid using ff_ prefix in function arguments 2024-01-10 14:08:00 +01:00
Clément Bœsch
29fd3f75fe avcodec/proresenc_anatoliy: rework encode_ac_coeffs() prototype
This makes the prototype closer to the function of the same name in
proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
3543100a05 avcodec/proresenc_anatoliy: replace get_level() with FFABS()
This matches the code from proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
ed8692446c avcodec/proresenc_anatoliy: cosmetics to make encode_dcs() identical to the one in Kostya encoder 2024-01-10 14:08:00 +01:00
Clément Bœsch
e87bc5641c avcodec/proresenc_anatoliy: remove TO_GOLOMB2()
A few cosmetics aside, this makes the function identical to the one with
the same name in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
a026f98f29 avcodec/proresenc_anatoliy: only pass down the first scale to encode_dcs()
This matches encode_dcs() prototype from proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
1aa7d504ec avcodec/proresenc_anatoliy: shuffle declarations around in encode_dcs()
This makes the function closer to the same function in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
87ba89281c avcodec/proresenc_anatoliy: rename TO_GOLOMB() to MAKE_CODE()
This matches the name in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
7af42088d7 avcodec/proresenc_kostya: add Anatoliy copyright
Both encoders share a lot of code from both authors.
2024-01-10 14:08:00 +01:00
Clément Bœsch
d269f84199 avcodec/proresenc_anatoliy: remove IS_NEGATIVE() macro
This makes the function closer to encode_acs() in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
9c7f6d89fd avcodec/proresenc_anatoliy: rename new_dc to dc
This makes the function closer to encode_dcs() in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
9258f4eaf9 avcodec/proresenc_anatoliy: compute sign only once
This makes the function closer to encode_dcs() in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
17392ca84f avcodec/proresenc_anatoliy: import GET_SIGN() macro from Kostya encoder and use it 2024-01-10 14:08:00 +01:00
Clément Bœsch
273f591a3d avcodec/proresenc_anatoliy: directly work with blocks in encode_dcs()
This makes the function closer to encode_dcs() in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
dadc5ac24a avcodec/proresenc_anatoliy: reduce DC encoding function prototype differences with Kostya encoder 2024-01-10 14:08:00 +01:00
Clément Bœsch
8e42d3aba0 avcodec/proresenc_anatoliy: execute codebook FFMIN() at assignment
This makes the function closer to encode_dcs() in proresenc_kostya.
2024-01-10 14:08:00 +01:00
Clément Bœsch
43baba4647 avcodec/proresenc_anatoliy: rename new_code/code to code/codebook
This makes the function closer to encode_dcs() in proresenc_kostya.
2024-01-10 14:08:00 +01:00