0e59675698
avutil/hwcontext_vulkan: use the typedef'd name for the expect_assume struct
...
Signed-off-by: James Almer <jamrial@gmail.com >
2025-04-15 16:52:51 -03:00
f29475a89e
avutil/hwcontext_vulkan: check if expect_assume is supported by the header
...
Reviewed-by: Lynne <dev@lynne.ee >
Signed-off-by: James Almer <jamrial@gmail.com >
2025-04-15 16:44:45 -03:00
0040d7e608
lavfi/src_movie: set pkt_timebase
...
Fix “Could not update timestamps for skipped samples” warning
and associated misfeature.
2025-04-15 15:49:38 +02:00
2229b74127
avformat/movenc: fix setting elst/stss for IAMF with Opus
2025-04-14 17:25:39 -03:00
d36883f119
avformat/iamf_writer: fix setting skip_samples and discard_padding for OPUS
2025-04-14 17:25:34 -03:00
3b2a9410ef
avcodec/decode: Only use ff_progress_frame_get_buffer() with blank input
...
All users (namely HEVC) that use ff_progress_frame_alloc()
should just use ff_thread_get_buffer(). Using
ff_progress_frame_get_buffer() is not a must; it is merely
a convenience wrapper.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-14 10:15:08 +02:00
29b85cd4b8
vulkan_ffv1: add cached symbol reader for AMD
...
Speeds up everything on AMD by 3x.
This uses 32 local invocations to load state into cache, as well
as to do the RCT faster.
2025-04-14 06:10:43 +02:00
e040c087c7
vulkan: add support for expect/assume
...
This commit adds support for compiler hints.
While on AMD these are not used/needed, Nvidia benefits from them, and gives
a sizeable 10% speedup on 4k.
2025-04-14 06:10:43 +02:00
985a26be28
vulkan_ffv1: shortcut +-1 coeffs in symbol reading
...
Slightly faster, and allows for further optimizations.
2025-04-14 06:10:43 +02:00
4d561e6a1e
vulkan_ffv1: remove need for scratch data during setup
...
This saves on some VRAM, but mainly allows for a more unified path.
2025-04-14 06:10:43 +02:00
8ceabb677c
vulkan_ffv1: externalize extended lookup check
...
8% speedup on nvidia on 4k.
2025-04-14 06:10:43 +02:00
77f777d925
ffv1/vulkan: redo context count tracking and quant_table_idx management
...
This commit also makes it possible for the encoder to choose a different
quantization table on a per-slice basis, as well as adding this capability
to the decoder.
Also, this commit fully fixes decoding of context=1 encoded files.
2025-04-14 06:10:42 +02:00
66b8c92df2
vulkan_ffv1: cache only 2 lines when decoding RGB
...
This reduces the intermediate VRAM used for RGB decoding by a
factor of 100x for 6k video.
This also speeds the decoder up by 16% for 4k RGB24 and 31% for 6k video.
This is equivalent to what the software decoder does, but with less pointers.
2025-04-14 06:10:42 +02:00
72953477a4
vulkan_ffv1: fix left-2 sample addressing
...
Typo.
Not enough to fix context=1, but its a start.
2025-04-14 06:10:42 +02:00
694ebe890c
vulkan_ffv1: improve buffer barrier correctness for slice state
...
This is likely a nanooptimization, but its more correct.
2025-04-14 06:10:42 +02:00
6111aef533
vulkan_ffv1: fix reset shader dependencies
...
Without a barrier upfront, the reset shader may read data fields not
yet set by the setup shader.
2025-04-14 06:10:42 +02:00
b72ada0a96
vulkan_ffv1: fallback to upload if mapping packet fails, fix fallback
...
The commit which added support for host mapping accidentally broke the
original, upload route.
For drivers without host-mapping (very few), fix it.
2025-04-14 06:10:42 +02:00
45d7abf6d9
vulkan_ffv1: init overread/corrupt fields
...
Forgotten.
2025-04-14 06:10:42 +02:00
1f09b55c94
vulkan_ffv1: allocate just as much memory for slice state as needed
...
Rather than always using the maximum allowed slices, just use the number
of slices present in this frame.
2025-04-14 06:10:41 +02:00
fc960dafef
vulkan_ffv1: optimize symbol reader
...
This was the fastest variant tested.
2025-04-14 06:10:41 +02:00
defebd74c0
vulkan_ffv1: slightly optimize the range decoder
...
GPUs have cmovs as standard.
2025-04-14 06:10:41 +02:00
d7772da728
vulkan_ffv1: remove unused define
...
Leftover debug macro.
2025-04-14 06:10:41 +02:00
d077e00f3e
vulkan_ffv1: enable acceleration on Intel
...
Fixed by previous commit.
2025-04-14 06:10:41 +02:00
a1137f9214
hwcontext_vulkan: disable descriptor buffer extension on Intel
...
Temporary workaround. Will be replaced with a version check once a fix is
in the works and a known next version for Mesa with a fix is known.
2025-04-14 06:10:41 +02:00
79ff1f21c4
vulkan_decode: only create sequence params in end_frame
...
We tried to create sequence params in both start_frame and end_frame.
This was redundant.
Just always create them in end_frame.
2025-04-14 06:10:40 +02:00
4f64df2928
vulkan: remove unused field from exec pools
...
This used to be involved in a mechanism to switch between queue indices,
but since the rewrite of the rewrite of the rewrite, it was rewritten out.
2025-04-14 06:10:40 +02:00
11911aef46
vulkan_shaderc/glslang: print full shaders on TRACE rather than VERBOSE
...
Way too spammy.
2025-04-14 06:10:40 +02:00
7b0156201b
vulkan: fix logging level when erroring upon creating shader module
2025-04-14 06:10:34 +02:00
75d5672b4b
avcodec/mpegaudioenc: Rename MPA_encode_* -> mpa_encode_*
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
7915e2a095
avcodec/mpegaudioenc: Move PutBitContext to stack
...
Avoids keeping dangling pointers in the context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
62e1abcf0d
avcodec/mpegaudioenc: Don't pad one bit at a time
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
87f3e20931
avcodec/mpegaudioenc: Avoid intermediate buffer
...
We know the final size before encoding, so we can switch to
ff_get_encode_buffer() which avoids an implicit memcpy().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
6f7ebeff70
avcodec/mpegaudioenc: Combine writing scale factors
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
db75955d60
avcodec/mpegaudioenc_{fixed,float}: Merge encoders
...
Most of the encoders is the same. So deduplicate them.
This reduces code size from 22410B to 12637B here.
The data in mpegaudiotab.h is also automatically deduplicated.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
13a0d0ade1
avcodec/mpegaudioenc_template: Remove always-false branch
...
The sample rates here have already been checked generically
via CODEC_SAMPLERATES().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
66310f8a22
avformat/asf_tags: Deduplicate tags
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
413905bff2
avcodec/opus/tab: Deduplicate arrays
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
35fcdb2132
swscale/x86/rgb2rgb: Deduplicate ASM constants
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
044bfc7785
avcodec/aac{enc,}tab: Deduplicate swb tables
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
ab1bc2f745
avcodec/aacenc: Remove always-false check
...
The sample rates have already been checked generically
via AVCodec.supported_samplerates.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:49:21 +02:00
516bcfc169
avutil/aes: Use #if checks instead of if (ARCH_X86)
...
Reviewed-by: James Almer <jamrial@gmail.com >
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:47:32 +02:00
f81ace52f8
avutil/aes: Make aes_init_static() av_cold
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:47:26 +02:00
c8c4e55b2b
avcodec/motionpixels: Avoid av_unused
...
Easily possible now that -Wdeclaration-after-statement is gone.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 22:47:26 +02:00
bf327ac676
avcodec/hq_hqa: Check size before initializing GetByteContext
...
Enables the compiler to optimize the buf_size assert
in bytestream2_init() away.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 08:49:26 +02:00
16943876f8
avcodec/hq_hqa: Remove implicit always-false checks
...
This decoder has explicit checks.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 08:48:11 +02:00
c39e23cc91
avcodec/hq_hqa: Check available date before allocating frame
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 08:26:37 +02:00
c1f124f3f0
avcodec/hq_hqa: Use ff_vlc_init_from_lengths()
...
This allows to avoid the codes table; furthermore, given
that the runs fit into seven bits and level into nine,
one can put them into one int16_t and use as symbols table
in ff_vlc_init_from_lengths().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 08:26:37 +02:00
9c0d6145c9
avcodec/hq_hqa: Include implicit +1 run in RL VLC table
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 08:25:43 +02:00
ce0074f97b
avcodec/hq_hqa: Use RL-VLC table
...
This moves indirections to init.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 08:25:29 +02:00
18309fba3c
avcodec/hq_hqadata: Avoid relocations
...
Initialize a list of 128 pointers at decoder init
instead of using a const list of pointers (which
will be initialized at runtime when libavcodec
is loaded when using pic code with Elf); the former
takes only 128 bytes (+ a bit of initialization code),
the latter 1KiB on 64 bit systems (+3KiB on x64 elf
for relocation information).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com >
2025-04-13 08:24:39 +02:00