This option flag only carries nontrivial information for options that
call a function, in all other cases its presence can be inferred from
the option type (bool options do not have arguments, all other types do)
and is thus nothing but useless clutter.
Change the option parsing code to infer its value when it can, and drop
the flag from options where it's not needed.
It causes those options to be parsed as either
* -autofoo 0/1 (with an argument)
* -noautofoo (without an argument)
This is unnecessary, confusing, and against the documentation; these are
also the only two bool options that take an argument.
This should not affect the users, as these options are on by default,
and are supposed to be used as -nofoo per the documentation.
Otherwise an unitialized stack value would be copied to FPSConvContext.
As it's then never used, it tends not to be a problem in practice,
however it is UB and some compilers warn about it.
The check for UWP mode was duplicated from right above, in
d54127c41a81cf2078a3504f78e0e4232cfe11b7.
Also, instead of several lines with "enabled uwp && ...", make one
"if enabled uwp; then" block.
Signed-off-by: Martin Storsjö <martin@martin.st>
This frame will be freed in the next line.
Reviewed-by: Zhao Zhili <quinkblack@foxmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The current pack_output function pointer is a property of the decoder,
rather than a constant method provided by the DSP code. Indeed, except
for an unused initialisation, the field is never used in DSP code.
The penultimate loop iteration could pick any vl such that:
vlenb/4 < vl <= vlenb/2
Thus if the total length is not a multiple of vlenb/2, the vfadd.vf
on the penultimate iteration would yield corrupt values for the last
iteration.
To avoid this, force vl = vlen/2 until the last iteration. Unfortunately
this latent bug is not reproducible with either hardware or QEMU as of now.
This skips the round-trip to scalar register for the sliding 'x'
coefficients, improving performance by about 5%. The trick here is that
the vector slide-up instruction preserves elements in destination vector
until the slide offset.
The switch from vfslide1up.vf to vslideup.vi also allows the elimination
of data dependencies on consecutive slides. Since the specifications
recommend sticking to power of two offsets, we could slide as follows:
vslideup.vi v8, v0, 2
vslideup.vi v4, v0, 1
vslideup.vi v12, v8, 1
vslideup.vi v16, v8, 2
However in the device under test, this seems to make performance slightly
worse, so this is left for (in)validation with future better hardware.
Long is 32 bits signed on Windows, and nb_stream{s,_groups} are both unsigned
int. In a realistic scenario this wont make a difference, but it's still
proper.
Also ensure the parsed string is an integer while at it.
Signed-off-by: James Almer <jamrial@gmail.com>
It caused lacking a public declaration build error with
-Werror=missing-prototypes.
Since DXGI_FORMAT is moved to public since patch set V10, this function
is no longer useful. Now remove it.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Signed-off-by: James Almer <jamrial@gmail.com>
This fixes the following build error:
src/libavcodec/d3d12va_decode.c:49:10: error: no previous prototype for function
'ff_d3d12va_get_surface_index' [-Werror,-Wmissing-prototypes]
49 | unsigned ff_d3d12va_get_surface_index(const AVCodecContext *avctx,
| ^
Signed-off-by: Martin Storsjö <martin@martin.st>
This disables regenerating ffversion.h whenever the checked out
git commit changes, speeding up development rebuilds.
Whenever this option is set, force the version to be printed as
"unknown" rather than showing potentially stale information.
Signed-off-by: Martin Storsjö <martin@martin.st>
Same as d3d11va, this flag enables main still picture profile for
d3d12va. User should add this flag when decoding main still picture
profile.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
The implementation is based on:
https://learn.microsoft.com/en-us/windows/win32/medfound/direct3d-12-video-overview
With the Direct3D 12 video decoding support, we can render or process
the decoded images by the pixel shaders or compute shaders directly
without the extra copy overhead, which is beneficial especially if you
are trying to render or post-process a 4K or 8K video.
The command below is how to enable d3d12va:
ffmpeg -hwaccel d3d12va -i input.mp4 output.mp4
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Tong Wu <tong1.wu@intel.com>
In close_output(), a dummy frame is created with format NONE passed
to enc_open(), which isn't prepared for it. The NULL pointer
dereference happened at
av_pix_fmt_desc_get(enc_ctx->pix_fmt)->comp[0].depth.
When fgt.graph is NULL, skip fg_output_frame() since there is
nothing to output.
frame #0: 0x0000005555bc34a4 ffmpeg_g`enc_open(opaque=0xb400007efe2db690, frame=0xb400007efe2d9f70) at ffmpeg_enc.c:235:44
frame #1: 0x0000005555bef250 ffmpeg_g`enc_open(sch=0xb400007dde2d4090, enc=0xb400007e4e2daad0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:1462:11
frame #2: 0x0000005555bee094 ffmpeg_g`send_to_enc(sch=0xb400007dde2d4090, enc=0xb400007e4e2daad0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:1571:19
frame #3: 0x0000005555bee01c ffmpeg_g`sch_filter_send(sch=0xb400007dde2d4090, fg_idx=0, out_idx=0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:2154:12
frame #4: 0x0000005555bcf124 ffmpeg_g`close_output(ofp=0xb400007e4e2d85b0, fgt=0x0000007d1790eb08) at ffmpeg_filter.c:2225:15
frame #5: 0x0000005555bcb000 ffmpeg_g`fg_output_frame(ofp=0xb400007e4e2d85b0, fgt=0x0000007d1790eb08, frame=0x0000000000000000) at ffmpeg_filter.c:2317:16
frame #6: 0x0000005555bc7e48 ffmpeg_g`filter_thread(arg=0xb400007eae2ce7a0) at ffmpeg_filter.c:2836:15
frame #7: 0x0000005555bee568 ffmpeg_g`task_wrapper(arg=0xb400007d8e2db478) at ffmpeg_sched.c:2200:21
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
From AOSP doc, these values are device and codec specific, but lower
values generally result in more efficient (smaller-sized) encoding.
For example, global_quality 50 on Pixel 6 results a 1080P 30 FPS
HEVC with 3744 kb/s, while global_quality 80 results 28178 kb/s.
Fix#10689
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The ffmpeg coding style doesn't usually use const on scalar
parameters (or on the pointer values - as opposed to the type
that is pointed to, where it has a semantic meaning), contrary
to the dav1d coding style (where this was imported from).
This avoids warnings about differences in the type signatures
between declaration and definition of this function, with older
versions of MSVC.
The issue was observed with one version of MSVC 2017,
19.16.27024.1, with warnings like these:
src/tests/checkasm/checkasm.c(969): warning C4028: formal parameter 3 different from declaration
The warning itself is bogus as the const here is harmless, and
newer versions of MSVC no longer warn about this.
Signed-off-by: Martin Storsjö <martin@martin.st>
This can be used to run tests multple times, with e.g. differing
QEMU settings, by adding something like this to the FATE configuration
file:
target_exec="qemu-aarch64-static"
fate_targets="fate-checkasm fate-cpu"
fate_environments="sve128 sve256 sve512"
sve128_env="QEMU_CPU=max,sve128=on"
sve256_env="QEMU_CPU=max,sve256=on"
sve512_env="QEMU_CPU=max,sve512=on"
It's also possible to customize the target_exec command further
by injecting a sufficiently quoted variable into it, which then can
be updated for each run, e.g. target_exec="\$(CUR_EXEC_CMD)".
For each of the environment names in fate_environments, the tests
that are run get the name suffixed on the fate tests in the
test log and fate report, e.g. "fate-checkasm-h264dsp_sve128".
Signed-off-by: Martin Storsjö <martin@martin.st>
This can be useful if doing testing of uncommon CPU extensions by
running tests with QEMU (by configuring with e.g.
"target_exec=qemu-aarch64"), by only running the checkasm tests,
to get a reasonable test coverage without excessive test runtime.
For such a config, setting fate_targets="fate-checkasm fate-cpu"
can be a good tradeoff.
Signed-off-by: Martin Storsjö <martin@martin.st>