From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)
x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.
The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.
In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
libjxl only accepts 16-bit buffers with its API, but it can
accept 9-bit to 15-bit input via a 16-bit buffer, provided the flag
is set declaring the buffer to be of the respective significant depth.
Likewise, it can only provide pixel data on decode as a 16-bit buffer
(if higher than 8) but does provide the metadata tagging the actual bit
depth.
This commit causes libjxlenc.c and libjxldec.c to respect this metadata
and tag/read it accordingly from AVCodecContext->bits_per_raw_sample.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
mfenc sets FF_CODEC_CAP_INIT_CLEANUP, so calling mf_close() on
failure inside mf_init() results in a double-free.
Signed-off-by: Cameron Gutman <aicommander@gmail.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Only warn if the advanced_editlist option is enabled (it is enabled
by default though) so we don't print one warning for each track, and
demote the warning to AV_LOG_LEVEL_VERBOSE; this message does get
generated whenever parsing a fragmented MP4 file, regardless of
whether the file actually uses multiple edits or not.
Later when parsing the mov structures, the demuxer does warn if
the file did contain multiple edits which would require the
advanced_editlist option enabled for decoding correctly.
Adjust the warning message for the case when the file seemed like it
actually would have needed handling of advanced edit lists, to
reflect the fact that it doesn't help to try set the option as
it has been automatically disabled.
Signed-off-by: Martin Storsjö <martin@martin.st>
The construct of using offsetof on a (potentially anonymous) struct
defined within the offsetof expression, while supported by all
current compilers, has been declared explicitly undefined by the
C standards committee [1].
Clang recently got a change to identify this as an issue [2];
initially it was treated as a hard error, but it was soon after
softened into a warning under the -Wgnu-offsetof-extensions option
(not enabled automatically as part of -Wall though).
Nevertheless - in this particular case, it's trivial to fix the
code not to rely on the construct that the standards committee has
explicitly called out as undefined.
[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2350.htm
[2] https://reviews.llvm.org/D133574
Signed-off-by: Martin Storsjö <martin@martin.st>
If the dictionary provided on input contains multiple entries for an
option (relevant for flags modifying the previous value with '+' or
'-') and the option is not found in the target object, only the last
entry would be returned to the caller.
Pass AV_DICT_MULTIKEY to av_dict_set() to make sure all such entries are
returned.
QSVScaleContext and VPPContext have the same base context, and all
features in scale_qsv are implemented in vpp_qsv filter, so scale_qsv
can be taken as a special case of vpp_qsv filter, and we may use
VPPContext with a different option array, preinit callback and supported
pixel formats to implement scale_qsv then remove QSVScaleContext
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Use QSVVPPContext as a base context of QSVScaleContext, hence we may
re-use functions defined for QSVVPPContext to manage MFX session for
scale_qsv filter.
In addition, system memory has been taken into account in
QSVVVPPContext, we may add support for non-QSV pixel formats in the
future.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Hyper Encode uses Intel integrated and discrete graphics on one system
to accelerate encoding of a single video stream.
Depending on the selected parameters and codecs, performance gain on AlderLake iGPU + ARC Gfx up to 1.6x.
More information: https://www.intel.co.uk/content/www/uk/en/architecture-and-technology/adaptix/deep-link.html
Developer guide: https://github.com/oneapi-src/oneVPL-intel-gpu/blob/main/doc/HyperEncode_FeatureDeveloperGuide.md
Hyper Encode is supported only on Windows and requires D3D11 and oneVPL.
To enable Hyper Encode need to specify:
-Hyper Encode mode (-dual_gfx on or dual_gfx adaptive)
-Encoder: h264_qsv or hevc_qsv
-BRC: VBR, CQP or ICQ
-Lowpower mode (-low_power 1)
-Closed GOP for AVC or strict GOP for HEVC -idr_interval = 0 used by default
Depending on the encoding parameters, the following parameters may need
to be adjusted:
-g recommended >= 30 for better performance
-async_depth recommended >= 30 for better performance
-extra_hw_frames recommended equal to async_depth value
-bf recommended = 0 for better performance
In the cases with fast encoding (-preset veryfast) there may be no
performance gain due to the fact that the decode is slower than the encode.
Command line examples:
ffmpeg.exe -init_hw_device qsv:hw,child_device_type=d3d11va,child_device=0 -v verbose -y -hwaccel qsv -extra_hw_frames 60 -async_depth 60 -c:v h264_qsv -i bbb_sunflower_2160p_60fps_normal.mp4
-async_depth 60 -c:v h264_qsv -preset medium -g 60 -low_power 1 -bf 0 -dual_gfx on output.h265
Signed-off-by: galinart <artem.galin@intel.com>
Let's ignore the index table if the number of index entries does not match the
index duration (or the special AVID index entry counts).
Fixes: OOM
Fixes: 50551/clusterfuzz-testcase-minimized-ffmpeg_dem_MXF_fuzzer-6607795234930688
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Marton Balint <cus@passwd.hu>
This is intended to be a more convenient replacement for
reordered_opaque.
Add support for it in the two encoders that offer
AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE: libx264 and libx265. Other
encoders will be supported in future commits.
Some encoders (ffv1, flac, adx) are marked with AV_CODEC_CAP_DELAY onky
in order to be flushed at the end, otherwise they behave as no-delay
encoders.
Add a capability to mark these encoders. Use it for setting pts
generically.