Encoding was untested before this.
Notice that the filesize degradation is partially due to
mpegvideo no longer using progressive_sequence and
progressive_frame.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Effectively reverts c59b5e3d1e.
This is possible now that ff_print_debug_info2() uses
the MPVPicture dimensions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
MotionEstContext contains pointers (to function pointers) that
have been set on a per-frame basis depending upon no_rounding
in ff_me_init_pic() to avoid branches like these. Also makes
MotionEstContext more independent of MpegEncContext.
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The mpegvideo-based encoders do one uncommon thing with
the packet's data given by ff_alloc_packet(): They potentially
reallocate it. But this only affects the internal buffer
and is not user-facing at all, so one can nevertheless
use the AV_CODEC_CAP_DR1 for them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When there are multiple candidates for macroblock type, the encoder
tries them all. In order to do so, it keeps several sets of states
containing the variables that get modified when encoding
the macroblock and in the end uses the best of these.
Yet one variable was set, but not included in this state:
The current macroblock's qscale value in the current picture's
qscale_table. This may currently be set multiple times in
mpv_reconstruct_mb(), yet it is read when adaptive_quant is true.
Currently, the value read can be the value set by the last attempt
to write the current macroblock and not the initial value.
Fix this by only setting the qscale_table value in one place
outside of mpv_reconstruct_mb() (where it does not belong at all).
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Reduce binary size at the same time. The performance compared to clang -O3
is the same.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
'avs3' is registered at mp4ra.org. The Avs3ConfigurationBox 'av3c'
inside 'avs3' hasn't been registered yet, but is specified by the
AVS3 spec.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The idea is to split the 16 bit coefficients into lower and upper half,
invoke udot for the lower half, shift by 8, and follow by udot for the
upper half.
Benchmark on A78:
bgra_to_y_128_c: 682.0 ( 1.00x)
bgra_to_y_128_neon: 181.2 ( 3.76x)
bgra_to_y_128_dotprod: 117.8 ( 5.79x)
bgra_to_y_1080_c: 5742.5 ( 1.00x)
bgra_to_y_1080_neon: 1472.5 ( 3.90x)
bgra_to_y_1080_dotprod: 906.5 ( 6.33x)
bgra_to_y_1920_c: 10194.0 ( 1.00x)
bgra_to_y_1920_neon: 2589.8 ( 3.94x)
bgra_to_y_1920_dotprod: 1573.8 ( 6.48x)
Signed-off-by: Martin Storsjö <martin@martin.st>
Change the type of logctx from void* to AVFormatContext*. This is in
preparation for the next commit.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Some frames may be buffered before a complex filtergraph can be configured.
This change ensures the side data removal in the cases where autorotation is
enabled also applies to them.
Fixes ticket #11487
Signed-off-by: James Almer <jamrial@gmail.com>
ff_decode_frame_props() injects global side data passed by the caller (Usually
coming from the container) but ignores the global side data the decoder
gathered from the bitstream itself.
This commit amends this.
Signed-off-by: James Almer <jamrial@gmail.com>
This simplifies the code, reduces allocations, and critically, does
not store references of frames, along with references to hw_frames_ctx.
The issue was that storing refs to frames while transferring stored
refs to hw_frames_ctx of frames, and so created a circular dependency,
which caused the Vulkan device to never be terminated.
This only stores what it strictly needs as a dependency, and enables
the frames context to be freed, even while doing asynchronous transfers.
It prevents the filter of running in case such option is missing,
failing early, during init() instead of simply logging an error
during runtime.
Signed-off-by: Leandro Santiago <leandrosansilva@gmail.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Instead of calculating a^2, b^2, (a+b)^2 and (a-b)^2, calculate only
a^2, b^2 and 2*a*b in each iteration and derive the latter parts from
these three at the end.
Before and after:
A78
ac3_sum_square_bufferfly_int32_neon: 484.8 ( 2.00x)
ac3_sum_square_bufferfly_int32_neon: 468.2 ( 2.08x)
A72
ac3_sum_square_bufferfly_int32_neon: 793.6 ( 1.26x)
ac3_sum_square_bufferfly_int32_neon: 527.3 ( 1.92x)
Signed-off-by: Martin Storsjö <martin@martin.st>
Constify dstStrice argument of rgbx_to_nv12_neon_16_wrapper to be
compatible with other functions as used in function assignment.
Signed-off-by: Adam Lackorzynski <adam@l4re.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
(setting to 100 as a reasonable compromise)
The change has caused regressions for many users and consumers.
Playlist reloads only happen when a playlist doesn't indicate that it
has ended (via #EXT-X-ENDLIST), which means that the addition of future
segments is still expected.
It is well possible that an HLS server is temporarily unable to serve
further segments but resumes after some time, either indicating a
discontinuity or even by fully catching up.
With a segment length of 3s, a max_reload value of 1000 corresponds to
a duration of 50 minutes which appears to be a reasonable default.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The issue is that this could consume gigabytes of VRAM at higher
resolutions for not that much of a speedup.
Automatic detection was not a good idea as we can't know how much
VRAM is actually free.
Just remove it.
BGR formats in Vulkan cannot be used in storage images, as the
pixel labels on storage images are always ordered as RGB, and
swizzling is not an option due to old hardware limitations.
This means that you must always use an RGB format and manually
swizzle when reading or writing to BGR images, or simply not use
a format in the shader itself.
This adds support for the latter.
&pl_alpha_overlay expects straight alpha, but the alpha output may be
premultiplied as a result of the target alpha mode (or in the absence of an
alpha channel). Fix it by requesting independent alpha explicitly when
blending.
There is no reason to only do this on the first input. It doesn't actually
matter for now given that we don't constrain the color space list, but it
may matter when that changes.
Signed-off-by: Niklas Haas <git@haasn.dev>