This cannot beat the Zbb implementation, and it is unlikely that a real
meaningful CPU design would support V and not Zbb. The best loop rewrite
that I could come up with (4 shifts, 2 ands, 3 ors) is still ~40% slower
than Zbb.
A proper faster vector implementation should be feasible with the
cryptographic vector extensions, but that is a story for another time.
The code was blindly assuming that Zbb or V implied Zba. While the
earlier is practically always true, the later broke some QEMU setups,
as V was introduced earlier than Zba.
Even though they have the same size, and typically the same alignment,
uint32_t and float are under no circumstances compatible types in C.
The casts from float * to uint32_t * are invalid here. Insofar as the
resulting pointers are dereferenced, this is undefined behaviour.
This patch uses av_float2int() / av_int2float() instead.
Except for add_squares, telling the compiler that the output vector(s)
cannot alias helps quite a bit (cycles on SiFive U74-MC):
ps_add_squares_c: 98277.7
ps_add_squares_r: 98320.2
ps_hybrid_analysis_c: 3731.2
ps_hybrid_analysis_r: 2495.7
ps_hybrid_analysis_ileave_c: 20478.0
ps_hybrid_analysis_ileave_r: 16092.2
ps_hybrid_synthesis_deint_c: 19051.5
ps_hybrid_synthesis_deint_r: 15420.0
ps_mul_pair_single_c: 122941.2
ps_mul_pair_single_r: 91035.0
Encoders (usually) have no business modifying frame->data
(which need not be writable), so they should use the appropriate
pointers.
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also allocate the AVFrame during init and use av_frame_replace()
to replace it later.
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
A private class for an encoder without options is useless.
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This increases the group multiplier as per T-Head C910 benchmarks:
inverse_coupling_c: 4597.0
inverse_coupling_rvv_i32: 1312.7 (m1)
inverse_coupling_rvv_i32: 1116.7 (m2)
inverse_coupling_rvv_i32: 732.2 (m4)
inverse_coupling_rvv_i32: 898.0 (m8)
The IMDCT offset is only relevant for NEON optimisations. There are no
VFP optimisations here that would justify the HAVE_VFP flag. In
practice, this makes no difference because HAVE_NEON is practically
always true for standard Armv8 platforms.
This decoding flag makes decoders drop all frames after a parameter
change, but what exactly constitutes a parameter change is not well
defined and will typically depend on the exact use case.
This functionality then does not belong in libavcodec, but rather in
user code
Copy packet side data to the output frame in ff_decode_frame_props_from_pkt()
instead of in discard_samples(), having the latter only applying the skip if
required.
This will be useful for the following commit.
Signed-off-by: James Almer <jamrial@gmail.com>
Prevent the fifo used in encoding VPx videos from filling up and
stopping encode when it reaches 21845 items, which happens when the
video has more than that number of frames.
Incorporated suggestion from James Zern to prevent calling
frame_data_submit() at all when performing the first pass of a 2-pass
encode so the fifo is not filled at all; replaces original patch which
drained the fifo after filling to prevent it from becoming full.
Fixes the regression originally introduced in
5bda4ec6c3
Co-authored-by: James Zern <jzern@google.com>
Signed-off-by: David Lemler <david@lemler.family>
Signed-off-by: James Zern <jzern@google.com>
In some versions of libsvtav1, setting intra_period_length to 0
does not produce the intended result (i.e.) all frames produced
are not keyframes.
Instead handle the gop_size == 1 as a special case by setting
the pic_type to EB_AV1_KEY_PICTURE when encoding each frame so
that all the output frames are keyframes.
SVT-AV1 Bug: https://gitlab.com/AOMediaCodec/SVT-AV1/-/issues/2076
Example command: ffmpeg -f lavfi -i testsrc=duration=1:size=64x64:rate=30 -c:v libsvtav1 -g 1 -y test.webm
Before: Only first frame is keyframe, rest are intraonly.
After: All frames are keyframes.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Mention encoder name in the message to emphasize that the value in
question is not supported by this specific encoder, not necessarily by
libavcodec in general.
Print a list of values supported by the encoder.
Git master libjxl changed several function signatures, so this commit
adds some #ifdefs to handle the new signatures without breaking old
releases. Do note that old git master development versions of libjxl
will be broken, but no releases will be.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Should fix#10457, a regression caused by
69516ab3e9.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>