I spotted an interesting pattern that I didn't see before that leads to the implementation being faster.
The bit shifting table I was using before is no longer needed, and was able to remove quite a few lines.
I also add use of FMA on the AVX2 version.
f32 1920x1080 1 thread with prelut
c impl
1434012700 UNITS in lut3d->interp, 1 runs, 0 skips
1434035335 UNITS in lut3d->interp, 2 runs, 0 skips
1423615347 UNITS in lut3d->interp, 4 runs, 0 skips
1426268863 UNITS in lut3d->interp, 8 runs, 0 skips
sse2
905484420 UNITS in lut3d->interp, 1 runs, 0 skips
905659010 UNITS in lut3d->interp, 2 runs, 0 skips
915167140 UNITS in lut3d->interp, 4 runs, 0 skips
915834222 UNITS in lut3d->interp, 8 runs, 0 skips
avx
574794860 UNITS in lut3d->interp, 1 runs, 0 skips
581035090 UNITS in lut3d->interp, 2 runs, 0 skips
584116720 UNITS in lut3d->interp, 4 runs, 0 skips
581460290 UNITS in lut3d->interp, 8 runs, 0 skips
avx2
301698880 UNITS in lut3d->interp, 1 runs, 0 skips
301982880 UNITS in lut3d->interp, 2 runs, 0 skips
306962430 UNITS in lut3d->interp, 4 runs, 0 skips
305472025 UNITS in lut3d->interp, 8 runs, 0 skips
gbrap16 1920x1080 1 thread with prelut
c impl
1480894840 UNITS in lut3d->interp, 1 runs, 0 skips
1502922990 UNITS in lut3d->interp, 2 runs, 0 skips
1496114307 UNITS in lut3d->interp, 4 runs, 0 skips
1492554551 UNITS in lut3d->interp, 8 runs, 0 skips
sse2
980777180 UNITS in lut3d->interp, 1 runs, 0 skips
986121520 UNITS in lut3d->interp, 2 runs, 0 skips
986489840 UNITS in lut3d->interp, 4 runs, 0 skips
998832248 UNITS in lut3d->interp, 8 runs, 0 skips
avx
622212360 UNITS in lut3d->interp, 1 runs, 0 skips
622981160 UNITS in lut3d->interp, 2 runs, 0 skips
645396315 UNITS in lut3d->interp, 4 runs, 0 skips
641057075 UNITS in lut3d->interp, 8 runs, 0 skips
avx2
321336400 UNITS in lut3d->interp, 1 runs, 0 skips
321268920 UNITS in lut3d->interp, 2 runs, 0 skips
323459895 UNITS in lut3d->interp, 4 runs, 0 skips
324949967 UNITS in lut3d->interp, 8 runs, 0 skips
This gets rid of of rist_receiver_data_read, rist_receiver_data_block_free and rist_parse_address
these functions have been deprecated since librist release v0.2.1 and are replaced with functions
suffixed with 2.
I added a version macro check at the top of the file to ensure ffmpeg can still be compiled against
older versions.
Signed-off-by: Gijs Peskens <gijs@peskens.net>
Signed-off-by: Marton Balint <cus@passwd.hu>
The maximum allowed useful PES payload data was set to PES_packet_length, but
it is in fact smaller by the length of the PES header.
This changes how corrupt streams are packetized:
- If PES header length is bigger than PES_packet_length then the PES packet
payload will be handled as an unbound packet
- PES packets with payload across multiple MPEGTS packets will always be
splitted if with the next chunk of data the payload should exceed
PES_packet_length, previously a PES_header_length amount of excess was
allowed.
Fixes ticket #9355.
Signed-off-by: Marton Balint <cus@passwd.hu>
This renames PESContext->total_size to PESContext->PES_packet_length and keeps
it 0 for unbound packets, so its name and semantics will match the standard.
There should be no change in functionality.
Signed-off-by: Marton Balint <cus@passwd.hu>
(Inside a function a stray ';' is an empty statement; outside of
a function it is actually invalid, but compilers happen to accept
it without complaint (unless e.g. using -pedantic).)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The voice registration system in libflite is broken: It is not
thread-safe and also not based on internal counters; instead
any call to unregister a voice frees said voice even if there are still
many other users of said voice who have also registered said voice.
While there is no way to guard against another library unregistering
voices behind our back, we can at least be correct in the absence of
other users of libflite. The current code already tried this by using
a reference count of our own for each voice; but the implementation
of this is not thread-safe at all.
Fix this by using a mutex to guard all of libavfilter's libflite
registration and unregistration calls, thereby being thread-safe
in the absence of other libflite users.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When an flite filter instance is uninitialized and the refcount
of the corresponding voice_entry reaches zero, the voice is
unregistered, yet the voice_entry's pointer to the voice is not reset.
(Whereas some other pointers are needlessly reset.)
Because of this a new flite filter instance will believe said voice
to already be registered, leading to use-after-frees.
Fix this by resetting the right pointer instead of the wrong ones.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Could also happen if initializing flite failed* or if an unknown voice
has been selected or if registering the voice failed.
*: which it currently can't, because it is a no-op.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes segfaults with filters that either return AVERROR(EAGAIN)
(or another error) or that do not set everything and rely on
filter_query_formats() to set the rest.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The calling code does not handle failures and will fail with assertion failures later.
Seeking can always fail even when the position was previously read.
Fixes: Assertion failure
Fixes: 35253/clusterfuzz-testcase-minimized-ffmpeg_dem_MATROSKA_fuzzer-4693059982983168
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Some packages may not define custom cflags, in which case a simple
"pkg-config --cflags" call will return an empty string.
This change will be useful to get a valid include path that can be
used in library checks.
Reviewed-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes memleaks in case the trailer is never written.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
av_image_copy() expects an array of four pointers according to its
declaration; although it currently only touches pointers that
are actually in use (depending upon the pixel format) this might
change at any time (as has already happened for the linesizes
in d7bc52bf45).
This fixes ticket #9264 as well as a warning from GCC 11.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
(It is actually UB if a declaration and its definition differ wrt
their types like they do in this case (the declaration in allfilters
is const).)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now setting the input and output devices lists is guarded
by a mutex. This prevents data races emanating from multiple concurrent
calls to avpriv_register_devices() (triggered by multiple concurrent
calls to avdevice_register_all()). Yet reading the lists pointers was
done without any lock and with nonatomic variables. This means that
there are data races in case of concurrent calls to
av_(de)muxer_iterate() and avdevice_register_all() (but only if the
iteration in av_(de)muxer_iterate exhausts the non-device (de)muxers).
This commit fixes this by putting said pointers into atomic objects.
Due to the unavailability of _Atomic the object is an atomic_uintptr,
leading to ugly casts. Switching to atomics also allowed to remove
the mutex currently used in avpriv_register_devices().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: left shift of negative value -1
Fixes: 39223/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5498831521841152
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
avcodec_receive_packet() already unreferences the packet on its own.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The output stream's packet may not have been allocated
at that point. This happens when quitting in the following command line:
$ ./ffmpeg -lavfi abuffer=sample_fmt=u8:sample_rate=48000:channel_layout=stereo -f null -
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When inserting an auto-resampler, it may be that the configuration
of the filters that the auto-resampler is supposed to connect is
already partially merged, i.e. converter->inputs[0].incfg.foo and
converter->outputs[0].outcfg.foo (where foo is one of formats,
samplerates, channel_layouts) can coincide. Therefore merging
the converter filter's input link might modify the outcfg of the
converter' outlink. Yet the current code in avfiltergraph.c used
pointers from before merging the inlink for merging the outlink,
leading to a use-after-free in command lines like:
$ ffmpeg -f lavfi -i anullsrc=cl=stereo -lavfi channelsplit,axcorrelate -f null -
Fix this by not using outdated values when merging the outlink.
This is a regression since 85a6404d7e.
Found-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>