Found via ASAN with the dnn-layer-conv2d FATE-test.
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
buffer_length is a power-of-two and modulo is buffer_length - 1, so that
buffer_length & modulo is zero.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Also unify incrementing the variable containing the pointer
to the currently used HRIR data.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The headphone filter uses an array with as many elements as the
filter has inputs to store some per-input information; yet actually it
only stores information for all inputs except the very first one (which
is special for this filter). Therefore this commit modifies the code to
remove this unused element.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Despite the headphone filter only using one AVFrame at a time, it kept
an array each of whose entries contained a pointer to an AVFrame at all
times; the pointers were mostly NULL. This commit instead replaces them
by using a single pointer to an AVFrame on the stack of the only
function that actually uses them.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The headphone filter allocates a pair of buffers to be used as
intermediate buffers lateron: Before every use they are zeroed, then
some elements of the buffer are set and lateron the complete buffers are
copied into another, bigger buffer. These intermediate buffers are
unnecessary as the data can be directly written into the bigger buffer.
Furthermore, the whole buffer has been zeroed initially and because no
piece of this buffer is set twice (due to the fact that duplicate
channel map entries are skipped), it is unnecessary to rezero the part
of the big buffer that is about to be written to.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Before this commit, the headphone filter called
av_channel_layout_extract_channel() in a loop in order to find out
the index of a channel (given via its AV_CH_* value) in a channel layout.
This commit changes this to av_get_channel_layout_channel_index()
instead.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The documentation of the map argument of the headphone filter states:
"Set mapping of input streams for convolution. The argument is a
’|’-separated list of channel names in order as they are given as
additional stream inputs for filter."
Yet this has not been honoured at all. Instead for the kth given HRIR
channel pair it was checked whether there was a kth mapping and if the
channel position so given was present in the channel layout of the main
input; if so, then the given HRIR channel pair was matched to the kth
channel of the main input. It should actually have been matched to the
channel given by the kth mapping. A consequence of the current algorithm
is that if N additional HRIR channel pairs are given, a permutation of
the first N entries of the mapping does not affect the output at all.
The old code might even set arrays belonging to streams that don't exist
(i.e. whose index is >= the number of channels of the main input
stream); these parts were not read lateron at all. The new code doesn't
do this any longer and therefore the number of elements of some of the
allocated arrays has been reduced (in case the number of mappings was
bigger than the number of channels of the first input stream).
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
When the headphone filter is configured to perform its processing in the
frequency domain, it allocates (among other things) two pairs of
buffers, all of the same size. One pair is used to store data in it
during the initialization of the filter; the other pair is only
allocated lateron. It is zero-initialized and yet its data is
immediately overwritten by the content of the other pair of buffers
mentioned above; the latter pair is then freed.
This commit eliminates the pair of intermediate buffers.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The headphone filter has two modes; in one of them (say A), it needs
certain buffers to store data. But it allocated them in both modes.
Furthermore when in mode A it also allocated intermediate buffers of the
same size, initialized them, copied their contents into the permanent
buffers and freed them.
This commit changes this: The permanent buffer is only allocated when
needed; the temporary buffer has been completely avoided.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
They seem to exist for an option that was never implemented.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The delay arrays were never properly initialized, only zero-initialized;
furthermore these arrays duplicate fields in the headphone_inputs
struct. So remove them.
(Btw: The allocations for them have not been checked.)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The string given by an AVOption that contains the channel assignment
is used only once; ergo it doesn't matter that parsing the string via
av_strtok() is destructive. There is no need to make a copy.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
When parsing the channel mapping string (a string containing '|'
delimited tokens each of which is supposed to contain a channel name
like "FR"), the old code would at each step read up to seven uppercase
characters from the input string and give this to
av_get_channel_layout() to parse. The returned layout is then checked
for being a layout with a single channel set by computing its logarithm.
Besides being overtly complicated this also has the drawback of relying
on the assumption that every channel name consists of at most seven
uppercase letters only; but said assumption is wrong: The abbreviation
of the second low frequency channel is LFE2. Furthermore it treats
garbage like "FRfoo" as valid channel.
This commit changes this by using av_get_channel_layout() directly;
furthermore, av_get_channel_layout_nb_channels() (which uses popcount)
is used to find out the number of channels instead of the custom code
to calculate the logarithm.
(As a consequence, certain other formats to specify the channel layouts
are now accepted (like the hex versions of av_get_channel_layout()); but
this is actually not bad at all.)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The headphone filter has an option for the user to specify an assignment
of inputs to channels (or from pairs of channels of the second input to
channels). Up until now, these channels were stored in an int containing
the logarithm of the channel layout. Yet it is not the logarithm that is
used lateron and so a retransformation was necessary. Therefore this
commit simply stores the uint64_t as is, avoiding the retransformation.
This also has the advantage that unset channels (whose corresponding
entry is zero) can't be mistaken for valid channels any more; the old
code had to initialize the channels to -1 to solve this problem and had
to check for whether a channel is set before the retransformation
(because 1 << -1 is UB).
The only downside of this approach is that the size of the context
increases (by 256 bytes); but this is not exceedingly much.
Finally, the array has been moved to the end of the context; it is only
used a few times during the initialization process and moving it
decreased the offsets of lots of other entries, reducing codesize.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The headphone filter does most of its initialization after its init
function, because it can perform certain tasks only after all but its
first input streams have reached eof. When this happens, it allocates
certain buffers and errors out if an allocation fails.
Yet the filter didn't check whether some of these buffers already exist
(which may happen if an earlier attempt has been interrupted in the
middle (due to an allocation error)) in which case the old buffers leak.
This commit makes sure that initializing the buffers is only attempted
once; if not successfull at the first attempt, future calls to the
filter will error out. Trying to support resuming initialization doesn't
seem worthwhile.
Notice that some allocations were freed before a new allocation was
performed; yet this effort was incomplete. Said code has been removed.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The number of channels can be up to 64, not only 16.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The headphone filter stores the channel position of the ith HRIR stream
in the ith element of an array of 64 elements; but because there is no
check for duplicate channels, it is easy to write beyond the end of the
array by simply repeating channels.
This commit adds a check for duplicate channels to rule this out.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
When the headphone filter does its processing in the time domain,
the lengths of the buffers involved are determined by three parameters,
only two of which are relevant here: ir_len and air_len. The former is
the length (in samples) of the longest HRIR input stream and the latter
is the smallest power-of-two bigger than ir_len.
Using optimized functions to calculate the convolution places
restrictions on the alignment of the length of the vectors whose scalar
product is calculated. Therefore said length, namely ir_len, is aligned
on 32; but the number of elements of the buffers used is given by air_len
and for ir_len < 16 a buffer overflow happens.
This commit fixes this by ensuring that air_len is always >= 32 if
processing happens in the time domain.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Not providing any samples makes no sense at all. And if no samples
were provided for one of the HRIR streams, one would either run into
an av_assert1 in ff_inlink_consume_samples() or into a segfault in
take_samples() in avfilter.c.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This buffer was supposed to be initialized by sscanf(input, "%7[A-Z]%n",
buf, &len), yet if the first input character is not in the A-Z range,
buf is not touched (in particular it needn't be zero-terminated if the
failure happened when parsing the first channel and it still contains
the last channel name if the failure happened when one channel name
could be successfully parsed). This is treated as error in which case
buf is used directly in the log message. This commit fixes this by
actually using the string that could not be matched in the log message
instead.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Use pthread to multithread dnn_execute_layer_conv2d.
Can be tested with command "./ffmpeg_g -i input.png -vf \
format=yuvj420p,dnn_processing=dnn_backend=native:model= \
espcn.model:input=x:output=y:options=conv2d_threads=23 \
-y sr_native.jpg -benchmark"
before patch: utime=11.238s stime=0.005s rtime=11.248s
after patch: utime=20.817s stime=0.047s rtime=1.051s
on my 3900X 12c24t @4.2GHz
About the increase of utime, it's because that CPU HyperThreading
technology makes logical cores twice of physical cores while cpu's
counting performance improves less than double. And utime sums
all cpu's logical cores' runtime. As a result, using threads num
near cpu's logical core's number will double utime, while reduce
rtime less than half for HyperThreading CPUs.
Signed-off-by: Xu Jun <xujunzz@sjtu.edu.cn>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
This reverts commit abc884bcc0.
This patch was pushed without actual review.
An actual review would have revealed that the switch to activate
was not done correctly because the logic between request_frame()
and frame_wanted is not as direct with filters with multiple
outputs than with a single output.
Allow to set the EOF timestamp.
Also: doc/filters/testsrc*: specify the rounding of the duration option.
The changes in the ref files are right.
For filter-fps-down, the graph is testsrc2=r=7:d=3.5,fps=3.
3.5=24.5/7, so the EOF of testsrc2 will have PTS 25/7.
25/7=(10+5/7)/3, so the EOF PTS for fps should be 11/7,
and the output should contain a frame at PTS 10.
For filter-fps-up, the graph is testsrc2=r=3:d=2,fps=7,
for filter-fps-up-round-down and filter-fps-up-round-up
it is the same with explicit rounding options.
But there is no rounding: testsrc2 produces exactly 6 frames
and 2 seconds, fps converts it into exactly 14 frames.
The tests should probably be adjusted to restore them to
a useful coverage.
ff_set_common_formats() is currently only called after
graph_check_validity(), guaranteeing that inputs and outputs
are connected.
If we want to support configuring partially-connected graphs,
we will have a lot of redesign to do anyway.
Fix CID 1466262 / 1466263.
Explicitly insert the scale or aresample filter where it would
have been inserted by the negotiation.
Re-enable conversions if it cannot be done easily.
If a conversion is needed in a test, we want to know about it.
If the negotiation changes and makes new conversion necessary,
we want to know about it even more.
The channel_layouts and channel_counts options set what buffersink
is supposed to accept. If channel_counts contains 2, then stereo is
already accepted, there is no point in having it in channel_layouts
too. This was not properly documented until now, so only print a
warning.
Part of the code expects valid lists, in particular no duplicates.
These tests allow to catch bugs in filters (unlikely but possible)
and to give a clear message when the error comes from the user
((a)formats) or the application (buffersink).
If we decide to switch to a more efficient merging algorithm,
possibly sorting the lists, these functions will be the preferred
place for pre-processing, and can be renamed accordingly.
It will allow to refernce it as a whole without clunky macros.
Most of the changes have been automatically made with sed:
sed -i '
s/-> *in_formats/->incfg.formats/g;
s/-> *out_formats/->outcfg.formats/g;
s/-> *in_channel_layouts/->incfg.channel_layouts/g;
s/-> *out_channel_layouts/->outcfg.channel_layouts/g;
s/-> *in_samplerates/->incfg.samplerates/g;
s/-> *out_samplerates/->outcfg.samplerates/g;
' src/libavfilter/*(.)
Fixes: signed integer overflow: -1429092 * -32596 cannot be represented in type 'int'
Fixes: 24419/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FFWAVESYNTH_fuzzer-5157849974702080
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>