The headphone filter has two modes; in one of them (say A), it needs
certain buffers to store data. But it allocated them in both modes.
Furthermore when in mode A it also allocated intermediate buffers of the
same size, initialized them, copied their contents into the permanent
buffers and freed them.
This commit changes this: The permanent buffer is only allocated when
needed; the temporary buffer has been completely avoided.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
They seem to exist for an option that was never implemented.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The delay arrays were never properly initialized, only zero-initialized;
furthermore these arrays duplicate fields in the headphone_inputs
struct. So remove them.
(Btw: The allocations for them have not been checked.)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The string given by an AVOption that contains the channel assignment
is used only once; ergo it doesn't matter that parsing the string via
av_strtok() is destructive. There is no need to make a copy.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
When parsing the channel mapping string (a string containing '|'
delimited tokens each of which is supposed to contain a channel name
like "FR"), the old code would at each step read up to seven uppercase
characters from the input string and give this to
av_get_channel_layout() to parse. The returned layout is then checked
for being a layout with a single channel set by computing its logarithm.
Besides being overtly complicated this also has the drawback of relying
on the assumption that every channel name consists of at most seven
uppercase letters only; but said assumption is wrong: The abbreviation
of the second low frequency channel is LFE2. Furthermore it treats
garbage like "FRfoo" as valid channel.
This commit changes this by using av_get_channel_layout() directly;
furthermore, av_get_channel_layout_nb_channels() (which uses popcount)
is used to find out the number of channels instead of the custom code
to calculate the logarithm.
(As a consequence, certain other formats to specify the channel layouts
are now accepted (like the hex versions of av_get_channel_layout()); but
this is actually not bad at all.)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The headphone filter has an option for the user to specify an assignment
of inputs to channels (or from pairs of channels of the second input to
channels). Up until now, these channels were stored in an int containing
the logarithm of the channel layout. Yet it is not the logarithm that is
used lateron and so a retransformation was necessary. Therefore this
commit simply stores the uint64_t as is, avoiding the retransformation.
This also has the advantage that unset channels (whose corresponding
entry is zero) can't be mistaken for valid channels any more; the old
code had to initialize the channels to -1 to solve this problem and had
to check for whether a channel is set before the retransformation
(because 1 << -1 is UB).
The only downside of this approach is that the size of the context
increases (by 256 bytes); but this is not exceedingly much.
Finally, the array has been moved to the end of the context; it is only
used a few times during the initialization process and moving it
decreased the offsets of lots of other entries, reducing codesize.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The headphone filter does most of its initialization after its init
function, because it can perform certain tasks only after all but its
first input streams have reached eof. When this happens, it allocates
certain buffers and errors out if an allocation fails.
Yet the filter didn't check whether some of these buffers already exist
(which may happen if an earlier attempt has been interrupted in the
middle (due to an allocation error)) in which case the old buffers leak.
This commit makes sure that initializing the buffers is only attempted
once; if not successfull at the first attempt, future calls to the
filter will error out. Trying to support resuming initialization doesn't
seem worthwhile.
Notice that some allocations were freed before a new allocation was
performed; yet this effort was incomplete. Said code has been removed.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The number of channels can be up to 64, not only 16.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The headphone filter stores the channel position of the ith HRIR stream
in the ith element of an array of 64 elements; but because there is no
check for duplicate channels, it is easy to write beyond the end of the
array by simply repeating channels.
This commit adds a check for duplicate channels to rule this out.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
When the headphone filter does its processing in the time domain,
the lengths of the buffers involved are determined by three parameters,
only two of which are relevant here: ir_len and air_len. The former is
the length (in samples) of the longest HRIR input stream and the latter
is the smallest power-of-two bigger than ir_len.
Using optimized functions to calculate the convolution places
restrictions on the alignment of the length of the vectors whose scalar
product is calculated. Therefore said length, namely ir_len, is aligned
on 32; but the number of elements of the buffers used is given by air_len
and for ir_len < 16 a buffer overflow happens.
This commit fixes this by ensuring that air_len is always >= 32 if
processing happens in the time domain.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Not providing any samples makes no sense at all. And if no samples
were provided for one of the HRIR streams, one would either run into
an av_assert1 in ff_inlink_consume_samples() or into a segfault in
take_samples() in avfilter.c.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This buffer was supposed to be initialized by sscanf(input, "%7[A-Z]%n",
buf, &len), yet if the first input character is not in the A-Z range,
buf is not touched (in particular it needn't be zero-terminated if the
failure happened when parsing the first channel and it still contains
the last channel name if the failure happened when one channel name
could be successfully parsed). This is treated as error in which case
buf is used directly in the log message. This commit fixes this by
actually using the string that could not be matched in the log message
instead.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Use pthread to multithread dnn_execute_layer_conv2d.
Can be tested with command "./ffmpeg_g -i input.png -vf \
format=yuvj420p,dnn_processing=dnn_backend=native:model= \
espcn.model:input=x:output=y:options=conv2d_threads=23 \
-y sr_native.jpg -benchmark"
before patch: utime=11.238s stime=0.005s rtime=11.248s
after patch: utime=20.817s stime=0.047s rtime=1.051s
on my 3900X 12c24t @4.2GHz
About the increase of utime, it's because that CPU HyperThreading
technology makes logical cores twice of physical cores while cpu's
counting performance improves less than double. And utime sums
all cpu's logical cores' runtime. As a result, using threads num
near cpu's logical core's number will double utime, while reduce
rtime less than half for HyperThreading CPUs.
Signed-off-by: Xu Jun <xujunzz@sjtu.edu.cn>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
This reverts commit abc884bcc0.
This patch was pushed without actual review.
An actual review would have revealed that the switch to activate
was not done correctly because the logic between request_frame()
and frame_wanted is not as direct with filters with multiple
outputs than with a single output.
Allow to set the EOF timestamp.
Also: doc/filters/testsrc*: specify the rounding of the duration option.
The changes in the ref files are right.
For filter-fps-down, the graph is testsrc2=r=7:d=3.5,fps=3.
3.5=24.5/7, so the EOF of testsrc2 will have PTS 25/7.
25/7=(10+5/7)/3, so the EOF PTS for fps should be 11/7,
and the output should contain a frame at PTS 10.
For filter-fps-up, the graph is testsrc2=r=3:d=2,fps=7,
for filter-fps-up-round-down and filter-fps-up-round-up
it is the same with explicit rounding options.
But there is no rounding: testsrc2 produces exactly 6 frames
and 2 seconds, fps converts it into exactly 14 frames.
The tests should probably be adjusted to restore them to
a useful coverage.
ff_set_common_formats() is currently only called after
graph_check_validity(), guaranteeing that inputs and outputs
are connected.
If we want to support configuring partially-connected graphs,
we will have a lot of redesign to do anyway.
Fix CID 1466262 / 1466263.
Explicitly insert the scale or aresample filter where it would
have been inserted by the negotiation.
Re-enable conversions if it cannot be done easily.
If a conversion is needed in a test, we want to know about it.
If the negotiation changes and makes new conversion necessary,
we want to know about it even more.
The channel_layouts and channel_counts options set what buffersink
is supposed to accept. If channel_counts contains 2, then stereo is
already accepted, there is no point in having it in channel_layouts
too. This was not properly documented until now, so only print a
warning.
Part of the code expects valid lists, in particular no duplicates.
These tests allow to catch bugs in filters (unlikely but possible)
and to give a clear message when the error comes from the user
((a)formats) or the application (buffersink).
If we decide to switch to a more efficient merging algorithm,
possibly sorting the lists, these functions will be the preferred
place for pre-processing, and can be renamed accordingly.
It will allow to refernce it as a whole without clunky macros.
Most of the changes have been automatically made with sed:
sed -i '
s/-> *in_formats/->incfg.formats/g;
s/-> *out_formats/->outcfg.formats/g;
s/-> *in_channel_layouts/->incfg.channel_layouts/g;
s/-> *out_channel_layouts/->outcfg.channel_layouts/g;
s/-> *in_samplerates/->incfg.samplerates/g;
s/-> *out_samplerates/->outcfg.samplerates/g;
' src/libavfilter/*(.)
Fixes: signed integer overflow: -1429092 * -32596 cannot be represented in type 'int'
Fixes: 24419/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FFWAVESYNTH_fuzzer-5157849974702080
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The latest builds of glslang introduce new libraries that need to be
linked for all symbols to be fully resolved.
This change will break building against older installations of glslang
and it's very hard to tell them apart as the library change upstream
was not accompanied by any version bump and no official release has
been made with this change it - just lots of people packaging up git
snapshots. So, apologies in advance.
Version 1.1 (FX Fighter) files all have a sample rate of 44100
in the header, but only play back correctly at 22050.
Force the sample rate to 22050 when reading, and restrict it
when muxing.
Since bae8844e35, the AVPacket that is
intended to be used to return the demuxed packet is automatically
unreferenced when the demuxer returns an error. This makes an
av_packet_unref() in the lavfi demuxer redundant.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Although the ICC specifications say to check for this, libtiff doesn't
and neither does any other TIFF implementation, and the TIFF specs
say that Photoshop has a different way to encapsulate ICC profiles,
and are asking for advice on how to deal with it.
So basically, photoshop puts a different type than what's specified,
no other implementation checks for this, we do because we tried to
follow the specs although its harmless to not, and ran into this bug
because we didn't know about it.
Fixes: signed integer overflow: 998938090 + 1169275991 cannot be represented in type 'int'
Fixes: 23411/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VP9_fuzzer-4644692330545152
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 7958120835074169528 * 9 cannot be represented in type 'long long'
Fixes: 23382/clusterfuzz-testcase-minimized-ffmpeg_DEMUXER_fuzzer-6230683226996736
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If encoding fails, the AVPacket that ought to contain the encoded packet
is already unreferenced generically.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Currently the utilized AVBPrint API is internally limited to unsigned
integers, so if we limit the file size as well as the amount to read
to UINT_MAX - 1, we do not require additional limiting to be performed
on the values.
This change is based on the fact that initially the 8*1024 value added
in 96d70694ae was only for the case where
the file size was not known. It was not a maximum file size limit.
In 2912118898 this was reworked to be
a maximum manifest file size limit, while its commit message appears
to only note that it added support for larger manifest file sizes.
This should enable various unfortunately large MPEG-DASH manifests,
such as Youtube's multi-megabyte live stream archives to load up
as well as bring back the original intent of the logic.