1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00
Commit Graph

5181 Commits

Author SHA1 Message Date
Lynne
8e94b7cff0
lavu/tx: invert permutation lookups
out[lut[i]] = in[i] lookups were 4.04 times(!) slower than
out[i] = in[lut[i]] lookups for an out-of-place FFT of length 4096.

The permutes remain unchanged for anything but out-of-place monolithic
FFT, as those benefit quite a lot from the current order (it means
there's only 1 lookup necessary to add to an offset, rather than
a full gather).

The code was based around non-power-of-two FFTs, so this wasn't
benchmarked early on.
2021-02-27 04:21:05 +01:00
Lynne
9ddaf0c9f0
lavu/tx: simplify in-place permute search function 2021-02-27 04:21:03 +01:00
Lynne
10341743d2
lavu/tx: require output argument to match input for inplace transforms
This simplifies some assembly code by a lot, by either saving a branch
or saving an entire duplicated function.
2021-02-26 05:42:24 +01:00
James Almer
45a2902976 avutil/buffer: free all pooled buffers immediately after uninitializing the pool
No buffer will be fetched from the pool after it's uninitialized, so there's
no benefit from waiting until every single buffer has been returned to it
before freeing them all.
This should free some memory in certain scenarios, which can be beneficial in
low memory systems.

Based on a patch by Jonas Karlman.

Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-24 10:45:30 -03:00
Andreas Rheinhardt
c7c7aa85b7 avutil/tx: Fix declaration after statement
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 10:03:32 +01:00
Martin Storsjö
ee040a7fc2 arm/aarch64: Use mach_absolute_time as timer on apple platforms
This is much less precise than the cycle counter register, but
the cycle counter register is not available on apple platforms
(and on linux, it requires a kernel module for allowing user mode
access).

Signed-off-by: Martin Storsjö <martin@martin.st>
2021-02-21 22:41:34 +02:00
Lynne
5ca40d6d94
lavu/tx: support in-place FFT transforms
This commit adds support for in-place FFT transforms. Since our
internal transforms were all in-place anyway, this only changes
the permutation on the input.

Unfortunately, research papers were of no help here. All focused
on dry hardware implementations, where permutes are free, or on
software implementations where binary bloat is of no concern so
storing dozen times the transforms for each permutation and version
is not considered bad practice.
Still, for a pure C implementation, it's only around 28% slower
than the multi-megabyte FFTW3 in unaligned mode.

Unlike a closed permutation like with PFA, split-radix FFT bit-reversals
contain multiple NOPs, multiple simple swaps, and a few chained swaps,
so regular single-loop single-state permute loops were not possible.
Instead, we filter out parts of the input indices which are redundant.
This allows for a single branch, and with some clever AVX512 asm,
could possibly be SIMD'd without refactoring.

The inplace_idx array is guaranteed to never be larger than the
revtab array, and in practice only requires around log2(len) entries.

The power-of-two MDCTs can be done in-place as well. And it's
possible to eliminate a copy in the compound MDCTs too, however
it'll be slower than doing them out of place, and we'd need to dirty
the input array.
2021-02-21 17:05:16 +01:00
Lynne
aa34e99f88
lavu/tx: space out enum AVTXType values with newlines
Makes separation clearer.
2021-02-21 17:05:04 +01:00
Andreas Rheinhardt
c9d9c60746 avutil/video_enc_params: Check for truncation before creating buffer
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-19 07:45:39 +01:00
Andreas Rheinhardt
39df279c74 avutil/video_enc_params: Combine overflow checks
This patch also fixes a -Wtautological-constant-out-of-range-compare
warning from Clang and a -Wtype-limits warning from GCC on systems
where size_t is 64bits and unsigned 32bits. The reason for this seems
to be that variable (whose value derives from sizeof() and can therefore
be known at compile-time) is used instead of using sizeof() directly in
the comparison.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-19 07:45:39 +01:00
Andreas Rheinhardt
bd50e715a9 avutil/common: Move everything inside inclusion guards
libavutil/common.h is a public header that provides generic math
functions whereas libavutil/intmath.h is a private header that contains
plattform-specific optimized versions of said math functions. common.h
includes intmath.h (when building the FFmpeg libraries) so that the
optimized versions are used for them.

This interdependency sometimes causes trouble: intmath.h once contained
an inlined ff_sqrt function that relied upon av_log2_16bit. In case there
was no optimized logarithm available on this plattform, intmath.h needed
to include common.h to get the generic implementation and this has been
done after the optimized versions (if any) have been provided so that
common.h used the optimized versions; it also needed to be done before
ff_sqrt. Yet when intmath.h was included from common.h and if an ordinary
inclusion guard was used by common.h, the #include "common.h" in intmath.h
was a no-op and therefore av_log2_16bit was still unknown at the end of
intmath.h (and also in ff_sqrt) if no optimized version was available.

Before a955b59658 this was solved by
duplicating the #ifndef av_log2_16bit check after the inclusion of
common.h in intmath.h; said commit instead moved these checks to the
end of common.h, outside the inclusion guards and made common.h include
itself to get these unguarded defines. This is still the current
state of affairs.

Yet this is unnecessary since 9734b8ba56
as said commit removed ff_sqrt as well as the #include "common.h" from
intmath.h. Therefore this commit moves everything inside the inclusion
guards and makes common.h not include itself.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-11 09:07:10 +01:00
Michael Niedermayer
7a23952614 avutil/mathematics: Fix undefined negation in av_compare_ts()
Fixes: negation of -9223372036854775808 cannot be represented in type 'int64_t' (aka 'long'); cast to an unsigned type to negate this value to itself
Fixes: 29437/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-4748510022991872

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-02-10 12:28:29 +01:00
Michael Niedermayer
1bda9bb68a libavutil/common: Add FFABS64U()
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-02-10 12:28:29 +01:00
Michael Niedermayer
8574fcbfc7 libavutil/eval: Remove CONFIG_TRAPV special handling
Fixes: division by zero
Fixes: 29555/clusterfuzz-testcase-minimized-ffmpeg_dem_VIVO_fuzzer-5149951447400448

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-02-10 12:28:29 +01:00
Guo, Yejun
149bfc2445 libavutil/frame.h: correct typo for AVFilmGrainParams in comment 2021-01-27 13:13:47 +08:00
Michael Niedermayer
5dd9567080 avutil/common: Add FFABSU() for a signed -> unsigned ABS
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-01-26 18:37:12 +01:00
Marton Balint
16766bf8a8 Revert "avutil/timecode: fix sscanf format string with garbage at the end"
This reverts commit 6696a07ac6.

It is wrong to restrict timecodes to always contain leading zeros or for hours
or frames to be 2 chars only.

Signed-off-by: Marton Balint <cus@passwd.hu>
2021-01-23 19:54:14 +01:00
Xu Guangxin
ae97f69ce1 avutils/vulkan: hwmap, respect src frame resolution
fixes http://trac.ffmpeg.org/ticket/9055

The hw decoder may allocate a large frame from AVHWFramesContext, and adjust width and height based on bitstream.
We need to use resolution from src frame instead of AVHWFramesContext.

test command:
ffmpeg -loglevel debug -hide_banner -hwaccel vaapi -init_hw_device vaapi=va:/dev/dri/renderD128 -hwaccel_device va -hwaccel_output_format vaapi -init_hw_device vulkan=vulk -filter_hw_device vulk -i 1920x1080.264 -c:v libx264 -r:v 30 -profile:v high -preset veryfast -vf "hwmap,chromaber_vulkan=0:0,hwdownload,format=nv12" -map 0 -y vaapiouts.mkv

expected:
No green bar at bottom.
2021-01-22 04:30:42 +01:00
rcombs
eabf5e6d6b All: update names in copyright headers 2021-01-20 01:02:56 -06:00
Michael Niedermayer
1b19057396 avutil/timecode: Avoid undefined behavior with large framenum
Fixes: signed integer overflow: 2147462079 + 2149596 cannot be represented in type 'int'
Fixes: 27565/clusterfuzz-testcase-minimized-ffmpeg_DEMUXER_fuzzer-5091972813160448

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-01-19 00:05:50 +01:00
Limin Wang
6696a07ac6 avutil/timecode: fix sscanf format string with garbage at the end
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
2021-01-16 08:51:11 +08:00
James Almer
f6477ac9f4 avutil/tx: use ENOSYS instead of ENOTSUP
It's the standard error code used across the codebase to signal unimplemented
or unsupported features.

Signed-off-by: James Almer <jamrial@gmail.com>
2021-01-13 23:02:47 -03:00
Lynne
06a8596825
lavu: support arbitrary-point FFTs and all even (i)MDCT transforms
This patch adds support for arbitrary-point FFTs and all even MDCT
transforms.
Odd MDCTs are not supported yet as they're based on the DCT-II and DCT-III
and they're very niche.

With this we can now write tests.
2021-01-13 17:34:13 +01:00
Michael Niedermayer
90e4862ffa avutil/eval: Unconditionally check argument of e_div
Fixes: division by zero
Fixes: 26451/clusterfuzz-testcase-minimized-ffmpeg_dem_VIVO_fuzzer-4756955832516608

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-01-11 23:21:05 +01:00
Lynne
91e1625db1
lavu/tx: clip when converting table values to fixed-point
INT32_MAX (2147483647) isn't exactly representable by a floating point
value, with the closest being 2147483648.0. So when rescaling a value
of 1.0, this could overflow when casting the 64-bit value returned from
lrintf() into 32 bits.
Unfortunately the properties of integer overflows don't match up well
with how a Fourier Transform operates. So clip the value before
casting to a 32-bit int.

Should be noted we don't have overflows with the table values we're
currently using. However, converting a Kaiser-Bessel window function
with a length of 256 and a parameter of 5.0 to fixed point did create
overflows. So this is more of insurance to save debugging time
in case something changes in the future.
The macro is only used during init, so it being a little slower is
not a problem.
2021-01-09 20:54:56 +01:00
Andreas Rheinhardt
2c6f532e0a Mark some pointers as const
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-01-01 15:25:48 +01:00
Anton Khirnov
baecaa16c1 mpegvideo: use the AVVideoEncParams API for exporting QP tables
Do it only when requested with the AV_CODEC_EXPORT_DATA_VIDEO_ENC_PARAMS
flag.

Drop previous code using the long-deprecated AV_FRAME_DATA_QP_TABLE*
API. Temporarily disable fate-filter-pp, fate-filter-pp7,
fate-filter-spp. They will be reenabled once these filters are converted
in following commits.
2021-01-01 14:23:19 +01:00
Anton Khirnov
e15371061d lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bump
They are not properly namespaced and not intended for public use.
2021-01-01 14:14:57 +01:00
Anton Khirnov
c8c2dfbc37 lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h
That is a more appropriate place for it.
2021-01-01 14:11:01 +01:00
Lynne
53c56585a6
hwcontext_drm: make dependency on Linux kernel headers optional
This was introduced in version 4.6. And may not exist all without an
optional package. So to prevent a hard dependency on needing the Linux
kernel headers to compile, make this optional.

Also ignore the status of the ioctl, since it may fail on older kernels
which don't support it. It's okay to ignore as its not fatal and any
serious errors will be caught later by the mmap call.
2020-12-30 23:14:46 +01:00
Marvin Scholz
d67c6c7f6f lavu: use address-of operator checking clock_gettime
When targeting a recent enough macOS/iOS version that has clock_gettime
it won't be a weak symbol, in which case clang warns for this check
as it's always true:

  warning: address of function 'clock_gettime' will always
  evaluate to 'true'

This warning is silenced by using the address-of operator to make
the intent explicit.
2020-12-28 01:12:26 -03:00
Lynne
b51b9bbd42
hwcontext_vulkan: wait and signal semaphores when transferring to CUDA
Same as when downloading. Not sure why this isn't done, probably
because the CUDA code predates the sync mechanism we settled on.
2020-12-05 23:53:23 +01:00
Limin Wang
48235c8263 avutil/opt: add AV_OPT_FLAG_DEPRECATED option
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
2020-12-05 09:00:53 +08:00
Marton Balint
eca12f4d5a avutil/timecode: add av_timecode_init_from_components
Signed-off-by: Marton Balint <cus@passwd.hu>
2020-12-03 18:32:54 +01:00
Marton Balint
2d90d51c56 avutil/timecode: allow drop frame timecodes for multiples of 30000/1001 fps
Signed-off-by: Marton Balint <cus@passwd.hu>
2020-12-03 18:32:13 +01:00
James Almer
53b4550bdd avutil/film_grain_params: fix doxy for ar_coeff_* fields
Signed-off-by: James Almer <jamrial@gmail.com>
2020-12-03 13:25:21 -03:00
Paul B Mahol
13df9bfbcb avutil/avsscanf: fix possible overreads when dealing with %c or %s 2020-12-02 13:54:53 +01:00
Lynne
659e6e9c88
hwcontext_vulkan: reduce priority for PACK32 formats
Due to some endian-dependent overlap, these should be used last.
2020-11-27 02:58:02 +01:00
James Almer
ee61d4dc68 avutil/film_grain_params: add more details to some AVFilmGrainAOMParams fields
Signed-off-by: James Almer <jamrial@gmail.com>
2020-11-26 22:35:40 -03:00
Lynne
2ba04670c3
lavu/film_grain_params: fix typo in type enum
Ref: xkcd #1015
2020-11-27 02:30:01 +01:00
Niklas Haas
3fbc74582f
hwcontext_vulkan: optionally enable more functionality
These two extensions and two features are both optionally used by
libplacebo to speed up rendering, so it makes sense for libavutil to
automatically enable them as well.
2020-11-25 23:14:47 +01:00
Lynne
2aeb152653
hwcontext_vulkan: support additional pixel formats
We support every single packed format possible now.
There are some fringe leftover mappings which are uninteresting.
2020-11-25 23:14:47 +01:00
Lynne
48b3532183
hwcontext_vulkan: fix incorrect A/0BGR mapping
Vulkan formats with a PACK suffix define native endianess.
Vulkan formats without a PACK suffix are in bytestream order.

Pixel formats with a LE/BE suffix define endianess.
Pixel formats without LE/BE suffix are in bytestream order.
2020-11-25 23:14:46 +01:00
Lynne
3aa8de12ab
hwcontext_vulkan: simplify plane size calculations and support 4-plane formats
Needed to support YUVA.
2020-11-25 23:14:46 +01:00
Lynne
7b274a9b89
hwcontext_vulkan: do not segfault when failing to init a AVHWFramesContext
frames_uninit is always called on failure, and the free_exec_ctx function
did not zero the pool when freeing it, so it resulted in a double free.
2020-11-25 23:14:46 +01:00
Lynne
18a6535b08
hwcontext_vulkan: always attempt to map host memory when transferring
This relies on the fact that host memory is always going to be required
to be aligned to the platform's page size, which means we can adjust
the pointers when we map them to buffers and therefore skip an entire
copy. This has already had extensive testing in libplacebo without
problems, so its safe to use here as well.

Speeds up downloads and uploads on platforms which do not pool their
memory hugely, but less so on platforms that do.

We can pool the buffers ourselves, but that can come as a later patch
if necessary.
2020-11-25 23:14:01 +01:00
Lynne
9cf1811d3d
hwcontext_vulkan: check for memory size before choosing type
It makes allocation a bit more robust in case some weird device with
weird drivers which segments memory in weird ways appears.
2020-11-25 23:06:36 +01:00
Lynne
ff29ca2f1f
hwcontext_vulkan: correctly access the p->extensions bitmask
Its a 64-bit bitfield being put directly into an int.
2020-11-25 23:06:36 +01:00
Lynne
b83e0560f7
hwcontext_vulkan: unify download/upload functions
They were identical, save for variable names and order.
2020-11-25 23:06:35 +01:00
Lynne
b4f9d05301
hwcontext_vulkan: add VkExternalMemoryBufferCreateInfo to imported buffers
Its a validation layer thing.
2020-11-25 23:06:35 +01:00