1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-03-03 14:32:16 +02:00

36963 Commits

Author SHA1 Message Date
Martin Storsjö
ecd343aa1f arm: vp9itxfm: Only reload the idct coeffs for the iadst_idct combination
This avoids reloading them if they haven't been clobbered, if the
first pass also was idct.

This is similar to what was done in the aarch64 version.

This is cherrypicked from libav commit
3c87039a404c5659ae9bf7454a04e186532eb40b.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:27 +01:00
Martin Storsjö
37cb224e3e aarch64: vp9itxfm: Don't repeatedly set x9 when nothing overwrites it
This is cherrypicked from libav commit
2f99117f6ff24ce5be2abb9e014cb8b86c2aa0e0.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:25 +01:00
Martin Storsjö
f69dd26df5 arm: vp9itxfm: Rename a macro parameter to fit better
Since the same parameter is used for both input and output,
the name inout is more fitting.

This matches the naming used below in the dmbutterfly macro.

This is cherrypicked from libav commit
79566ec8c77969d5f9be533de04b1349834cca62.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:21 +01:00
Martin Storsjö
4a5874ea8d arm/aarch64: vp9itxfm: Fix indentation of macro arguments
This is cherrypicked from libav commit
721bc37522c5c1d6a8c3cea5e9c3fcde8d256c05.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:19 +01:00
Martin Storsjö
a95e7de41d aarch64: vp9itxfm: Use w3 instead of x3 for the int eob parameter
The clobbering tests in checkasm are only invoked when testing
correctness, so this bug didn't show up when benchmarking the
dc-only version.

This is cherrypicked from libav commit
4d960a11855f4212eb3a4e470ce890db7f01df29.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:16 +01:00
Janne Grunau
a71cd8439f arm: vp9itxfm: Simplify the stack alignment code
This is one instruction less for thumb, and only have got
1/2 arm/thumb specific instructions.

This is cherrypicked from libav commit
e5b0fc170f85b00f7dd0ac514918fb5c95253d39.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:12 +01:00
Janne Grunau
cb220eeef9 aarch64: vp9: loop filter: replace 'orr; cbn?z' with 'adds; b.{eq,ne};
The latter is 1 cycle faster on a cortex-53 and since the operands are
bytewise (or larger) bitmask (impossible to overflow to zero) both are
equivalent.

This is cherrypicked from libav commit
e7ae8f7a715843a5089d18e033afb3ee19ab3057.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:10 +01:00
Janne Grunau
62ea07d797 aarch64: vp9: use alternative returns in the core loop filter function
Since aarch64 has enough free general purpose registers use them to
branch to the appropiate storage code. 1-2 cycles faster for the
functions using loop_filter 8/16, ... on a cortex-a53. Mixed results
(up to 2 cycles faster/slower) on a cortex-a57.

This is cherrypicked from libav commit
d7595de0b25e7064fd9e06dea5d0425536cef6dc.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:06 +01:00
Paul B Mahol
743052ec5b avcodec/cinepakenc: remove CVID from long description
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-14 16:56:47 +01:00
Martin Vignali
31e722e9da libavcodec/psd : add test for channel depth/channel count in bitmap mode
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 04:52:43 +01:00
Paul B Mahol
2eaee6e79b avcodec/qdrw: skip long comment for now
Fixes part of #5918.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-13 21:19:17 +01:00
Steinar H. Gunderson
d68d7198be speedhq: Align blocks variable properly.
Seemingly ff_clear_block_sse assumed that the block array is aligned,
so make sure it is.

Fixes ticket #6079

Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-13 16:47:53 -03:00
James Almer
6596b34954 avcodec/lossless_videodsp: add missing call to ff_llviddsp_init_ppc()
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:56:50 -03:00
James Almer
6d4c9f2ade lossless_videodsp: rename add_hfyu_left_pred_int16 to add_left_pred_int16
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:05 -03:00
James Almer
47f212329e huffyuvdsp: move functions only used by huffyuv from lossless_videodsp
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:05 -03:00
James Almer
cf9ef83960 huffyuvencdsp: move shared functions to a new lossless_videoencdsp context
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:04 -03:00
James Almer
30c1f27299 huffyuvencdsp: move functions only used by huffyuv from lossless_videodsp
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:04 -03:00
James Almer
5ac1dd8e23 lossless_videodsp: move shared functions from huffyuvdsp
Several codecs other than huffyuv use them.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:04 -03:00
James Almer
1d4d0ee4b0 avutil/reverse: move the ff_reverse declaration to a separate header
Fixes compilation with hardcoded tables after eaff1aa09e90e2711207c9463db8bf8e8dec8178
and e71b8119e7db675dd2dac3f7fb069b0df2943c38

Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 19:59:37 -03:00
James Almer
e71b8119e7 avcodec/mathops: add missing header for ff_reverse
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-11 21:18:03 -03:00
Derek Buitenhuis
14b9060160 hevc: Mark as having threadsafe init
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2017-01-11 12:21:43 -05:00
Steinar H. Gunderson
2a293ec7ac avcodec: add Newtek SpeedHQ decoder
This decoder can decode all existing SpeedHQ formats (SHQ0–5, 7, and 9),
including correct decoding of the alpha channel.

1080p is decoded in 142 fps on one core of my i7-4600U (2.1 GHz Haswell),
about evenly split between bitstream reader and IDCT. There is currently
no attempt at slice or frame threading, even though the format trivially
supports both.

NewTek very helpfully provided a full set of SHQ samples, as well as
source code for an SHQ2 encoder (not included) and assistance with
understanding some details of the format.
2017-01-11 16:02:10 +01:00
Steinar H. Gunderson
eaff1aa09e avcodec: move bitswap_32() into a header file
Allows more codecs than mpeg12video to make use of it.
2017-01-11 15:40:01 +01:00
Paul B Mahol
107b3064d8 avcodec/wmaprodec: do not force extradata presence for XMA
Mainly useful for supporting decoding of headerless files.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-11 11:48:07 +01:00
Paul B Mahol
45cd50e5e2 avcodec/psd: fix ugly typo
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-11 11:42:50 +01:00
Martin Vignali
658e626cc0 libavcodec/psd : add support for psd bitmap mode
Fixes ticket #6044

Based on patch by Carl Eugen Hoyos
2017-01-11 00:22:25 +01:00
Carl Eugen Hoyos
4313ed511a lavc/psd: Interpret DUOTONE as GRAYSCALE.
This is what gimp, ImageMagick and FreeImage do and what the
Adobe Photoshop file format specification suggests.
Fixes a sample from ticket #6045.

Reviewed-by: Martin Vignali
2017-01-11 00:17:59 +01:00
Steven Liu
d9c2cfd316 avcodec/bsf: fix resource leak in av_bsf_list_parse_str
cid: 1396268
when av_strdup(str) error, the lst need release

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
2017-01-11 04:09:47 +08:00
Michael Niedermayer
f48b6b8b91 avcodec/tiff: Perform multiply in tiff_unpack_lzma() as 64bit
This should make no difference as the value should not be able to be that large
but its more correct this way

Fixes CID1348138

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-10 00:55:15 +01:00
Paul B Mahol
24d31a8074 avcodec/qdm2: make use of bytestream2
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-09 18:43:19 +01:00
Jun Zhao
b53b3a4f6a lavc/vaapi_encode_h264: disable B frames in baseline profile
Disable B frames when using baseline/constrained baseline profile,
following H.264 spec Annex A.2.1.

Signed-off-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Yi A Wang <yi.a.wang@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
2017-01-09 00:28:08 +00:00
Michael Niedermayer
762bf6f4af avcodec/bsf: Fix av_bsf_list_free()
Negate null check
Fixes CID1396248

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-08 15:26:01 +01:00
Michael Niedermayer
bd83c295fc avcodec/omx: Do not pass negative value into av_malloc()
Fixes CID1396849

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-08 15:25:14 +01:00
foo86
000638431c avcodec/dca: add support for 20-bit XLL
Fixes ticket #6063.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-07 11:28:12 -03:00
Paul B Mahol
90ac9f4094 avcodec: add QDMC decoder
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-06 22:05:45 +01:00
Paul B Mahol
49633f9f74 avcodec/iff: add support for vertical word compression in ILBM
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-06 21:45:52 +01:00
Kevin Wheatley
09905c412d libavcodec/exr: Fix blank output when data window != display window
looks like there is a bug in commit
1a08758e7c4e14a9ea8d2fef6c33ad411b2d3c40 relating to the handling of
ptr in decode_frame after decode_block is called, before this commit
ptr would have been incremented for each line in the data window, now
after the commit it is left at the start of the first included line
rather than the line after the data window then the code sets the
remaining lines to 0 and thus the whole image is over written.

Fix by adjusting ptr to the correct line after decode_block returns

Signed-off-by: Kevin Wheatley <kevin.j.wheatley@gmail.com>
2017-01-06 18:01:12 +01:00
Rostislav Pehlivanov
2d208aaabe imdct15: replace the FFT with a faster PFA FFT algorithm
This commit replaces the current inefficient non-power-of-two FFT with a
much faster FFT based on the Prime Factor Algorithm.
Although it is already much faster than the old algorithm without SIMD,
the new algorithm makes use of the already very throughouly SIMD'd power
of two FFT, which improves performance even more across all platforms
which we have SIMD support for.

Most of the work was done by Peter Barfuss, who passed the code to me to
implement into the iMDCT and the current codebase. The code for a
5-point and 15-point FFT was derived from the previous implementation,
although it was optimized and simplified, which will make its future
SIMD easier. The 15-point FFT is currently using 6% of the current
overall decoder overhead.

The FFT can now easily be used as a forward transform by simply not
multiplying the 5-point FFT's imaginary component by -1 (which comes
from the fact that changing the complex exponential's angle by -1 also
changes the output by that) and by multiplying the "theta" angle of the
main exptab by -1. Hence the deliberately left multiplication by -1 at
the end.

FATE passes, and performance reports on other platforms/CPUs are
welcome.

Performance comparisons:

iMDCT, PFA:
101127 decicycles in speed,   32765 runs,      3 skips
iMDCT, Old:
211022 decicycles in speed,   32768 runs,      0 skips

Standalone FFT, 300000 transforms of size 960:
    PFA        Old FFT     kiss_fft    libfftw3f
    3.659695s, 15.726912s, 13.300789s, 1.182222s

Being only 3x slower than libfftw3f is a big achievement by itself.

There appears to be something capping the performance in the iMDCT side
of things, possibly during the pre-stage reindexing. However, it is
certainly fast enough for now.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-01-05 22:32:02 +00:00
Rostislav Pehlivanov
4fdacf4cdb imdct15: remove the AArch64 assembly
Prep work for the next commit, which will add a new FFT algorithm
which makes the iMDCT over 3x faster than it is currently (standalone,
the FFT is with some framesizes over 10x faster).

The new FFT algorithm uses the already thouroughly SIMD'd power of two
FFT which already has SIMD for AArch64, so users of that platform will
still see an improvement.

The previous FFT+SIMD was barely 2.5x faster than the C versions on these
platforms.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-01-05 22:32:02 +00:00
Steve Lhomme
fd0716b364 dxva2: make ff_dxva2_get_surface() static and rename it
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-05 23:18:36 +01:00
Carl Eugen Hoyos
e6050d81b0 lavc/Makefile: Clean up the amv encoder dependencies. 2017-01-05 12:17:54 +01:00
Michael Niedermayer
7ca2a23aaa avcodec/bitstream: Document the values supported for *_size in ff_init_vlc_sparse()
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-05 12:08:24 +01:00
Michael Niedermayer
8f1d18a91b avcodec/bitstream: assert that *_size in ff_init_vlc_sparse() is valid
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-05 12:08:23 +01:00
Andreas Cadhalpun
e8651f51aa wmavoice: validate block alignment
This prevents a division by zero crash in wmavoice_decode_packet.

The problem was introduced by commit
3deb4b54a24f8cddce463d9f5751b01efeb976af.

Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2017-01-03 00:52:55 +01:00
Andreas Cadhalpun
91e6a64d2e wmavoice: truncate spillover_nbits if too large
This fixes triggering the av_assert0(ret <= tmp.size).

The problem was reintroduced by commit
7b27dd5c16de785297ce4de4b88afa0b6685f61d and originally fixed by
2a4700a4f03280fa8ba4fc0f8a9987bb550f0d1e.

Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2017-01-03 00:51:58 +01:00
Michael Niedermayer
aa95292043 avcodec/x86/vc1dsp_mc: Fix build with NASM 2.09.10
make fate passes

Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-02 22:37:55 +01:00
John Comeau
d06518752b avcodec/x86/imdct36: fix building with nasm 2.11.05
fixes `operation size not specified` errors as described here:
http://stackoverflow.com/questions/36854583/compiling-ffmpeg-for-kali-linux-2

I rebuilt again with yasm and made sure it didn't break that.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-02 20:44:16 +01:00
Carl Eugen Hoyos
28307ef7e6 lavc/psd: Support indexed files.
Fixes ticket #6045.
2017-01-02 11:39:21 +01:00
Michael Niedermayer
68cdeb06de avcodec/tests/fft: Fix indention of dct_init()
Fixes CID1396253

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-01 23:04:31 +01:00
Carl Eugen Hoyos
4acea512f3 lavc/mjpegdec: Do not overread too short JFIF tags.
Fixes ticket #6055.
2017-01-01 18:53:27 +01:00