1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-13 21:28:01 +02:00
Commit Graph

83120 Commits

Author SHA1 Message Date
Martin Storsjö
0ba0187535 aarch64: vp9mc: Fix a comment to refer to a register with the right name
This is cherrypicked from libav commit
85ad5ea72c.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:43 +01:00
Martin Storsjö
02cfb9a16e aarch64: vp9dsp: Fix vertical alignment in the init file
This is cherrypicked from libav commit
65074791e8.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:40 +01:00
Martin Storsjö
656d910981 arm: vp9mc: Fix vertical alignment of operands
This is cherrypicked from libav commit
c536e5e869.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:37 +01:00
Martin Storsjö
8b11a89c06 aarch64: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

vp9_inv_dct_dct_16x16_sub16_add_neon:   1373.2
vp9_inv_dct_dct_32x32_sub32_add_neon:   8089.0

By skipping individual 8x16 or 8x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     235.3
vp9_inv_dct_dct_16x16_sub2_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub8_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   1372.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   1372.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     555.1
vp9_inv_dct_dct_32x32_sub2_add_neon:    5190.2
vp9_inv_dct_dct_32x32_sub4_add_neon:    5180.0
vp9_inv_dct_dct_32x32_sub8_add_neon:    5183.1
vp9_inv_dct_dct_32x32_sub12_add_neon:   6161.5
vp9_inv_dct_dct_32x32_sub16_add_neon:   6155.5
vp9_inv_dct_dct_32x32_sub20_add_neon:   7136.3
vp9_inv_dct_dct_32x32_sub24_add_neon:   7128.4
vp9_inv_dct_dct_32x32_sub28_add_neon:   8098.9
vp9_inv_dct_dct_32x32_sub32_add_neon:   8098.8

I.e. in general a very minor overhead for the full subpartition case due
to the additional cmps, but a significant speedup for the cases when we
only need to process a small part of the actual input data.

This is cherrypicked from libav commits
cad42fadcd and
a0c443a398.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:32 +01:00
Martin Storsjö
388f6e6715 arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

                                     Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub16_add_neon:   3188.1   2435.4   2499.0   1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon:  18531.7  16582.3  14207.6  12000.3

By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     274.6    189.5    211.7    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    2064.0   1534.8   1719.4   1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    2135.0   1477.2   1736.3   1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon:    2446.7   1828.7   1993.6   1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   2832.4   2118.3   2266.5   1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.7   2475.3   2523.5   1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     756.2    456.7    862.0    553.9
vp9_inv_dct_dct_32x32_sub2_add_neon:   10682.2   8190.4   8539.2   6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10813.5   8014.9   8518.3   6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon:   11859.6   9313.0   9347.4   7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon:  12946.6  10752.4  10192.2   8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon:  14074.6  11946.5  11001.4   9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon:  15269.9  13662.7  11816.1   9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon:  16327.9  14940.1  12626.7  10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17462.7  15776.1  13446.2  11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon:  18575.5  17157.0  14249.3  12015.1

I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.

In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.

This is cherrypicked from libav commit
9c8bc74c2b.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:30 +01:00
Martin Storsjö
ecd343aa1f arm: vp9itxfm: Only reload the idct coeffs for the iadst_idct combination
This avoids reloading them if they haven't been clobbered, if the
first pass also was idct.

This is similar to what was done in the aarch64 version.

This is cherrypicked from libav commit
3c87039a40.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:27 +01:00
Martin Storsjö
37cb224e3e aarch64: vp9itxfm: Don't repeatedly set x9 when nothing overwrites it
This is cherrypicked from libav commit
2f99117f6f.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:25 +01:00
Martin Storsjö
f69dd26df5 arm: vp9itxfm: Rename a macro parameter to fit better
Since the same parameter is used for both input and output,
the name inout is more fitting.

This matches the naming used below in the dmbutterfly macro.

This is cherrypicked from libav commit
79566ec8c7.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:21 +01:00
Martin Storsjö
4a5874ea8d arm/aarch64: vp9itxfm: Fix indentation of macro arguments
This is cherrypicked from libav commit
721bc37522.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:19 +01:00
Martin Storsjö
a95e7de41d aarch64: vp9itxfm: Use w3 instead of x3 for the int eob parameter
The clobbering tests in checkasm are only invoked when testing
correctness, so this bug didn't show up when benchmarking the
dc-only version.

This is cherrypicked from libav commit
4d960a1185.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:16 +01:00
Janne Grunau
a71cd8439f arm: vp9itxfm: Simplify the stack alignment code
This is one instruction less for thumb, and only have got
1/2 arm/thumb specific instructions.

This is cherrypicked from libav commit
e5b0fc170f.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:12 +01:00
Janne Grunau
cb220eeef9 aarch64: vp9: loop filter: replace 'orr; cbn?z' with 'adds; b.{eq,ne};
The latter is 1 cycle faster on a cortex-53 and since the operands are
bytewise (or larger) bitmask (impossible to overflow to zero) both are
equivalent.

This is cherrypicked from libav commit
e7ae8f7a71.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:10 +01:00
Janne Grunau
62ea07d797 aarch64: vp9: use alternative returns in the core loop filter function
Since aarch64 has enough free general purpose registers use them to
branch to the appropiate storage code. 1-2 cycles faster for the
functions using loop_filter 8/16, ... on a cortex-a53. Mixed results
(up to 2 cycles faster/slower) on a cortex-a57.

This is cherrypicked from libav commit
d7595de0b2.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:06 +01:00
Michael Bradshaw
3ac46a0a62 ffmpeg: Add -time_base option to hint the time base
Signed-off-by: Michael Bradshaw <mjbshaw@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 20:03:56 +01:00
Paul B Mahol
743052ec5b avcodec/cinepakenc: remove CVID from long description
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-14 16:56:47 +01:00
Carl Eugen Hoyos
935404923d Cosmetics: Reindent after last commit. 2017-01-14 06:07:06 +01:00
Carl Eugen Hoyos
c723108e25 lavf/matroskaenc: Do not write two CodecID elements for rawvideo.
Fixes ticket #6068.
2017-01-14 06:06:05 +01:00
Martin Vignali
1412e5a004 fate/psd : add test for bitmap and duotone
The duotone file is interpreted as gray

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 04:52:43 +01:00
Martin Vignali
31e722e9da libavcodec/psd : add test for channel depth/channel count in bitmap mode
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 04:52:43 +01:00
Matthieu Bouron
e109c54a69 swresample/arm: cosmetic fixes 2017-01-13 21:24:25 +01:00
Matthieu Bouron
0265aec565 swresample/aarch64: add ff_resample_common_apply_filter_{x4,x8}_{float,s16}_neon 2017-01-13 21:24:19 +01:00
Paul B Mahol
2eaee6e79b avcodec/qdrw: skip long comment for now
Fixes part of #5918.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-13 21:19:17 +01:00
Steinar H. Gunderson
d68d7198be speedhq: Align blocks variable properly.
Seemingly ff_clear_block_sse assumed that the block array is aligned,
so make sure it is.

Fixes ticket #6079

Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-13 16:47:53 -03:00
James Almer
6596b34954 avcodec/lossless_videodsp: add missing call to ff_llviddsp_init_ppc()
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:56:50 -03:00
James Almer
6d4c9f2ade lossless_videodsp: rename add_hfyu_left_pred_int16 to add_left_pred_int16
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:05 -03:00
James Almer
47f212329e huffyuvdsp: move functions only used by huffyuv from lossless_videodsp
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:05 -03:00
James Almer
cf9ef83960 huffyuvencdsp: move shared functions to a new lossless_videoencdsp context
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:04 -03:00
James Almer
30c1f27299 huffyuvencdsp: move functions only used by huffyuv from lossless_videodsp
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:04 -03:00
James Almer
5ac1dd8e23 lossless_videodsp: move shared functions from huffyuvdsp
Several codecs other than huffyuv use them.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:04 -03:00
Steven Liu
3222786c5a avformat/hlsenc: refine the hlsenc code
because the oc have been  potint to hls->avf or hls->vtt_avf
here is not needed point once again

Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
2017-01-13 07:59:48 +08:00
Steven Liu
b97e9cba0b avformat/hlsenc: fix hlsenc bug at windows system
when hlsenc use flag second_level_segment_index,
second_level_segment_size and second_level_segment_duration,
the rename is ok but the output filename always use the old filename
so move the rename operation after the close the ts file and
before open new segment

Reported-by: Christian Johannesen <chrisjohannesen@gmail.com>
Reviewed-by: Bodecs Bela <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
2017-01-13 07:57:22 +08:00
Steven Liu
aa7982577c cmdutils_opencl: fix resource_leak cid 1396852
CID: 1396852
check the devices_list alloc status,
and release the devices_list when alloc devices error

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
2017-01-13 07:54:49 +08:00
Thomas Turner
08fdf965c9 avutil/tests/audio_fifo.c: pass by reference for efficiency and change datatype to const
Signed-off-by: Thomas Turner <thomastdt@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-13 00:17:10 +01:00
James Almer
1d4d0ee4b0 avutil/reverse: move the ff_reverse declaration to a separate header
Fixes compilation with hardcoded tables after eaff1aa09e
and e71b8119e7

Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 19:59:37 -03:00
Carl Eugen Hoyos
2f94b305ac lavf/mxf: Add a universal label for ProRes used in FCP.
Fixes ticket #6075.
2017-01-12 23:24:39 +01:00
Sergey Kudryashov
a9b33b5a37 libavfilter/af_biquads: warn about clipping only after frame with clipping 2017-01-12 19:52:29 +01:00
Nicolas George
f7191ccad6 lavfi: remove stray semicolons.
Hopefully fix compilation with suncc.
2017-01-12 15:07:18 +01:00
Carl Eugen Hoyos
f31bac596f lavf/dss: Do not fail randomly if dss_sp input contains 0xff.
Fixes decoding the sample from ticket #6072 with ffmpeg.
2017-01-12 15:02:42 +01:00
Nicolas George
aaae459a85 lavfi: reindent after previous commit. 2017-01-12 14:06:16 +01:00
Nicolas George
912969a33e lavfi/buffersink: move to the new design. 2017-01-12 14:06:16 +01:00
Nicolas George
32c59a115d lavfi: do not call ff_filter_frame() with activate.
avfilter_graph_request_oldest() does work that should be done by
either the filter or the application.

The principle of this function, calling ff_request_frame() from
outside the filter was always shaky. This version is less elegant
since it requires making special cases for each filter, but it
is more robust since it no longer calls ff_request_frame()
directly without notifying the filter.

Eventually, avfilter_graph_request_oldest() will be deprecated
for a function to just run the graph.
2017-01-12 14:06:16 +01:00
Nicolas George
c619a4e525 lavfi: make two functions static.
ff_request_frame_to_filter() and ff_filter_frame_to_filter()
are only used in avfilter.c.
2017-01-12 14:06:16 +01:00
Nicolas George
ae4650f0b9 lavfi: disallow ff_request_frame for filters using activate.
Having two different functions allows to have stricter tests
and detect errors earlier.
2017-01-12 14:06:16 +01:00
Nicolas George
9eb4c79afd lavfi: add ff_inlink_request_frame(). 2017-01-12 14:06:16 +01:00
Nicolas George
d3cb140433 lavfi: move ff_update_link_current_pts() into the utility functions.
It does not change anything for the existing filters and makes
better code fatrorization when future code will use the utility
functions.
2017-01-12 14:06:16 +01:00
Nicolas George
7910127a8e lavfi: cosmetic: remove forward declaration. 2017-01-12 14:06:16 +01:00
Nicolas George
3ff01feda3 lavfi: add AVFilter.activate. 2017-01-12 14:06:16 +01:00
Nicolas George
db4a71c0ff lavfi: use the consume helpers in ff_filter_frame_to_filter(). 2017-01-12 14:06:16 +01:00
Nicolas George
d360ddf03b lavfi: add helpers to consume frames from link FIFOs. 2017-01-12 14:06:16 +01:00
Nicolas George
2e5af443c3 lavfi: pass min explicitly to samples_ready(). 2017-01-12 14:06:16 +01:00