1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-08 13:22:53 +02:00
Commit Graph

93617 Commits

Author SHA1 Message Date
Martin Storsjö
fecf75a5c4 aarch64: vp8: Remove superfluous includes
This fixes building with MSVC, which lacks unistd.h.

Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit ad32f7b126)
2019-02-19 23:42:16 +02:00
Martin Storsjö
7ddfa5e908 aarch64: vp8: Fix assembling with armasm64
Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit 2eeac79936)
2019-02-19 23:42:03 +02:00
Martin Storsjö
c950beb68d aarch64: vp8: Fix assembling with clang
This also partially fixes assembling with MS armasm64 (via
gas-preprocessor).

The movrel macro invocations need to pass the offset via a separate
parameter. Mach-o and COFF relocations don't allow a negative
offset to a symbol, which is handled properly if the offset is passed
via the parameter. If no offset parameter is given, the macro
evaluates to something like "adrp x17, subpel_filters-16+(0)", which
older clang versions also fail to parse (the older clang versions
only support one single offset term, although it can be a parenthesis.

Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit 26d7af4c38)
2019-02-19 23:41:47 +02:00
Tomas Härdin
abc5ac3cf5 palettegen: Fill with last color, not black
If we fill with black then the generated palette will have one color more
than what the user requested. This also resulted in unwanted black specks in
the output of paletteuse, especially when generating small palettes.
2019-02-19 21:29:03 +01:00
Matthew Fearnley
f2e89fe4d3 libavcodec/zmbvenc: motion estimation improvements/bug fixes:
- Clamp ME range to -64..63 (prevents corruption when me_range is too high)
- Allow MV's up to *and including* the positive range limit
- Allow out-of-edge ME by padding the prev buffer with a border of 0's
- Try previous MV before checking the rest (improves speed in some cases)
- More robust logic in code - ensure *mx,*my,*xored are updated together
2019-02-19 21:25:14 +01:00
Matthew Fearnley
2d80b56ce0 libavcodec/zmbvenc: block scoring improvements/bug fixes
- Improve block choices by counting 0-bytes in the entropy score
- Make histogram use uint16_t type, to allow byte counts from 16*16
(current block size) up to 255*255 (maximum allowed 8bpp block size)
- Make sure score table is big enough for a full block's worth of bytes
- Calculate *xored without using code in inner loop
2019-02-19 21:25:14 +01:00
Martin Storsjö
7e42d5f0ab aarch64: vp8: Optimize vp8_idct_add_neon for aarch64
The previous version was a pretty exact translation of the arm
version. This version does do some unnecessary arithemetic (it does
more operations on vectors that are only half filled; it does 4
uaddw and 4 sqxtun instead of 2 of each), but it reduces the overhead
of packing data together (which could be done for free in the arm
version).

This gives a decent speedup on Cortex A53, a minor speedup on
A72 and a very minor slowdown on Cortex A73.

Before:        Cortex A53    A72    A73
vp8_idct_add_neon:   79.7   67.5   65.0
After:
vp8_idct_add_neon:   67.7   64.8   66.7

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:28 +02:00
Martin Storsjö
49f9c4272c aarch64: vp8: Skip saturating in shrn in ff_vp8_idct_add_neon
The original arm version didn't do saturation here. This probably
doesn't make any difference for performance, but reduces the
differences.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:24 +02:00
Martin Storsjö
37394ef01b aarch64: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2
This makes it similar to put_epel16_v6, and gives a large speedup
on Cortex A53, a minor speedup on A72 and a very minor slowdown on
A73.

Before:                 Cortex A53     A72     A73
vp8_put_epel16_h6v6_neon:   2211.4  1586.5  1431.7
After:
vp8_put_epel16_h6v6_neon:   1736.9  1522.0  1448.1

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:21 +02:00
Martin Storsjö
cef914e083 arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2
This makes it similar to put_epel16_v6, and gives a 10-25%
speedup of this function.

Before:                   Cortex A7       A8       A9      A53     A72
vp8_put_epel16_h6v6_neon:    3058.0   2218.5   2459.8   2183.0  1572.2
After:
vp8_put_epel16_h6v6_neon:    2670.8   1934.2   2244.4   1729.4  1503.9

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:18 +02:00
Martin Storsjö
e39a9212ab aarch64: vp8: Port bilin functions from arm version
Cortex A53     A72     A73
vp8_put_bilin4_h_c:        303.8   102.2   161.8
vp8_put_bilin4_h_neon:     100.0    40.9    41.2
vp8_put_bilin4_hv_c:       322.8   201.0   305.9
vp8_put_bilin4_hv_neon:    156.8    72.6    77.0
vp8_put_bilin4_v_c:        304.7   101.7   166.5
vp8_put_bilin4_v_neon:      82.7    41.2    33.0
vp8_put_bilin8_h_c:       1192.7   352.5   623.8
vp8_put_bilin8_h_neon:     213.5    70.2    87.8
vp8_put_bilin8_hv_c:      1098.6   769.2  1041.9
vp8_put_bilin8_hv_neon:    324.0   123.5   146.0
vp8_put_bilin8_v_c:       1193.9   350.4   617.7
vp8_put_bilin8_v_neon:     183.9    60.7    64.7
vp8_put_bilin16_h_c:      2353.1   671.2  1223.3
vp8_put_bilin16_h_neon:    261.9   140.7   145.0
vp8_put_bilin16_hv_c:     2453.2  1470.9  2355.2
vp8_put_bilin16_hv_neon:   383.9   196.0   217.0
vp8_put_bilin16_v_c:      2349.3   669.8  1251.2
vp8_put_bilin16_v_neon:    202.9   110.7    96.2

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:14 +02:00
Martin Storsjö
58d1549227 aarch64: vp8: Port epel4 functions from arm version
Cortex A53    A72    A73
vp8_put_epel4_h4_c:        631.4  291.7  367.8
vp8_put_epel4_h4_neon:     241.0  131.0  155.7
vp8_put_epel4_h4v4_c:      967.5  529.3  667.7
vp8_put_epel4_h4v4_neon:   429.3  241.8  279.7
vp8_put_epel4_h4v6_c:     1374.7  657.5  864.5
vp8_put_epel4_h4v6_neon:   515.5  295.5  334.7
vp8_put_epel4_h6_c:        851.0  421.0  486.0
vp8_put_epel4_h6_neon:     321.5  195.0  217.7
vp8_put_epel4_h6v4_c:     1111.3  621.1  781.2
vp8_put_epel4_h6v4_neon:   539.2  328.0  365.3
vp8_put_epel4_h6v6_c:     1561.3  763.3  999.7
vp8_put_epel4_h6v6_neon:   645.5  401.0  434.7
vp8_put_epel4_v4_c:        663.8  298.3  357.0
vp8_put_epel4_v4_neon:     116.0   81.5   72.5
vp8_put_epel4_v6_c:        870.5  437.0  507.4
vp8_put_epel4_v6_neon:     147.7  108.8   92.0

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:11 +02:00
Martin Storsjö
cc7ba00c35 aarch64: vp8: Port missing epel8 functions from arm version
Cortex A53     A72     A73
vp8_put_epel8_h4_c:       2594.8  1159.6  1374.8
vp8_put_epel8_h4_neon:     506.4   244.2   314.0
vp8_put_epel8_h6_c:       3445.8  1677.1  1811.3
vp8_put_epel8_h6_neon:     634.4   371.7   433.0
vp8_put_epel8_v4_c:       2614.0  1174.8  1378.0
vp8_put_epel8_v4_neon:     321.0   221.7   235.8
vp8_put_epel8_v6_c:       3635.5  1703.0  2079.2
vp8_put_epel8_v6_neon:     416.9   317.0   295.5

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:08 +02:00
Martin Storsjö
52c9b0a6c0 aarch64: vp8: Port vp8_luma_dc_wht and vp8_idct_dc_add4uv from arm version
Cortex A53    A72    A73
vp8_luma_dc_wht_c:        115.7   75.7   90.7
vp8_luma_dc_wht_neon:      60.7   41.2   45.7
vp8_idct_dc_add4uv_c:     376.1  262.9  282.5
vp8_idct_dc_add4uv_neon:   52.0   29.0   37.0

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:04 +02:00
Martin Storsjö
c513fcd7d2 aarch64: vp8: Fix a typo in a comment
Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:00 +02:00
Martin Storsjö
f1011ea28a aarch64: vp8: Reorder the function pointer inits to match the arm original
Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:56 +02:00
Martin Storsjö
b4b27dce95 aarch64: vp8: Move the vp8dsp makefile entries to the right places
Even if NEON would be disabled, the init functions should be built
as they are called as long as ARCH_AARCH64 is set.

These functions are part of a generic DSP subsytem, not tied directly
to one decoder. (They should be built if the vp7 decoder is enabled,
even if the vp8 decoder is disabled.)

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:53 +02:00
Martin Storsjö
ad32f7b126 aarch64: vp8: Remove superfluous includes
This fixes building with MSVC, which lacks unistd.h.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:50 +02:00
Martin Storsjö
85bfaa4949 aarch64: vp8: Use the proper aarch64 form for conditional branches
The previous form also does seem to assemble on current tools,
but I think it might fail on some older aarch64 tools.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:47 +02:00
Martin Storsjö
2eeac79936 aarch64: vp8: Fix assembling with armasm64
Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:44 +02:00
Martin Storsjö
26d7af4c38 aarch64: vp8: Fix assembling with clang
This also partially fixes assembling with MS armasm64 (via
gas-preprocessor).

The movrel macro invocations need to pass the offset via a separate
parameter. Mach-o and COFF relocations don't allow a negative
offset to a symbol, which is handled properly if the offset is passed
via the parameter. If no offset parameter is given, the macro
evaluates to something like "adrp x17, subpel_filters-16+(0)", which
older clang versions also fail to parse (the older clang versions
only support one single offset term, although it can be a parenthesis.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:41 +02:00
Magnus Röös
0801853e64 libavcodec: vp8 neon optimizations for aarch64
Partial port of the ARM Neon for aarch64.

Benchmarks from fate:

benchmarking with Linux Perf Monitoring API
nop: 58.6
checkasm: using random seed 1760970128
NEON:
 - vp8dsp.idct       [OK]
 - vp8dsp.mc         [OK]
 - vp8dsp.loopfilter [OK]
checkasm: all 21 tests passed
vp8_idct_add_c: 201.6
vp8_idct_add_neon: 83.1
vp8_idct_dc_add_c: 107.6
vp8_idct_dc_add_neon: 33.8
vp8_idct_dc_add4y_c: 426.4
vp8_idct_dc_add4y_neon: 59.4
vp8_loop_filter8uv_h_c: 688.1
vp8_loop_filter8uv_h_neon: 216.3
vp8_loop_filter8uv_inner_h_c: 649.3
vp8_loop_filter8uv_inner_h_neon: 195.3
vp8_loop_filter8uv_inner_v_c: 544.8
vp8_loop_filter8uv_inner_v_neon: 131.3
vp8_loop_filter8uv_v_c: 706.1
vp8_loop_filter8uv_v_neon: 141.1
vp8_loop_filter16y_h_c: 668.8
vp8_loop_filter16y_h_neon: 242.8
vp8_loop_filter16y_inner_h_c: 647.3
vp8_loop_filter16y_inner_h_neon: 224.6
vp8_loop_filter16y_inner_v_c: 647.8
vp8_loop_filter16y_inner_v_neon: 128.8
vp8_loop_filter16y_v_c: 721.8
vp8_loop_filter16y_v_neon: 154.3
vp8_loop_filter_simple_h_c: 387.8
vp8_loop_filter_simple_h_neon: 187.6
vp8_loop_filter_simple_v_c: 384.1
vp8_loop_filter_simple_v_neon: 78.6
vp8_put_epel8_h4v4_c: 3971.1
vp8_put_epel8_h4v4_neon: 855.1
vp8_put_epel8_h4v6_c: 5060.1
vp8_put_epel8_h4v6_neon: 989.6
vp8_put_epel8_h6v4_c: 4320.8
vp8_put_epel8_h6v4_neon: 1007.3
vp8_put_epel8_h6v6_c: 5449.3
vp8_put_epel8_h6v6_neon: 1158.1
vp8_put_epel16_h6_c: 6683.8
vp8_put_epel16_h6_neon: 831.8
vp8_put_epel16_h6v6_c: 11110.8
vp8_put_epel16_h6v6_neon: 2214.8
vp8_put_epel16_v6_c: 7024.8
vp8_put_epel16_v6_neon: 799.6
vp8_put_pixels8_c: 112.8
vp8_put_pixels8_neon: 78.1
vp8_put_pixels16_c: 131.3
vp8_put_pixels16_neon: 129.8

This contains a fix to include guards by Carl Eugen Hoyos.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:33 +02:00
Luca Barbato
899ee03088 Unbreak travis on macos 2019-02-19 10:05:10 +01:00
hwren
ff03418348 lavc/libdavs2: fix parameter setting error
Signed-off-by: hwrenx <hwrenx@126.com>
2019-02-19 13:21:04 +08:00
hwren
11751f6252 lavc/libxavs2: use upper layer qp parameters first
Signed-off-by: hwrenx <hwrenx@126.com>
2019-02-19 13:21:04 +08:00
hwren
8754147db6 lavc/libxavs2: remove unused context parameter
Signed-off-by: hwrenx <hwrenx@126.com>
2019-02-19 13:21:04 +08:00
Xiaofeng Wang
821791caa0 lavf/mpeg: fix indent
Signed-off-by: Xiaofeng Wang <xiaofeng.wang@bqvision.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-18 23:08:38 +01:00
Steve Lhomme
9326117bf6 avformat/matroskadec: Check parents remaining length
This was found through the Hacker One program on VLC but is not a security issue in libavformat
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-17 10:29:42 +01:00
Michael Niedermayer
b687b549aa avformat/webmdashenc: Check id in adaption_sets
Fixes: out of array access

Found-by: Wenxiang Qian
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-17 10:29:42 +01:00
Wenxiang Qian
85f91ed760 avformat/http: Fix Out-of-Bounds access in process_line()
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-17 10:29:42 +01:00
Wenxiang Qian
a142ffdcae avformat/ftp: Fix Out-of-Bounds Access and Information Leak in ftp.c:393
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-17 10:29:42 +01:00
Kevin Backhouse via RT
894995c41e avcodec/htmlsubtitles: Fixes denial of service due to use of sscanf in inner loop for handling braces
Fixes: [Semmle Security Reports #19439]
Fixes: dos_sscanf2.mkv

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-17 10:29:42 +01:00
Kevin Backhouse via RT
1f00c97bc3 avcodec/htmlsubtitles: Fixes denial of service due to use of sscanf in inner loop for tag scaning
Fixes: [Semmle Security Reports #19438]
Fixes: dos_sscanf1.mkv

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-17 10:29:42 +01:00
Michael Niedermayer
7246bf365a avcodec/mpegvideo_enc: Use av_assert1() instead of assert()
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-17 10:29:42 +01:00
Michael Niedermayer
d1afa7284c avformat/matroskadec: Do not leak queued packets on sync errors
Fixes: memleak
Fixes: clusterfuzz-testcase-minimized-audio_decoder_fuzzer-5649187601121280

Reported-by: Chris Cunningham <chcunningham@google.com>
Tested-by: Chris Cunningham <chcunningham@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-17 09:12:06 +01:00
Gyan Doshi
a9452fe6dc doc/fftools-common-opts: add example for codec option stream specifiers 2019-02-17 11:28:31 +05:30
Reto Kromer
7a51fed0f0 doc/muxers: grammar fix
Signed-off-by: Gyan Doshi <ffmpeg@gyani.pro>
2019-02-17 10:56:38 +05:30
Carl Eugen Hoyos
fe7d8c993f lavc/libgsmenc: Force mono and use 13k as default bitrate. 2019-02-17 01:04:18 +01:00
gxw
1466dc140d avcodec/mips: [loongson] optimize theora decoding with mmi.
Optimize theora decoding with mmi in functions:
1. ff_vp3_idct_add_mmi
2. ff_vp3_idct_put_mmi
3. ff_vp3_idct_dc_add_mmi
4. ff_put_no_rnd_pixels_l2_mmi

Theora decoding speed improved about 32%(from 88fps to 116fps, Tested on loongson 3A3000).

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-16 19:56:57 +01:00
Michael Niedermayer
1f686d023b avcodec/mpeg4videodec: Clear interlaced_dct for studio profile
Fixes: Out of array access
Fixes: 13090/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MPEG4_fuzzer-5408668986638336

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Kieran Kunhya <kierank@obe.tv>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-16 19:56:57 +01:00
Philip Langdale
d6fc5dc24a avcodec/version: Bump micro-version for nvdec/cuviddec changes
I forgot to add the version bump and changelog within the changes.
2019-02-16 10:43:24 -08:00
Philip Langdale
317b7b06fd avcodec/cuviddec: Add support for decoding HEVC 4:4:4 content
This is the equivalent change for cuviddec after the previous change
for nvdec. I made similar changes to the copying routines to handle
pixel formats in a more generic way.

Note that unlike with nvdec, there is no confusion about the ability
of a codec to output 444 formats. This is because the cuvid parser is
used, meaning that 444 JPEG content is still indicated as using a 420
output format.
2019-02-16 10:21:26 -08:00
Diego Biurrun
f8df5e2f31 tests: Add a convenience function for video-only lavf tests
Rename a test in the process for consistency and simplicity and
remove the remnants of the now-unused lavf regression test scripts.
2019-02-16 18:15:55 +01:00
Diego Biurrun
618d02c1fa tests: Convert lavf container tests to non-legacy test scripts
Rename some tests in the process for consistency and simplicity.
2019-02-16 18:15:46 +01:00
Diego Biurrun
896fe15dbb tests: Convert lavf pixfmt conversion tests to non-legacy test scripts
Also split monolithic lavf-pixfmt test into individual tests.
2019-02-16 18:15:38 +01:00
Diego Biurrun
a957e9379d tests: Convert lavf image tests to non-legacy test scripts
Rename some tests in the process for consistency and simplicity.
2019-02-16 18:15:30 +01:00
Diego Biurrun
eb8a811599 tests: Convert audio-only lavf tests to non-legacy test scripts
Rename some tests in the process for consistency and simplicity.
2019-02-16 18:15:22 +01:00
Diego Biurrun
a70eac7a9b tests: Convert image2pipe tests to non-legacy test scripts 2019-02-16 18:15:11 +01:00
Philip Langdale
83c7ac2e47 avcodec/nvdec: Explicitly mark codecs that support 444 output formats
With the introduction of HEVC 444 support, we technically have two
codecs that can handle 444 - HEVC and MJPEG. In the case of MJPEG,
it can decode, but can only output one of the semi-planar formats.

That means we need additional logic to decide whether to use a
444 output format or not.
2019-02-16 08:47:36 -08:00
Philip Langdale
e06ccfbe1d avcodec/nvdec: Add support for decoding HEVC 4:4:4 content
The latest generation video decoder on the Turing chips supports
decoding HEVC 4:4:4. Supporting this is relatively straight-forward;
we need to account for the different chroma format and pick the
right output and sw formats at the right times.

There was one bug which was the hard-coded assumption that the
first chroma plane would be half-height; I fixed this to use the
actual shift value on the plane.

We also need to pass the SPS and PPS range extension flags.
2019-02-16 08:47:36 -08:00