1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00
Commit Graph

44507 Commits

Author SHA1 Message Date
Diego Biurrun
57b753b445 build: Prefer NASM assembler over YASM
NASM is more actively maintained and permits generating dependency information
as a sideeffect of assembling, thus cutting build times in half.
2017-03-07 08:32:37 +01:00
Diego Biurrun
f54037da8a build: Make x86 assembler commandline-selectable 2017-03-07 08:32:37 +01:00
Diego Biurrun
51411eb7ff build: Special-case handling of SDL CFLAGS
SDL adds some "special" CFLAGS that interfere with building normal
binaries. Capture those CFLAGS separately and avoid adding them to
the general CFLAGS.
2017-03-07 08:32:37 +01:00
Diego Biurrun
003124ebf4 build: Fix logic of clock_gettime() check
We should only check for clock_gettime() if _POSIX_MONOTONIC_CLOCK is
available and do a full link check for clock_gettime() in all cases.
2017-03-07 08:32:37 +01:00
Vittorio Giovara
b44bd7ee7f pixlet: Fix architecture-dependent code and values
The constants used in the decoder used floating point precision,
and this caused different values to be generated on different
architectures. Additionally on big endian machines, the fate test
would output bytes in native order, which is different from the one
hardcoded in the test.

So, eradicate floating point numbers and use fixed point (32.32)
arithmetics everywhere, replacing constants with precomputed integer
values, and force the pixel format output to be the same in the fate
test.

Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2017-03-06 18:15:02 -05:00
Diego Biurrun
808ef43597 build: Explicitly set 32-bit/64-bit object formats for nasm/yasm
Consistently use object format names with "32" suffix and set object format
to "win64" on Windows x86_64, which fixes assembling with nasm.
2017-03-05 14:38:56 +01:00
Diego Biurrun
6eef263aca x86: Merge align directives into SECTION_RODATA declarations where possible 2017-03-05 14:26:06 +01:00
Ganapathy Kasi
3303f86467 nvenc: Remove qmin and qmax constraints for nvenc vbr
qmin and qmax are not necessary for nvenc vbr.

Also fix for using 2 pass vbr mode for slow preset through ctx->flag NVENC_TWO_PASSES.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2017-03-04 08:23:28 +01:00
Paul B Mahol
aba5b94859 Add Apple Pixlet decoder
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2017-03-01 11:52:29 -05:00
James Almer
19d57ca62e libavutil: add av_mod_uintp2
Signed-off-by: James Almer <jamrial@gmail.com>
2017-03-01 11:23:19 -05:00
Ganesh Ajjanagadde
7bfda7d157 intmath: add faster clz support 2017-03-01 11:23:19 -05:00
Diego Biurrun
5ff3b5cafc build: Add pthreads to list of avutil extralibs
libavutil uses pthreads in the buffer code (abstracted through a header).
2017-03-01 12:02:11 +01:00
Diego Biurrun
db869f4ea4 fate: Add build-only targets to FATE 2017-03-01 11:06:52 +01:00
Diego Biurrun
3c0efbd033 build: Allow generating dependencies as a side-effect of assembling 2017-03-01 10:18:15 +01:00
Diego Biurrun
39e208f4d4 build: Generalize yasm/nasm-related variable names
None of them are specific to the YASM assembler.
2017-03-01 10:18:15 +01:00
Diego Biurrun
d1d6230ea3 build: Add "build" shorthand target that depends on all compile targets 2017-03-01 10:13:26 +01:00
Diego Biurrun
4d1f7e8bc7 build: Skip generating .version files when cleaning 2017-03-01 09:42:39 +01:00
Diego Biurrun
58407b4d74 configure: Fix typo in objcc default setting
Also drop stray duplicate OBJCC config.mak entry.
2017-03-01 09:23:42 +01:00
Diego Biurrun
fde7ee8710 x86: hevc: Add missing colons after assembly labels
This fixes several warnings of the sort
warning: label alone on a line without a colon might be in error
2017-03-01 09:23:42 +01:00
Diego Biurrun
7cb1d9e2db build: Fine-grained link-time dependency settings
Previously, all link-time dependencies were added for all libraries,
resulting in bogus link-time dependencies since not all dependencies
are shared across libraries. Also, in some cases like libavutil, not
all dependencies were taken into account, resulting in some cases of
underlinking.

To address all this mess a machinery is added for tracking which
dependency belongs to which library component and then leveraged
to determine correct dependencies for all individual libraries.
2017-03-01 09:00:40 +01:00
Diego Biurrun
d154bdd3d0 configure: Simplify dlopen check 2017-03-01 09:00:40 +01:00
Michael Niedermayer
d7b2bb5391 h264_sei: Check actual presence of picture timing SEI message
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2017-02-28 10:32:50 -05:00
Diego Biurrun
21cca00dfe build: Explicitly disable external libraries when not explicitly enabled
Leaving those variables in an undefined state allows them getting implicitly
enabled when they are declared as weak dependencies of other components.
In that case, the library check is not run and required linker flags are not
added, resulting in a failing build.

Fixes linking when enabling libfreetype without libfontconfig.
2017-02-28 13:00:20 +01:00
Diego Biurrun
e1a6d63c7e fate: Rename WMV8_DRM decoder tests to WMV3_DRM
The codec used in those files is WMV3/WMV9, not WMV2/WMV8.
2017-02-28 13:00:20 +01:00
Luca Barbato
79331df362 rtsp: Lazily set up the pollfd array once 2017-02-28 12:54:04 +01:00
Ben Chang
d8f36a6aa3 nvenc: Fix the preset mapping list
The map is a sparse array and does not need a empty element to terminate
it.

The empty element is stored after the last one inserted in the list,
overwriting whichever element was next with zeros.

Bug-Id: 1029

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2017-02-28 11:54:02 +01:00
Diego Biurrun
698ac8f9ca fate: Make null comparison method more useful
This allows dropping /dev/null as reference value when no output is generated.
2017-02-27 13:57:35 +01:00
Diego Biurrun
c483398bb7 build: Drop DOC_ prefix from EXAMPLES-related variables 2017-02-27 13:57:35 +01:00
Luca Barbato
5263f464db rtsp: Lazily allocate the pollfd array
And use av_malloc_array.
2017-02-27 13:51:53 +01:00
Luca Barbato
b9b82151a1 rtsp: Move the pollfd setup out of the for loop 2017-02-27 13:51:53 +01:00
Luca Barbato
150e99d694 rtsp: Factor out packet reading 2017-02-27 13:51:53 +01:00
Diego Biurrun
4141a5a240 Use modern avconv syntax for codec selection in documentation and tests 2017-02-27 10:36:45 +01:00
Diego Biurrun
da8093f712 fate: Use bitexact optimizations in the svq3-2 test
This fixes the test with mmxext disabled because the current reference
frame hashes correspond to the non-bitexact mmxext optimizations.
2017-02-27 10:36:44 +01:00
Anton Khirnov
984736dd9e lavc: make sure not to return EAGAIN from codecs
This error is treated specially by the API.

CC: libav-stable@libav.org
2017-02-25 09:57:44 +01:00
James Almer
4cc0227040 apetag: account for header size if present when returning the start position
The size field in the header/footer accounts for the entire APE tag
structure except the 32 bytes from header, for compatibility with
APEv1.

Signed-off-by: James Almer <jamrial@gmail.com>

CC: libav-stable@libav.org

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2017-02-25 09:57:44 +01:00
James Almer
3f258f5ee0 apetag: fix flag value to signal footer presence
According to the spec[1], a value of 0 means the footer is present and a value
of 1 means it's absent, the exact opposite of header presence flag where 1
means present and 0 absent.
The reason for this is compatibility with APEv1 tags, where there's no header,
footer presence was mandatory for all files, and the flags field was a zeroed
reserved field.

[1] http://wiki.hydrogenaud.io/index.php?title=Ape_Tags_Flags

Signed-off-by: James Almer <jamrial@gmail.com>

CC: libav-stable@libav.org

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2017-02-25 09:57:43 +01:00
Anton Khirnov
b2788fe934 svq3: fix the slice size check
Currently it incorrectly compares bits with bytes.

Also, move the check right before where it's relevant, so that the
correct number of remaining bits is used.

CC: libav-stable@libav.org
2017-02-25 09:57:43 +01:00
John Stebbins
cd7a2e1502 asfdec: fix reading files larger than 2GB
avio_skip returns file position and overflows int
2017-02-24 11:41:33 -07:00
John Stebbins
248dc5c164 h264dec: fix dropped initial SEI recovery point 2017-02-24 08:24:13 -07:00
Diego Biurrun
8e4d4efc67 fate: Add another SVQ3 test to increase coverage 2017-02-24 10:59:42 +01:00
Martin Storsjö
b8f66c0838 aarch64: vp9itxfm: Reorder iadst16 coeffs
This matches the order they are in the 16 bpp version.

There they are in this order, to make sure we access them in the
same order they are declared, easing loading only half of the
coefficients at a time.

This makes the 8 bpp version match the 16 bpp version better.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-24 00:04:34 +02:00
Martin Storsjö
08074c092d arm: vp9itxfm: Reorder iadst16 coeffs
This matches the order they are in the 16 bpp version.

There they are in this order, to make sure we access them in the
same order they are declared, easing loading only half of the
coefficients at a time.

This makes the 8 bpp version match the 16 bpp version better.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-24 00:04:33 +02:00
Martin Storsjö
09eb88a12e aarch64: vp9itxfm: Reorder the idct coefficients for better pairing
All elements are used pairwise, except for the first one.
Previously, the 16th element was unused. Move the unused element
to the second slot, to make the later element pairs not split
across registers.

This simplifies loading only parts of the coefficients,
reducing the difference to the 16 bpp version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-24 00:04:32 +02:00
Martin Storsjö
de06bdfe6c arm: vp9itxfm: Reorder the idct coefficients for better pairing
All elements are used pairwise, except for the first one.
Previously, the 16th element was unused. Move the unused element
to the second slot, to make the later element pairs not split
across registers.

This simplifies loading only parts of the coefficients,
reducing the difference to the 16 bpp version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-24 00:04:31 +02:00
Martin Storsjö
65aa002d54 aarch64: vp9itxfm: Avoid reloading the idct32 coefficients
The idct32x32 function actually pushed d8-d15 onto the stack even
though it didn't clobber them; there are plenty of registers that
can be used to allow keeping all the idct coefficients in registers
without having to reload different subsets of them at different
stages in the transform.

After this, we still can skip pushing d12-d15.

Before:
vp9_inv_dct_dct_32x32_sub32_add_neon: 8128.3
After:
vp9_inv_dct_dct_32x32_sub32_add_neon: 8053.3

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-24 00:03:44 +02:00
Martin Storsjö
402546a172 arm: vp9itxfm: Avoid reloading the idct32 coefficients
The idct32x32 function actually pushed q4-q7 onto the stack even
though it didn't clobber them; there are plenty of registers that
can be used to allow keeping all the idct coefficients in registers
without having to reload different subsets of them at different
stages in the transform.

Since the idct16 core transform avoids clobbering q4-q7 (but clobbers
q2-q3 instead, to avoid needing to back up and restore q4-q7 at all
in the idct16 function), and the lanewise vmul needs a register in
the q0-q3 range, we move the stored coefficients from q2-q3 into q4-q5
while doing idct16.

While keeping these coefficients in registers, we still can skip pushing
q7.

Before:                              Cortex A7       A8       A9      A53
vp9_inv_dct_dct_32x32_sub32_add_neon:  18553.8  17182.7  14303.3  12089.7
After:
vp9_inv_dct_dct_32x32_sub32_add_neon:  18470.3  16717.7  14173.6  11860.8

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-24 00:03:43 +02:00
Martin Storsjö
575e31e931 arm: vp9lpf: Implement the mix2_44 function with one single filter pass
For this case, with 8 inputs but only changing 4 of them, we can fit
all 16 input pixels into a q register, and still have enough temporary
registers for doing the loop filter.

The wd=8 filters would require too many temporary registers for
processing all 16 pixels at once though.

Before:                          Cortex A7      A8     A9     A53
vp9_loop_filter_mix2_v_44_16_neon:   289.7   256.2  237.5   181.2
After:
vp9_loop_filter_mix2_v_44_16_neon:   221.2   150.5  177.7   138.0

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-24 00:03:09 +02:00
Martin Storsjö
3bf9c48320 aarch64: vp9lpf: Use dup+rev16+uzp1 instead of dup+lsr+dup+trn1
This is one cycle faster in total, and three instructions fewer.

Before:
vp9_loop_filter_mix2_v_44_16_neon: 123.2
After:
vp9_loop_filter_mix2_v_44_16_neon: 122.2

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-24 00:03:00 +02:00
Martin Storsjö
c582cb8537 arm/aarch64: vp9lpf: Keep the comparison to E within 8 bit
The theoretical maximum value of E is 193, so we can just
saturate the addition to 255.

Before:                     Cortex A7      A8      A9     A53  A53/AArch64
vp9_loop_filter_v_4_8_neon:     143.0   127.7   114.8    88.0         87.7
vp9_loop_filter_v_8_8_neon:     241.0   197.2   173.7   140.0        136.7
vp9_loop_filter_v_16_8_neon:    497.0   419.5   379.7   293.0        275.7
vp9_loop_filter_v_16_16_neon:   965.2   818.7   731.4   579.0        452.0
After:
vp9_loop_filter_v_4_8_neon:     136.0   125.7   112.6    84.0         83.0
vp9_loop_filter_v_8_8_neon:     234.0   195.5   171.5   136.0        133.7
vp9_loop_filter_v_16_8_neon:    490.0   417.5   377.7   289.0        271.0
vp9_loop_filter_v_16_16_neon:   951.2   814.7   732.3   571.0        446.7

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-24 00:02:36 +02:00
Diego Biurrun
ed6a891c36 Place attribute_deprecated in the right position for struct declarations
libavcodec/vaapi.h:58:1: warning: attribute 'deprecated' is ignored, place it after "struct" to apply attribute to type declaration [-Wignored-attributes]
2017-02-23 12:23:20 +01:00