1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-18 03:19:31 +02:00
Commit Graph

417 Commits

Author SHA1 Message Date
Michael Niedermayer
4159f702a7 avutil/timer: Fix units for x86 after c708b54033
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-09 15:22:02 +01:00
James Almer
3f3d748cab x86: Move XOP emulation to x86util
We need the emulation to support the cases where the first
argument is the same as the fourth. To achieve this a fifth
argument working as a temporary may be needed.
Emulation that doesn't obey the original instruction semantics
can't be in x86inc.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-24 08:30:19 +01:00
Michael Niedermayer
bd8d73ea8b Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: add detection for Bit Manipulation Instruction sets

Conflicts:
	libavutil/x86/cpu.c

See: 0bc3de19ff
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-23 22:52:58 +01:00
Michael Niedermayer
d9574069c1 Merge commit '1b932eb1508f550fac9e911923a0383efda53aa3'
* commit '1b932eb1508f550fac9e911923a0383efda53aa3':
  x86: add detection for FMA3 instruction set

Conflicts:
	configure
	libavutil/cpu.h
	libavutil/x86/cpu.c

See: a2af8eddab
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-23 22:43:08 +01:00
James Almer
d59fcdaff3 x86: add detection for Bit Manipulation Instruction sets
Based on x264 code

Signed-off-by: James Almer <jamrial@gmail.com>
2014-02-23 15:29:36 +01:00
James Almer
1b932eb150 x86: add detection for FMA3 instruction set
Based on x264 code

Signed-off-by: James Almer <jamrial@gmail.com>
2014-02-23 15:29:36 +01:00
James Almer
10b0161d78 x86: add missing XOP checks and macros
Signed-off-by: James Almer <jamrial@gmail.com>
2014-02-23 15:29:36 +01:00
James Almer
0bc3de19ff x86: add detection for Bit Manipulation Instruction sets
Based on x264 code

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-22 17:26:00 +01:00
James Almer
a2af8eddab x86: add detection for FMA3 instruction set
Based on x264 code

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-22 17:25:52 +01:00
Christophe Gisquet
996697e266 x86: float dsp: unroll SSE versions
vector_fmul and vector_fmac_scalar are guaranteed that they can process in
batch of 16 elements, but their SSE versions only does 8 at a time.

Therefore, unroll them a bit.
299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-20 14:18:05 +01:00
Christophe Gisquet
133b34207c x86: float dsp: unroll SSE versions
vector_fmul and vector_fmac_scalar are guaranteed that they can process in
batch of 16 elements, but their SSE versions only does 8 at a time.

Therefore, unroll them a bit.
299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-15 18:54:21 +01:00
James Almer
23a8c63452 x86inc: Extend FMA_INSTR functionality
Support the cases where the first and last operand of
the XOP instruction are the same.

Also add vpmacsdql emulation.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-13 22:14:24 +01:00
James Almer
6c12b1de06 x86: add missing XOP checks and macros
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-11 03:46:52 +01:00
Loren Merritt
b7d0d10a1d x86inc: Speed up assembling with Yasm
Work around Yasm's inefficiency with handling large numbers of variables
in the global scope.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-01-26 18:40:08 +01:00
Loren Merritt
4d55fe7204 x86inc: speed up compilation with yasm
Work around yasm's inefficiency with handling large numbers of variables
in the global scope.
2014-01-18 01:19:16 +01:00
Michael Niedermayer
c3814ab654 rename new lls code to lls2 to avoid conflict with the old which has a different ABI
also remove failed attempt at a compatibility layer, the code simply cannot work

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-17 16:41:08 +01:00
Michael Niedermayer
bbe66ef912 avutil: rename lls to lls2
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-17 16:30:23 +01:00
Michael Niedermayer
a665704402 Merge commit '4d6ee0725553a43ba88d6f8327ebcf8f1c5ae8d4'
* commit '4d6ee0725553a43ba88d6f8327ebcf8f1c5ae8d4':
  libavutil: x86: Add AVX2 capable CPU detection.

Conflicts:
	libavutil/cpu.c
	libavutil/cpu.h
	libavutil/x86/cpu.c

See: 865b70bc5d
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-26 02:36:36 +02:00
Kieran Kunhya
865b70bc5d Add AVX2 capable CPU detection. Patch based on x264's AVX2 detection
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-26 02:34:22 +02:00
Kieran Kunhya
4d6ee07255 libavutil: x86: Add AVX2 capable CPU detection.
Patch based on x264's AVX2 detection

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-25 19:36:55 +01:00
Michael Niedermayer
f9bef2bec9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: more AVX2 framework

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-14 16:13:57 +02:00
Michael Niedermayer
e3e0e3d0c9 Merge commit 'c6908d6b4b377a04a5d055ba874bdbcf06c80497'
* commit 'c6908d6b4b377a04a5d055ba874bdbcf06c80497':
  x86inc: FMA3/4 Support

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-14 16:06:22 +02:00
Michael Niedermayer
9ac124c889 Merge commit '206895708ea2b464755d340e44501daf9a07c310'
* commit '206895708ea2b464755d340e44501daf9a07c310':
  x86inc: Remove our FMA4 support

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-14 15:54:23 +02:00
Michael Niedermayer
12e4493f9c Merge commit 'c108ba0175d4fc3a3253a8b0f782fbfb96ba5098'
* commit 'c108ba0175d4fc3a3253a8b0f782fbfb96ba5098':
  x86inc: Use VEX-encoded instructions in AVX functions

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-14 15:48:34 +02:00
Jason Garrett-Glaser
a3fabc6cb3 x86: more AVX2 framework
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:41:56 +01:00
Jason Garrett-Glaser
c6908d6b4b x86inc: FMA3/4 Support
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:41:54 +01:00
Derek Buitenhuis
206895708e x86inc: Remove our FMA4 support
This is so we can sync to x264's version of FMA4 support.

This partialy reverts commit 79687079a9.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:39:29 +01:00
Henrik Gramner
c108ba0175 x86inc: Use VEX-encoded instructions in AVX functions
Automatically use VEX-encoding in AVX/AVX2/XOP/FMA3/FMA4
functions for all instructions that exists in a VEX-encoded
version.

This change makes it easier to extend existing code to use AVX2.

Also add support for AVX emulation of a few instructions that
were missing before.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:36:11 +01:00
Michael Niedermayer
31d0d35560 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86inc: Remove .rodata kludges

Conflicts:
	libavutil/x86/x86inc.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-09 14:29:42 +02:00
Henrik Gramner
ad7d7d4f6a x86inc: Remove .rodata kludges
The Mach-O bug was fixed in yasm 0.8.0 and we don't
support versions that old anymore.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-09 07:44:30 -04:00
Michael Niedermayer
19c3890819 Merge commit '3e2fa991db7ef172579422accd61624d52777e5a'
* commit '3e2fa991db7ef172579422accd61624d52777e5a':
  x86inc: remove misaligned cpu flag

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 12:02:21 +02:00
Michael Niedermayer
31d9aa6b2e Merge commit '71155665414b551ad350622d5abed20e58371fbf'
* commit '71155665414b551ad350622d5abed20e58371fbf':
  x86inc: various minor backports from x264

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:57:39 +02:00
Michael Niedermayer
3f965ab95d Merge commit '47f9d7ce5493e119e09d1227d017414feaaf8d97'
* commit '47f9d7ce5493e119e09d1227d017414feaaf8d97':
  x86inc: Check for __OUTPUT_FORMAT__ having a value of "x64"

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:37:22 +02:00
Michael Niedermayer
1f17619fe4 Merge commit 'bbe4a6db44f0b55b424a5cc9d3e89cd88e250450'
* commit 'bbe4a6db44f0b55b424a5cc9d3e89cd88e250450':
  x86inc: Utilize the shadow space on 64-bit Windows

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:23:00 +02:00
Michael Niedermayer
17d9c7c208 Merge commit '3fb78e99a04d0ed8db834d813d933eb86c37142a'
* commit '3fb78e99a04d0ed8db834d813d933eb86c37142a':
  x86inc: create xm# and ym#, analagous to m#

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:15:17 +02:00
Michael Niedermayer
3352fdb292 Merge commit '49ebe3f9fe02174ae7e14548001fd146ed375cc2'
* commit '49ebe3f9fe02174ae7e14548001fd146ed375cc2':
  x86inc: fix some corner cases of SWAP

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:07:03 +02:00
Michael Niedermayer
006c0fcfea Merge commit '63f0d623100bdb0c6081456127f4b6713e83d3db'
* commit '63f0d623100bdb0c6081456127f4b6713e83d3db':
  x86inc: Use SSE instead of SSE2 for copying data

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:01:40 +02:00
Michael Niedermayer
faafffaf82 Merge commit 'ad76e6e7e193b98e7335156422d35467816f9ef1'
* commit 'ad76e6e7e193b98e7335156422d35467816f9ef1':
  x86inc: Set ELF hidden visibility for global constants

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 10:52:51 +02:00
Michael Niedermayer
c1488fab3d Merge commit '25cb0c1a1e66edacc1667acf6818f524c0997f10'
* commit '25cb0c1a1e66edacc1667acf6818f524c0997f10':
  x86inc: activate REP_RET automatically

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 10:27:30 +02:00
Henrik Gramner
3e2fa991db x86inc: remove misaligned cpu flag
Prevents a crash if the misaligned exception mask bit is
cleared for some reason.

Misaligned SSE functions are only used on AMD Phenom CPUs
and the benefit is miniscule. They also require modifying
the MXCSR control register and by removing those functions
we can get rid of that complexity altogether.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:27:38 -04:00
Jason Garrett-Glaser
7115566541 x86inc: various minor backports from x264
Small backports that sneaked into other asm commits in x264.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:27:22 -04:00
Derek Buitenhuis
47f9d7ce54 x86inc: Check for __OUTPUT_FORMAT__ having a value of "x64"
This is also a valid value for WIN64.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:27:08 -04:00
Henrik Gramner
bbe4a6db44 x86inc: Utilize the shadow space on 64-bit Windows
Store XMM6 and XMM7 in the shadow space in functions that
clobbers them. This way we don't have to adjust the stack
pointer as often, reducing the number of instructions as
well as code size.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:25:35 -04:00
Loren Merritt
3fb78e99a0 x86inc: create xm# and ym#, analagous to m#
For when we want to mix simd sizes within one function.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:25:19 -04:00
Loren Merritt
49ebe3f9fe x86inc: fix some corner cases of SWAP
SWAP with >=3 named (rather than numbered) args
PERMUTE followed by SWAP with 2 named args
used to produce the wrong permutation

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:25:06 -04:00
Henrik Gramner
63f0d62310 x86inc: Use SSE instead of SSE2 for copying data
Reduces code size because movaps/movups is one byte
shorter than movdqa/movdqu.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:24:33 -04:00
Henrik Gramner
ad76e6e7e1 x86inc: Set ELF hidden visibility for global constants
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:24:13 -04:00
Loren Merritt
25cb0c1a1e x86inc: activate REP_RET automatically
Now RET checks whether it immediately follows a branch, so the
programmer dosen't have to keep track of that condition. REP_RET
is still needed manually when it's a branch target, but that's
much rarer.

The implementation involves lots of spurious labels, but that's OK
because we strip them.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:17:59 -04:00
Ronald S. Bultje
c07ac8d467 VP9 MC (ssse3) optimizations.
Decoding time of ped1080p.webm goes from 20.7sec to 11.3sec.
2013-10-02 21:03:15 -04:00
Michael Niedermayer
361bc70731 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  avutil: Fix compilation with inline asm disabled on mingw

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-09-22 11:51:38 +02:00
Alex Smith
08fa828b3f avutil: Fix compilation with inline asm disabled on mingw
Because of -Werror=implicit-function-declaration the build will fail.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-09-22 00:50:32 +03:00
Thilo Borgmann
d814a839ac Reinstate proper FFmpeg license for all files. 2013-08-30 15:47:38 +00:00
Michael Niedermayer
f0a3562382 Merge commit '79aec43ce813a3e270743ca64fa3f31fa43df80b'
* commit '79aec43ce813a3e270743ca64fa3f31fa43df80b':
  x86: Add and use more convenience macros to check CPU extension availability

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-30 11:57:35 +02:00
Michael Niedermayer
2a60666d1d Merge commit '8410d6e93c2e074881f1c7b7e4cdefd2e497d52e'
* commit '8410d6e93c2e074881f1c7b7e4cdefd2e497d52e':
  avutil: Refactor CPU extension availability macros

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:15:10 +02:00
Michael Niedermayer
c83d794936 Merge commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b'
* commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b':
  avutil: Move internal CPU detection function declarations to private header

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:05:15 +02:00
Diego Biurrun
79aec43ce8 x86: Add and use more convenience macros to check CPU extension availability 2013-08-29 13:07:37 +02:00
Diego Biurrun
8410d6e93c avutil: Refactor CPU extension availability macros 2013-08-28 23:54:14 +02:00
Diego Biurrun
b78b10c4b7 avutil: Move internal CPU detection function declarations to private header 2013-08-28 23:54:14 +02:00
Michael Niedermayer
9d01bf7d66 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Consistently use "cpu_flags" as variable/parameter name for CPU flags

Conflicts:
	libavcodec/x86/dsputil_init.c
	libavcodec/x86/h264dsp_init.c
	libavcodec/x86/hpeldsp_init.c
	libavcodec/x86/motion_est.c
	libavcodec/x86/mpegvideo.c
	libavcodec/x86/proresdsp_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-18 09:53:47 +02:00
Diego Biurrun
3ac7fa81b2 Consistently use "cpu_flags" as variable/parameter name for CPU flags 2013-07-18 00:31:35 +02:00
Michael Niedermayer
a478e99a60 avutil/x86: reenable ff_update_lls_avx()
The bug has been fixed in c8b920a9b7 by Loren Merritt

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-02 12:02:08 +02:00
Michael Niedermayer
d1fa671895 Merge commit 'c8b920a9b7fa534a6141695ace4e8c2dfcd56cee'
* commit 'c8b920a9b7fa534a6141695ace4e8c2dfcd56cee':
  lls/x86: use 3-operator vaddpd in ADDPD_MEM

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-02 11:40:44 +02:00
Loren Merritt
c8b920a9b7 lls/x86: use 3-operator vaddpd in ADDPD_MEM
Fixes build with yasm-1.1

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-07-02 10:15:09 +02:00
Michael Niedermayer
a6e46ed51a Revert "avutil/x86: disable ff_evaluate_lls_sse2() for 32bit"
This reverts commit 247425241c.
2013-07-01 02:27:47 +02:00
Michael Niedermayer
4e488ac5f5 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: lpc: fix a segfault in av_evaluate_lls_sse2()

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-01 02:26:22 +02:00
Loren Merritt
1221bb6239 x86: lpc: fix a segfault in av_evaluate_lls_sse2() 2013-06-30 23:11:19 +00:00
Michael Niedermayer
247425241c avutil/x86: disable ff_evaluate_lls_sse2() for 32bit
It just segfaults on 32bit, thus its disabled until someone fixes it.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 19:03:57 +02:00
Michael Niedermayer
6e76e6a05a Merge commit 'b545179fdff1ccfbbb9d422e4e9720cb6c6d9191'
* commit 'b545179fdff1ccfbbb9d422e4e9720cb6c6d9191':
  x86: lpc: simd av_evaluate_lls

Conflicts:
	libavutil/x86/lls.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 12:15:12 +02:00
Michael Niedermayer
a285079bc7 lls.asm: disable ff_update_lls_avx
The code doesnt build with yasm from ubuntu 12.04

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 12:12:11 +02:00
Michael Niedermayer
0b40c50508 lls.asm: put avx code under if HAVE_AVX_EXTERNAL
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 12:12:01 +02:00
Michael Niedermayer
78b5479633 Merge commit '502ab21af0ca68f76d6112722c46d2f35c004053'
* commit '502ab21af0ca68f76d6112722c46d2f35c004053':
  x86: lpc: simd av_update_lls

The versions are bumped due to changes in lls.h which is used across
libraries affecting intra library ABI
(This version bump also covers changes to lls.h in the immedeatly previous
 commits)

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 11:35:52 +02:00
Loren Merritt
b545179fdf x86: lpc: simd av_evaluate_lls
1.5x-1.8x faster on sandybridge

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-06-29 13:23:57 +02:00
Loren Merritt
502ab21af0 x86: lpc: simd av_update_lls
4x-6x faster on sandybridge

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-06-29 13:23:57 +02:00
Michael Niedermayer
3c200aa693 Merge commit '1fda184a85178cfd7b98d9e308d18e1ded76a511'
* commit '1fda184a85178cfd7b98d9e308d18e1ded76a511':
  avutil: Add av_cold attributes to init functions missing them

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-05 12:53:50 +02:00
Diego Biurrun
1fda184a85 avutil: Add av_cold attributes to init functions missing them 2013-05-04 22:48:05 +02:00
Michael Niedermayer
e91339cde2 Merge commit '566b7a20fd0cab44d344329538d314454a0bcc2f'
* commit '566b7a20fd0cab44d344329538d314454a0bcc2f':
  x86: float dsp: butterflies_float SSE

Conflicts:
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-03 11:57:59 +02:00
Christophe Gisquet
566b7a20fd x86: float dsp: butterflies_float SSE
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
2013-05-03 08:08:02 +02:00
Michael Niedermayer
92218aad00 butterflies_float: replace 2 lea by 2 add
adds are simpler instructions and should be faster or equally fast
on all cpus

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-17 00:10:06 +02:00
Christophe Gisquet
1a4007964c x86: float dsp: butterflies_float SSE
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-17 00:03:25 +02:00
Ronald S. Bultje
b93b27edb0 dsputil: Make dsputil selectable
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:04:05 +03:00
Christophe Gisquet
2e81acc687 x86inc: Fix number of operands for cmp* instructions
cmp{p,s}{s,d} instructions do take an imm8 operand.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-04-09 23:55:30 +02:00
Christophe Gisquet
0b467a6e83 x264asm: fix cmp* number of arguments
cmp{p,s}{s,d} instructions do take an imm8 operand.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-05 16:42:12 +02:00
Michael Niedermayer
63a97d5674 Merge commit 'b6649ab5037fb55f78c2606f3d23cea0867cdeaa'
* commit 'b6649ab5037fb55f78c2606f3d23cea0867cdeaa':
  cosmetics: Remove unnecessary extern keywords from function declarations

Conflicts:
	libswscale/x86/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-28 11:20:41 +01:00
Diego Biurrun
b6649ab503 cosmetics: Remove unnecessary extern keywords from function declarations 2013-03-27 14:21:45 +01:00
Ronald S. Bultje
6a701306db dsputil: make selectable.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-12 19:56:58 +01:00
Ronald S. Bultje
0c0828ecc5 x86: Use simple nop codes for <= sse (rather than <= mmx)
The "CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-19 22:33:19 +02:00
James Almer
a56fd9edab lavu: Fix checkheaders for x86/emms.h
internal.h doesn't need to include cpu.h anymore since
the relevant code was moved to x86/emms.h

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-17 00:18:16 +01:00
Michael Niedermayer
61fbb4cd57 Merge commit '4db96649ca700db563d9da4ebe70bf9fc4c7a6ba'
* commit '4db96649ca700db563d9da4ebe70bf9fc4c7a6ba':
  avutil: Ensure that emms_c is always defined, even on non-x86
  configure: Move MinGW CPPFLAGS setting to libc section, where it belongs
  avutil: Move emms code to x86-specific header

Conflicts:
	configure
	libavutil/internal.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-15 12:10:08 +01:00
Diego Biurrun
4db96649ca avutil: Ensure that emms_c is always defined, even on non-x86 2013-02-14 19:29:04 +01:00
Diego Biurrun
ab441e20ff avutil: Move emms code to x86-specific header 2013-02-14 17:37:34 +01:00
Ronald S. Bultje
b582af1ed7 Use simple nop codes for <= sse (rather than <= mmx).
The "CPU: CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.

Change-Id: I7e7c52a2191006df30a9aadbc40d481a1db89106
2013-02-11 23:38:57 +01:00
Michael Niedermayer
8102f27b5b Merge commit '73b704ac609d83e0be124589f24efd9b94947cf9'
* commit '73b704ac609d83e0be124589f24efd9b94947cf9':
  arm: Add some missing header #includes
  floatdsp: move scalarproduct_float from dsputil to avfloatdsp.

Conflicts:
	libavcodec/acelp_pitch_delay.c
	libavcodec/amrnbdec.c
	libavcodec/amrwbdec.c
	libavcodec/ra288.c
	libavcodec/x86/dsputil_mmx.c
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:31:55 +01:00
Michael Niedermayer
6e6e170898 Merge commit '42d324694883cdf1fff1612ac70fa403692a1ad4'
* commit '42d324694883cdf1fff1612ac70fa403692a1ad4':
  floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.

Conflicts:
	libavcodec/arm/dsputil_init_vfp.c
	libavcodec/arm/dsputil_vfp.S
	libavcodec/dsputil.c
	libavcodec/ppc/float_altivec.c
	libavcodec/x86/dsputil.asm
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:04:50 +01:00
Michael Niedermayer
b1b870fbd7 Merge commit '55aa03b9f8f11ebb7535424cc0e5635558590f49'
* commit '55aa03b9f8f11ebb7535424cc0e5635558590f49':
  floatdsp: move vector_fmul_add from dsputil to avfloatdsp.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/x86/dsputil.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 13:54:34 +01:00
Ronald S. Bultje
42d3246948 floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
55aa03b9f8 floatdsp: move vector_fmul_add from dsputil to avfloatdsp. 2013-01-22 11:55:42 -08:00
Ronald S. Bultje
d56668bd80 floatdsp: move scalarproduct_float from dsputil to avfloatdsp.
This makes the aac decoder and all voice codecs independent of dsputil.
2013-01-22 11:55:42 -08:00
Michael Niedermayer
ed8ff70d9e Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: dsputil: Drop some unused macro definitions
  x86: Add a Yasm-based emms() replacement

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-19 13:20:25 +01:00
Michael Niedermayer
b45e0c2573 Merge commit 'd633d12b2cc999cee3ac25bf9a810fe7ff03726d'
* commit 'd633d12b2cc999cee3ac25bf9a810fe7ff03726d':
  x86inc: Add cvisible macro for C functions with public prefix

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-19 13:11:41 +01:00
Michael Niedermayer
1b03e09198 Merge commit 'ef5d41a5534b65f03d02f2e11a503ab8416bfc3b'
* commit 'ef5d41a5534b65f03d02f2e11a503ab8416bfc3b':
  x86inc: Rename "program_name" to "private_prefix"
  configure: Run SHFLAGS through ldflags_filter()

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-19 13:01:06 +01:00
Martin Storsjö
f4facd2ce7 x86: Add a Yasm-based emms() replacement
This provides a fallback when building with Yasm enabled, but neither
inline assembly, nor the _mm_empty intrinsic are available or enabled.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-18 22:02:13 +01:00
Diego Biurrun
d633d12b2c x86inc: Add cvisible macro for C functions with public prefix
This allows defining externally visible library symbols.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-18 22:02:03 +01:00
Diego Biurrun
ef5d41a553 x86inc: Rename "program_name" to "private_prefix"
The new name is more descriptive and will allow defining a separate
public prefix for externally visible library symbols.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-18 20:29:53 +01:00
Michael Niedermayer
17596198ca Merge commit '80ac87c13dc8c6c063e26a464c5c542357c0583f'
* commit '80ac87c13dc8c6c063e26a464c5c542357c0583f':
  lavc: support ZenoXVID custom tag
  libcdio: support recent cdio-paranoia
  float_dsp: Add #ifdef HAVE_INLINE_ASM around vector_fmul_window
  theora: Skip zero-sized headers

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-18 13:36:39 +01:00
Martin Storsjö
973b4d44f1 float_dsp: Add #ifdef HAVE_INLINE_ASM around vector_fmul_window
This fixes builds on 64bit MSVC.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-01-17 19:07:35 +02:00
Michael Niedermayer
5c7e9e16c9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavc: Move vector_fmul_window to AVFloatDSPContext
  rtpdec_mpeg4: Check the remaining amount of data before reading

Conflicts:
	libavcodec/dsputil.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-16 12:38:41 +01:00
Michael Niedermayer
68f92a70f1 Merge commit 'dae1d507af94261bafd3b11549884e5d1eca590e'
* commit 'dae1d507af94261bafd3b11549884e5d1eca590e':
  x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags
  vf_fps: add final flushed frames to the dropped frame count
  rv34_parser: Adjust #if for disabling individual parsers

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-16 11:44:45 +01:00
Justin Ruggles
e034cc6c60 lavc: Move vector_fmul_window to AVFloatDSPContext
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-16 10:45:45 +01:00
Diego Biurrun
dae1d507af x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags 2013-01-15 17:29:43 +01:00
Michael Niedermayer
b7ede94bbd Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: ABSB2: port to cpuflags

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-15 16:16:18 +01:00
Michael Niedermayer
77041e2474 Merge commit '094a7405e5d8463d7d167d893e04934ec1a84ecd'
* commit '094a7405e5d8463d7d167d893e04934ec1a84ecd':
  x86: ABSB: port to cpuflags
  sdp: Include SRTP crypto params if using the srtp protocol

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-15 16:12:24 +01:00
Michael Niedermayer
cfc40a6aff Merge commit 'd8c772de53d29afb1bada88afa859fce8489c668'
* commit 'd8c772de53d29afb1bada88afa859fce8489c668':
  nutdec: Always return a value from nut_read_timestamp()
  configure: Make warnings from -Wreturn-type fatal errors
  x86: ABS2: port to cpuflags
  vdpau: Remove av_unused attribute from function declaration
  h264: fix ff_generate_sliding_window_mmcos() prototype.

Conflicts:
	configure
	libavformat/nutdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-15 15:23:20 +01:00
Diego Biurrun
320e1d0df3 x86: ABSB2: port to cpuflags 2013-01-15 11:18:51 +01:00
Diego Biurrun
094a7405e5 x86: ABSB: port to cpuflags 2013-01-15 11:18:51 +01:00
Diego Biurrun
51969a652c x86: ABS2: port to cpuflags 2013-01-14 21:56:55 +01:00
Michael Niedermayer
ea93ccf079 Merge commit '5b4dfbffc258f90a7d2540d21209ac23afcf7cd0'
* commit '5b4dfbffc258f90a7d2540d21209ac23afcf7cd0':
  x86: ABS1: port to cpuflags
  v210x: cosmetics, reformat

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-07 01:35:18 +01:00
Diego Biurrun
5b4dfbffc2 x86: ABS1: port to cpuflags 2013-01-06 13:57:01 +01:00
Michael Niedermayer
7e90053822 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mpegvideo: increase edge_emu_buffer size for VC1
  lavc: merge latest x86inc.asm fixes with x264

Conflicts:
	libavcodec/mpegvideo.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-20 02:51:35 +01:00
Ronald S. Bultje
a34d9ad969 lavc: merge latest x86inc.asm fixes with x264
Unbreak NASM support.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-19 07:27:33 +01:00
Michael Niedermayer
a01fe55077 Merge commit 'c0dc57f1264dad1e121772d03abdb9e14ed8857f'
* commit 'c0dc57f1264dad1e121772d03abdb9e14ed8857f':
  asyncts: merge two conditions
  x86inc: fully concatenate tokens to fix macro expansion for nasm
  h264: initialize frame-mt context copies properly

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-14 15:43:46 +01:00
Janne Grunau
0995ad8db4 x86inc: fully concatenate tokens to fix macro expansion for nasm
Fixes build errors with nasm introduced in 6f40e9f070 for stack
memory alignment. Noticed by BugMaster.
2012-12-13 23:57:09 +01:00
Michael Niedermayer
7897919a88 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  aacdec: Fix an off-by-one overwrite when switching to LTP profile from MAIN.
  x86inc: fix stack alignment on win64
  rtpproto: Remove unused defines

Conflicts:
	libavcodec/aacdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-13 12:23:48 +01:00
Ronald S. Bultje
140367aff9 x86inc: fix stack alignment on win64
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-12-12 21:30:49 +02:00
Ronald S. Bultje
ce58642ed0 x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-12 10:37:52 +01:00
Ronald S. Bultje
6f40e9f070 x86inc: support stack mem allocation and re-alignment in PROLOGUE
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-12 05:23:46 +01:00
Michael Niedermayer
5c076205a6 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  golomb: use unsigned arithmetics in svq3_get_ue_golomb()
  x86: float_dsp: fix loading of the len parameter on x86-32
  takdec: fix initialisation of LOCAL_ALIGNED array
  takdec: fix initialisation of LOCAL_ALIGNED array

Conflicts:
	libavcodec/rv30.c
	libavcodec/svq3.c
	libavcodec/takdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-08 16:36:47 +01:00
Justin Ruggles
1c012e6bfb x86: float_dsp: fix loading of the len parameter on x86-32 2012-12-07 21:19:29 -05:00
Michael Niedermayer
af164d7d9f Merge commit 'c25fc5c2bb6ae8c93541c9427df3e47206d95152'
* commit 'c25fc5c2bb6ae8c93541c9427df3e47206d95152':
  fate: dpcm: Add dependencies
  SBR DSP x86: implement SSE sbr_hf_gen
  AAC SBR: use AVFloatDSPContext's vector_fmul
  fate: image: Add dependencies
  Changelog: add an entry for deprecating the avconv -vol option
  x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32

Conflicts:
	Changelog
	libavutil/x86/float_dsp.asm
	tests/fate/image.mak

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-07 15:21:41 +01:00
Michael Niedermayer
54a71f2e6c Merge commit 'b519298a1578e0c895d53d4b4ed8867b1c031a56'
* commit 'b519298a1578e0c895d53d4b4ed8867b1c031a56':
  pixdesc: fix yuva 10bit bit depth
  avconv: deprecate the -vol option
  x86: af_volume: add SSE2/SSSE3/AVX-optimized s32 volume scaling
  x86: af_volume: add SSE2-optimized s16 volume scaling

Conflicts:
	ffmpeg.c
	tests/ref/lavfi/pixdesc
	tests/ref/lavfi/pixfmts_copy
	tests/ref/lavfi/pixfmts_null
	tests/ref/lavfi/pixfmts_scale
	tests/ref/lavfi/pixfmts_vflip

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-06 15:55:47 +01:00
Michael Niedermayer
15784c2bab Merge commit '9d5c62ba5b586c80af508b5914934b1c439f6652'
* commit '9d5c62ba5b586c80af508b5914934b1c439f6652':
  lavu/opt: do not filter out the initial sign character except for flags
  eval: treat dB as decibels instead of decibytes
  float_dsp: add vector_dmul_scalar() to multiply a vector of doubles

Conflicts:
	libavutil/eval.c
	tests/ref/fate/eval

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-06 14:33:38 +01:00
Justin Ruggles
ecc8b02194 x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-12-06 14:11:15 +01:00
Justin Ruggles
b30a363331 x86: af_volume: add SSE2/SSSE3/AVX-optimized s32 volume scaling 2012-12-05 11:23:37 -05:00
Justin Ruggles
ac7eb4cb20 float_dsp: add vector_dmul_scalar() to multiply a vector of doubles
Include x86-optimized versions for SSE2 and AVX.
2012-12-05 11:23:36 -05:00
Michael Niedermayer
42d3fea65f Merge commit 'af7d13ee4a4bf8d708f9b0598abb8f6e22b76de1'
* commit 'af7d13ee4a4bf8d708f9b0598abb8f6e22b76de1':
  asink_nullsink: plug a memory leak.
  x86: h264_idct: port to cpuflags
  x86: cpu: Drop unused HAVE_RWEFLAGS condition

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-28 13:32:17 +01:00
Diego Biurrun
490df522c7 x86: cpu: Drop unused HAVE_RWEFLAGS condition
The test for rweflags was dropped in a previous commit.
2012-11-28 00:28:09 +01:00
Michael Niedermayer
b4d4e51027 Merge commit '3c370f5abc55739a261534b9f9bdc739cedbbbb9'
* commit '3c370f5abc55739a261534b9f9bdc739cedbbbb9':
  riff: only warn on a bad INFO chunk code size instead of failing
  configure: Add separate list for libraries and use where appropriate
  x86: float_dsp: add SSE version of vector_fmul_scalar()

Conflicts:
	configure
	libavformat/riff.c
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-27 14:10:05 +01:00
Justin Ruggles
947f933687 x86: float_dsp: add SSE version of vector_fmul_scalar() 2012-11-26 11:30:19 -05:00
Michael Niedermayer
e6d81ce22e Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: h264_intrapred: Fix C function names in comments
  x86: SPLATD: port to cpuflags

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-19 14:24:20 +01:00
Diego Biurrun
87af05c575 x86: SPLATD: port to cpuflags 2012-11-18 18:34:05 +01:00
Michael Niedermayer
a1b5c9634e Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: mmx2 ---> mmxext in asm constructs

Conflicts:
	libavcodec/x86/h264_chromamc_10bit.asm
	libavcodec/x86/h264_deblock.asm
	libavcodec/x86/h264dsp_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-14 12:34:30 +01:00
Diego Biurrun
26301caaa1 x86: mmx2 ---> mmxext in asm constructs 2012-11-14 00:58:51 +01:00
Michael Niedermayer
da501ea857 Merge commit '802713c4e7b41bc2deed754d78649945c3442063'
* commit '802713c4e7b41bc2deed754d78649945c3442063':
  mss2: prevent potential uninitialized reads
  mss2: reindent after last commit
  mss2: fix handling of unmasked implicit WMV9 rectangles
  configure: add lavu dependency to lavr/lavfi .pc files
  x86inc: Set program_name outside of x86inc.asm

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-12 10:57:06 +01:00
Diego Biurrun
2b479bcab0 build: Drop AVX assembly ifdefs
An assembler able to cope with AVX instructions is now required.
2012-11-11 20:43:28 +01:00
Diego Biurrun
f0d124f005 x86inc: Set program_name outside of x86inc.asm
This reduces the local difference to the x264 upstream version.
2012-11-11 11:06:19 +01:00
Michael Niedermayer
2ce64413e2 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: PALIGNR: port to cpuflags
  x86: h264_qpel_10bit: port to cpuflags

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-10 12:44:39 +01:00
Diego Biurrun
4b60fac419 x86: PALIGNR: port to cpuflags 2012-11-09 21:31:31 +01:00
Michael Niedermayer
e859339e7a Merge commit '930e26a3ea9d223e04bac4cdde13697cec770031'
* commit '930e26a3ea9d223e04bac4cdde13697cec770031':
  x86: h264qpel: Only define mmxext QPEL functions if H264QPEL is enabled
  x86: PABSW: port to cpuflags
  x86: vc1dsp: port to cpuflags
  rtmp: Use av_strlcat instead of strncat

Conflicts:
	libavcodec/x86/h264_qpel.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-05 22:36:05 +01:00
Diego Biurrun
dbb37e7711 x86: PABSW: port to cpuflags 2012-11-05 14:51:10 +01:00
Michael Niedermayer
37e81996dc Merge commit '9221efef7968463f3e3d9ce79ea72eaca082e73f'
* commit '9221efef7968463f3e3d9ce79ea72eaca082e73f':
  lavf: fix av_interleaved_write_frame() doxy.
  lavf: clarify the lifetime of demuxed packets.
  avconv: do not free muxed packet on streamcopy.
  crc: move doxy to the header
  vf_drawtext: do not use deprecated av_tree_node_size
  x86: Refactor PSWAPD fallback implementations and port to cpuflags

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-03 14:24:11 +01:00
Michael Niedermayer
1885ffb03d Merge commit '9a07c1332cfe092b57b5758f22b686ca58806c60'
* commit '9a07c1332cfe092b57b5758f22b686ca58806c60':
  parser: Move Doxygen documentation to the header files
  PGS subtitles: Expose forced flag
  x86: PMINUB: port to cpuflags

Conflicts:
	libavcodec/avcodec.h
	libavcodec/pgssubdec.c
	libavcodec/version.h
	libavcodec/x86/ac3dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-03 14:13:45 +01:00