Michael Niedermayer
1f17619fe4
Merge commit 'bbe4a6db44f0b55b424a5cc9d3e89cd88e250450'
...
* commit 'bbe4a6db44f0b55b424a5cc9d3e89cd88e250450':
x86inc: Utilize the shadow space on 64-bit Windows
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:23:00 +02:00
Michael Niedermayer
17d9c7c208
Merge commit '3fb78e99a04d0ed8db834d813d933eb86c37142a'
...
* commit '3fb78e99a04d0ed8db834d813d933eb86c37142a':
x86inc: create xm# and ym#, analagous to m#
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:15:17 +02:00
Michael Niedermayer
3352fdb292
Merge commit '49ebe3f9fe02174ae7e14548001fd146ed375cc2'
...
* commit '49ebe3f9fe02174ae7e14548001fd146ed375cc2':
x86inc: fix some corner cases of SWAP
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:07:03 +02:00
Michael Niedermayer
006c0fcfea
Merge commit '63f0d623100bdb0c6081456127f4b6713e83d3db'
...
* commit '63f0d623100bdb0c6081456127f4b6713e83d3db':
x86inc: Use SSE instead of SSE2 for copying data
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:01:40 +02:00
Michael Niedermayer
faafffaf82
Merge commit 'ad76e6e7e193b98e7335156422d35467816f9ef1'
...
* commit 'ad76e6e7e193b98e7335156422d35467816f9ef1':
x86inc: Set ELF hidden visibility for global constants
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 10:52:51 +02:00
Michael Niedermayer
c1488fab3d
Merge commit '25cb0c1a1e66edacc1667acf6818f524c0997f10'
...
* commit '25cb0c1a1e66edacc1667acf6818f524c0997f10':
x86inc: activate REP_RET automatically
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 10:27:30 +02:00
Henrik Gramner
bbe4a6db44
x86inc: Utilize the shadow space on 64-bit Windows
...
Store XMM6 and XMM7 in the shadow space in functions that
clobbers them. This way we don't have to adjust the stack
pointer as often, reducing the number of instructions as
well as code size.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:25:35 -04:00
Loren Merritt
3fb78e99a0
x86inc: create xm# and ym#, analagous to m#
...
For when we want to mix simd sizes within one function.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:25:19 -04:00
Loren Merritt
49ebe3f9fe
x86inc: fix some corner cases of SWAP
...
SWAP with >=3 named (rather than numbered) args
PERMUTE followed by SWAP with 2 named args
used to produce the wrong permutation
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:25:06 -04:00
Henrik Gramner
63f0d62310
x86inc: Use SSE instead of SSE2 for copying data
...
Reduces code size because movaps/movups is one byte
shorter than movdqa/movdqu.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:24:33 -04:00
Henrik Gramner
ad76e6e7e1
x86inc: Set ELF hidden visibility for global constants
...
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:24:13 -04:00
Loren Merritt
25cb0c1a1e
x86inc: activate REP_RET automatically
...
Now RET checks whether it immediately follows a branch, so the
programmer dosen't have to keep track of that condition. REP_RET
is still needed manually when it's a branch target, but that's
much rarer.
The implementation involves lots of spurious labels, but that's OK
because we strip them.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:17:59 -04:00
Ronald S. Bultje
c07ac8d467
VP9 MC (ssse3) optimizations.
...
Decoding time of ped1080p.webm goes from 20.7sec to 11.3sec.
2013-10-02 21:03:15 -04:00
Michael Niedermayer
361bc70731
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
avutil: Fix compilation with inline asm disabled on mingw
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-09-22 11:51:38 +02:00
Alex Smith
08fa828b3f
avutil: Fix compilation with inline asm disabled on mingw
...
Because of -Werror=implicit-function-declaration the build will fail.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-09-22 00:50:32 +03:00
Thilo Borgmann
d814a839ac
Reinstate proper FFmpeg license for all files.
2013-08-30 15:47:38 +00:00
Michael Niedermayer
f0a3562382
Merge commit '79aec43ce813a3e270743ca64fa3f31fa43df80b'
...
* commit '79aec43ce813a3e270743ca64fa3f31fa43df80b':
x86: Add and use more convenience macros to check CPU extension availability
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-30 11:57:35 +02:00
Michael Niedermayer
2a60666d1d
Merge commit '8410d6e93c2e074881f1c7b7e4cdefd2e497d52e'
...
* commit '8410d6e93c2e074881f1c7b7e4cdefd2e497d52e':
avutil: Refactor CPU extension availability macros
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:15:10 +02:00
Michael Niedermayer
c83d794936
Merge commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b'
...
* commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b':
avutil: Move internal CPU detection function declarations to private header
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:05:15 +02:00
Diego Biurrun
79aec43ce8
x86: Add and use more convenience macros to check CPU extension availability
2013-08-29 13:07:37 +02:00
Diego Biurrun
8410d6e93c
avutil: Refactor CPU extension availability macros
2013-08-28 23:54:14 +02:00
Diego Biurrun
b78b10c4b7
avutil: Move internal CPU detection function declarations to private header
2013-08-28 23:54:14 +02:00
Michael Niedermayer
9d01bf7d66
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
Consistently use "cpu_flags" as variable/parameter name for CPU flags
Conflicts:
libavcodec/x86/dsputil_init.c
libavcodec/x86/h264dsp_init.c
libavcodec/x86/hpeldsp_init.c
libavcodec/x86/motion_est.c
libavcodec/x86/mpegvideo.c
libavcodec/x86/proresdsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-18 09:53:47 +02:00
Diego Biurrun
3ac7fa81b2
Consistently use "cpu_flags" as variable/parameter name for CPU flags
2013-07-18 00:31:35 +02:00
Michael Niedermayer
a478e99a60
avutil/x86: reenable ff_update_lls_avx()
...
The bug has been fixed in c8b920a9b7
by Loren Merritt
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-02 12:02:08 +02:00
Michael Niedermayer
d1fa671895
Merge commit 'c8b920a9b7fa534a6141695ace4e8c2dfcd56cee'
...
* commit 'c8b920a9b7fa534a6141695ace4e8c2dfcd56cee':
lls/x86: use 3-operator vaddpd in ADDPD_MEM
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-02 11:40:44 +02:00
Loren Merritt
c8b920a9b7
lls/x86: use 3-operator vaddpd in ADDPD_MEM
...
Fixes build with yasm-1.1
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-07-02 10:15:09 +02:00
Michael Niedermayer
a6e46ed51a
Revert "avutil/x86: disable ff_evaluate_lls_sse2() for 32bit"
...
This reverts commit 247425241c
.
2013-07-01 02:27:47 +02:00
Michael Niedermayer
4e488ac5f5
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
x86: lpc: fix a segfault in av_evaluate_lls_sse2()
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-01 02:26:22 +02:00
Loren Merritt
1221bb6239
x86: lpc: fix a segfault in av_evaluate_lls_sse2()
2013-06-30 23:11:19 +00:00
Michael Niedermayer
247425241c
avutil/x86: disable ff_evaluate_lls_sse2() for 32bit
...
It just segfaults on 32bit, thus its disabled until someone fixes it.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 19:03:57 +02:00
Michael Niedermayer
6e76e6a05a
Merge commit 'b545179fdff1ccfbbb9d422e4e9720cb6c6d9191'
...
* commit 'b545179fdff1ccfbbb9d422e4e9720cb6c6d9191':
x86: lpc: simd av_evaluate_lls
Conflicts:
libavutil/x86/lls.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 12:15:12 +02:00
Michael Niedermayer
a285079bc7
lls.asm: disable ff_update_lls_avx
...
The code doesnt build with yasm from ubuntu 12.04
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 12:12:11 +02:00
Michael Niedermayer
0b40c50508
lls.asm: put avx code under if HAVE_AVX_EXTERNAL
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 12:12:01 +02:00
Michael Niedermayer
78b5479633
Merge commit '502ab21af0ca68f76d6112722c46d2f35c004053'
...
* commit '502ab21af0ca68f76d6112722c46d2f35c004053':
x86: lpc: simd av_update_lls
The versions are bumped due to changes in lls.h which is used across
libraries affecting intra library ABI
(This version bump also covers changes to lls.h in the immedeatly previous
commits)
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-30 11:35:52 +02:00
Loren Merritt
b545179fdf
x86: lpc: simd av_evaluate_lls
...
1.5x-1.8x faster on sandybridge
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-06-29 13:23:57 +02:00
Loren Merritt
502ab21af0
x86: lpc: simd av_update_lls
...
4x-6x faster on sandybridge
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-06-29 13:23:57 +02:00
Michael Niedermayer
3c200aa693
Merge commit '1fda184a85178cfd7b98d9e308d18e1ded76a511'
...
* commit '1fda184a85178cfd7b98d9e308d18e1ded76a511':
avutil: Add av_cold attributes to init functions missing them
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-05 12:53:50 +02:00
Diego Biurrun
1fda184a85
avutil: Add av_cold attributes to init functions missing them
2013-05-04 22:48:05 +02:00
Michael Niedermayer
e91339cde2
Merge commit '566b7a20fd0cab44d344329538d314454a0bcc2f'
...
* commit '566b7a20fd0cab44d344329538d314454a0bcc2f':
x86: float dsp: butterflies_float SSE
Conflicts:
libavutil/x86/float_dsp.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-03 11:57:59 +02:00
Christophe Gisquet
566b7a20fd
x86: float dsp: butterflies_float SSE
...
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
2013-05-03 08:08:02 +02:00
Michael Niedermayer
92218aad00
butterflies_float: replace 2 lea by 2 add
...
adds are simpler instructions and should be faster or equally fast
on all cpus
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-17 00:10:06 +02:00
Christophe Gisquet
1a4007964c
x86: float dsp: butterflies_float SSE
...
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-17 00:03:25 +02:00
Ronald S. Bultje
b93b27edb0
dsputil: Make dsputil selectable
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:04:05 +03:00
Christophe Gisquet
2e81acc687
x86inc: Fix number of operands for cmp* instructions
...
cmp{p,s}{s,d} instructions do take an imm8 operand.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-04-09 23:55:30 +02:00
Christophe Gisquet
0b467a6e83
x264asm: fix cmp* number of arguments
...
cmp{p,s}{s,d} instructions do take an imm8 operand.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-05 16:42:12 +02:00
Michael Niedermayer
63a97d5674
Merge commit 'b6649ab5037fb55f78c2606f3d23cea0867cdeaa'
...
* commit 'b6649ab5037fb55f78c2606f3d23cea0867cdeaa':
cosmetics: Remove unnecessary extern keywords from function declarations
Conflicts:
libswscale/x86/swscale.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-28 11:20:41 +01:00
Diego Biurrun
b6649ab503
cosmetics: Remove unnecessary extern keywords from function declarations
2013-03-27 14:21:45 +01:00
Ronald S. Bultje
6a701306db
dsputil: make selectable.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-12 19:56:58 +01:00
Ronald S. Bultje
0c0828ecc5
x86: Use simple nop codes for <= sse (rather than <= mmx)
...
The "CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-19 22:33:19 +02:00