Mans Rullgard
0b711ca3f3
dsputil: drop non-compliant "fast" qpel mc functions
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-20 14:50:42 +01:00
Ronald S. Bultje
fef906c77c
Move vorbis_inverse_coupling from dsputil to vorbisdspcontext.
...
Conveniently (together with Justin's earlier patches), this makes
our vorbis decoder entirely independent of dsputil.
2013-01-19 22:21:10 -08:00
Ronald S. Bultje
aeaf268e52
vp3: integrate clear_blocks with idct of previous block.
...
This is identical to what e.g. vp8 does, and prevents the function call
overhead (plus dependency on dsputil for this particular function).
Arm asm updated by Janne Grunau <janne-libav@jannau.net>.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2013-01-19 22:04:55 -08:00
Diego Biurrun
822b0728f0
x86: dsputil: Drop some unused macro definitions
2013-01-18 22:24:58 +01:00
Justin Ruggles
e034cc6c60
lavc: Move vector_fmul_window to AVFloatDSPContext
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-16 10:45:45 +01:00
Diego Biurrun
dae1d507af
x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags
2013-01-15 17:29:43 +01:00
Diego Biurrun
51969a652c
x86: ABS2: port to cpuflags
2013-01-14 21:56:55 +01:00
Diego Biurrun
a0c5917f86
Drop Snow codec
...
Snow is a toy codec with no real-world use and horrible code.
2013-01-06 16:30:02 +01:00
Christophe Gisquet
4f50646697
x86: sbrdsp: Implement SSE qmf_post_shuffle
...
255 to 174 cycles on Arrandale / Win64. Unrolling yields no gain.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-06 13:57:01 +01:00
Christophe Gisquet
44a0036d10
x86: sbrdsp: Implement SSE sum64x5
...
698 to 174 cycles on Arrandale. Unrolling is a 6 cycles gain.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-06 13:57:01 +01:00
Diego Biurrun
5b4dfbffc2
x86: ABS1: port to cpuflags
2013-01-06 13:57:01 +01:00
Ronald S. Bultje
8c53d39e7f
lavc: introduce VideoDSPContext
...
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-20 13:40:45 +01:00
Ronald S. Bultje
6f40e9f070
x86inc: support stack mem allocation and re-alignment in PROLOGUE
...
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-12 05:23:46 +01:00
Mans Rullgard
30b3916425
ac3dec: make downmix() take array of pointers to channel data
2012-12-09 15:52:01 +00:00
Christophe Gisquet
2aef3d66c9
SBR DSP x86: implement SSE sbr_hf_gen
...
Start and end index are multiple of 2, therefore guaranteeing aligned access.
Also, this allows to generate 4 floats per loop, keeping the alignment all
along.
Timing:
- 32 bits: 326c -> 172c
- 64 bits: 323c -> 156c
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-12-07 11:04:26 +01:00
Diego Biurrun
9b15c0a9b3
x86: dsputilenc: port to cpuflags
2012-11-28 16:05:44 +01:00
Diego Biurrun
89145fbbfe
x86: h264dsp: Fix linking with yasm and optimizations disabled
...
Some optimized functions reference optimized symbols, so the functions
must be explicitly disabled when those symbols are unavailable.
2012-11-28 14:45:28 +01:00
Diego Biurrun
2e89aeed65
x86: h264_idct: port to cpuflags
2012-11-28 00:28:09 +01:00
Diego Biurrun
28e1cf19aa
x86: h264_weight: port to cpuflags
2012-11-27 21:10:38 +01:00
Diego Biurrun
7ee4071362
x86: fix build without inline asm
...
The qpel functions referenced here are not related to h264 and should
thus never have been under CONFIG_H264QPEL.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-11-26 01:50:47 +01:00
Justin Ruggles
2d3993ce8c
x86: h264 qpel: use the correct number of utilized xmm regs in cglobal
...
Fixes xmm register clobbering on win64.
2012-11-25 18:48:43 -05:00
Daniel Kang
610e00b359
x86: h264: Convert 8-bit QPEL inline assembly to YASM
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-11-25 20:38:35 +01:00
Daniel Kang
ad01ba6cea
x86: h264: Remove 3dnow QPEL code
...
The only CPUs that have 3dnow and don't have mmxext are 12 years old.
Moreover, AMD has dropped 3dnow extensions from newer CPUs.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-11-25 20:32:55 +01:00
Diego Biurrun
28c8e288fa
x86: h264_chromamc: port to cpuflags
2012-11-25 17:25:10 +01:00
Diego Biurrun
89923fce70
x86: h264_intrapred: Fix C function names in comments
...
Function names changed after switching to declaration with
PRED4x4/8x8/8x8L/16x16 macros in the C code.
2012-11-18 18:34:05 +01:00
Diego Biurrun
87af05c575
x86: SPLATD: port to cpuflags
2012-11-18 18:34:05 +01:00
Diego Biurrun
8c3849bc76
x86: dsputil: port to cpuflags
2012-11-16 10:38:23 +01:00
Diego Biurrun
26301caaa1
x86: mmx2 ---> mmxext in asm constructs
2012-11-14 00:58:51 +01:00
Diego Biurrun
5e9c6ef8f3
x86: h264_weight_10bit: port to cpuflags
2012-11-13 19:07:09 +01:00
Diego Biurrun
2b479bcab0
build: Drop AVX assembly ifdefs
...
An assembler able to cope with AVX instructions is now required.
2012-11-11 20:43:28 +01:00
Diego Biurrun
6cd796049d
x86: h264_qpel_10bit: drop unused parameter from MC10/MC20/MC30 macros
2012-11-10 14:49:09 +01:00
Diego Biurrun
4b60fac419
x86: PALIGNR: port to cpuflags
2012-11-09 21:31:31 +01:00
Diego Biurrun
4d1f69f244
x86: h264_qpel_10bit: port to cpuflags
2012-11-09 21:17:05 +01:00
Diego Biurrun
6ca60d4ddd
x86: h264_intrapred: port to cpuflags
2012-11-08 18:05:23 +01:00
Diego Biurrun
930e26a3ea
x86: h264qpel: Only define mmxext QPEL functions if H264QPEL is enabled
...
This fixes compilation with --disable-everything and components enabled.
2012-11-05 20:48:43 +01:00
Diego Biurrun
dbb37e7711
x86: PABSW: port to cpuflags
2012-11-05 14:51:10 +01:00
Diego Biurrun
6c104826bd
x86: vc1dsp: port to cpuflags
2012-11-05 14:51:10 +01:00
Diego Biurrun
0a7a94f2e5
x86: Refactor PSWAPD fallback implementations and port to cpuflags
2012-11-02 17:05:29 +01:00
Diego Biurrun
26f01bd106
x86: PMINUB: port to cpuflags
2012-11-02 15:38:15 +01:00
Diego Biurrun
9ce02e14f0
x86: ac3dsp: port to cpuflags
2012-11-02 15:24:50 +01:00
Diego Biurrun
c37322e68c
x86: Move optimization suffix to end of function names
...
This simplifies cpuflags porting.
2012-10-31 18:21:55 +01:00
Diego Biurrun
fa8fcab1e0
x86: h264_chromamc_10bit: drop pointless PAVG %define
...
It is only used in one place so there is no need for the abstraction.
2012-10-31 18:21:55 +01:00
Diego Biurrun
d8eda37080
x86: mmx2 ---> mmxext in function names
2012-10-31 17:53:57 +01:00
Diego Biurrun
be2c456e96
x86: fmtconvert: Refactor cvtps2pi emulation through cpuflags
2012-10-31 01:05:03 +01:00
Diego Biurrun
be923ed659
x86: fmtconvert: port to cpuflags
2012-10-31 01:05:03 +01:00
Diego Biurrun
588fafe7f3
x86: MMX2 ---> MMXEXT in macro names
2012-10-31 01:04:55 +01:00
Diego Biurrun
652f518594
x86: mmx2 ---> mmxext in comments and messages
2012-10-31 00:37:42 +01:00
Diego Biurrun
04581c8c77
x86: yasm: Use complete source path for macro helper %includes
...
This is more consistent with the way we handle C #includes and
it simplifies the build system.
2012-10-31 00:37:42 +01:00
Diego Biurrun
6860b4081d
x86: include x86inc.asm in x86util.asm
...
This is necessary to allow refactoring some x86util macros with cpuflags.
2012-10-31 00:37:42 +01:00
Ronald S. Bultje
95c89da36e
Use ptrdiff_t instead of int for intra pred "stride" function parameter.
...
This way, SIMD-optimized functions don't have to sign-extend their
stride argument manually to be able to do pointer arithmetic.
2012-10-29 17:49:13 -07:00