1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-23 12:43:46 +02:00
Commit Graph

9 Commits

Author SHA1 Message Date
Ronald S. Bultje
0e80265b0a vp9: refactor 10/12bpp dc-only code in 4x4/8x8 and add to 16x16. 2015-10-13 11:06:00 -04:00
Ronald S. Bultje
1338fb79d4 vp9: add 10/12bpp sse2 SIMD version for idct_idct_16x16. 2015-10-13 11:06:00 -04:00
Ronald S. Bultje
cb054d061a vp9: add 10/12bpp sse2 SIMD versions of iadst8x8. 2015-10-13 11:05:59 -04:00
Ronald S. Bultje
e0610787b2 vp9: add 10/12bpp sse2 SIMD for idct_idct_8x8. 2015-10-13 11:05:59 -04:00
Ronald S. Bultje
a35f6bdb38 vp9: add 12bpp sse2 versions of iadst4. 2015-10-13 11:05:59 -04:00
Ronald S. Bultje
235e76aeb8 vp9: initial attempt at a idct_idct_4x4 12bpp x86 simd (sse2) impl.
The trouble with this function is that intermediates overflow 31+sign
bits, so I've added some helpers (that will also be used in 10/12bpp
8x8, 16x16 and 32x32) to make that easier, basically emulating a half-
assed pmaddqd using 2xpmaddwd. It's currently sse2-only, if anyone sees
potential in adding ssse3, I'd love to hear it.
2015-10-13 11:05:58 -04:00
Ronald S. Bultje
f76423d097 vp9: add x86 simd (sse2/ssse3) for iadst4 10bpp functions. 2015-10-13 11:05:58 -04:00
Ronald S. Bultje
6b579cf547 vp9: add 10bpp simd (mmxext/ssse3) for idct_idct_4x4. 2015-10-13 11:05:58 -04:00
Ronald S. Bultje
1c3be32533 vp9: add 10/12bpp mmxext-optimized iwht_iwht_4x4 function. 2015-10-13 11:05:57 -04:00