Zuxy Meng
82eb4b0f1b
3DNow! & Extended 3DNow! versions of FFT
...
Patch by Zuxy Meng, zuxy <<dot>> meng >>at<< gmail <<dot>> com
Minor non-functional diff-related fixes by me.
Originally committed as revision 5125 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-08 04:13:55 +00:00
Loren Merritt
548a1c8a35
h264_idct8_add_mmx
...
Originally committed as revision 5123 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-07 22:45:56 +00:00
Loren Merritt
6da971f160
h264_idct_add only needs mmx1
...
Originally committed as revision 5122 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-07 22:33:32 +00:00
Zuxy Meng
2ffb22d2ad
use xorps instead of mulps to toggle the sign of a float, as suggested by Software Optimization Guide for AMD64 Processors.
...
Patch by Zuxy Meng < zuxy POIS meng AH gmail POIS com > OKed by Michael
Original thread:
Date: Mar 5, 2006 8:15 PM
Subject: [Ffmpeg-devel] [PATCH] Little optimization to fft_sse.c
Originally committed as revision 5112 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-05 20:25:18 +00:00
Loren Merritt
d84f7c61ee
gcc2.95 workaround
...
Originally committed as revision 5111 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-05 19:02:35 +00:00
Loren Merritt
7a5b2fa812
remove some useless instructions
...
Originally committed as revision 5109 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-04 19:56:01 +00:00
Loren Merritt
6a8eb0f45a
4% faster h264_qpel_mc
...
Originally committed as revision 5094 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-02 08:21:08 +00:00
Loren Merritt
ef9d1d1575
h264: special case dc-only idct. ~1% faster overall
...
Originally committed as revision 4971 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-10 06:55:25 +00:00
Loren Merritt
4e295993ba
10l in 1.12
...
Originally committed as revision 4965 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-09 02:43:23 +00:00
Loren Merritt
6ee669732d
10l (x86_64)
...
Originally committed as revision 4952 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 16:10:48 +00:00
Loren Merritt
e545f37527
18% faster put_h264_qpel16_mc[13]2_mmx2
...
Originally committed as revision 4951 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 10:52:25 +00:00
Loren Merritt
c03ce51dfb
11% faster put_h264_qpel16_v_lowpass_mmx2
...
Originally committed as revision 4950 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 07:35:03 +00:00
Loren Merritt
0331f09237
15% faster put_h264_qpel16_hv_lowpass_mmx2
...
Originally committed as revision 4949 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 06:25:14 +00:00
Steve L'Homme
68b51e58ce
MSVC-compatible __align8/__align16 declaration
...
patch by Steve Lhomme, steve .dot. lhomme .at. free .dot. fr
Originally committed as revision 4942 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-05 13:35:17 +00:00
Diego Biurrun
5509bffa88
Update licensing information: The FSF changed postal address.
...
Originally committed as revision 4842 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-01-12 22:43:26 +00:00
Loren Merritt
e8b562087d
tweak h264_biweight
...
Originally committed as revision 4835 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-01-09 03:38:37 +00:00
Loren Merritt
cec9395977
fix some potential arithmetic overflows in pred_direct_motion() and
...
ff_h264_weight_WxH_mmx2().
Originally committed as revision 4795 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-30 23:47:41 +00:00
Diego Biurrun
bb270c0896
COSMETICS: tabs --> spaces, some prettyprinting
...
Originally committed as revision 4764 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-22 01:10:11 +00:00
Diego Biurrun
115329f160
COSMETICS: Remove all trailing whitespace.
...
Originally committed as revision 4749 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-17 18:14:38 +00:00
Guillaume Poirier
f6d1338cb5
Add the rest of missing Reg_* macros to support both AMD-64 style regs and IA32 regs.
...
Not used yet, but should be once the SIMD code to accelerate Snow decoding is merged.
Originally committed as revision 4731 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-10 22:53:44 +00:00
Loren Merritt
ea15df8048
use sse16_sse2() in nsse
...
Originally committed as revision 4688 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-11-12 05:23:25 +00:00
Loren Merritt
a6624e21cb
faster h264_chroma_mc8_mmx, added h264_chroma_mc4_mmx.
...
2-4% overall speedup.
Originally committed as revision 4666 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-10-27 06:45:29 +00:00
Loren Merritt
b926572aa9
h264 mmx weighted prediction. up to 3% overall speedup.
...
Originally committed as revision 4630 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-10-09 23:38:52 +00:00
Loren Merritt
5693c08356
sse2 16x16 sum squared diff (306=>268 cycles on a K8)
...
faster 8x8 mmx ssd (77=>70 cycles)
Originally committed as revision 4623 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-30 02:31:47 +00:00
Michael Niedermayer
12e9668119
replace a few mov + psrlq with pshufw, there are more cases which could benefit from this but they would require us to duplicate some functions ...
...
the trick is from various places (my own code in libpostproc, a patch on the x264 list, ...)
Originally committed as revision 4608 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-21 21:17:09 +00:00
Reimar Döffinger
cd7af76d9e
Fix compile without CONFIG_GPL, misplaced #endif caused a missing }.
...
Originally committed as revision 4575 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-10 19:30:40 +00:00
Michael Niedermayer
9f211bc6d7
remove unused table entries
...
change non portable table access
Originally committed as revision 4574 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-10 19:03:37 +00:00
Michael Niedermayer
84740d5980
xvids mmx&mmx2 idcts
...
needed to decode xvid without some minor artefacts
under #ifdef CONFIG_GPL of course
Originally committed as revision 4572 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-10 17:01:30 +00:00
Måns Rullgård
79396ac685
Kill some compiler warnings. Compiled code verified identical after changes.
...
Originally committed as revision 4567 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-06 21:25:35 +00:00
Michael Niedermayer
d3a9f79871
simplify (d&a) and (d&~a) calculation, hint by skal
...
Originally committed as revision 4552 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-03 09:17:30 +00:00
Michael Niedermayer
b5b65df7a9
add consts (this was in my local tree, dunno where it came from, probably forgoten from some const patch)
...
Originally committed as revision 4551 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-02 21:13:19 +00:00
Måns Rullgård
bf4e3bd2d0
kill a bunch of compiler warnings
...
Originally committed as revision 4522 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-08-14 15:42:40 +00:00
Alexander Strasser
c11c2bc20b
libavutil: Utility code from libavcodec moved to a separate library.
...
Originally committed as revision 4489 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-08-01 20:07:05 +00:00
Loren Merritt
d2bb7db135
sort H.264 mmx dsp functions into their own file
...
Originally committed as revision 4338 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-02 20:45:35 +00:00
Michael Niedermayer
c26ae41db2
adding a few const
...
Originally committed as revision 4337 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 21:19:00 +00:00
Michael Niedermayer
435b0720a8
100l for myself (breaking amd64)
...
Originally committed as revision 4336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 18:04:01 +00:00
Michael Niedermayer
6510f43cf3
merge a few asm blocks so gcc cant unoptimize it (658->631 dezicycles on duron)
...
Originally committed as revision 4334 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 11:56:58 +00:00
Michael Niedermayer
987ae784e6
get rid of 2 movq (680 -> 658 dezicycles on duron)
...
Originally committed as revision 4333 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 11:36:32 +00:00
Michael Niedermayer
e4b36d4434
avoid one transpose (730->680 dezicycles on duron)
...
Originally committed as revision 4332 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 08:43:40 +00:00
Loren Merritt
85bbfcd4ee
10l (symbol mangling)
...
Originally committed as revision 4331 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 04:51:46 +00:00
Michael Niedermayer
1f3dbc09b1
add rounding bias before the horizontal idct (765->730 dezicyles on duron)
...
Originally committed as revision 4330 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 01:18:41 +00:00
Loren Merritt
1d62fc8560
MMX for H.264 iDCT (adapted from x264)
...
Originally committed as revision 4329 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-05-31 22:48:33 +00:00
Zoltán Hidvégi
3072f0cb2e
MMX code for (put|avg)_h264_chroma_mc8
...
Originally committed as revision 4305 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-05-25 08:05:41 +00:00
Loren Merritt
5cf08f2393
H.264 deblocking optimizations (mmx for chroma_bS4 case, convert existing cases to 8-bit math)
...
Originally committed as revision 4271 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-05-18 09:17:22 +00:00
Michael Niedermayer
5773a74669
porting the mmx&sse2 (sse2 untested) vp3 idcts to the lavc idct API
...
Originally committed as revision 4260 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-05-17 18:28:40 +00:00
Michael Niedermayer
b178f758fa
disabling vp3 mmx&mmx2 idcts, they must be ported over to the lavc idct API, ill port the vp3 c idct
...
Originally committed as revision 4255 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-05-17 09:11:48 +00:00
Michael Niedermayer
c998bdd9a0
fix PIC
...
Originally committed as revision 4204 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-05-08 18:48:19 +00:00
Loren Merritt
42251a2a4f
MMX for H.264 deblocking filter
...
Originally committed as revision 4158 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-04-25 01:01:41 +00:00
Michael Niedermayer
4e492bf107
read 32bit instead of 64bit to avoid overreading and missalignments
...
Originally committed as revision 4133 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-04-17 13:30:45 +00:00
Martin Drab
4d9ae03b09
optimization and gcc 4.0 bug workaround patch by (Martin Drab >drab kepler.fjfi.cvut cz<)
...
Originally committed as revision 3945 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-02-07 17:09:48 +00:00