1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-11-23 21:54:53 +02:00
Files
FFmpeg/libavcodec
Andreas Rheinhardt ade54335b2 avcodec/x86/simple_idct: Port to SSE2
Before this commit, the (32-bit only) simple idct came in three
versions: A pure MMX IDCT and idct-put and idct-add versions
which use SSE2 at the put and add stage, but still use pure MMX
for the actual IDCT.

This commit ports said IDCT to SSE2; this was entirely trivial
for the IDCT1-5 and IDCT7 parts (where one can directly use
the full register width) and was easy for IDCT6 and IDCT8
(involving a few movhps and pshufds). Unfortunately, DC_COND_INIT
and Z_COND_INIT still use only the lower half of the registers.

This saved 4658B here; the benchmarking option of the dct test tool
showed a 15% speedup.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-08 18:48:54 +01:00
..
2025-11-05 15:13:54 +00:00
2025-11-08 18:48:54 +01:00
2025-11-05 16:31:59 +00:00
2025-11-08 01:17:46 +01:00