Janne Grunau
dfe224f377
aarch64: get_cabac inline asm
...
Based on the x86 branchless get_cabac asm. get_cabac_noinline() gets
approximately 20% faster (no cycle counts available) compared to clang
from Xcode 5.1 beta5. More than 6% faster overall. A part of the overall
speedup might be explained by additional inlining of get_cabac().
2014-03-09 00:45:33 +01:00
Janne Grunau
9c029f67ca
aarch64: use EXTERN_ASM consistently for exported symbols
...
Based on e3fec3f095
for arm.
2014-02-20 15:24:35 +01:00
Janne Grunau
fe96769bed
aarch64: port neon clobber test from arm
2014-01-15 12:31:07 +01:00
Janne Grunau
f896bca03f
aarch64: h264 (bi)weight NEON optimizations
...
Ported from ARMv7 NEON.
2014-01-15 12:31:07 +01:00
Janne Grunau
36e3b1f2fd
aarch64: h264 loop filter NEON optimizations
...
Ported from ARMv7 NEON.
2014-01-15 12:31:04 +01:00
Janne Grunau
c65d67ef50
aarch64: hpeldsp NEON optimizations
...
Ported from ARMv7 NEON.
2014-01-15 12:30:24 +01:00
Janne Grunau
d5dd8c7bf0
aarch64: h264 qpel NEON optimizations
...
Ported from ARMv7 NEON.
2014-01-15 12:17:49 +01:00
Janne Grunau
8438b3f09f
aarch64: h264 idct NEON assembler optimizations
...
Ported from ARMv7 NEON.
2014-01-15 12:13:41 +01:00
Janne Grunau
71617884a2
aarch64: h264 chroma motion compensation NEON optimizations
...
Since RV40 and VC-1 use almost the same algorithm so optimizations for
those two decoders are easy to do and included.
2014-01-15 12:07:18 +01:00