In high bit depth the pixels will not be stored in uint8_t like in the
normal case, but in uint16_t. The pixel size is thus 1 in normal bit
depth and 2 in high bit depth.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.
Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
This moves the H264-specific functions from DSPContext to the new
H264DSPContext. The code is made conditional on CONFIG_H264DSP
which is set by the codecs requiring it.
The qpel and chroma MC functions are not moved as these are used by
non-h264 code.
Originally committed as revision 22565 to svn://svn.ffmpeg.org/ffmpeg/trunk
These macros are redundant. All uses are replaced with the generic
DECLARE_ALIGNED macro instead.
Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk
This eliminates all aliasing violation warnings in h264 code.
No measurable speed difference with gcc-4.4.3 on i7.
Originally committed as revision 21881 to svn://svn.ffmpeg.org/ffmpeg/trunk
1% faster MBAFF decoding overall, maybe ~0.1% faster for the cathedral sample.
Originally committed as revision 21507 to svn://svn.ffmpeg.org/ffmpeg/trunk
This is a hair slower (0.15% maybe) but i really dont want to have the
identical code duplicated 3 times because gcc adds odd threaded jumps with
register reshuffling and register safe/restore.
Originally committed as revision 21502 to svn://svn.ffmpeg.org/ffmpeg/trunk
This prevents the issue with having to switch between slow and
fast code paths in each row.
0.5% faster loopfilter for cathedral
Originally committed as revision 21495 to svn://svn.ffmpeg.org/ffmpeg/trunk
This allows many things to be simplified away.
h264 decoder is overall 1% faster with a mbaff sample and
0.1% slower with the cathedral sample, probably because the slow loop
filter code must be loaded into the code cache for each first MB of each
row but isnt used for the following MBs.
Originally committed as revision 21493 to svn://svn.ffmpeg.org/ffmpeg/trunk
This should fix a segfault, also it might be faster on systems where the
+52 wasnt free.
Originally committed as revision 21406 to svn://svn.ffmpeg.org/ffmpeg/trunk
This is faster for videos that have lots of MBs that fall in this category.
Originally committed as revision 21400 to svn://svn.ffmpeg.org/ffmpeg/trunk
Change order of operands as gcc uses a hardcoded register per operand it seems
even for static functions
thus reducing unneeded moved (now functions try to pass the same argument in
the same spot).
Change signed int to unsigned int for array indexes as signed requires signed
extension while unsigned is free.
move the +52 up and merge it where it will end as a lea instruction, gcc always
splits the 52 out there turning the free +52 into an expensive one otherwise.
The changed code becomes a little faster.
Originally committed as revision 21375 to svn://svn.ffmpeg.org/ffmpeg/trunk