While pshufb allows emulating bswap on XMM registers for SSSE3, more
shuffling is needed for SSE2. Alignment is critical, so specific codepaths
are provided for this case.
For the huffyuv sequence "angels_480-huffyuvcompress.avi":
C (using bswap instruction): ~ 55k cycles
SSE2: ~ 40k cycles
SSSE3 using unaligned loads: ~ 35k cycles
SSSE3 using aligned loads: ~ 30k cycles
Signed-off-by: Diego Biurrun <diego@biurrun.de>
The functions are already av_ prefixed and intfloat header is already provided.
Install libavutil/intfloat.h
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
-vbsf doesn't exist anymore. It got renamed to -bsf somewhere along the
line. Update print statement accordingly.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Current demuxer recognizes several colorspace formats that begin with 'C420'
but does not yet recognize plain 'C420'. GStreamer's y4menc component
generates .y4m files with a 'C420' colorspace. This new comparison is
placed after the other 'C420' checks so that it doesn't interfere with
them.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
* qatar/master:
png: add missing #if HAVE_SSSE3 around function pointer assignment.
imdct36: mark SSE functions as using all 16 XMM registers.
png: move DSP functions to their own DSP context.
sunrast: Add a sample request for TIFF, IFF, and Experimental Rastfile formats.
sunrast: Cosmetics
sunrast: Remove if (unsigned int < 0) check.
sunrast: Replace magic number by a macro.
Conflicts:
libavcodec/dsputil.c
libavcodec/dsputil.h
libavcodec/pngdec.c
libavcodec/sunrast.c
libavcodec/x86/Makefile
libavcodec/x86/dsputil_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Offsets are relative to the end of the header, not the
start of the buffer, thus the buffer size needs to be subtracted.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Codec is too simple to gain much from it at lower resolutions,
but should help at very high resolutions, particularly for
v3 and v5 where a not too optimized pseudo-YUV to RGB
is done in the codec.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
This reverts e6e7bfc1 and 365e1ec2.
The code may be incorrect both before and after the revert, but we
do not have any samples that were fixed by the original commits.
Fixes ticket #871.
With gcc 4.6 this part of the code is ca. 4x faster, resulting
in an overall speedup of around 5% for fate-fraps-v5 sample.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
On x86-64, it indeed uses all 16 registers (and on x86-32, this gets
clipped to 8). Not marking it properly causes callers of this function
to fail randomly because of XMM register clobbering.
Note: This fixes the following GCC warning :-
libavcodec/sunrast.c:94: warning: comparison of unsigned expression < 0 is always false.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Codec has only I- and skip-frames, so there is no
need for reget_buffer, change it so it works with
get_buffer.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Causes FFmpeg to pass through the correct pts values,
instead of clobbering all to AV_NOPTS_VALUE (the av_init_packet
default) to then make up new ones based on only fps when muxing.
Included are also the related FATE ref changes, which all
some reasonable on quick investigation.
Also set all H.264 references to us -vsync drop to reduce the
diff for the ref files.
Otherwise almost all H.264 references need to change, mostly due
to now starting with negative pts values.
About 20 additional H.264 conformance tests needed -vsync
drop anyway because they create pts values that are out of
order and thus not possible to mux otherwise.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
* qatar/master:
aacenc: Fix LONG_START windowing.
aacenc: Fix a bug where deinterleaved samples were stored in the wrong place.
avplay: use the correct array size for stride.
lavc: extend doxy for avcodec_alloc_context3().
APIchanges: mention avcodec_alloc_context()/2/3
avcodec_align_dimensions2: set only 4 linesizes, not AV_NUM_DATA_POINTERS.
aacsbr: ARM NEON optimised sbrdsp functions
aacsbr: align some arrays
aacsbr: move some simdable loops to function pointers
cosmetics: Remove extra newlines at EOF
Conflicts:
libavcodec/utils.c
libavfilter/formats.c
libavutil/mem.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Move libgsm_encode_close before its first use and call it
with the correct number of arguments.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>