Only two functions that use xop multiply-accumulate instructions where the
first operand is the same as the fourth actually took advantage of the macros.
This further reduces differences with x264's x86inc.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
There's no benefit from using blendps here except on CPUs with AVX, where
it's faster than shufps according to Intel's documentation.
As such, rename the sse4 functions to sse/sse2 and use shufps instead.
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
The swresample_ prefix is not for internal functions
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Linear interpolation goes from 63 (llvm) or 58 (gcc) to 48 (yasm)
cycles/sample on 64bit, or from 66 (llvm/gcc) to 52 (yasm) cycles/
sample on 32bit. Bon-linear goes from 43 (llvm) or 38 (gcc) to
32 (yasm) cycles/sample on 64bit, or from 46 (llvm) or 44 (gcc) to
38 (yasm) cycles/sample on 32bit (all testing on OSX 10.9.2, llvm
5.1 and gcc 4.8/9).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Should fix compilation failures with MSVC and any other compiler
without inline asm support.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
DSP bits of swri_resample go into their own mini-DSP functions; DSP
init goes from a per-call branch in multiple_resample to a proper
DSP init routine; x86 bits go into x86/; swri_resample() moves out of
resample_template.c into resample.c because it's independent of DSP
code or sample type; multiple_resample() is simplified.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
pshuf+paddd is slightly faster than phaddd.
The real gain is in pre-ssse3 processors like AMD K8 and K10, which get
a big boost in performance compared to the mmxext version
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
These are not supported by all compilers (gcc 2.95 but also older SPARC
compilers, see gcc bug #33304 for example), and there is no real need for them.
One use of this feature remains in libavdevice/v4l2.c which can't be
replaced quite as easily.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
* commit '6860b4081d046558c44b1b42f22022ea341a2a73':
x86: include x86inc.asm in x86util.asm
cng: Reindent some incorrectly indented lines
cngdec: Allow flushing the decoder
cngdec: Make the dbov variable have the right unit
cngdec: Fix the memset size to cover the full array
cngdec: Update the LPC coefficients after averaging the reflection coefficients
configure: fix print_config() with broke awks
Conflicts:
libavcodec/x86/ac3dsp.asm
libavcodec/x86/dct32.asm
libavcodec/x86/deinterlace.asm
libavcodec/x86/dsputil.asm
libavcodec/x86/dsputilenc.asm
libavcodec/x86/fft.asm
libavcodec/x86/fmtconvert.asm
libavcodec/x86/h264_chromamc.asm
libavcodec/x86/h264_deblock.asm
libavcodec/x86/h264_deblock_10bit.asm
libavcodec/x86/h264_idct.asm
libavcodec/x86/h264_idct_10bit.asm
libavcodec/x86/h264_intrapred.asm
libavcodec/x86/h264_intrapred_10bit.asm
libavcodec/x86/h264_weight.asm
libavcodec/x86/vc1dsp.asm
libavcodec/x86/vp3dsp.asm
libavcodec/x86/vp56dsp.asm
libavcodec/x86/vp8dsp.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
mpegvideo: reduce excessive inlining of mpeg_motion()
mpegvideo: convert mpegvideo_common.h to a .c file
build: factor out mpegvideo.o dependencies to CONFIG_MPEGVIDEO
Move MASK_ABS macro to libavcodec/mathops.h
x86: move MANGLE() and related macros to libavutil/x86/asm.h
x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h
aacdec: Don't fall back to the old output configuration when no old configuration is present.
rtmp: Add message tracking
rtsp: Support mpegts in raw udp packets
rtsp: Support receiving plain data over UDP without any RTP encapsulation
rtpdec: Remove an unused include
rtpenc: Remove an av_abort() that depends on user-supplied data
vsrc_movie: discourage its use with avconv.
avconv: allow no input files.
avconv: prevent invalid reads in transcode_init()
avconv: rename OutputStream.is_past_recording_time to finished.
Conflicts:
configure
doc/filters.texi
ffmpeg.c
ffmpeg.h
libavcodec/Makefile
libavcodec/aacdec.c
libavcodec/mpegvideo.c
libavformat/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>