1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-23 12:43:46 +02:00
FFmpeg/libavutil/x86
Jun Zhao d36b8394f4 avutil/pixelutils: sad_32x32 sse2/avx2 optimizations.
add ff_pixelutils_sad_32x32_sse2, ff_pixelutils_sad_{a,u}_32x32_sse2,
ff_pixelutils_sad_32x32_avx22, ff_pixelutils_sad_{a,u}_32x32_avx2

use perf record/report profiling, get instructions:u for avx2 sad_32x32:

  72.05%  pixelutils  pixelutils     [.] block_sad_32x32_c
  18.50%  pixelutils  pixelutils     [.] block_sad_16x16_c
   4.78%  pixelutils  pixelutils     [.] block_sad_8x8_c
   2.69%  pixelutils  pixelutils     [.] block_sad_4x4_c
   0.89%  pixelutils  pixelutils     [.] block_sad_2x2_c
   0.16%  pixelutils  pixelutils     [.] ff_pixelutils_sad_32x32_avx2
   0.16%  pixelutils  pixelutils     [.] ff_pixelutils_sad_u_32x32_avx2
   0.12%  pixelutils  pixelutils     [.] ff_pixelutils_sad_a_32x32_avx2

sse2 sad_32x32 instructions:u like:

  71.86%  pixelutils  pixelutils     [.] block_sad_32x32_c
  18.42%  pixelutils  pixelutils     [.] block_sad_16x16_c
   4.81%  pixelutils  pixelutils     [.] block_sad_8x8_c
   2.68%  pixelutils  pixelutils     [.] block_sad_4x4_c
   0.88%  pixelutils  pixelutils     [.] block_sad_2x2_c
   0.29%  pixelutils  pixelutils     [.] ff_pixelutils_sad_32x32_sse2
   0.26%  pixelutils  pixelutils     [.] ff_pixelutils_sad_u_32x32_sse2
   0.23%  pixelutils  pixelutils     [.] ff_pixelutils_sad_a_32x32_sse2

Signed-off-by: Jun Zhao <mypopydev@gmail.com>
2018-07-31 19:17:51 +08:00
..
asm.h
bswap.h
cpu.c lavu/x86/cpu: Fix aesni detection 2018-07-19 20:17:44 +02:00
cpu.h Merge commit '4cf84e254ae75b524e1cacae499a97d7cc9e5906' 2018-02-11 23:08:48 -03:00
cpuid.asm
emms.asm
emms.h
fixed_dsp_init.c
fixed_dsp.asm
float_dsp_init.c
float_dsp.asm
imgutils_init.c
imgutils.asm
intmath.h Don't use _tzcnt instrinics with clang for windows w/o BMI. 2017-10-25 21:50:37 +02:00
intreadwrite.h
lls_init.c
lls.asm
Makefile
pixelutils_init.c avutil/pixelutils: sad_32x32 sse2/avx2 optimizations. 2018-07-31 19:17:51 +08:00
pixelutils.asm avutil/pixelutils: sad_32x32 sse2/avx2 optimizations. 2018-07-31 19:17:51 +08:00
pixelutils.h
timer.h
w64xmmtest.h
x86inc.asm x86inc: Drop cpuflags_slowctz 2018-01-20 19:23:37 +01:00
x86util.asm avutil/x86util : add macro for loading a 128 bits constants in an xmm or in each part of an ymm in order to simplify avx2 asm func 2017-12-02 18:25:15 +01:00