1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-18 03:19:31 +02:00
Commit Graph

14 Commits

Author SHA1 Message Date
Michael Niedermayer
516c213f08 avcodec/x86/vp9dsp_init_16bpp: Fix linking to missing ff_vp9_ipred_dr_32x32_16_avx2() on 32bit
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-06-28 00:31:33 +02:00
Ilia Valiakhmetov
35a5d9715d avcodec/vp9: add 64-bit ipred_dr_32x32_16 avx2 implementation
vp9_diag_downright_32x32_12bpp_c: 429.7
vp9_diag_downright_32x32_12bpp_sse2: 158.9
vp9_diag_downright_32x32_12bpp_ssse3: 144.6
vp9_diag_downright_32x32_12bpp_avx: 141.0
vp9_diag_downright_32x32_12bpp_avx2: 73.8

Almost 50% faster than avx implementation

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2017-06-27 16:10:50 -04:00
Diego Biurrun
fd502f4f5f build: Generalize yasm/nasm-related variable names
None of them are specific to the YASM assembler.

(Cherry-picked from libav commit 39e208f4d4)

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-21 17:00:29 -03:00
Ilia Valiakhmetov
81fc617c12 avcodec/vp9: ipred_dr_16x16_16 avx2 implementation
Signed-off-by: Ilia Valiakhmetov <zakne0ne@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2017-06-12 12:40:58 -04:00
Ilia Valiakhmetov
73d9a9a6af libavcodec/vp9: ipred_dl_32x32_16 avx2 implementation
vp9_diag_downleft_32x32_8bpp_c: 580.2
vp9_diag_downleft_32x32_8bpp_sse2: 75.6
vp9_diag_downleft_32x32_8bpp_ssse3: 73.7
vp9_diag_downleft_32x32_8bpp_avx: 72.7
vp9_diag_downleft_32x32_10bpp_c: 1101.2
vp9_diag_downleft_32x32_10bpp_sse2: 145.4
vp9_diag_downleft_32x32_10bpp_ssse3: 137.5
vp9_diag_downleft_32x32_10bpp_avx: 134.8
vp9_diag_downleft_32x32_10bpp_avx2: 94.0
vp9_diag_downleft_32x32_12bpp_c: 1108.5
vp9_diag_downleft_32x32_12bpp_sse2: 145.5
vp9_diag_downleft_32x32_12bpp_ssse3: 137.3
vp9_diag_downleft_32x32_12bpp_avx: 135.2
vp9_diag_downleft_32x32_12bpp_avx2: 94.0

~30% faster than avx implementation

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2017-06-06 08:05:03 -04:00
Ronald S. Bultje
f8c019944d vp9: re-split the decoder/format/dsp interface header files.
The advantage here is that the internal software decoder interface is
not exposed to the DSP functions or the hardware accelerations.
2017-03-28 18:04:26 -04:00
Clément Bœsch
1c9f4b5078 lavc/vp9: split into vp9{block,data,mvs}
This is following Libav layout to ease merges.
2017-03-27 21:38:21 +02:00
Ilia
2f3d10a01a avcodec/vp9: avx2 implementation of ipred_dl_16x16_16
vp9_diag_downleft_16x16_10bpp_c: 263.0
vp9_diag_downleft_16x16_10bpp_sse2: 44.7
vp9_diag_downleft_16x16_10bpp_ssse3: 32.5
vp9_diag_downleft_16x16_10bpp_avx: 31.9
vp9_diag_downleft_16x16_10bpp_avx2: 25.7
vp9_diag_downleft_16x16_12bpp_c: 264.7
vp9_diag_downleft_16x16_12bpp_sse2: 44.4
vp9_diag_downleft_16x16_12bpp_ssse3: 32.0
vp9_diag_downleft_16x16_12bpp_avx: 32.4
vp9_diag_downleft_16x16_12bpp_avx2: 25.5

Benchmarked with 10000 runs

Signed-off-by: Ilia <zakne0ne@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2017-03-20 09:47:43 -04:00
James Almer
70d685a77f x86: use the new helper macros where useful
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2016-02-14 20:00:21 -03:00
Ronald S. Bultje
061b67fb50 vp9: 10/12bpp SIMD (sse2/ssse3/avx) for directional intra prediction. 2015-10-03 14:42:39 -04:00
Ronald S. Bultje
26ece7a511 vp9: 16bpp tm/dc/h/v intra pred simd (mostly sse2) functions. 2015-10-03 14:42:39 -04:00
Ronald S. Bultje
344d519040 vp9: add subpel MC SIMD for 10/12bpp. 2015-09-16 21:11:34 -04:00
Ronald S. Bultje
77f359670f vp9: add fullpel (avg) MC SIMD for 10/12bpp. 2015-09-16 21:11:34 -04:00
Ronald S. Bultje
6354ff0383 vp9: add fullpel (put) MC SIMD for 10/12bpp. 2015-09-16 21:11:34 -04:00