Michael Niedermayer
|
516c213f08
|
avcodec/x86/vp9dsp_init_16bpp: Fix linking to missing ff_vp9_ipred_dr_32x32_16_avx2() on 32bit
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
2017-06-28 00:31:33 +02:00 |
|
Ilia Valiakhmetov
|
35a5d9715d
|
avcodec/vp9: add 64-bit ipred_dr_32x32_16 avx2 implementation
vp9_diag_downright_32x32_12bpp_c: 429.7
vp9_diag_downright_32x32_12bpp_sse2: 158.9
vp9_diag_downright_32x32_12bpp_ssse3: 144.6
vp9_diag_downright_32x32_12bpp_avx: 141.0
vp9_diag_downright_32x32_12bpp_avx2: 73.8
Almost 50% faster than avx implementation
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
2017-06-27 16:10:50 -04:00 |
|
Diego Biurrun
|
fd502f4f5f
|
build: Generalize yasm/nasm-related variable names
None of them are specific to the YASM assembler.
(Cherry-picked from libav commit 39e208f4d4 )
Signed-off-by: James Almer <jamrial@gmail.com>
|
2017-06-21 17:00:29 -03:00 |
|
Ilia Valiakhmetov
|
81fc617c12
|
avcodec/vp9: ipred_dr_16x16_16 avx2 implementation
Signed-off-by: Ilia Valiakhmetov <zakne0ne@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
2017-06-12 12:40:58 -04:00 |
|
Ilia Valiakhmetov
|
73d9a9a6af
|
libavcodec/vp9: ipred_dl_32x32_16 avx2 implementation
vp9_diag_downleft_32x32_8bpp_c: 580.2
vp9_diag_downleft_32x32_8bpp_sse2: 75.6
vp9_diag_downleft_32x32_8bpp_ssse3: 73.7
vp9_diag_downleft_32x32_8bpp_avx: 72.7
vp9_diag_downleft_32x32_10bpp_c: 1101.2
vp9_diag_downleft_32x32_10bpp_sse2: 145.4
vp9_diag_downleft_32x32_10bpp_ssse3: 137.5
vp9_diag_downleft_32x32_10bpp_avx: 134.8
vp9_diag_downleft_32x32_10bpp_avx2: 94.0
vp9_diag_downleft_32x32_12bpp_c: 1108.5
vp9_diag_downleft_32x32_12bpp_sse2: 145.5
vp9_diag_downleft_32x32_12bpp_ssse3: 137.3
vp9_diag_downleft_32x32_12bpp_avx: 135.2
vp9_diag_downleft_32x32_12bpp_avx2: 94.0
~30% faster than avx implementation
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
2017-06-06 08:05:03 -04:00 |
|
Ronald S. Bultje
|
f8c019944d
|
vp9: re-split the decoder/format/dsp interface header files.
The advantage here is that the internal software decoder interface is
not exposed to the DSP functions or the hardware accelerations.
|
2017-03-28 18:04:26 -04:00 |
|
Clément Bœsch
|
1c9f4b5078
|
lavc/vp9: split into vp9{block,data,mvs}
This is following Libav layout to ease merges.
|
2017-03-27 21:38:21 +02:00 |
|
Ilia
|
2f3d10a01a
|
avcodec/vp9: avx2 implementation of ipred_dl_16x16_16
vp9_diag_downleft_16x16_10bpp_c: 263.0
vp9_diag_downleft_16x16_10bpp_sse2: 44.7
vp9_diag_downleft_16x16_10bpp_ssse3: 32.5
vp9_diag_downleft_16x16_10bpp_avx: 31.9
vp9_diag_downleft_16x16_10bpp_avx2: 25.7
vp9_diag_downleft_16x16_12bpp_c: 264.7
vp9_diag_downleft_16x16_12bpp_sse2: 44.4
vp9_diag_downleft_16x16_12bpp_ssse3: 32.0
vp9_diag_downleft_16x16_12bpp_avx: 32.4
vp9_diag_downleft_16x16_12bpp_avx2: 25.5
Benchmarked with 10000 runs
Signed-off-by: Ilia <zakne0ne@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
2017-03-20 09:47:43 -04:00 |
|
James Almer
|
70d685a77f
|
x86: use the new helper macros where useful
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
|
2016-02-14 20:00:21 -03:00 |
|
Ronald S. Bultje
|
061b67fb50
|
vp9: 10/12bpp SIMD (sse2/ssse3/avx) for directional intra prediction.
|
2015-10-03 14:42:39 -04:00 |
|
Ronald S. Bultje
|
26ece7a511
|
vp9: 16bpp tm/dc/h/v intra pred simd (mostly sse2) functions.
|
2015-10-03 14:42:39 -04:00 |
|
Ronald S. Bultje
|
344d519040
|
vp9: add subpel MC SIMD for 10/12bpp.
|
2015-09-16 21:11:34 -04:00 |
|
Ronald S. Bultje
|
77f359670f
|
vp9: add fullpel (avg) MC SIMD for 10/12bpp.
|
2015-09-16 21:11:34 -04:00 |
|
Ronald S. Bultje
|
6354ff0383
|
vp9: add fullpel (put) MC SIMD for 10/12bpp.
|
2015-09-16 21:11:34 -04:00 |
|