Martin Vignali
9b8c1224d7
libavcodec/exr : add X86 SIMD for reorder_pixels
...
Signed-off-by: James Almer <jamrial@gmail.com>
2017-09-17 17:53:57 -03:00
Ivan Kalvachev
7205513f8f
SIMD opus pvq_search implementation
...
Explanation on the workings and methods used by the
Pyramid Vector Quantization Search function
could be found in the following Work-In-Progress mail threads:
http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212146.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212816.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213030.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213436.html
Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com>
2017-08-18 17:18:32 +01:00
Paul B Mahol
4ed7c2bbc3
avcodec/utvideodec: add SIMD for restore_rgb_planes
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-06-27 09:54:10 +02:00
Rostislav Pehlivanov
e1120b1c54
mdct15: add assembly optimizations for the 15-point FFT
...
c: 1802 decicycles in fft15,16774635 runs, 2581 skips
avx: 865 decicycles in fft15,16776378 runs, 838 skips
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-06-23 23:45:37 +01:00
Diego Biurrun
fd502f4f5f
build: Generalize yasm/nasm-related variable names
...
None of them are specific to the YASM assembler.
(Cherry-picked from libav commit 39e208f4d4
)
Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-21 17:00:29 -03:00
James Darnley
8e89f6fd37
avcodec/x86: move simple_idct to external assembly
2017-05-30 13:20:42 +02:00
Ronald S. Bultje
c9d98c5649
cavs: convert idct from inline asm to yasm.
2017-04-06 10:03:27 -04:00
Clément Bœsch
40ac226014
lavc/x86/hevc: rename hevc_res_add to hevc_add_res
...
This will simplify incoming merge.
2017-03-24 11:45:23 +01:00
Clément Bœsch
c66bd8f3ff
Merge commit 'b57e38f52cc3f31a27105c28887d57cd6812c3eb'
...
* commit 'b57e38f52cc3f31a27105c28887d57cd6812c3eb':
ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm
Merged-by: Clément Bœsch <u@pkh.me>
2017-03-22 12:49:29 +01:00
James Almer
ca8a3978e5
Merge commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5'
...
* commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5':
x86: hpeldsp: Split off VP3-specific bits into a separate file
Merged-by: James Almer <jamrial@gmail.com>
2017-01-31 14:49:29 -03:00
James Almer
cf9ef83960
huffyuvencdsp: move shared functions to a new lossless_videoencdsp context
...
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:04 -03:00
Rostislav Pehlivanov
d2ae5f77c6
aacenc: add SIMD optimizations for abs_pow34 and quantization
...
Performance improvements:
quant_bands:
with: 681 decicycles in quant_bands, 8388453 runs, 155 skips
without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips
Around 42% for the function
Twoloop coder:
abs_pow34:
with/without: 7.82s/8.17s
Around 4% for the entire encoder
Both:
with/without: 7.15s/8.17s
Around 12% for the entire encoder
Fast coder:
abs_pow34:
with/without: 3.40s/3.77s
Around 10% for the entire encoder
Both:
with/without: 3.02s/3.77s
Around 20% faster for the entire encoder
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: James Almer <jamrial@gmail.com>
2016-10-18 21:41:18 +01:00
Justin Ruggles
b57e38f52c
ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm
...
Adds a wrapper function for downmixing which detects channel count changes
and updates the selected downmix function accordingly.
Simplification and porting to current x86inc infrastructure by Diego Biurrun.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-10-01 00:46:25 +02:00
Anton Khirnov
12004a9a7f
audiodsp/x86: yasmify vector_clipf_sse
2016-09-22 09:47:52 +02:00
Anton Khirnov
89466de4ae
vp9/x86: rename vp9dsp to vp9mc
...
It only contains the MC SIMD, other SIMD will go into different files.
2016-08-03 10:57:50 +02:00
James Almer
efc9d5c4bc
x86/ttaenc: add ff_ttaenc_filter_process_{ssse3,sse4}
...
Signed-off-by: James Almer <jamrial@gmail.com>
2016-08-02 15:48:04 -03:00
Diego Biurrun
1dfc3cf89d
x86: hpeldsp: Split off VP3-specific bits into a separate file
2016-07-20 18:33:25 +02:00
James Almer
fca3c3b619
hevc: Add AVX2 DC IDCT
...
Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>.
Integrated to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
2016-07-18 15:27:13 +02:00
Diego Biurrun
01621202aa
build: miscellaneous cosmetics
...
Restore alphabetical order in lists, break overly long lines, do some
prettyprinting, add some explanatory section comments, group parts
together that belong together logically.
2016-04-07 15:26:08 +02:00
Diego Biurrun
1a094af638
fft: Split MDCT bits off from FFT
2016-03-01 10:18:28 +01:00
Timothy Gu
e3461197b1
x86/vc1dsp: Split the file into MC and loopfilter
2016-02-29 08:46:53 -08:00
Derek Buitenhuis
b056482ef3
Merge commit '15a24614aef5836af3cd2c7cc3b2b737eee6bf3c'
...
* commit '15a24614aef5836af3cd2c7cc3b2b737eee6bf3c':
build: Add vc1dsp component for more fine-grained dependencies
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2016-02-24 18:21:38 +00:00
Diego Biurrun
15a24614ae
build: Add vc1dsp component for more fine-grained dependencies
2016-02-19 20:38:18 +01:00
James Almer
8ae7447941
x86/dcadec: add ff_lfe_fir0_float_{sse,sse2,avx,fma3}
...
Up to ~4 times faster on x86_64, ~8 times on x86_32 if compiling using x87 fp math.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2016-02-06 01:36:55 -03:00
Timothy Gu
9fd6ea933f
dirac_dwt: Make x86 files/functions names consistent
2016-02-05 19:30:23 -08:00
Timothy Gu
17ab8f7e68
diracdsp: Make x86 files/functions names consistent
2016-02-05 19:29:43 -08:00
foo86
ae5b2c5250
avcodec/dca: add new decoder based on libdcadec
2016-01-31 17:09:38 +01:00
foo86
4608996772
avcodec/dca: remove old decoder
...
Remove all files and functions which are not going to be reused,
and disable all functions and FATE tests temporarily which will be.
2016-01-31 17:09:38 +01:00
James Almer
209f50e16b
avcodec/synth_filter: split off remaining code from dcadec files
...
Signed-off-by: James Almer <jamrial@gmail.com>
2016-01-25 14:57:38 -03:00
Diego Biurrun
03ef89faf2
x86: build: Group all encoder objects together
2016-01-18 14:47:58 +01:00
Anton Khirnov
e7078e842d
hevcdsp: add x86 SIMD for MC
2015-12-05 21:11:52 +01:00
James Almer
73353af6e5
x86/Makefile: move decoder/encoder objects out of the subsystems section
...
Signed-off-by: James Almer <jamrial@gmail.com>
2015-10-22 03:55:18 -03:00
Timothy Gu
6b41b44149
huffyuvencdsp: Convert ff_diff_bytes_mmx to yasm
...
Heavily based upon ff_add_bytes by Christophe Gisquet.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2015-10-20 18:24:54 -07:00
Ronald S. Bultje
1c3be32533
vp9: add 10/12bpp mmxext-optimized iwht_iwht_4x4 function.
2015-10-13 11:05:57 -04:00
Christophe Gisquet
4369b9dc7b
x86: simple_idct(_put): 10bits versions
...
Modeled from the prores version. Clips to [0;1023] and is bitexact.
Bitexactness requires to add offsets in different places compared to
prores or C, and makes the function approximately 2% slower.
For 16 frames of a DNxHD 4:2:2 10bits test sequence:
C: 60861 decicycles in idct, 1048205 runs, 371 skips
sse2: 27567 decicycles in idct, 1048216 runs, 360 skips
avx: 26272 decicycles in idct, 1048171 runs, 405 skips
The add version is not implemented, so the corresponding dsp
function is set to NULL to make it clear in a code executing it.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 13:32:21 +02:00
Paul B Mahol
35af7add6f
avcodec/takdec: add x86 SIMD for rest of decorrelation modes
...
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-09 21:38:15 +02:00
James Almer
72254b19b8
x86/alacdsp: add simd optimized functions
...
Signed-off-by: James Almer <jamrial@gmail.com>
2015-10-06 20:22:00 -03:00
Ronald S. Bultje
26ece7a511
vp9: 16bpp tm/dc/h/v intra pred simd (mostly sse2) functions.
2015-10-03 14:42:39 -04:00
Ronald S. Bultje
db7786e8ff
vp9: sse2/ssse3/avx 16bpp loopfilter x86 simd.
2015-10-03 14:42:39 -04:00
James Almer
3178931a14
x86/hevc_sao: move 10/12bit functions into a separate file
...
Tested-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-09-30 02:59:55 -03:00
Ronald S. Bultje
344d519040
vp9: add subpel MC SIMD for 10/12bpp.
2015-09-16 21:11:34 -04:00
Ronald S. Bultje
6354ff0383
vp9: add fullpel (put) MC SIMD for 10/12bpp.
2015-09-16 21:11:34 -04:00
Hendrik Leppkes
41194f065c
Merge commit 'cad40a3833ad81a352e7657ec6f7d637cea3b798'
...
* commit 'cad40a3833ad81a352e7657ec6f7d637cea3b798':
lavc: Drop deprecated deinterlace module
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-09-05 17:06:14 +02:00
Vittorio Giovara
cad40a3833
lavc: Drop deprecated deinterlace module
...
Deprecated in 03/2013.
2015-08-28 16:04:19 +02:00
James Almer
9dcaae70f2
x86/aacpsdsp: add SSE and SSE3 optimized functions
...
Between 1.5 and 2.5 times faster
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-07-30 19:01:15 -03:00
Michael Niedermayer
115a9b5091
Merge commit 'd42191c78befc1983f23b1899b2dda513b72f1ed'
...
* commit 'd42191c78befc1983f23b1899b2dda513b72f1ed':
configure: Factor out vp8dsp module
Conflicts:
configure
libavcodec/Makefile
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-17 22:45:34 +02:00
Michael Niedermayer
fd29dd432c
Merge commit '5cb4bdb2a03c3643f8f1e7d21d7094e61e0a4418'
...
* commit '5cb4bdb2a03c3643f8f1e7d21d7094e61e0a4418':
configure: Factor out rv34dsp module
Conflicts:
libavcodec/Makefile
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-17 22:21:36 +02:00
Vittorio Giovara
d42191c78b
configure: Factor out vp8dsp module
2015-07-17 18:46:24 +01:00
Vittorio Giovara
5cb4bdb2a0
configure: Factor out rv34dsp module
2015-07-17 18:46:24 +01:00
James Almer
7912a6830d
avcodec/jpeg200dsp: add ff_ict_float_{sse,avx}
...
Original intrinsics version by Nicolas Bertrand.
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-06-13 16:53:27 -03:00