James Almer
c16e99e3b3
x86: check for AV_CPU_FLAG_AVXSLOW where useful
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-01 00:15:35 +02:00
Michael Niedermayer
b666e81c13
Merge commit 'e4610300de6869bd6b3b00e76cfeabb6d7653dcd'
...
* commit 'e4610300de6869bd6b3b00e76cfeabb6d7653dcd':
x86: cavs: Remove an unneeded scratch buffer
Conflicts:
libavcodec/x86/cavsdsp.c
See: d79f7bf0d6
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-28 22:12:41 +02:00
Michael Niedermayer
e4610300de
x86: cavs: Remove an unneeded scratch buffer
...
Simplifies the code and makes it build on certain compilers
running out of registers on x86.
CC: libav-stable@libav.org
Reported-By: mudler
2015-05-28 18:40:40 +02:00
Timothy Gu
2b388e6dde
Revert "Move struc FFTContext below SECTION_RODATA"
...
This reverts commit 599888a480
.
The commit does not silence the warning on ELF-based systems, and will be
fixed in the subsequent commit.
Conflicts:
libavcodec/x86/fft_mmx.asm
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-28 00:08:32 +02:00
Michael Niedermayer
d9b264bc73
Merge commit '848e86f74d3e6e87fa592ee8ba8c184cc5fd9a42'
...
* commit '848e86f74d3e6e87fa592ee8ba8c184cc5fd9a42':
mpegvideo: Drop flags and flags2
Conflicts:
libavcodec/mpeg12dec.c
libavcodec/mpeg12enc.c
libavcodec/mpegvideo.c
libavcodec/mpegvideo_enc.c
libavcodec/mpegvideo_motion.c
libavcodec/ratecontrol.c
libavcodec/vc1_block.c
libavcodec/vc1_loopfilter.c
libavcodec/vc1_mc.c
libavcodec/vc1dec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-22 20:24:41 +02:00
Vittorio Giovara
848e86f74d
mpegvideo: Drop flags and flags2
...
They are just duplicates of AVCodecContext members so use those instead.
2015-05-22 15:34:39 +01:00
Michael Niedermayer
451be676f3
Merge remote-tracking branch 'rbultje/vp9-bugfixes'
...
* rbultje/vp9-bugfixes:
vp9: match another find_ref_mvs() bug in libvpx.
vp9: fix scaled motion vector clipping for sub8x8 blocks.
vp9: improve signbias check.
vp9: don't allow compound references if error_resilience is enabled.
vp9: clamp segmented lflvl before applying ref/mode deltas.
vp9: reset loopfilter mode/ref deltas on keyframe.
vp9: fix crash when playing back 440/440 content with width%64<56.
vp9: extend loopfilter workaround for vp9 h/v mix-up to work for 422.
vp9: clip motion vectors in the same way as libvpx does.
vp9: set skip flag if the block had no coded coefficients.
vp9: apply mv scaling workaround only when subsampling is enabled.
vp9: read all 4x4 blocks in sub8x8 blocks individually with scalability.
vp9: fix segmentation map referencing upon framesize change.
vp9: disable more pmulhrsw optimizations in idct16/32.
vp9: disable all pmulhrsw in 8/16 iadst x86 optimizations.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-18 02:35:16 +02:00
Carl Eugen Hoyos
e609cfd697
lavc/flac: Fix encoding and decoding with high lpc.
...
Based on an analysis by trac user lvqcl.
Fixes ticket #4421 , reported by Chase Walker.
2015-05-17 02:08:58 +02:00
Ronald S. Bultje
d32d0593f1
vp9: disable more pmulhrsw optimizations in idct16/32.
...
For idct16, only when called from a adst16x16 variant, so impact is
minor. For idct32, for all, so relatively major impact.
2015-05-14 14:15:27 -04:00
Ronald S. Bultje
96d30c3495
vp9: disable all pmulhrsw in 8/16 iadst x86 optimizations.
...
They all overflow in various samples that are considered valid input.
2015-05-14 13:39:37 -04:00
Michael Niedermayer
cc77bb09e4
avcodec/x86/vp9dsp_init: Fix mix of declaration and statement
...
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-07 14:33:10 +02:00
Ronald S. Bultje
b224b165cb
vp9: add keyframe profile 2/3 support.
2015-05-06 15:10:41 -04:00
Michael Niedermayer
6ef3426d90
avcodec/x86/deinterlace: use INIT_MMX like other asm code does too
2015-05-05 02:41:15 +02:00
Michael Niedermayer
dfc0708e23
avcodec/x86/dct-test: Use uint8_t for idct_simple_mmx_perm
...
The table contains no element outside the unsigned 8bit range
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-02 13:43:15 +02:00
Michael Niedermayer
270e647adc
avcodec/x86/dct-test: Make static table const
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-02 13:42:46 +02:00
Ronald S. Bultje
3de13d5212
vp9: remove another optimization branch in iadst16 which causes overflows.
...
See sample vp90-2-14-resize-fp-tiles-16-8.webm from the vp9 test vector
set to reproduce the issue.
2015-04-24 16:54:31 +02:00
Ronald S. Bultje
d02d04a18f
vp9: remove one optimization branch in iadst16 which causes overflows.
...
See sample vp90-2-14-resize-fp-tiles-16-8-4-2-1.webm from the vp9 test
vector set which reproduces the issue. This probably costs a few cycles,
but I don't think there's an easy way to workaround that.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-04-22 21:37:10 +02:00
Michael Niedermayer
0245abc7c1
avcodec/x86/hpeldsp_init: Put CONFIG_* first in if()
...
This is more consistent and may fix a build failure
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-26 15:41:27 +01:00
James Almer
6b940b8c99
x86/xvididct: add some yasm guards
...
Should fix compilation on compilers with less-than-ideal dead code elimination
Signed-off-by: James Almer <jamrial@gmail.com>
2015-03-20 02:38:20 -03:00
James Almer
b0fea4ad7e
x86/xvididct: remove obsolete function prototypes
...
Signed-off-by: James Almer <jamrial@gmail.com>
2015-03-20 02:38:14 -03:00
Michael Niedermayer
1eb28479da
Merge commit '48aef27f5232794e70ecef0d347b9f65e27a9bad'
...
* commit '48aef27f5232794e70ecef0d347b9f65e27a9bad':
x86: Put COPY3_IF_LT under HAVE_6REGS
Conflicts:
libavcodec/x86/mathops.h
See: b38910c979
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-17 20:25:47 +01:00
Luca Barbato
48aef27f52
x86: Put COPY3_IF_LT under HAVE_6REGS
...
It uses 6 registers, unbreaks building on hardened x86 system.
Bug-Id: gentoo/541930
CC: libav-stable@libav.org
2015-03-17 12:31:04 +01:00
Michael Niedermayer
d79f7bf0d6
avcodec/x86/cavsdsp: remove incorrect LOCAL_ALIGN tmp
...
This is faster and simpler as well
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-16 14:51:51 +01:00
James Almer
e8374d7202
x86/proresdsp: remove ff_prores_idct_put_10_sse4
...
It's exactly the same as the sse2 version.
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-03-16 01:52:44 -03:00
James Almer
bdd179c8cb
x86/proresdsp: remove unused macro
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-03-16 01:49:34 -03:00
Christophe Gisquet
238db7cc56
x86: lavc: use LOCAL_ALIGNED instead of DECLARE_ALIGNED
...
The later may yield incorrect code for on-stack variables.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-14 20:06:47 +01:00
Christophe Gisquet
15ce160183
x86: xvid_idct: SSE2 merged add version
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-14 13:36:47 +01:00
Christophe Gisquet
decd5193e1
x86: xvid_idct: merged idct_put SSE2 versions
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-14 13:36:29 +01:00
Christophe Gisquet
8200575d84
x86: dct-test: evaluate prores idct avx version
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-14 13:23:27 +01:00
Christophe Gisquet
4eb4451be1
x86: dct-test: fix compilation for prores
...
When the decoder is deactivated, the x86-optimized versions are
not compiled, resulting in a link error.
The C version is unaffected, as it is part of the idctdsp
subsystem.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-14 13:23:06 +01:00
Christophe Gisquet
c3bf52713a
x86: xvid_idct: port MMX iDCT to yasm
...
Also reduce the table duplication with SSE2 code, remove duplicated
macro parameters.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-14 11:45:11 +01:00
Christophe Gisquet
2999bd7da2
x86: xvid_idct: port SSE2 iDCT to yasm
...
The main difference consists in renaming properly labels, and
letting yasm select the gprs for skipping 1D transforms.
Previous-version-reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-13 01:04:52 +01:00
James Almer
5c8f747085
x86/hevc_sao: use unaligned movs for sao_{band,filter} with width 8
...
Suggested-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-03-01 20:02:43 -03:00
Michael Niedermayer
7fce8c752d
Merge commit '71f1ad37d858b810b71a4af1c25771beaa50b27b'
...
* commit '71f1ad37d858b810b71a4af1c25771beaa50b27b':
lavc: do not compile fmtconvert unconditionally
Conflicts:
configure
libavcodec/ppc/Makefile
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-01 00:06:42 +01:00
Michael Niedermayer
5c17377e28
Merge commit 'd74a8cb7e42f703be5796eeb485f06af710ae8ca'
...
* commit 'd74a8cb7e42f703be5796eeb485f06af710ae8ca':
fmtconvert: drop unused functions
Conflicts:
libavcodec/arm/fmtconvert_vfp_armv6.S
libavcodec/x86/fmtconvert.asm
libavcodec/x86/fmtconvert_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-28 23:58:29 +01:00
Anton Khirnov
71f1ad37d8
lavc: do not compile fmtconvert unconditionally
...
Only ac3dec and dcadec use it.
2015-02-28 21:51:24 +01:00
Anton Khirnov
d74a8cb7e4
fmtconvert: drop unused functions
2015-02-28 21:51:24 +01:00
Michael Niedermayer
23a90768a8
avcodec/v210dec: Add ff prefix to v210_x86_init()
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-27 19:08:09 +01:00
Michael Niedermayer
0e699676f9
avcodec/snow: mark dwt init as av_cold
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-27 16:53:37 +01:00
Carl Eugen Hoyos
36a6fb989b
hevc_deblock: Fix compilation with nasm
...
CC: libav-stable@libav.org
Bug-Id: 795
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2015-02-22 22:34:20 +00:00
Michael Niedermayer
03f39fbb2a
avcodec/x86/mlpdsp_init: Simplify mlp_filter_channel_x86()
...
Based on patch by Francisco Blas Izquierdo Riera
Commit message partly taken from carl
fixes a compilation
error in mlpdsp_init.c with -fstack-check and some gcc compilers (I
reproduced the issue with gcc 4.7.3) by simplifying the code.
See also https://bugs.gentoo.org/show_bug.cgi?id=471756
$ make libavcodec/x86/mlpdsp_init.o
libavcodec/x86/mlpdsp_init.c: In function ‘mlp_filter_channel_x86’:
libavcodec/x86/mlpdsp_init.c:142:5: error: can’t find a register in
class ‘GENERAL_REGS’ while reloading ‘asm’
libavcodec/x86/mlpdsp_init.c:142:5: error: ‘asm’ operand has impossible
constraints
4551 -> 4509 dezicycles
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-21 16:05:41 +01:00
Christophe Gisquet
398f531915
x86: hevc_mc: fewer xmm regs used in epel h/v
...
11 xmm regs seem only required for avx2.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-17 15:19:19 +01:00
Christophe Gisquet
89cb4995fa
x86: hevc_mc: save 1 gpr in epel filter loading
...
The 3*stride value stored in r3src can be loaded much later,
so use r3src instead of a dedicated gpr when possible.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-16 21:53:51 +01:00
James Almer
03adafb318
x86/g722dsp: add ff_g722_apply_qmf_sse2
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-16 00:41:21 -03:00
Christophe Gisquet
b533949813
x86: hevc: remove a parameter to WP internals
...
The second stride is always the internal buffer one, MAX_PB_SIZE (times 2 to
get the value in bytes).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-14 17:22:50 +01:00
James Almer
1679d68dbf
x86/hevc_mc: optimize AVX2 mc functions
...
Before
40766 decicycles in ff_hevc_put_hevc_qpel_h64_8_avx2, 8192 runs, 0 skips
After
37975 decicycles in ff_hevc_put_hevc_qpel_h64_8_avx2, 8192 runs, 0 skips
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-12 13:21:58 -03:00
James Almer
14b44c1614
x86/hevc_sao: make sao_edge_filter_{10,12} work on x86_32
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-12 13:21:30 -03:00
James Almer
06fe6dfe12
x86/hevc_sao: make sao_band_filter work on x86_32
...
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-09 20:41:21 -03:00
Christophe Gisquet
b61b9e4919
x86: hevc_mc: remove lea in EPEL_LOAD
...
The second parameter to the macro is always an immediate address,
so no lea is needed.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-08 22:19:35 +01:00
Christophe Gisquet
4919b38421
x86: hevc_mc: fewer gpr autoloads for _v filters
...
In that case, it's just to load my, but mx/r3src is not used.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-08 22:19:34 +01:00