With the current code it fails due to running out
of registers.
So code the store offsets manually into the assembler
instead.
Passes "make fate-dts".
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
This way, the special IDCT permutations are no longer needed. Bfin code
is disabled until someone updates it. This is similar to how H264 does
it, and removes the dsputil dependency imposed by the scantable code.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '76b19a3984359b3be44d4f7e4e69b7b86729a622':
Fix a number of incorrect intmath.h #includes.
avconv: remove an unused variable
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'a846dccb29d2bb0798af1d47d06100eda9ca87cc':
h264chroma: x86: Fix building with yasm disabled
rv34: Drop now unnecessary dsputil dependencies
Conflicts:
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '620289a20e022b9c16c10d546ef86cc0bb77cc84':
sh4: Fix silly type vs. variable name search and replace typo
configure: Group all hwaccels together in a separate variable
Add av_cold attributes to arch-specific init functions
Conflicts:
configure
libavcodec/arm/mpegvideo_armv5te.c
libavcodec/x86/mlpdsp.c
libavcodec/x86/motion_est.c
libavcodec/x86/mpegvideoenc.c
libavcodec/x86/videodsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '25841dfe806a13de526ae09c11149ab1f83555a8':
Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter.
Conflicts:
libavcodec/alpha/dsputil_alpha.c
libavcodec/dsputil_template.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
Use proper "" quotes for local header #includes
ppc: fmtconvert: Drop two unused variables.
bink demuxer: set framerate.
Conflicts:
libavcodec/kbdwin.c
libavcodec/ppc/fmtconvert_altivec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This makes the plain-armv6 version use the same registers as the
armv6t2 version above.
This fixes fate-vp8 on plain-armv6 devices.
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit '6bdb841b46d170d58488deaed720729b79223b1d':
arm: h264qpel: use neon h264 qpel functions only if supported
* bug was fixed previously (in merge of buggy code):
h264: copy h264qpel dsp context to slice thread copies
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
* commit 'ce378f0dd0c4e5350b3280e6b3e8d6b46fe4b0a3':
fate: Use wmv2 IDCT for wmv2 tests
vorbisdsp: change block_size type from int to intptr_t.
Conflicts:
tests/fate-run.sh
tests/fate/vcodec.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
libavutil/arm/asm.S sets '.arch' depending on HAVE_ARMV5TE so that
assembling armv5te code will always succeed even if the default -march
flag does not support it. HAVE_ARMV5TE_EXTERNAL tests assembling code
with the default arch.
Fixes the missing symbol ff_prefetch_arm with --cpu= not including
armv5te.
CC: libav-stable@libav.org
* commit 'aeaf268e52fc11c1f64914a319e0edddf1346d6a':
vp3: integrate clear_blocks with idct of previous block.
mpegvideo: fix loop condition in draw_line()
dvdsubdec: parse the size from the extradata
Conflicts:
libavcodec/dvdsubdec.c
libavcodec/mpegvideo.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This is identical to what e.g. vp8 does, and prevents the function call
overhead (plus dependency on dsputil for this particular function).
Arm asm updated by Janne Grunau <janne-libav@jannau.net>.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
* qatar/master:
lavc: Move vector_fmul_window to AVFloatDSPContext
rtpdec_mpeg4: Check the remaining amount of data before reading
Conflicts:
libavcodec/dsputil.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* commit '9ebd45c2d58ad9241ad09718679f0cf7fb57da52':
configure: do not bypass cpuflags section if --cpu not given
dct-test: arm: indicate required cpu features for optimised funcs
snow: fix build after 594d4d5df3
arm: fix use of uninitialised value in ff_fft_fixed_init_arm()
avpicture: Don't assume a valid pix fmt in avpicture_get_size
Conflicts:
libavcodec/avpicture.c
libavcodec/snow.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This is consistent with usual ARM nomenclature as well as with the
VFPV3 and NEON symbols which both lack the ARM prefix.
Signed-off-by: Mans Rullgard <mans@mansr.com>
When initialising an FFTContext for a plain FFT, mdct_bits is not set
and can contain a garbage value. Since nbits is always valid and for
MDCT operation is mdct_bits - 2 checking this instead avoids using an
uninitialised value while having the same effect.
Signed-off-by: Mans Rullgard <mans@mansr.com>
* commit '284ea790d89441fa1e6b2d72d3c1ed6d61972f0b':
dsputil: move vector_fmul_scalar() to AVFloatDSPContext in libavutil
aacenc: use the correct output buffer
aacdec: fix signed overflows in lcg_random()
base64: fix signed overflow in shift
Conflicts:
libavcodec/dsputil.c
libavutil/base64.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
pixfmt: support more yuva formats
swscale: support gray to 9bit and 10bit formats
configure: rewrite print_config() function using awk
FATE: fix (AD)PCM test dependencies broken in e519990
Use ptrdiff_t instead of int for intra pred "stride" function parameter.
x86: use PRED4x4/8x8/8x8L/16x16 macros to declare intrapred prototypes.
Conflicts:
libavcodec/h264pred.c
libavcodec/h264pred_template.c
libavutil/pixfmt.h
libswscale/swscale_unscaled.c
tests/ref/lavfi/pixdesc
tests/ref/lavfi/pixfmts_copy
tests/ref/lavfi/pixfmts_null
tests/ref/lavfi/pixfmts_scale
tests/ref/lavfi/pixfmts_vflip
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'c9ef43215c7d68c2cdcdbe02287aa114f27a32ed':
fate-vc1: add dependencies
ARM: fix overreads in neon h264 chroma mc
rtsp: Make sure the ret variable is initialized in ff_rtsp_fetch_packet
gitignore: ignore files created by msvc
fate: Add proper dependencies for the tests in video.mak
configure: Disable Snow decoder and encoder by default
lzo: Drop obsolete fast_memcpy reference
build: Drop OBJS declaration for non-existing PCM_DVD encoder
mpeg4videodec: Disable frame multithreading for GMC, its not implemented at all
Conflicts:
libavcodec/mpegvideo.c
libavformat/rtsp.c
tests/fate/microsoft.mak
tests/fate/video.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The loops were reading ahead one line, which could end up outside the
buffer for reference blocks at the edge of the picture. Removing
this readahead has no measurable performance impact.
Signed-off-by: Mans Rullgard <mans@mansr.com>
* commit '9734b8ba56d05e970c353dfd5baafa43fdb08024':
Move avutil tables only used in libavcodec to libavcodec.
Conflicts:
libavcodec/mathtables.c
libavutil/intmath.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'b522000e9b2ca36fe5b2751096b9a5f5ed8f87e6':
avio: introduce avio_closep
mpegtsenc: set muxing type notification to verbose
vc1dec: Use correct spelling of "opposite"
a64multienc: change mc_frame_counter to unsigned
arm: call arm-specific rv34dsp init functions under if (ARCH_ARM)
svq1: Drop a bunch of useless parentheses
parseutils-test: do not print numerical error codes
svq1: K&R formatting cosmetics
Conflicts:
doc/APIchanges
libavcodec/svq1dec.c
libavcodec/svq1enc.c
libavformat/version.h
libavutil/parseutils.c
tests/ref/fate/parseutils
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'cbcd497f384f0f8ef3f76f85b29b644b900d6b9f':
adxdec: use planar sample format
adpcmdec: use planar sample format for adpcm_thp
adpcmdec: use planar sample format for adpcm_ea_xas
adpcmdec: use planar sample format for adpcm_ea_r1/r2/r3
adpcmdec: use planar sample format for adpcm_xa
adpcmdec: use planar sample format for adpcm_ima_ws for vqa version 3
adpcmdec: use planar sample format for adpcm_4xm
adpcmdec: use planar sample format for adpcm_ima_wav
adpcmdec: use planar sample format for adpcm_ima_qt
pcmdec: use planar sample format for pcm_lxf
mace: use planar sample format
atrac1: use planar sample format
build: non-x86: Only compile mpegvideo optimizations when necessary
rtpdec_mpeg4: au_headers is a single array, simple av_free is enough
avcodec: free extended_data instead address of it
fate: Add tests of the ff_make_absolute_url function
url: Handle relative urls starting with two slashes
url: Handle relative urls being just a new query string
url: Don't treat slashes in query parameters as directory separators
Conflicts:
libavcodec/adxdec.c
libavcodec/mips/Makefile
libavcodec/pcm.c
libavcodec/utils.c
libavformat/Makefile
libavformat/utils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
ARM: use numeric ID for Tag_ABI_align_preserved
segment: Pass the interrupt callback on to the chained AVFormatContext, too
ARM: bswap: drop armcc version of av_bswap16()
ARM: set Tag_ABI_align_preserved in all asm files
Merged-by: Michael Niedermayer <michaelni@gmx.at>
All our ARM asm preserves alignment so setting this attribute
in a common location is simpler. This removes numerous warnings
when linking with armcc.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This reverts commit d25f87f517.
This breaks decoding of some h264 files
I have tested the original patch with fate but by mistake have
forgotten to specify the fate samples so testing was limited to
the internal regression tests.
* qatar/master:
libx264: add forgotten ;
matroskadec: fix a sanity check.
matroskadec: only return corrupt packets that actually contain data
lavf: zero data/size of the packet passed to read_packet().
ARM: use 2-operand syntax for ADD Rd, PC in Apple PIC code
ARM: align PIC offset pools to 4 bytes
ARM: swap source operands in some add instructions
configure: update tms470 detection for latest version
lavf probe: prevent codec probe with no data at all seen
motion_est: fix use of inline on extern functions
Conflicts:
libavcodec/motion_est_template.c
libavformat/matroskadec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
mpegvideo: drop unnecessary arguments to hpel_motion()
mpegvideo: drop 'inline' from some functions
nellymoserdec: drop support for s16 output.
bmpdec: only initialize palette for pal8.
build: Properly remove object files while cleaning
flacdsp: arm optimised lpc filter
compat/vsnprintf: return number of bytes required on truncation.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
lavr: fix handling of custom mix matrices
fate: force pix_fmt in lagarith-rgb32 test
fate: add tests for lagarith lossless video codec.
ARMv6: vp8: fix stack allocation with Apple's assembler
ARM: vp56: allow inline asm to build with clang
fft: 3dnow: fix register name typo in DECL_IMDCT macro
x86: dct32: port to cpuflags
x86: build: replace mmx2 by mmxext
Revert "wmapro: prevent division by zero when sample rate is unspecified"
wmapro: prevent division by zero when sample rate is unspecified
lagarith: fix color plane inversion for YUY2 output.
lagarith: pad RGB buffer by 1 byte.
dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.
Conflicts:
doc/APIchanges
libavcodec/lagarith.c
libavfilter/x86/gradfun.c
libavutil/cpu.h
libavutil/version.h
libswscale/utils.c
libswscale/version.h
libswscale/x86/yuv2rgb.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
In the GNU assembler, a relational expression, bizarrely, has the
value -1 if true, whereas in Apple's it is +1. This patch makes
sure the correct expression is used in both cases.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The clang integrated assembler does not support pre-UAL syntax,
while gcc requires pre-UAL syntax for ARM code. A patch[1] for
clang to support the old syntax as well has been ignored since
January.
This patch chooses the syntax appropriate for each compiler,
allowing both to build the code. Notably, this change allows
building for iphone with the latest Apple Xcode update.
[1] http://llvm.org/bugs/show_bug.cgi?id=11855
Signed-off-by: Mans Rullgard <mans@mansr.com>