This makes it similar to put_epel16_v6, and gives a 10-25%
speedup of this function.
Before: Cortex A7 A8 A9 A53 A72
vp8_put_epel16_h6v6_neon: 3058.0 2218.5 2459.8 2183.0 1572.2
After:
vp8_put_epel16_h6v6_neon: 2670.8 1934.2 2244.4 1729.4 1503.9
Signed-off-by: Martin Storsjö <martin@martin.st>
* qatar/master:
vp8: Use 2 registers for dst_stride and src_stride in neon bilin filter
Conflicts:
libavcodec/arm/vp8dsp_neon.S
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
float_dsp: ppc: add a separate header for Altivec function prototypes
ARM: fix float_dsp breakage from d5a7229
Add a float DSP framework to libavutil
PPC: Move types_altivec.h and util_altivec.h from libavcodec to libavutil
ARM: Move asm.S from libavcodec to libavutil
vc1dsp: mark put/avg_vc1_mspel_mc() always_inline
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
Revert "v210enc: use FFALIGN()"
doxygen: Do not include license boilerplates in Doxygen comment blocks.
avplay: reset decoder flush state when seeking
ape: skip packets with invalid size
ape: calculate final packet size instead of guessing
ape: stop reading after the last frame has been read
ape: return AVERROR_EOF instead of AVERROR(EIO) when demuxing is finished
ape: return error if seeking to the current packet fails in ape_read_packet()
avcodec: Clarify AVFrame member documentation.
v210dec: check for coded_frame allocation failure
v210enc: use stride as it is already calculated
v210enc: use FFALIGN()
v210enc: return proper AVERROR codes instead of -1
v210enc: do not set coded_frame->key_frame
v210enc: check for coded_frame allocation failure
drawtext: add 'fix_bounds' option on coords fixing
drawtext: fix text_{w, h} expression vars
drawtext: add missing braces around an if() block.
Conflicts:
libavcodec/arm/vp8.h
libavcodec/arm/vp8dsp_init_arm.c
libavcodec/v210dec.c
libavfilter/vf_drawtext.c
libavformat/ape.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
aac_latm: reconfigure decoder on audio specific config changes
latmdec: fix audio specific config parsing
Add avcodec_decode_audio4().
avcodec: change number of plane pointers from 4 to 8 at next major bump.
Update developers documentation with coding conventions.
svq1dec: avoid undefined get_bits(0) call
ARM: h264dsp_neon cosmetics
ARM: make some NEON macros reusable
Do not memcpy raw video frames when using null muxer
fate: update asf seektest
vp8: flush buffers on size changes.
doc: improve general documentation for MacOSX
asf: use packet dts as approximation of pts
asf: do not call av_read_frame
rtsp: Initialize the media_type_mask in the rtp guessing demuxer
Cleaned up alacenc.c
Conflicts:
doc/APIchanges
doc/developer.texi
libavcodec/8svx.c
libavcodec/aacdec.c
libavcodec/ac3dec.c
libavcodec/avcodec.h
libavcodec/nellymoserdec.c
libavcodec/tta.c
libavcodec/utils.c
libavcodec/version.h
libavcodec/wmadec.c
libavformat/asfdec.c
tests/ref/seek/lavf_asf
Merged-by: Michael Niedermayer <michaelni@gmx.at>
From 52.503s (~40fps) to 27.973sec (~80fps) decoding of 480p sintel
trailer, i.e. a ~2x speedup overall, on a Nexus S.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
doxygen: Consistently use '@' instead of '\' for Doxygen markup.
Use av_printf_format to check the usage of printf style functions
Add av_printf_format, for marking printf style format strings and their parameters
ARM: enable thumb for Cortex-M* CPUs
nsvdec: Propagate error values instead of returning 0 in nsv_read_header().
build: remove SRC_PATH_BARE variable
build: move basic rules and variables to main Makefile
build: move special targets to end of main Makefile
lavdev: improve feedback in case of invalid frame rate/size
vfwcap: prefer "framerate_q" over "fps" in vfw_read_header()
v4l2: prefer "framerate_q" over "fps" in v4l2_set_parameters()
fbdev: prefer "framerate_q" over "fps" in device context
bktr: prefer "framerate" over "fps" for grab_read_header()
ALSA: implement channel layout for playback.
alsa: support unsigned variants of already supported signed formats.
alsa: add support for more formats.
ARM: allow building in Thumb2 mode
Conflicts:
common.mak
doc/APIchanges
libavcodec/vdpau.h
libavdevice/alsa-audio-common.c
libavdevice/fbdev.c
libavdevice/libdc1394.c
libavutil/avutil.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The assembler emits literal pools too far from the load instructions,
so we must do it explicitly at a suitable location.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 8b454c352f)
The assembler emits literal pools too far from the load instructions,
so we must do it explicitly at a suitable location.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This adds NEON optimised versions of all functions in VP8DSPContext.
Based on initial work by Rob Clark.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit a1c1d3c003)