Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The residual block data of 16x16 blocks was ignored for b-frames, which
leads to easy-to-identify artifacts. After this patch, the artifacts are
gone. Sample video: svq3_watermark.mov. (Fate results unaffected.)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Most of the changes are just trivial are just trivial replacements of
fields from MpegEncContext with equivalent fields in H264Context.
Everything in h264* other than h264.c are those trivial changes.
The nontrivial parts are:
1) extracting a simplified version of the frame management code from
mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
its own more complex system already and those were set only to appease
the mpegvideo parts.
2) some tables that need to be allocated/freed in appropriate places.
3) hwaccels -- mostly trivial replacements.
for dxva, the draw_horiz_band() call is moved from
ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
because it's now different for h264 and MpegEncContext-based
decoders.
4) svq3 -- it does not use h264 complex reference system, so I just
added some very simplistic frame management instead and dropped the
use of ff_h264_frame_start(). Because of this I also had to move some
initialization code to svq3.
Additional fixes for chroma format and bit depth changes by
Janne Grunau <janne-libav@jannau.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
* commit '35685a3c2a1ec09f3c62dcfc4368fe9e92bcddf6':
dsputil: Move ff_shrink* function declarations to separate header
dsputil: Move ff_svq3 function declarations to a separate header
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* commit 'b8f3ab8e6a7ce3627764da53b809628c828d4047':
ac3dec: output planar float only
svq3: make slice type value unsigned to match svq3_get_ue_golomb return type
configure: Have protocols select network code instead of depending on it
Conflicts:
libavcodec/svq3.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
golomb: use unsigned arithmetics in svq3_get_ue_golomb()
x86: float_dsp: fix loading of the len parameter on x86-32
takdec: fix initialisation of LOCAL_ALIGNED array
takdec: fix initialisation of LOCAL_ALIGNED array
Conflicts:
libavcodec/rv30.c
libavcodec/svq3.c
libavcodec/takdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This prevents undefined behaviour of signed left shift if the coded
value is larger than 2^31. Large values are most likely invalid and
caused errors or by feeding random.
Validate every use of svq3_get_ue_golomb() and changed the place there
the return value was compared with negative numbers. dirac.c was clean,
fixed rv30 and svq3.
* commit 'e002e3291e6dc7953f843abf56fc14f08f238b21':
Use the new aes/md5/sha/tree allocation functions
avutil: Add functions for allocating opaque contexts for algorithms
svq3: fix pointer type warning
svq3: replace unsafe pointer casting with intreadwrite macros
parseutils-test: various cleanups
Conflicts:
doc/APIchanges
libavcodec/svq3.c
libavutil/parseutils.c
libavutil/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes:
libavcodec/svq3.c:661:9: warning: passing argument 2 of 'svq3_decode_block' from incompatible pointer type
libavcodec/svq3.c:208:19: note: expected 'DCTELEM *' but argument is of type 'DCTELEM (*)[32]'
Signed-off-by: Mans Rullgard <mans@mansr.com>
* qatar/master:
rtpdec_asf: Set the no_resync_search option for the chained asf demuxer
asfdec: Add an option for not searching for the packet markers
cosmetics: Clean up the tiffenc pix_fmts declaration to match the style of others
cosmetics: Align codec declarations
cosmetics: Convert mimic.c to utf-8
avconv: remove an unused function parameter.
avconv: remove now pointless variables.
avconv: drop support for building without libavfilter.
nellymoserenc: fix crash due to memsetting the wrong area.
libavformat: Only require first packet to be known for audio/video streams
avplay: Don't try to scale timestamps if the tb isn't set
Conflicts:
Changelog
configure
ffmpeg.c
libavcodec/aacenc.c
libavcodec/bmpenc.c
libavcodec/dnxhddec.c
libavcodec/dnxhdenc.c
libavcodec/ffv1.c
libavcodec/flacenc.c
libavcodec/fraps.c
libavcodec/huffyuv.c
libavcodec/libopenjpegdec.c
libavcodec/mpeg12enc.c
libavcodec/mpeg4videodec.c
libavcodec/pamenc.c
libavcodec/pgssubdec.c
libavcodec/pngenc.c
libavcodec/qtrleenc.c
libavcodec/rawdec.c
libavcodec/sgienc.c
libavcodec/tiffenc.c
libavcodec/v210dec.c
libavcodec/wmv2dec.c
libavformat/utils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Also break some long lines, remove codec function placeholder comments
and add spaces in sample/pixel format lists.
Signed-off-by: Martin Storsjö <martin@martin.st>
Results of IDCT can by far outreach the range of ff_cropTbl[], leading
to overreads and potentially crashes.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
* qatar/master: (29 commits)
amrwb: remove duplicate arguments from extrapolate_isf().
amrwb: error out early if mode is invalid.
h264: change underread for 10bit QPEL to overread.
matroska: check buffer size for RM-style byte reordering.
vp8: disable mmx functions with sse/sse2 counterparts on x86-64.
vp8: change int stride to ptrdiff_t stride.
wma: fix invalid buffer size assumptions causing random overreads.
Windows Media Audio Lossless decoder
rv10/20: Fix slice overflow with checked bitstream reader.
h263dec: Disallow width/height changing with frame threads.
rv10/20: Fix a buffer overread caused by losing track of the remaining buffer size.
rmdec: Honor .RMF tag size rather than assuming 18.
g722: Fix the QMF scaling
r3d: don't set codec timebase.
electronicarts: set timebase for tgv video.
electronicarts: parse the framerate for cmv video.
ogg: don't set codec timebase
electronicarts: don't set codec timebase
avs: don't set codec timebase
wavpack: Fix an integer overflow
...
Conflicts:
libavcodec/arm/vp8dsp_init_arm.c
libavcodec/fraps.c
libavcodec/h264.c
libavcodec/mpeg4videodec.c
libavcodec/mpegvideo.c
libavcodec/msmpeg4.c
libavcodec/pnmdec.c
libavcodec/qpeg.c
libavcodec/rawenc.c
libavcodec/ulti.c
libavcodec/vcr1.c
libavcodec/version.h
libavcodec/wmalosslessdec.c
libavformat/electronicarts.c
libswscale/ppc/yuv2rgb_altivec.c
tests/ref/acodec/g722
tests/ref/fate/ea-cmv
Merged-by: Michael Niedermayer <michaelni@gmx.at>