* commit '8b00f4df20f4a8ab0656fdaf7d00233a6515a052':
h264: move some neighbour information into the per-slice context
Conflicts:
libavcodec/h264_cabac.c
libavcodec/h264_cavlc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '4bd5ac200d15b4f458a50f66006549825f9fc865':
h264: move {chroma,intra16x16}_pred_mode into the per-slice context
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5355ed6b20e941430c4f8fb82644e87a65366d61':
h264: move {prev,next}_mb_skipped into the per-slice context
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'd231e84b06a9964c840cff4e228509f706165fb6':
h264: move the quantizers into the per-slice context
Conflicts:
libavcodec/dxva2_h264.c
libavcodec/h264_cavlc.c
libavcodec/h264_loopfilter.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This fixes out of array reads and/or infinite loops.
30 is the maximum number of bits that can be read into
coeff_abs below.
CC: libav-stable@libav.org
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit '277ff7f5dc134f1c2dfc4ea0ef3540340482e3d2':
lavu: move internal define to the only places where it is used
Conflicts:
libavcodec/h264_cabac.c
libavutil/internal.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '23e85be58fc64b2e804e68b0034a08a6d257e523':
h264: add a parameter to the CHROMA444 macro.
h264: add a parameter to the CHROMA422 macro.
Conflicts:
libavcodec/h264.c
libavcodec/h264.h
libavcodec/h264_cavlc.c
libavcodec/h264_loopfilter.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '6d2b6f21eb45ffbda1103c772060303648714832':
h264: add a parameter to the CABAC macro.
h264: add a parameter to the FIELD_OR_MBAFF_PICTURE macro.
Conflicts:
libavcodec/h264.c
libavcodec/h264_cabac.c
libavcodec/h264_cavlc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '7fa00653a550c0d24b3951c0f9fed6350ecf5ce4':
h264: add a parameter to the FIELD_PICTURE macro.
h264: add a parameter to the FRAME_MBAFF macro.
Conflicts:
libavcodec/h264.c
libavcodec/h264_loopfilter.c
libavcodec/h264_refs.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'da6be8fcec16a94d8084bda8bb8a0a411a96bcf7':
h264: add a parameter to the MB_FIELD macro.
h264: add a parameter to the MB_MBAFF macro.
Conflicts:
libavcodec/h264.c
libavcodec/h264_cabac.c
libavcodec/h264_cavlc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Most of the changes are just trivial are just trivial replacements of
fields from MpegEncContext with equivalent fields in H264Context.
Everything in h264* other than h264.c are those trivial changes.
The nontrivial parts are:
1) extracting a simplified version of the frame management code from
mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
its own more complex system already and those were set only to appease
the mpegvideo parts.
2) some tables that need to be allocated/freed in appropriate places.
3) hwaccels -- mostly trivial replacements.
for dxva, the draw_horiz_band() call is moved from
ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
because it's now different for h264 and MpegEncContext-based
decoders.
4) svq3 -- it does not use h264 complex reference system, so I just
added some very simplistic frame management instead and dropped the
use of ff_h264_frame_start(). Because of this I also had to move some
initialization code to svq3.
Additional fixes for chroma format and bit depth changes by
Janne Grunau <janne-libav@jannau.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The variable is copied to subsequent threads at the same time, so this
may cause wrong ref_count[] values to be copied to subsequent threads.
This bug was found using TSAN and Helgrind.
Original patch by Ronald, adapted with a local_ref_count by Clément,
following the suggestion of Michael Niedermayer.
Signed-off-by: Clément Bœsch <clement.boesch@smartjog.com>
* qatar/master:
log: Only include unistd.h if configure found it
ape: create audio stream before reading tags.
mov: make a length variable larger.
image2: Add "start_number" private option to the demuxer
image2: Add "start_number" private option to the muxer
avconv: remove a forgotten debugging printf.
avconv: use more descriptive names for hardcoded filters.
avconv: remove redundant handling of async.
doc/filters: fix typo.
h264: use asm cabac reader under a generic condition
Conflicts:
ffmpeg.c
libavformat/img2dec.c
libavformat/img2enc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This removes a dependency on implementation details from generic
code and allows easy addition of the equivalent optimisation for
other architectures than x86.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register.
There is a surprisingly large performance improvement over the c version (more
so than the generated assembly seems to suggest) just in get_cabac, I measured
roughly 40% faster for get_cabac on a K8. However, overall the difference is
not that big, I measured roughly 5% on a test clip on a K8 and a Core2.
Hopefully it still compiles on x86 32bit...
Now that only one table is used, there's some chance even darwin as compiles
this (apparently the label arithmetic used previously doesn't work if it
involves symbols defined in a different file, thanks to Ronald S. Bultje for
helping me with this).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The reason is this is easier for PIC code (in particular on darwin...).
Keep the old names as pointers (static in cabac_functions.h so gcc
knows these are just immediate offsets) so the c code can nicely stay the same
(alternatively could use offsets directly in the functions needing the
tables). This should produce the same code as before with non-pic and better
code (confirmed) with pic.
The assembly uses the new table but still won't work for PIC case.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register. get_cabac() gets about 40% faster, for
an overall speedup of about 5%.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The reason is this is easier for PIC code (in particular on darwin...).
Keep the old names as pointers (static in cabac_functions.h so gcc
knows these are just immediate offsets) so the c code can nicely stay the same
(alternatively could use offsets directly in the functions needing the
tables). This should produce the same code as before with non-pic and better
code (confirmed) with pic.
The assembly uses the new table but still won't work for PIC case.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register.
There is a surprisingly large performance improvement over the c version (more
so than the generated assembly seems to suggest) just in get_cabac, I measured
roughly 40% faster for get_cabac on a K8. However, overall the difference is
not that big, I measured roughly 5% on a test clip on a K8 and a Core2.
Hopefully it still compiles on x86 32bit...
v2: incorporated feedback from Loren Merritt to avoid rip-relative movs
for every table, and got rid of unnecessary @GOTPCREL.
v3: apply similar fixes to the the decode_significance functions, and use
same macro arguments for non-pic case.
v4: prettify inline asm arguments, add a non-fast-cmov version (as I expect
the c code to be faster otherwise since both cmov and sbb suck hard on a
Prescott, even can't construct the mask with a 64bit shift as that's just as
terrible - it's quite difficult to find usable instructions on that chip...).
This is tested to work but not on a P4, in theory it _should_ be fast there.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
h264: Factorize declaration of mb_sizes array.
vsrc_buffer: when no frame is available, return an error instead of segfaulting.
configure: add dl to frei0r extralibs.
dsputil x86: use SSE float instruction instead of SSE2 integer equivalent
dsputil x86: remove deprecated parameter from scalarproduct_int16 prototype
vp8dsp x86: perform rounding shift with a single instruction
fate: add BMP tests.
swscale: handle complete dimensions for monoblack/white.
aacenc: Mark deinterleave_input_samples argument as const.
vf_unsharp: Mark readonly variable as const.
h264: fix 4:2:2 PCM-macroblocks decoding
Conflicts:
configure
libavcodec/h264.h
libavcodec/x86/dsputil_mmx.c
libavfilter/vf_unsharp.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master: (58 commits)
amrnbdec: check frame size before decoding.
cscd: use negative error values to indicate decode_init() failures.
h264: prevent overreads in intra PCM decoding.
FATE: do not decode audio in the nuv test.
dxa: set audio stream time base using the sample rate
psx-str: do not allow seeking by bytes
asfdec: Do not set AVCodecContext.frame_size
vqf: set packet parameters after av_new_packet()
mpegaudiodec: use DSPUtil.butterflies_float().
FATE: add mp3 test for sample that exhibited false overreads
fate: add cdxl test for bit line plane arrangement
vmnc: return error on decode_init() failure.
libvorbis: add/update error messages
libvorbis: use AVFifoBuffer for output packet buffer
libvorbis: remove unneeded e_o_s check
libvorbis: check return values for functions that can return errors
libvorbis: use float input instead of s16
libvorbis: do not flush libvorbis analysis if dsp state was not initialized
libvorbis: use VBR by default, with default quality of 3
libvorbis: fix use of minrate/maxrate AVOptions
...
Conflicts:
Changelog
doc/APIchanges
libavcodec/avcodec.h
libavcodec/dpxenc.c
libavcodec/libvorbis.c
libavcodec/vmnc.c
libavformat/asfdec.c
libavformat/id3v2enc.c
libavformat/internal.h
libavformat/mp3enc.c
libavformat/utils.c
libavformat/version.h
libswscale/utils.c
tests/fate/video.mak
tests/ref/fate/nuv
tests/ref/fate/prores-alpha
tests/ref/lavf/ffm
tests/ref/vsynth1/prores
tests/ref/vsynth2/prores
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master: (38 commits)
v210enc: remove redundant check for pix_fmt
wavpack: allow user to disable CRC checking
v210enc: Use Bytestream2 functions
cafdec: Check return value of avio_seek and avoid modifying state if it fails
yop: Check return value of avio_seek and avoid modifying state if it fails
tta: Check return value of avio_seek and avoid modifying state if it fails
tmv: Check return value of avio_seek and avoid modifying state if it fails
r3d: Check return value of avio_seek and avoid modifying state if it fails
nsvdec: Check return value of avio_seek and avoid modifying state if it fails
mpc8: Check return value of avio_seek and avoid modifying state if it fails
jvdec: Check return value of avio_seek and avoid modifying state if it fails
filmstripdec: Check return value of avio_seek and avoid modifying state if it fails
ffmdec: Check return value of avio_seek and avoid modifying state if it fails
dv: Check return value of avio_seek and avoid modifying state if it fails
bink: Check return value of avio_seek and avoid modifying state if it fails
Check AVCodec.pix_fmts in avcodec_open2()
svq3: Prevent illegal reads while parsing extradata.
remove ParseContext1
vc1: use ff_parse_close
mpegvideo parser: move specific fields into private context
...
Conflicts:
libavcodec/4xm.c
libavcodec/aacdec.c
libavcodec/h264.c
libavcodec/h264.h
libavcodec/h264_cabac.c
libavcodec/h264_cavlc.c
libavcodec/mpeg4video_parser.c
libavcodec/svq3.c
libavcodec/v210enc.c
libavformat/cafdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Conversion of the luma intra prediction mode to one of the constrained
("alzheimer") ones can happen by crafting special bitstreams, causing
a crash because we'll call a NULL function pointer for 16x16 block intra
prediction, since constrained intra prediction functions are only
implemented for chroma (8x8 blocks).
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
* qatar/master:
FATE: add tests for targa
ARM: fix Thumb-mode simple_idct_arm
ARM: 4-byte align start of all asm functions
rgb2rgb: rgb12to15()
swscale-test: fix stack overread.
swscale: fix invalid conversions and memory problems.
cabac: split cabac.h into declarations and function definitions
cabac: Mark ff_h264_mps_state array as static, it is only used within cabac.c.
cabac: Remove ff_h264_lps_state array.
Conflicts:
libswscale/rgb2rgb.h
libswscale/swscale_unscaled.c
tests/fate/image.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This fixes standalone compilation of some decoders with --disable-optimizations.
cabac.h defines some inline functions that use symbols from cabac.c. Without
optimizations these inline functions are not eliminated and linking fails with
references to non-existing symbols.
Splitting the inline functions off into their own header and only #including
it in the places where the inline functions are used allows #including cabac.h
from anywhere without ill effects.
* qatar/master:
mpegvideo_enc: K&R cosmetics
doxygen: remove unreplaced variables from custom header and footer
threads: test for sys/param.h and include it for sysctl on OpenBSD
v4l2: remove unneded linux specific asm/types.h include
x86: Fix constraints for decode_significance*_x86
Conflicts:
libavcodec/mpegvideo_enc.c
libavdevice/v4l2.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Originally, prior to 8742a4ff8, the caller code was compiled
within this condition:
ARCH_X86 && HAVE_7REGS && HAVE_EBX_AVAILABLE && !defined(BROKEN_RELOCATIONS)
Since HAVE_7REGS is defined as
(ARCH_X86_64 || (HAVE_EBX_AVAILABLE && HAVE_EBP_AVAILABLE))
the subcondition HAVE_7REGS && HAVE_EBX_AVAILABLE is equal
to HAVE_7REGS (for 32 bit at least). The correct simplification
of the original condition thus is HAVE_7REGS, not
HAVE_EBX_AVAILABLE.
This fixes compilation in some cases where HAVE_EBP_AVAILABLE = 0
and HAVE_EBX_AVAILABLE = 1.
Signed-off-by: Martin Storsjö <martin@martin.st>