* commit 'e588615d938f8581f0d6f3771662d08cadfc00de':
hevc: fix decoding of one PU wide files
Conflicts:
libavcodec/hevc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'ca96e337169093979d7c763064ad9dae12b3108c':
vp9: drop support for real (non-emulated) edges
Conflicts:
libavcodec/vp9block.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'ef8c93e2f18c624d0c266687e43ab99af7921dd3':
vp8: drop support for real (non-emulated) edges
Conflicts:
tests/fate/vpx.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'ebfe622bb1ca57cecb932e42926745cba7161913':
mpegvideo: drop support for real (non-emulated) edges
Conflicts:
libavcodec/mpegvideo.c
libavcodec/mpegvideo_motion.c
libavcodec/wmv2.c
If this is slower on a major platform then it should be investigated
and potentially reverted.
See: 8fc52a5ef9
See: 3969b4b861
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fix PTS set on the frame when encoding, which must be specified in the
encoder timebase or this will confuse the encoder.
When muxing the packet, the PTS/DTS generated by the encoder is then
rescaled to the stream timebase.
They are not measurably faster on x86, they might be somewhat faster on
other platforms due to missing emu edge SIMD, but the gain is not large
enough to justify the added complexity.
They are not measurably faster on x86, they might be somewhat faster on
other platforms due to missing emu edge SIMD, but the gain is not large
enough to justify the added complexity.
Several decoders disable those anyway and they are not measurably faster
on x86. They might be somewhat faster on other platforms due to missing
emu edge SIMD, but the gain is not large enough (and those decoders
relevant enough) to justify the added complexity.
untested (noone tested within about a month) and the change is
quite trivial so should be ok. While the code before this change
is broken.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
If the audio changes from 9eac7c4 were merged as they were, this
would cause scripts with both video+audio to fail with a lot of
audio decoding errors (the video would be fine). Scripts with
only one of either video or audio were unaffected. Additionally,
the av_packet changes in general caused seeking to break.
Using av_packet_from_data allows video+audio scripts to work as
expected, without audio decoding errors. It also fixes seeking.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
These are the remaining av_packet-related bits from 9eac7c4
that didn't get merged at that time.
Changes authored by Anton Khirnov <anton@khirnov.net>, split out
from 9eac7c4 by Stephen Hutchinson <qyot27@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5dae4872357613a0b51120b54a4c5221e0ec3f69':
arm: Allow overriding the alignment set in the function macro
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'b7b932f5e3602bd34c3cc634b71c8bbbc0fb8dc0':
arm: Remove a leftover define for the pld instruction
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The header parser uses forward and backward parsing, making the
bulletproof prevention of loops difficult, thus this simple
detection code.
If someone improves the forward/backward parsing so it cannot loop
then this commit should be reverted
Fixes Ticket3278
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Runtime of the full 32x32 idct goes from 2446 to 2441 cycles (intra) or
from 1425 to 1306 cycles (inter). Overall runtime is not significantly
affected.
Runtime of all IDCTs together goes from 3327 to 2473 cycles (intra, i.e.
~35% faster) or from 2312 to 1448 cycles (inter, i.e. ~60% faster). Total
decode time of ped1080p.webm goes from 8.086sec to 7.974sec (1.4% faster).
Sub-IDCTs will follow later. ped1080.webm goes from 9.295s to 8.191s
(13.5% faster). The IDCT itself goes from 4372 (intra) or 4337 (inter)
to 403 (intra) or 329 (inter) cycles for the DC-only form, 23755 (intra)
or 23723 (inter) to 3497 (intra) or 3607 (inter) cycles for the no-DC
form, which averages from 23393 (intra) or 16612 (inter) to 3449 (intra)
or 2392 (inter) for all 32x32s together, i.e. about ~7x faster (all
tests done on ped1080p.webm).