1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00
Commit Graph

21529 Commits

Author SHA1 Message Date
Mark Thompson
ff35aa8ca4 vaapi_encode: Pass framerate parameters to driver
Only do this when building for a recent VAAPI version - initial
driver implementations were confused about the interpretation of the
framerate field, but hopefully this will be consistent everywhere
once 0.40.0 is released.
2017-01-30 22:52:54 +00:00
Mark Thompson
eddfb57210 vaapi_h264: Enable VBR mode
Default to using VBR when a target bitrate is set, unless the max rate
is also set and matches the target.  Changes to the Intel driver mean
that min_qp is also respected in this case, so set a codec default to
unset the value rather than using the current default inherited from
the MPEG-4 part 2 encoder.
2017-01-30 22:52:54 +00:00
Mark Thompson
f033ba470f vaapi_encode: Support VBR mode
This includes a backward-compatibility hack to choose CBR anyway on
old drivers which have no CBR support, so that existing programs will
continue to work their options now map to VBR.
2017-01-30 22:52:54 +00:00
Mark Thompson
ca6ae3b77a vaapi_encode: Add MPEG-2 support 2017-01-29 13:28:31 +00:00
Alexandra Hájková
381a4e31a6 tak: Convert to the new bitstream reader 2017-01-25 11:06:58 +01:00
Diego Biurrun
2e0e150144 magicyuv: Convert to the new bitstream reader 2017-01-25 10:38:43 +01:00
Diego Biurrun
b061f298f7 truemotion2rt: Convert to the new bitstream reader 2017-01-25 09:55:36 +01:00
Alexandra Hájková
e7f24c9ffc wavpack: Convert to the new bitstream reader 2017-01-25 09:55:35 +01:00
Alexandra Hájková
6668bc80b5 mpc: Convert to the new bitstream reader 2017-01-25 09:55:33 +01:00
Alexandra Hájková
fd8de7f2d8 dxtory: Convert to the new bitstream reader 2017-01-20 10:18:32 +01:00
Alexandra Hájková
4d49a4c550 apedec: Convert to the new bitstream reader 2017-01-20 10:18:32 +01:00
Anton Khirnov
b4a911c189 mpegvideoenc: make a table const 2017-01-19 09:52:21 +01:00
Anton Khirnov
296eff4d9d zmbvenc: get rid of a global table 2017-01-19 09:52:10 +01:00
Derek Buitenhuis
00b775dda2 hevc: Mark as having threadsafe init
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2017-01-19 09:51:15 +01:00
Alexandra Hájková
54dcd22885 als: Convert to the new bitstream reader 2017-01-17 09:52:11 +01:00
Luca Barbato
fb59f87ce7 nvenc: Explicitly push the cuda context on encoding
Make sure that NVENC does not misbehave if other cuda usages happen
in the application.
2017-01-17 07:37:12 +01:00
Alexandra Hájková
4795e4f61f alac: Convert to the new bitstream reader 2017-01-13 10:27:03 +01:00
Luca Barbato
f8f7ad758d qsv: Set the correct range for la_depth
Setting an invalid range for it makes the encoder behave inconsistently.
2017-01-13 08:42:10 +01:00
Anton Khirnov
1202b71269 theora: export cropping information instead of handling it internally 2017-01-12 16:29:17 +01:00
Anton Khirnov
c3e84820d6 h264dec: export cropping information instead of handling it internally 2017-01-12 16:29:12 +01:00
Anton Khirnov
4fded0480f h264dec: be more explicit in handling container cropping
The current condition can trigger in cases where it shouldn't, with
unexpected results.
Make sure that:
- container cropping is really based on the original dimensions from the
  caller
- those dimenions are discarded on size change

The code is still quite hacky and eventually should be deprecated and
removed, with the decision about which cropping is used delegated to the
caller.
2017-01-12 16:28:05 +01:00
Anton Khirnov
a02ae1c683 hevcdec: export cropping information instead of handling it internally 2017-01-12 16:27:56 +01:00
Anton Khirnov
019ab88a95 lavc: add an option for exporting cropping information to the caller
Also, add generic code for handling cropping, so the decoders can export
just the cropping size and not bother with the rest.
2017-01-12 16:24:15 +01:00
Anton Khirnov
b68e353136 qsvdec: do not sync PIX_FMT_QSV surfaces
Introducing enforced sync points in arbitrary places is bad for
performance. Since the vast majority of receiving code (QSV VPP or
encoders, retrieving frames through hwcontext) will do the syncing, this
change should not be visible to most callers. But bumping micro just in
case.

This is also consistent with what VAAPI hwaccel does.
2017-01-12 16:21:39 +01:00
Steve Lhomme
ac3c3ee678 dxva2: allow an empty array of ID3D11VideoDecoderOutputView
We can pick the correct slice index directly from the ID3D11VideoDecoderOutputView
casted from data[3].

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2017-01-12 16:19:13 +01:00
Steve Lhomme
f67235a28c dxva2: get the slice number directly from the surface in D3D11VA
No need to loop through the known surfaces, we'll use the requested surface
anyway.

The loop is only done for DXVA2.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2017-01-12 16:09:41 +01:00
Mark Thompson
89725a8512 vaapi_h264: Scale log2_max_pic_order_cnt_lsb with max_b_frames
Before this change, it was possible to overflow pic_order_cnt_lsb and
generate a stream with invalid POC numbering.  This makes sure that
the field is large enough that a single IDR B* P sequence uses fewer
than half the available POC lsb values.
2017-01-11 23:03:58 +00:00
Mark Thompson
a3c3a5eac2 vaapi_encode: Support forcing IDR frames via AVFrame.pict_type 2017-01-11 23:03:58 +00:00
Mark Thompson
37fab0661a vaapi_encode: Fix GOP sizing
This change makes the configured GOP size be respected exactly -
previously the value could be exceeded slightly due to flaws in the
frame type selection logic.
2017-01-11 23:03:58 +00:00
Alexandra Hájková
bd6496fa07 interplayvideo: Convert to the new bitstream reader 2017-01-09 15:21:47 +01:00
Alexandra Hájková
4e25051031 adx: Convert to the new bitstream reader 2017-01-09 15:21:47 +01:00
Alexandra Hájková
9aec009f65 dvbsubdec: Convert to the new bitstream reader 2017-01-09 15:21:47 +01:00
Alexandra Hájková
d7fe11634c motionpixels: Convert to the new bitstream reader 2017-01-09 15:18:16 +01:00
Anton Khirnov
f1af37b510 h264dec: make ff_h264_decode_init() static
It is not called from outside h264dec.c anymore.
2017-01-09 13:21:13 +01:00
Anton Khirnov
e7de05f98f h264dec: drop a redundant check
Cropping parameters are already checked for validity during SPS parsing,
no need to check them again.
2017-01-09 13:21:13 +01:00
Steve Lhomme
2835e9a9fd hevcdec: add P010 support for D3D11VA
Given it's the same API than DVXA2 I don't know why the same output was not
enabled for both.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2017-01-09 10:48:54 +01:00
Steve Lhomme
0ac2d86c47 dxva2: Factorize DXVA context validity test into a single macro
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2017-01-08 16:41:24 +01:00
Steve Lhomme
f8a42d4f26 dxva2: Make ff_dxva2_get_surface() static and drop its name prefix
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2017-01-08 16:41:07 +01:00
Jun Zhao
9b1db2d338 vaapi_h264: Fix POC on IDR frames
In H.264 section 8.2.1, we have that "The bitstream shall not contain
data that result in Min(TopFieldOrderCnt, BottomFieldOrderCnt) not
equal to 0 for a coded IDR frame".  This fixes the encoder to always
conform to this - previously the POC values formed an unbroken
sequence, not resetting to zero on IDR frames.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
2017-01-04 21:52:06 +00:00
Mark Thompson
d08e02d929 vaapi_h265: Fix build failure with old libva without 10-bit surfaces
10-bit surface support was added in libva 1.6.2, earlier versions
support H.265 encoding in 8-bit only.
2017-01-04 21:49:41 +00:00
Martin Storsjö
85ad5ea72c aarch64: vp9mc: Fix a comment to refer to a register with the right name
Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-03 14:16:10 +02:00
Martin Storsjö
65074791e8 aarch64: vp9dsp: Fix vertical alignment in the init file
Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-03 14:15:58 +02:00
Martin Storsjö
c536e5e869 arm: vp9mc: Fix vertical alignment of operands
Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-03 14:15:45 +02:00
Diego Biurrun
53618054b6 parser: Add missing #include for printing ISO C99 conversion specifiers 2016-12-25 13:22:50 +01:00
Diego Biurrun
0b77a59336 Use correct printf conversion specifiers for POSIX integer types 2016-12-23 19:30:00 +01:00
Diego Biurrun
92db508307 build: Generate pkg-config files from Make and not from configure
This moves work from the configure to the Make stage where it can
be parallelized and ensures that pkgconfig files are updated when
library versions change.

Bug-Id: 449
2016-12-22 12:30:54 +01:00
Diego Biurrun
f9edc734e0 ratecontrol: Drop xvid-rc-related struct members unused after a6901b9c6 2016-12-21 11:13:20 +01:00
Ruta Gadkari
5b26d3b789 nvenc: Update check for lookahead
By default it is -1.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2016-12-21 06:16:52 +01:00
Martin Storsjö
a0c443a398 aarch64: vp9itxfm: Use the offset parameter to movrel
This fixes build failures for iOS, broken since cad42fadcd.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-12-19 22:49:51 +02:00
Alexandra Hájková
fc322d6a70 tta: Convert to the new bitstream reader 2016-12-19 13:52:36 +01:00
Alexandra Hájková
00c72a1e01 mlp: Convert to the new bitstream reader 2016-12-19 13:22:29 +01:00
Alexandra Hájková
fa64aea12e unary: Convert to the new bitstream reader 2016-12-19 12:35:05 +01:00
Anton Khirnov
45286a625c h264dec: make sure to only end a field if it has been started
Calling ff_h264_field_end() when the per-field state is not properly
initialized leads to all kinds of undefined behaviour.

CC: libav-stable@libav.org
Bug-Id: 977 978 992
2016-12-19 08:15:58 +01:00
Anton Khirnov
c2fa6bb0e8 mpeg12dec: move setting first_field to mpeg_field_start()
For field picture, the first_field is set based on its previous value.
Before this commit, first_field is set when reading the picture
coding extension. However, in corrupted files there may be multiple
picture coding extension headers, so the final value of first_field that
is actually used during decoding can be wrong. That can lead to various
undefined behaviour, like predicting from a non-existing field.

Fix this problem, by setting first_field in mpeg_field_start(), which
should be called exactly once per field.

CC: libav-stable@libav.org
Bug-ID: 999
2016-12-19 08:15:49 +01:00
Anton Khirnov
e807491fc6 mpeg12dec: avoid signed overflow in bitrate calculation
CC: libav-stable@libav.org
Bug-Id: 981
Found-By: Agostino Sarubbo
2016-12-19 08:15:42 +01:00
Anton Khirnov
58405de095 mpegvideo_parser: avoid signed overflow in bitrate calculation
CC: libav-stable@libav.org
Bug-Id: 981
Found-By: Agostino Sarubbo
2016-12-19 08:15:07 +01:00
Anton Khirnov
cfa4eb4fba vaapi_decode: use the correct logging context 2016-12-19 08:13:28 +01:00
Anton Khirnov
ea8b730d8e hevcdec: add a VAAPI hwaccel
Partially based on a patch by Timo Rothenpieler <timo@rothenpieler.org>.
Additional scaling list handling fix by Jun Zhao <mypopydev@gmail.com>.
2016-12-19 08:13:08 +01:00
Anton Khirnov
d4a91e6534 pthread_frame: do not run hwaccel decoding asynchronously unless it's safe
Certain hardware decoding APIs are not guaranteed to be thread-safe, so
having the user access decoded hardware surfaces while the decoder is
running in another thread can cause failures (this is mainly known to
happen with DXVA2).

For such hwaccels, only allow the decoding thread to run while the user
is inside a lavc decode call (avcodec_send_packet/receive_frame).
2016-12-19 08:10:22 +01:00
Anton Khirnov
8dfba25ce8 pthread_frame: ensure the threads don't run simultaneously with hwaccel 2016-12-19 08:09:19 +01:00
Anton Khirnov
373fd76b4d hevcdec: do not set decoder-global SPS prematurely
It should only be set after the decoder state has been fully initialized
for using that SPS.
Fixes possible invalid reads on get_format() failure.

CC: libav-stable@libav.org
2016-12-19 08:07:15 +01:00
Janne Grunau
2425d7329f arm64: replace 'bic' with immediate with 'and' with inverted immediate
The former is not an official pseudo instruction although gas and llvm's
internal assembler support it. Fixes a build error with xcode 6.2
reported by Memphiz on github.
2016-12-14 21:53:05 +01:00
Diego Biurrun
ea7ee4b4e3 ppc: Centralize compiler-specific altivec.h #include handling in one place
Also move #includes into canonical order where appropriate.
2016-12-14 14:08:43 +01:00
Diego Biurrun
39929e55eb ppc: hevcdsp: Use shorthands for vector types
This is more consistent and fixes compilation with clang.
2016-12-14 14:08:43 +01:00
Diego Biurrun
554e55bbf0 decode.h: Add missing headers to fix standalone compilation 2016-12-14 14:08:43 +01:00
Wan-Teh Chang
343e283399 pthread_frame: use better memory orders for frame progress
This improves commit 59c7022740.

In ff_thread_report_progress(), the fast code path can load
progress[field] with the relaxed memory order, and the slow code path
can store progress[field] with the release memory order. These changes
are mainly intended to avoid confusion when one inspects the source code.
They are unlikely to have measurable performance improvement.

ff_thread_report_progress() and ff_thread_await_progress() form a pair.
ff_thread_await_progress() reads progress[field] with the acquire memory
order (in the fast code path). Therefore, one expects to see
ff_thread_report_progress() write progress[field] with the matching
release memory order.

In the fast code path in ff_thread_report_progress(), the atomic load of
progress[field] doesn't need the acquire memory order because the
calling thread is trying to make the data it just decoded visible to the
other threads, rather than trying to read the data decoded by other
threads.

In ff_thread_get_buffer(), initialize progress[0] and progress[1] using
atomic_init().

Signed-off-by: Wan-Teh Chang <wtc@google.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-12-14 11:16:51 +01:00
Derek Buitenhuis
5c7f2cf81d h264_slice: Wait for refs to be available before we use them in error concealment
This could happen when there was a frame number gap and frame threading was used.

Debugging-by: Ronald S. Bultje <rsbultje@gmail.com>
Debugging-by: Justin Ruggles <justin.ruggles@gmail.com>
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>

CC:libav-stable@libav.org
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-12-14 10:38:15 +01:00
Anton Khirnov
86157e6db2 hevc: decouple calling get_format() from exporting the SPS parameters
This makes sure ff_get_format() does not get called unnecessarily from
update_thread_context().
2016-12-14 09:06:45 +01:00
Anton Khirnov
730c023260 binkaudio: switch to the new send/receive API
It is more natural for this codec and allows to avoid awkward constructs
like "consuming 0 bytes from input". Also, keep a reference to the input
packet to avoid unnecessary copying.
2016-12-14 09:06:45 +01:00
Anton Khirnov
fa1749dd34 vp9: split superframes in the filtering stage before actual decoding
Significantly increases the efficiency of frame threading, since
individual frames in a superframe can now be decoded in parallel.
2016-12-14 09:06:45 +01:00
Anton Khirnov
03a80925ef lavc: add a bitstream filter for splitting VP9 superframes
Partially based on code by Ronald S. Bultje <rsbultje@gmail.com>.
2016-12-14 09:06:45 +01:00
Anton Khirnov
8fb4210ad8 qsvdec_h2645: switch to the new generic filtering mechanism
Drop the internal manual conversion from the MP4 format to Annex B.
2016-12-14 09:06:45 +01:00
Anton Khirnov
972c71e9cb lavc: add support for filtering packets before decoding 2016-12-14 09:06:45 +01:00
Anton Khirnov
061a0c14bb decode: restructure the core decoding code
Currently, the new decoding API is pretty much just a wrapper around the
old deprecated one. This is problematic, since it interferes with making
full use of the flexibility added by the new API. The old API should
also be removed at some future point.

Reorganize the code so that the new send_packet/receive_frame functions
call the actual decoding directly and change the old deprecated
avcodec_decode_* functions into wrappers around the new API.

The new internal API for decoders is now changing as well. Before this
commit, it mirrors the public API, so the decoders need to implement
send_packet() and receive_frame() callbacks. This turns out to require
awkward constructs in both the decoders and the generic code. After this
commit, the decoders only implement the receive_frame() callback and
call a new internal function, ff_decode_get_packet() to obtain input
data, in the same manner to how the bitstream filters now work.

avcodec will now always make a reference to the input packet, which means
that non-refcounted input packets will be copied. Keeping the previous
behaviour, where this copy could sometimes be avoided, would make the
code significantly more complex and fragile for only dubious gains,
since packets are typically small and everyone who cares about
performance should use refcounted packets anyway.
2016-12-14 09:06:44 +01:00
Anton Khirnov
549d0bdca5 decode: be more explicit about storing the last packet properties
The current code stores a pointer to the packet passed to the decoder,
which is then used during get_buffer() for timestamps and side data
passthrough. However, since this is a pointer to user data which we do
not own, storing it is potentially dangerous. It is also ill defined for
the new decoding API with split input/output.

Fix this problem by making an explicit internally owned copy of the
packet properties.
2016-12-14 09:06:44 +01:00
Anton Khirnov
47e547b321 lavc: add a null bitstream filter
It is useful for testing/debugging and will also be used as the default
filter in the following commit adding pre-decode filtering to avoid
having a separate non-filtered codepath.
2016-12-14 09:06:44 +01:00
Anton Khirnov
0309ddcfb2 lavc: handle MP3 in get_audio_frame_duration() 2016-12-14 09:06:44 +01:00
Diego Biurrun
6aa4ba7131 dxva2: Keep code shared between dxva2 and d3d11va under the correct #if
This partially reverts commit ac648bb835.
2016-12-12 13:44:25 +01:00
Alexandra Hajkova
b0e6b3f477 hevc: ppc: Add HEVC 4x4 IDCT for PowerPC
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-12-12 09:25:16 +01:00
Diego Biurrun
a6901b9c6b Drop libxvid rate control support for mpegvideo encoding
The feature has outlived is usefulness and complicates the code.
2016-12-11 09:27:40 +01:00
Diego Biurrun
ac648bb835 dxva2: Simplify some ifdefs 2016-12-11 09:27:40 +01:00
Mark Thompson
7d81698b89 vaapi_h265: Fix CFR mode with framerate set in AVCodecContext
Same issue as 17a0f9481c.
2016-12-10 16:55:44 +00:00
Diego Biurrun
932cc6496e vdpau: Do not #include vdpau_x11.h from the main vdpau header
That header should only be included in the special bits that use X11 code.
2016-12-09 08:41:53 +01:00
Diego Biurrun
92e6b31c3b dxva2: Adjust multiple inclusion guard names to follow convention 2016-12-09 08:41:52 +01:00
Andreas Cadhalpun
fc85646ad4 libopusdec: fix out-of-bounds read
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2016-12-08 15:53:58 -05:00
Andreas Cadhalpun
dc2ad09493 libschroedingerdec: fix leaking of framewithpts
Also preserve the return value from ff_get_buffer().

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-12-08 15:53:58 -05:00
Andreas Cadhalpun
8c3a643808 libschroedingerdec: don't produce empty frames
They are not valid and can cause problems/crashes for API users.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2016-12-08 15:53:58 -05:00
Timothy Gu
d3da8a0035 omx: Fix allocation check
Also use av_mallocz_array().

Bug-Id: CID 1396839
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-12-08 15:53:58 -05:00
Timothy Gu
d32bdadda8 qsvdec: Fix memory leak on error
Bug-Id: CID 1396851
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-12-08 15:53:58 -05:00
Diego Biurrun
d5759701a8 libkvazaar: Add missing header #includes
This fixes compilation after the next version bump.
2016-12-08 21:34:30 +01:00
Diego Biurrun
fbec58daa2 build: Add an internal component for hevc_ps code
This allows expressing dependencies in a more correct way.
2016-12-08 20:12:23 +01:00
Vittorio Giovara
2fb6acd9c2 lavc: Add spherical packet side data API
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-12-07 14:34:34 -05:00
Diego Biurrun
624aa8ab22 build: Add missing Makefile entries and ifdefs for QSV hwaccels 2016-12-07 15:46:57 +01:00
Diego Biurrun
e1dc5358af build: Create a component for MPEG audio header decoding
Fixes standalone compilation of the libmp3lame encoder.
2016-12-05 16:13:05 +01:00
Diego Biurrun
0fdc9f81a0 build: Add missing hevc_ps dependency for QSV HEVC encoder 2016-12-05 16:13:04 +01:00
Alexandra Hájková
6c916192f3 mimic: Convert to the new bitstream reader 2016-12-03 14:36:03 +01:00
Alexandra Hájková
cdc6727c3e metasound: Convert to the new bitstream reader 2016-12-03 14:36:03 +01:00
Alexandra Hájková
6fad5abcad lagarith: Convert to the new bitstream reader 2016-12-03 14:36:03 +01:00
Alexandra Hájková
c3defda0d8 indeo: Convert to the new bitstream reader 2016-12-03 14:36:03 +01:00
Alexandra Hájková
f5b7bd2a7c imc: Convert to the new bitstream reader 2016-12-03 14:36:03 +01:00
Alexandra Hájková
39ecf0588f webp: Convert to the new bitstream reader 2016-12-03 14:36:03 +01:00
James Almer
33a2b73b98 mpeg4audio: correctly propagate meaningful error values
Signed-off-by: James Almer <jamrial@gmail.com>
2016-12-02 12:16:30 -05:00
Wan-Teh Chang
d82d5379ca mmaldec: initialize refcount using atomic_init()
This is how we initialize refcount in libavutil/buffer.c.

Signed-off-by: Wan-Teh Chang <wtc@google.com>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-12-02 12:16:26 -05:00
Vittorio Giovara
5168026a05 options_table: Do not rely on enum size as option bound
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-12-02 11:36:46 -05:00
Vittorio Giovara
ff9db5cfd1 lavc: Use a stricter check for the color properties values
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-12-02 11:36:42 -05:00
Diego Biurrun
0a35f128f3 cabac: x86: Give optimizations header a more meaningful name 2016-12-01 08:23:54 +01:00
Martin Storsjö
cad42fadcd aarch64: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

vp9_inv_dct_dct_16x16_sub16_add_neon:   1373.2
vp9_inv_dct_dct_32x32_sub32_add_neon:   8089.0

By skipping individual 8x16 or 8x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     235.3
vp9_inv_dct_dct_16x16_sub2_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub8_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   1372.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   1372.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     555.1
vp9_inv_dct_dct_32x32_sub2_add_neon:    5190.2
vp9_inv_dct_dct_32x32_sub4_add_neon:    5180.0
vp9_inv_dct_dct_32x32_sub8_add_neon:    5183.1
vp9_inv_dct_dct_32x32_sub12_add_neon:   6161.5
vp9_inv_dct_dct_32x32_sub16_add_neon:   6155.5
vp9_inv_dct_dct_32x32_sub20_add_neon:   7136.3
vp9_inv_dct_dct_32x32_sub24_add_neon:   7128.4
vp9_inv_dct_dct_32x32_sub28_add_neon:   8098.9
vp9_inv_dct_dct_32x32_sub32_add_neon:   8098.8

I.e. in general a very minor overhead for the full subpartition case due
to the additional cmps, but a significant speedup for the cases when we
only need to process a small part of the actual input data.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-30 23:57:05 +02:00
Martin Storsjö
9c8bc74c2b arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

                                     Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub16_add_neon:   3188.1   2435.4   2499.0   1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon:  18531.7  16582.3  14207.6  12000.3

By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     274.6    189.5    211.7    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    2064.0   1534.8   1719.4   1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    2135.0   1477.2   1736.3   1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon:    2446.7   1828.7   1993.6   1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   2832.4   2118.3   2266.5   1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.7   2475.3   2523.5   1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     756.2    456.7    862.0    553.9
vp9_inv_dct_dct_32x32_sub2_add_neon:   10682.2   8190.4   8539.2   6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10813.5   8014.9   8518.3   6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon:   11859.6   9313.0   9347.4   7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon:  12946.6  10752.4  10192.2   8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon:  14074.6  11946.5  11001.4   9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon:  15269.9  13662.7  11816.1   9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon:  16327.9  14940.1  12626.7  10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17462.7  15776.1  13446.2  11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon:  18575.5  17157.0  14249.3  12015.1

I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.

In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-30 23:54:07 +02:00
Martin Storsjö
3c87039a40 arm: vp9itxfm: Only reload the idct coeffs for the iadst_idct combination
This avoids reloading them if they haven't been clobbered, if the
first pass also was idct.

This is similar to what was done in the aarch64 version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-30 23:53:52 +02:00
Clément Bœsch
c4c5f5386c vp9dsp: add DC only versions for idct/idct.
before:

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m11.125s
user    0m11.059s
sys     0m0.050s

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m10.944s
user    0m10.819s
sys     0m0.064s

after:

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m8.153s
user    0m8.034s
sys     0m0.050s

time ./avconv -v 0 -nostats -threads 1 -i sintel_vp9_500kbps.webm -f null -
real    0m8.038s
user    0m7.980s
sys     0m0.039s

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-30 23:48:28 +02:00
Diego Biurrun
e4382a4ab4 hevc: Eliminate pointless variable indirection 2016-11-30 14:11:44 +01:00
Diego Biurrun
5c89022542 hevc: Drop pointless av_unused attribute 2016-11-30 14:11:43 +01:00
Diego Biurrun
0983f9117f metasound: Drop unused tables 2016-11-30 13:44:05 +01:00
Diego Biurrun
212c6a1d70 mjpegdec: Check return values of functions that may fail 2016-11-29 13:13:35 +01:00
Diego Biurrun
3ee5f25d37 dxva2: Adjust printf length modifiers where appropriate 2016-11-29 13:13:35 +01:00
Anton Khirnov
3fe2a01df7 lavc: move decoding-related code from utils.c to a new file 2016-11-29 10:39:20 +01:00
Anton Khirnov
328cd2b599 lavc: move encoding-related code from utils.c to a new file 2016-11-29 10:39:20 +01:00
James Almer
45d199d5b0 aac_adtstoasc_bsf: validate and forward extradata if the stream is already ASC
Fixes AAC AudioSpecificConfig passthrough.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-29 10:39:20 +01:00
Andreas Cadhalpun
1762a39e09 mss2: only use error correction for matching block counts
This fixes a heap-buffer-overflow in ff_er_frame_end when decoding mss2
with coded_width/coded_height larger than width/height.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2016-11-29 10:38:01 +01:00
Diego Biurrun
eb135516e6 ac3enc: Avoid unnecessary macro indirections 2016-11-28 17:19:30 +01:00
Diego Biurrun
f0d3e43bd7 ac3enc: Reshuffle functions to avoid forward declarations 2016-11-28 17:19:30 +01:00
Diego Biurrun
e22c63ac74 ac3enc: Reshuffle some float/fixed-mode ifdefs to avoid a dummy function 2016-11-28 17:19:30 +01:00
Anton Khirnov
4adbb44ad1 tta: avoid undefined shifts
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-11-25 21:42:33 +01:00
Anton Khirnov
dc4b625028 tta: use get_unary() instead of a custom implementation
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-11-25 21:42:33 +01:00
Martin Storsjö
2f99117f6f aarch64: vp9itxfm: Don't repeatedly set x9 when nothing overwrites it
Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-24 13:39:21 +02:00
Alexandra Hájková
178b4ea5f9 xsubdec: Convert to the new bitstream reader 2016-11-24 11:22:13 +01:00
Alexandra Hájková
be35ef92a4 xan: Convert to the new bitstream reader 2016-11-24 11:22:13 +01:00
Alexandra Hájková
f9c59f26c8 wnv1: Convert to the new bitstream reader 2016-11-24 11:22:13 +01:00
Alexandra Hájková
0536e7d782 vima: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
e5bdfc6790 vble: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
104a4289f9 utvideodec: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
85f760fedd twinvq: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
0bea79afa6 tscc2: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
8e4cadea5d truespeech: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
0ac07d0b8d tiertex: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
9ab1a3e283 truemotion2: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
9f78e3a46d svq1dec: Convert to the new bitstream reader 2016-11-24 11:22:12 +01:00
Alexandra Hájková
6efbc88a5c smacker: Convert to the new bitstream reader 2016-11-24 11:22:11 +01:00
Alexandra Hájková
087bc8d704 sipr: Convert to the new bitstream reader 2016-11-24 11:22:11 +01:00
Alexandra Hájková
f26cbb555b rtjpeg: Convert to the new bitstream reader 2016-11-24 11:22:11 +01:00
Alexandra Hájková
c60cda7cb4 ra288: Convert to the new bitstream reader 2016-11-24 11:22:11 +01:00
Alexandra Hájková
7d8075cf47 ra144: Convert to the new bitstream reader 2016-11-24 11:22:11 +01:00
Martin Storsjö
79566ec8c7 arm: vp9itxfm: Rename a macro parameter to fit better
Since the same parameter is used for both input and output,
the name inout is more fitting.

This matches the naming used below in the dmbutterfly macro.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-23 23:56:56 +02:00
Martin Storsjö
721bc37522 arm/aarch64: vp9itxfm: Fix indentation of macro arguments
Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-23 23:56:16 +02:00
James Almer
aa498c3183 avpacket: fix leak on realloc in av_packet_add_side_data()
If realloc fails, the pointer is overwritten and the previously allocated buffer
is leaked, which goes against the expected functionality of keeping the packet
unchanged in case of error.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-23 13:17:52 +01:00
Andreas Cadhalpun
f92d7bdfdd libopusdec: default to stereo for invalid number of channels
This fixes an out-of-bounds read if avc->channels is 0.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-23 13:03:15 +01:00
Diego Biurrun
b34c6cd57a dvbsub: cosmetics: Group all debug code together 2016-11-23 07:40:46 +01:00
Diego Biurrun
b8cd7a3c8d dvbsub: Check for errors from system()
libavcodec/dvbsubdec.c:145:5: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result [-Wunused-result]
libavcodec/dvbsubdec.c:148:5: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result [-Wunused-result]
2016-11-23 07:36:32 +01:00
Diego Biurrun
6427379f23 als: Restructure DEBUG ifdefs to avoid unused function parameter warnings 2016-11-22 17:28:17 +01:00
Diego Biurrun
367f95af55 ac3enc: Restructure DEBUG ifdefs to avoid unused function parameter warnings 2016-11-22 17:28:17 +01:00
Diego Biurrun
81a3c42abe Drop some bogus Doxygen documentation. 2016-11-21 14:29:11 +01:00
Diego Biurrun
a1d9de304f Fix some mismatches between function parameter and doxygen parameter names. 2016-11-21 14:29:10 +01:00
Martin Storsjö
4d960a1185 aarch64: vp9itxfm: Use w3 instead of x3 for the int eob parameter
The clobbering tests in checkasm are only invoked when testing
correctness, so this bug didn't show up when benchmarking the
dc-only version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-18 23:17:33 +02:00
Janne Grunau
e5b0fc170f arm: vp9itxfm: Simplify the stack alignment code
This is one instruction less for thumb, and only have got
1/2 arm/thumb specific instructions.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-18 23:17:26 +02:00
Alexandra Hájková
0b5a26e8bc qdm2: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:36:22 +01:00
Alexandra Hájková
0dabd329e8 qcelp: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:36:18 +01:00
Alexandra Hájková
770406d1e8 pcx: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:36:14 +01:00
Alexandra Hájková
b3441350fa opus: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:36:11 +01:00
Alexandra Hájková
6f94a64bd6 nellymoser: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:36:08 +01:00
Alexandra Hájková
15d4dbfd4a jvdec: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:36:04 +01:00
Alexandra Hájková
1df549bfa2 hqx: Convert to the new bitstream header
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:35:43 +01:00
Alexandra Hájková
c5e01d9170 hq_hqa: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:35:39 +01:00
Alexandra Hájková
b2c56301f9 gsm: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:35:36 +01:00
Alexandra Hájková
2188d53906 g72x: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:35:33 +01:00
Alexandra Hájková
799703c3ea g2meet: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:35:26 +01:00
Alexandra Hájková
b37b681f77 fraps: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:35:14 +01:00
Alexandra Hájková
692ba4fe64 flashsv: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:35:10 +01:00
Alexandra Hájková
418ccdd703 faxcompr: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:35:07 +01:00
Alexandra Hájková
8df1ac6b78 exr: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:35:04 +01:00
Alexandra Hájková
2906d8dcb3 escape130: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:35:01 +01:00
Alexandra Hájková
c43eb73172 escape124: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:34:57 +01:00
Alexandra Hájková
d8618570be dvdsubdec: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:34:53 +01:00
Alexandra Hájková
928f8c7ce3 dss_sp: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:34:38 +01:00
Alexandra Hájková
942e84d2a3 cook: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:34:32 +01:00
Alexandra Hájková
e561146611 cljrdec: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:34:29 +01:00
Alexandra Hájková
b4c0daa83c cdxl: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:34:24 +01:00
Alexandra Hájková
0977a7c2f6 binkaudio: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:34:15 +01:00
Alexandra Hájková
9a23b59943 bink: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:34:10 +01:00
Alexandra Hájková
dae9b0b9c6 avs: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:34:04 +01:00
Alexandra Hájková
edd4c19a78 atrac3plus: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:33:59 +01:00
Alexandra Hájková
0272119202 atrac: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:33:50 +01:00
Alexandra Hájková
41679be1a2 asvdec: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:33:45 +01:00
Alexandra Hájková
012c451153 adpcm: Convert to the new bitstream header
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:33:01 +01:00
Alexandra Hájková
ed006ae4e2 4xm: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:32:57 +01:00
Alexandra Hájková
b25180801b on2avc: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:32:54 +01:00
Alexandra Hájková
7d957b3f47 ea: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:32:45 +01:00
Alexandra Hájková
adb1ebb36c eamad: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:32:40 +01:00
Alexandra Hájková
d182d8a6d3 cllc: Convert to the new bitstream reader
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:31:59 +01:00
Alexandra Hájková
dd3d7ddf2a lavc: add a new bitstream reader to replace get_bits
The new bit reader features a simpler API and an implementation without
stacks of nested macros.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-11-18 10:31:56 +01:00
Luca Barbato
adb0e941c3 avpacket: Mark src pointer as constant 2016-11-17 19:41:12 +01:00
Diego Biurrun
76167140a9 qsvdec: Drop stray extra braces around initializer
libavcodec/qsvdec.c:93:5: warning: braces around scalar initializer
2016-11-17 16:53:48 +01:00
Diego Biurrun
715b824346 qsv: Drop some unused variables 2016-11-17 16:53:48 +01:00
Janne Grunau
e7ae8f7a71 aarch64: vp9: loop filter: replace 'orr; cbn?z' with 'adds; b.{eq,ne};
The latter is 1 cycle faster on a cortex-53 and since the operands are
bytewise (or larger) bitmask (impossible to overflow to zero) both are
equivalent.
2016-11-16 09:05:18 +01:00
Janne Grunau
d7595de0b2 aarch64: vp9: use alternative returns in the core loop filter function
Since aarch64 has enough free general purpose registers use them to
branch to the appropiate storage code. 1-2 cycles faster for the
functions using loop_filter 8/16, ... on a cortex-a53. Mixed results
(up to 2 cycles faster/slower) on a cortex-a57.
2016-11-16 09:05:18 +01:00
Gianluigi Tiesi
e17567a831 libilbc: support for latest git of libilbc
In the latest git commits of libilbc developers removed WebRtc_xxx typedefs.
This commit uses int types instead. It's safe to apply also for previous
versions since WebRtc_Word16 was always a typedef of int16_t and
WebRtc_UWord16 a typedef of uint16_t.

Reviewed-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-11-16 08:21:05 +01:00
Diego Biurrun
f7407f56cb golomb: Replace __PRETTY_FUNCTION__ with __func__ for tracing
The former is a GNU extension while the latter is C99.
2016-11-15 09:41:08 +01:00
Mark Thompson
e0b164576f qsv: Add VP8 decoder 2016-11-14 19:38:20 +00:00
Mark Thompson
182cf170a5 vp8: Return stream format information from parser 2016-11-14 19:38:19 +00:00
Mark Thompson
b6582b2927 qsv: Add VC-1 decoder
It uses the same code as the MPEG-2 decoder, so the file is renamed
to contain all "other" (that is, not H.26[45]) codecs.
2016-11-14 19:38:19 +00:00
Mark Thompson
fea4dc05b4 vc1: Return stream format information from parser 2016-11-14 19:38:19 +00:00