FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2025-01-29 22:00:58 +02:00

Author	SHA1	Message	Date
Anton Khirnov	8dfba25ce8	pthread_frame: ensure the threads don't run simultaneously with hwaccel	2016-12-19 08:09:19 +01:00
Anton Khirnov	373fd76b4d	hevcdec: do not set decoder-global SPS prematurely It should only be set after the decoder state has been fully initialized for using that SPS. Fixes possible invalid reads on get_format() failure. CC: libav-stable@libav.org	2016-12-19 08:07:15 +01:00
Janne Grunau	2425d7329f	arm64: replace 'bic' with immediate with 'and' with inverted immediate The former is not an official pseudo instruction although gas and llvm's internal assembler support it. Fixes a build error with xcode 6.2 reported by Memphiz on github.	2016-12-14 21:53:05 +01:00
Diego Biurrun	ea7ee4b4e3	ppc: Centralize compiler-specific altivec.h #include handling in one place Also move #includes into canonical order where appropriate.	2016-12-14 14:08:43 +01:00
Diego Biurrun	39929e55eb	ppc: hevcdsp: Use shorthands for vector types This is more consistent and fixes compilation with clang.	2016-12-14 14:08:43 +01:00
Diego Biurrun	554e55bbf0	decode.h: Add missing headers to fix standalone compilation	2016-12-14 14:08:43 +01:00
Wan-Teh Chang	343e283399	pthread_frame: use better memory orders for frame progress This improves commit 59c70227405c214b29971e6272f3a3ff6fcce3d0. In ff_thread_report_progress(), the fast code path can load progress[field] with the relaxed memory order, and the slow code path can store progress[field] with the release memory order. These changes are mainly intended to avoid confusion when one inspects the source code. They are unlikely to have measurable performance improvement. ff_thread_report_progress() and ff_thread_await_progress() form a pair. ff_thread_await_progress() reads progress[field] with the acquire memory order (in the fast code path). Therefore, one expects to see ff_thread_report_progress() write progress[field] with the matching release memory order. In the fast code path in ff_thread_report_progress(), the atomic load of progress[field] doesn't need the acquire memory order because the calling thread is trying to make the data it just decoded visible to the other threads, rather than trying to read the data decoded by other threads. In ff_thread_get_buffer(), initialize progress[0] and progress[1] using atomic_init(). Signed-off-by: Wan-Teh Chang <wtc@google.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-12-14 11:16:51 +01:00
Derek Buitenhuis	5c7f2cf81d	h264_slice: Wait for refs to be available before we use them in error concealment This could happen when there was a frame number gap and frame threading was used. Debugging-by: Ronald S. Bultje <rsbultje@gmail.com> Debugging-by: Justin Ruggles <justin.ruggles@gmail.com> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com> CC:libav-stable@libav.org Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-12-14 10:38:15 +01:00
Anton Khirnov	86157e6db2	hevc: decouple calling get_format() from exporting the SPS parameters This makes sure ff_get_format() does not get called unnecessarily from update_thread_context().	2016-12-14 09:06:45 +01:00
Anton Khirnov	730c023260	binkaudio: switch to the new send/receive API It is more natural for this codec and allows to avoid awkward constructs like "consuming 0 bytes from input". Also, keep a reference to the input packet to avoid unnecessary copying.	2016-12-14 09:06:45 +01:00
Anton Khirnov	fa1749dd34	vp9: split superframes in the filtering stage before actual decoding Significantly increases the efficiency of frame threading, since individual frames in a superframe can now be decoded in parallel.	2016-12-14 09:06:45 +01:00
Anton Khirnov	03a80925ef	lavc: add a bitstream filter for splitting VP9 superframes Partially based on code by Ronald S. Bultje <rsbultje@gmail.com>.	2016-12-14 09:06:45 +01:00
Anton Khirnov	8fb4210ad8	qsvdec_h2645: switch to the new generic filtering mechanism Drop the internal manual conversion from the MP4 format to Annex B.	2016-12-14 09:06:45 +01:00
Anton Khirnov	972c71e9cb	lavc: add support for filtering packets before decoding	2016-12-14 09:06:45 +01:00
Anton Khirnov	061a0c14bb	decode: restructure the core decoding code Currently, the new decoding API is pretty much just a wrapper around the old deprecated one. This is problematic, since it interferes with making full use of the flexibility added by the new API. The old API should also be removed at some future point. Reorganize the code so that the new send_packet/receive_frame functions call the actual decoding directly and change the old deprecated avcodec_decode_* functions into wrappers around the new API. The new internal API for decoders is now changing as well. Before this commit, it mirrors the public API, so the decoders need to implement send_packet() and receive_frame() callbacks. This turns out to require awkward constructs in both the decoders and the generic code. After this commit, the decoders only implement the receive_frame() callback and call a new internal function, ff_decode_get_packet() to obtain input data, in the same manner to how the bitstream filters now work. avcodec will now always make a reference to the input packet, which means that non-refcounted input packets will be copied. Keeping the previous behaviour, where this copy could sometimes be avoided, would make the code significantly more complex and fragile for only dubious gains, since packets are typically small and everyone who cares about performance should use refcounted packets anyway.	2016-12-14 09:06:44 +01:00
Anton Khirnov	549d0bdca5	decode: be more explicit about storing the last packet properties The current code stores a pointer to the packet passed to the decoder, which is then used during get_buffer() for timestamps and side data passthrough. However, since this is a pointer to user data which we do not own, storing it is potentially dangerous. It is also ill defined for the new decoding API with split input/output. Fix this problem by making an explicit internally owned copy of the packet properties.	2016-12-14 09:06:44 +01:00
Anton Khirnov	47e547b321	lavc: add a null bitstream filter It is useful for testing/debugging and will also be used as the default filter in the following commit adding pre-decode filtering to avoid having a separate non-filtered codepath.	2016-12-14 09:06:44 +01:00
Anton Khirnov	0309ddcfb2	lavc: handle MP3 in get_audio_frame_duration()	2016-12-14 09:06:44 +01:00
Diego Biurrun	6aa4ba7131	dxva2: Keep code shared between dxva2 and d3d11va under the correct #if This partially reverts commit ac648bb835edd3f67bda2267d0e72e5e582eb5a1.	2016-12-12 13:44:25 +01:00
Alexandra Hajkova	b0e6b3f477	hevc: ppc: Add HEVC 4x4 IDCT for PowerPC Signed-off-by: Diego Biurrun <diego@biurrun.de>	2016-12-12 09:25:16 +01:00
Diego Biurrun	a6901b9c6b	Drop libxvid rate control support for mpegvideo encoding The feature has outlived is usefulness and complicates the code.	2016-12-11 09:27:40 +01:00
Diego Biurrun	ac648bb835	dxva2: Simplify some ifdefs	2016-12-11 09:27:40 +01:00
Mark Thompson	7d81698b89	vaapi_h265: Fix CFR mode with framerate set in AVCodecContext Same issue as 17a0f9481cf07af0feb3838ca315b970117e8000.	2016-12-10 16:55:44 +00:00
Diego Biurrun	932cc6496e	vdpau: Do not #include vdpau_x11.h from the main vdpau header That header should only be included in the special bits that use X11 code.	2016-12-09 08:41:53 +01:00
Diego Biurrun	92e6b31c3b	dxva2: Adjust multiple inclusion guard names to follow convention	2016-12-09 08:41:52 +01:00
Andreas Cadhalpun	fc85646ad4	libopusdec: fix out-of-bounds read Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>	2016-12-08 15:53:58 -05:00
Andreas Cadhalpun	dc2ad09493	libschroedingerdec: fix leaking of framewithpts Also preserve the return value from ff_get_buffer(). Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-12-08 15:53:58 -05:00
Andreas Cadhalpun	8c3a643808	libschroedingerdec: don't produce empty frames They are not valid and can cause problems/crashes for API users. Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>	2016-12-08 15:53:58 -05:00
Timothy Gu	d3da8a0035	omx: Fix allocation check Also use av_mallocz_array(). Bug-Id: CID 1396839 Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-12-08 15:53:58 -05:00
Timothy Gu	d32bdadda8	qsvdec: Fix memory leak on error Bug-Id: CID 1396851 Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-12-08 15:53:58 -05:00
Diego Biurrun	d5759701a8	libkvazaar: Add missing header #includes This fixes compilation after the next version bump.	2016-12-08 21:34:30 +01:00
Diego Biurrun	fbec58daa2	build: Add an internal component for hevc_ps code This allows expressing dependencies in a more correct way.	2016-12-08 20:12:23 +01:00
Vittorio Giovara	2fb6acd9c2	lavc: Add spherical packet side data API Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-12-07 14:34:34 -05:00
Diego Biurrun	624aa8ab22	build: Add missing Makefile entries and ifdefs for QSV hwaccels	2016-12-07 15:46:57 +01:00
Diego Biurrun	e1dc5358af	build: Create a component for MPEG audio header decoding Fixes standalone compilation of the libmp3lame encoder.	2016-12-05 16:13:05 +01:00
Diego Biurrun	0fdc9f81a0	build: Add missing hevc_ps dependency for QSV HEVC encoder	2016-12-05 16:13:04 +01:00
Alexandra Hájková	6c916192f3	mimic: Convert to the new bitstream reader	2016-12-03 14:36:03 +01:00
Alexandra Hájková	cdc6727c3e	metasound: Convert to the new bitstream reader	2016-12-03 14:36:03 +01:00
Alexandra Hájková	6fad5abcad	lagarith: Convert to the new bitstream reader	2016-12-03 14:36:03 +01:00
Alexandra Hájková	c3defda0d8	indeo: Convert to the new bitstream reader	2016-12-03 14:36:03 +01:00
Alexandra Hájková	f5b7bd2a7c	imc: Convert to the new bitstream reader	2016-12-03 14:36:03 +01:00
Alexandra Hájková	39ecf0588f	webp: Convert to the new bitstream reader	2016-12-03 14:36:03 +01:00
James Almer	33a2b73b98	mpeg4audio: correctly propagate meaningful error values Signed-off-by: James Almer <jamrial@gmail.com>	2016-12-02 12:16:30 -05:00
Wan-Teh Chang	d82d5379ca	mmaldec: initialize refcount using atomic_init() This is how we initialize refcount in libavutil/buffer.c. Signed-off-by: Wan-Teh Chang <wtc@google.com> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-12-02 12:16:26 -05:00
Vittorio Giovara	5168026a05	options_table: Do not rely on enum size as option bound Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-12-02 11:36:46 -05:00
Vittorio Giovara	ff9db5cfd1	lavc: Use a stricter check for the color properties values Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-12-02 11:36:42 -05:00
Diego Biurrun	0a35f128f3	cabac: x86: Give optimizations header a more meaningful name	2016-12-01 08:23:54 +01:00
Martin Storsjö	cad42fadcd	aarch64: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32 This work is sponsored by, and copyright, Google. Previously all subpartitions except the eob=1 (DC) case ran with the same runtime: vp9_inv_dct_dct_16x16_sub16_add_neon: 1373.2 vp9_inv_dct_dct_32x32_sub32_add_neon: 8089.0 By skipping individual 8x16 or 8x32 pixel slices in the first pass, we reduce the runtime of these functions like this: vp9_inv_dct_dct_16x16_sub1_add_neon: 235.3 vp9_inv_dct_dct_16x16_sub2_add_neon: 1036.7 vp9_inv_dct_dct_16x16_sub4_add_neon: 1036.7 vp9_inv_dct_dct_16x16_sub8_add_neon: 1036.7 vp9_inv_dct_dct_16x16_sub12_add_neon: 1372.1 vp9_inv_dct_dct_16x16_sub16_add_neon: 1372.1 vp9_inv_dct_dct_32x32_sub1_add_neon: 555.1 vp9_inv_dct_dct_32x32_sub2_add_neon: 5190.2 vp9_inv_dct_dct_32x32_sub4_add_neon: 5180.0 vp9_inv_dct_dct_32x32_sub8_add_neon: 5183.1 vp9_inv_dct_dct_32x32_sub12_add_neon: 6161.5 vp9_inv_dct_dct_32x32_sub16_add_neon: 6155.5 vp9_inv_dct_dct_32x32_sub20_add_neon: 7136.3 vp9_inv_dct_dct_32x32_sub24_add_neon: 7128.4 vp9_inv_dct_dct_32x32_sub28_add_neon: 8098.9 vp9_inv_dct_dct_32x32_sub32_add_neon: 8098.8 I.e. in general a very minor overhead for the full subpartition case due to the additional cmps, but a significant speedup for the cases when we only need to process a small part of the actual input data. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-30 23:57:05 +02:00
Martin Storsjö	9c8bc74c2b	arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32 This work is sponsored by, and copyright, Google. Previously all subpartitions except the eob=1 (DC) case ran with the same runtime: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_16x16_sub16_add_neon: 3188.1 2435.4 2499.0 1969.0 vp9_inv_dct_dct_32x32_sub32_add_neon: 18531.7 16582.3 14207.6 12000.3 By skipping individual 4x16 or 4x32 pixel slices in the first pass, we reduce the runtime of these functions like this: vp9_inv_dct_dct_16x16_sub1_add_neon: 274.6 189.5 211.7 235.8 vp9_inv_dct_dct_16x16_sub2_add_neon: 2064.0 1534.8 1719.4 1248.7 vp9_inv_dct_dct_16x16_sub4_add_neon: 2135.0 1477.2 1736.3 1249.5 vp9_inv_dct_dct_16x16_sub8_add_neon: 2446.7 1828.7 1993.6 1494.7 vp9_inv_dct_dct_16x16_sub12_add_neon: 2832.4 2118.3 2266.5 1735.1 vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.7 2475.3 2523.5 1983.1 vp9_inv_dct_dct_32x32_sub1_add_neon: 756.2 456.7 862.0 553.9 vp9_inv_dct_dct_32x32_sub2_add_neon: 10682.2 8190.4 8539.2 6762.5 vp9_inv_dct_dct_32x32_sub4_add_neon: 10813.5 8014.9 8518.3 6762.8 vp9_inv_dct_dct_32x32_sub8_add_neon: 11859.6 9313.0 9347.4 7514.5 vp9_inv_dct_dct_32x32_sub12_add_neon: 12946.6 10752.4 10192.2 8280.2 vp9_inv_dct_dct_32x32_sub16_add_neon: 14074.6 11946.5 11001.4 9008.6 vp9_inv_dct_dct_32x32_sub20_add_neon: 15269.9 13662.7 11816.1 9762.6 vp9_inv_dct_dct_32x32_sub24_add_neon: 16327.9 14940.1 12626.7 10516.0 vp9_inv_dct_dct_32x32_sub28_add_neon: 17462.7 15776.1 13446.2 11264.7 vp9_inv_dct_dct_32x32_sub32_add_neon: 18575.5 17157.0 14249.3 12015.1 I.e. in general a very minor overhead for the full subpartition case due to the additional loads and cmps, but a significant speedup for the cases when we only need to process a small part of the actual input data. In common VP9 content in a few inspected clips, 70-90% of the non-dc-only 16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left 8x8 or 16x16 subpartitions respectively. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-30 23:54:07 +02:00
Martin Storsjö	3c87039a40	arm: vp9itxfm: Only reload the idct coeffs for the iadst_idct combination This avoids reloading them if they haven't been clobbered, if the first pass also was idct. This is similar to what was done in the aarch64 version. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-30 23:53:52 +02:00

1 2 3 4 5 ...

21320 Commits