FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-02 03:06:28 +02:00

Author	SHA1	Message	Date
Martin Storsjö	383d96aa22	aarch64: vp9: Add NEON optimizations of VP9 MC functions This work is sponsored by, and copyright, Google. These are ported from the ARM version; it is essentially a 1:1 port with no extra added features, but with some hand tuning (especially for the plain copy/avg functions). The ARM version isn't very register starved to begin with, so there's not much to be gained from having more spare registers here - we only avoid having to clobber callee-saved registers. Examples of runtimes vs the 32 bit version, on a Cortex A53: ARM AArch64 vp9_avg4_neon: 27.2 23.7 vp9_avg8_neon: 56.5 54.7 vp9_avg16_neon: 169.9 167.4 vp9_avg32_neon: 585.8 585.2 vp9_avg64_neon: 2460.3 2294.7 vp9_avg_8tap_smooth_4h_neon: 132.7 125.2 vp9_avg_8tap_smooth_4hv_neon: 478.8 442.0 vp9_avg_8tap_smooth_4v_neon: 126.0 93.7 vp9_avg_8tap_smooth_8h_neon: 241.7 234.2 vp9_avg_8tap_smooth_8hv_neon: 690.9 646.5 vp9_avg_8tap_smooth_8v_neon: 245.0 205.5 vp9_avg_8tap_smooth_64h_neon: 11273.2 11280.1 vp9_avg_8tap_smooth_64hv_neon: 22980.6 22184.1 vp9_avg_8tap_smooth_64v_neon: 11549.7 10781.1 vp9_put4_neon: 18.0 17.2 vp9_put8_neon: 40.2 37.7 vp9_put16_neon: 97.4 99.5 vp9_put32_neon/armv8: 346.0 307.4 vp9_put64_neon/armv8: 1319.0 1107.5 vp9_put_8tap_smooth_4h_neon: 126.7 118.2 vp9_put_8tap_smooth_4hv_neon: 465.7 434.0 vp9_put_8tap_smooth_4v_neon: 113.0 86.5 vp9_put_8tap_smooth_8h_neon: 229.7 221.6 vp9_put_8tap_smooth_8hv_neon: 658.9 621.3 vp9_put_8tap_smooth_8v_neon: 215.0 187.5 vp9_put_8tap_smooth_64h_neon: 10636.7 10627.8 vp9_put_8tap_smooth_64hv_neon: 21076.8 21026.9 vp9_put_8tap_smooth_64v_neon: 9635.0 9632.4 These are generally about as fast as the corresponding ARM routines on the same CPU (at least on the A53), in most cases marginally faster. The speedup vs C code is pretty much the same as for the 32 bit case; on the A53 it's around 6-13x for ther larger 8tap filters. The exact speedup varies a little, since the C versions generally don't end up exactly as slow/fast as on 32 bit. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-10 11:15:56 +02:00
Martin Storsjö	c44a8a3eab	aarch64: Add an offset parameter to the movrel macro With apple tools, the linker fails with errors like these, if the offset is negative: ld: in section __TEXT,__text reloc 8: symbol index out of range for architecture arm64 Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-10 11:06:08 +02:00
Martin Storsjö	a4cfcddcb0	vp9: Make the subpel filters non-static Make them aligned, to allow efficient access to them from simd. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-10 11:05:57 +02:00
James Almer	98cae966c7	matroskaenc: write updated STREAMINFO metadata for FLAC streams if available FLAC streams originating from the FLAC encoder send updated and more complete STREAMINFO metadata as part of the last packet, so write that to CodecPrivate instead of the incomplete one available in extradata during init. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-11-10 09:15:24 +01:00
James Almer	f4bf236338	matroskaenc: fix muxing AAC streams when using aac_adtstoasc bsf aac_adtstoasc makes the aac extradata available only after the first packet is filtered, and as packet side data. Assume extradata will be available as part of the first packet if avpriv_mpeg4audio_get_config() fails the first time due to missing extradata and reserve space for the OutputSampleRate element in the Tracks master. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-11-10 09:01:18 +01:00
Anton Khirnov	84f225684c	pthread_frame: properly propagate the hw frame context across frame threads	2016-11-10 09:00:11 +01:00
Diego Biurrun	72a19f4013	mpegaudiodsp: aarch64: Adjust function prototype after `2caa93b813`	2016-11-10 00:13:48 +01:00
Diego Biurrun	2dd464868c	configure: Move license checks directly after command line parsing This will allow to error out immediately if incompatible options are passed on the command line instead of running time-consuming tests.	2016-11-09 20:51:56 +01:00
Diego Biurrun	c78495d1cd	configure: Log name and parameters of all helper functions where it makes sense	2016-11-09 20:51:56 +01:00
Diego Biurrun	8a6e7a67cb	configure: Use check_cpp in CPP flags tests	2016-11-09 20:51:56 +01:00
Diego Biurrun	831005b230	configure: Log correct test name and use correct filter when testing objective C flags	2016-11-09 20:51:56 +01:00
Diego Biurrun	fe7bc1f16a	configure: Do not unconditionally check for (and enable) xlib This avoids unnecessarily linking against xlib.	2016-11-09 20:51:55 +01:00
Diego Biurrun	d1a91ebe49	configure: Print list of enabled programs Also drop a related and now redundant informative output line.	2016-11-09 20:51:55 +01:00
Diego Biurrun	576c9003ae	configure: Improve output wording Also drop a redundant output line.	2016-11-09 20:51:55 +01:00
Diego Biurrun	a3483f7993	avconv: Drop stray leftover debug output	2016-11-09 20:51:55 +01:00
Diego Biurrun	67deba8a41	Use avpriv_report_missing_feature() where appropriate	2016-11-08 17:54:34 +01:00
Diego Biurrun	59d2b00d20	configure: Add --quiet command line parameter to suppress informative output	2016-11-08 17:32:57 +01:00
Diego Biurrun	4537647c04	fate: checkasm: Split monolithic test into individual components	2016-11-08 17:32:25 +01:00
Diego Biurrun	9498237049	checkasm: Add --test parameter to check only specific components Inspired by a patch from Martin Storsjö <martin@martin.st>.	2016-11-08 17:32:25 +01:00
Vittorio Giovara	de6e2ff3dd	mov: Read multiple stsd from DV Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-11-08 11:22:29 -05:00
Vittorio Giovara	47a795727f	hevc: Support extradata changes from multiple stsd Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-11-08 11:22:29 -05:00
Vittorio Giovara	2fe30b4743	hevc: Allow parsing external extradata buffers	2016-11-08 11:22:29 -05:00
Vittorio Giovara	5be2153111	hevc: Move hevc_decode_extradata before frame decoding Avoids a forward-declaration in the following commit. Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-11-08 11:22:29 -05:00
Vittorio Giovara	bed2c4b265	lavc: Add hevc main10 profile to avconv cli	2016-11-08 11:22:29 -05:00
Vittorio Giovara	17dac56b8f	lavu: Rename ycgco color space appropriately Planes are ordered as the name suggests now. Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2016-11-08 11:22:29 -05:00
Diego Biurrun	0361e4dcb4	h264_qpel: x86: Move function with only one instance out of template macro libavcodec/x86/h264_qpel.c:392:785: warning: unused function 'ff_avg_h264_qpel8or16_hv1_lowpass_mmxext' [-Wunused-function]	2016-11-08 17:21:02 +01:00
Diego Biurrun	88f0cf8cd3	avplay: Correct function pointer assignments in options array avplay.c:2928:5: warning: ISO C forbids initialization between function pointer and ‘void *’ [-Wpedantic]	2016-11-08 17:20:30 +01:00
Diego Biurrun	943533d64c	avconv: Correct function pointer assignments in options array Fixes several warnings of the type avconv_opt.c:2356:5: warning: ISO C forbids initialization between function pointer and ‘void *’ [-Wpedantic]	2016-11-08 16:48:41 +01:00
Andreas Cadhalpun	43de8b328b	lzf: update pointer p after realloc This fixes heap-use-after-free detected by AddressSanitizer. Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2016-11-07 22:42:00 +01:00
Luca Barbato	ab839054e6	swscale: Add GRAY12	2016-11-07 22:42:00 +01:00
Luca Barbato	7471352f19	pixfmt: Add GRAY12	2016-11-07 22:42:00 +01:00
Anton Khirnov	4ab61cd983	qsv{enc,dec}: extend the internal frame allocator Handle the internal frame requests, which is required by the HEVC encoding plugin. Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>	2016-11-07 12:48:00 +01:00
Anton Khirnov	00aeedd841	qsv{dec,enc}: use a struct as a memory id with internal memory allocator This will allow implementing the allocator more fully, which is needed by the HEVC encoder plugin with video memory input. Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>	2016-11-07 12:47:54 +01:00
Anton Khirnov	404e51478e	qsv{dec,enc}: always use an internal mfxFrameSurface1 For encoding, this avoids modifying the input surface, which we are not allowed to do. This will also be useful in the following commits. Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>	2016-11-07 12:47:46 +01:00
Anton Khirnov	e8bbacbf52	hwcontext_qsv: support frame mapping Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>	2016-11-07 12:47:40 +01:00
Anton Khirnov	8ea15afbf2	hwcontext_qsv: transfer data through the child context when VPP fails Uploading/downloading data through VPP may not work for some formats, in that case we can still try to call av_hwframe_transfer_data() on the child context. Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>	2016-11-07 12:47:33 +01:00
Anton Khirnov	b91ce48600	hwcontext_qsv: do not fail when download/upload VPP session creation fails Certain pixel formats (e.g. P8) might not be supported for download/upload through VPP operations, but can still be used otherwise. Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>	2016-11-07 12:47:26 +01:00
Anton Khirnov	b115a35ea6	hwcontext_qsv: add support for the P8 format When using GPU surfaces with QSV, one needs to supply a frame allocator, which will be invoked to pass surface pools to libmfx. For encoding, this allocator gets invoked not only for the pool of input frames, but also for a separate pool of (apparently) reconstructed frames and another pool of MFX_FOURCC_P8, which on Windows needs to return D3DFMT_P8 D3D surfaces. Those are probably used to store the encoded bitstream on the GPU. Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>	2016-11-07 12:47:20 +01:00
Anton Khirnov	10065d9324	hwcontext_dxva2: add support for the P8 format This format is used internally by the QSV encoder to store the encoded bitstream. Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>	2016-11-07 12:47:14 +01:00
Anton Khirnov	9109737654	hwcontext_dxva2: frame mapping support Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>	2016-11-07 12:46:59 +01:00
Hendrik Leppkes	fabfbfe571	dxva2: fix surface selection when compiled with both d3d11va and dxva2 Fixes a regression introduced in `be630b1e08` Signed-off-by: Anton Khirnov <anton@khirnov.net>	2016-11-07 10:05:12 +01:00
Derek Buitenhuis	db0b3dccb3	libx265: Add option to force IDR frames This is in the same the same vein as `380146924e`. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com> Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-07 10:16:10 +02:00
Diego Biurrun	3cba09e522	x86: Drop stray semicolons after function definitions libavcodec/x86/rv40dsp_init.c:97:2: warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic] libavcodec/x86/vp9dsp_init.c:94:40: warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic]	2016-11-05 12:41:45 +01:00
Martin Storsjö	d1ef1b9eaa	configure: Silence lld-link when getting the version number In recent lld-link versions, this command prints the version to stdout, but also prints an error to stderr: $ lld-link -flavor gnu --version LLD 4.0.0 (trunk 285641) lld-link: error: no input files lld-link: error: target emulation unknown: -m or at least one .o file required Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-04 21:37:57 +02:00
Martin Storsjö	392caa65df	arm: vp9mc: Insert a literal pool at the middle of the file This fixes errors like this when building non-pic binaries with armv6 as baseline: Error: invalid literal constant: pool needs to be closer Signed-off-by: Martin Storsjö <martin@martin.st>	2016-11-04 21:37:53 +02:00
Mark Thompson	8ad9f9d675	hwcontext_vaapi: Frame mapping support Can map to any supported software format (using a GPU copy if it doesn't actually match the surface format underneath).	2016-11-03 23:49:05 +00:00
Mark Thompson	124e26971e	lavfi: Hardware map filter Takes a frame associated with a hardware context as input and maps it to something else (another hardware frame or normal memory) for other processing. If the frame to map was originally in the target format (but mapped to something else), the original frame is output. Also supports mapping backwards, where only the output has a hardware context. The link immediately before will be supplied with mapped hardware frames which it can write directly into, and this filter then unmaps them back to the actual hardware frames.	2016-11-03 23:49:05 +00:00
Mark Thompson	d06aa24ba5	hwcontext: Hardware frame mapping Adds the new av_hwframe_map() function, which allows mapping between hardware frames and normal memory, along with internal support for implementing it. Also adds av_hwframe_ctx_create_derived(), for creating a hardware frames context associated with one device using frames mapped from another by some hardware-specific means.	2016-11-03 23:49:01 +00:00
Diego Biurrun	67351924fa	Drop unreachable break and return statements	2016-11-03 20:17:12 +01:00
Diego Biurrun	99434f4df8	float_dsp: Have implementation match function pointer prototype libavutil/x86/float_dsp_init.c(144) : warning C4028: formal parameter 1 different from declaration libavutil/x86/float_dsp_init.c(144) : warning C4028: formal parameter 2 different from declaration	2016-11-03 17:43:55 +01:00

1 2 3 4 5 ...

43959 Commits