FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-18 03:19:31 +02:00

Author	SHA1	Message	Date
Andreas Rheinhardt	396efc73e3	avformat/matroskaenc: Speed up reformatting WavPack WavPack's blocks use a length field, so that parsing them is fast. Therefore it makes sense to parse the block twice, once to get the length of the output packet and once to write the actual data instead of writing the data into a temporary buffer in a single pass. This speeds up muxing from 1597092 to 761850 Decicycles per write_packet call for a 2000kb/s stereo WavPack file muxed to /dev/null with writing CRC-32 disabled. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:55:34 +01:00
Andreas Rheinhardt	c1b6acde36	avformat/matroskaenc: Allow to use custom reformatting functions Matroska uses variable-length elements and in order not to waste bytes on length fields, the length of the data to write needs to be known before writing the length field. Annex B H.264/5 and WavPack need to be reformatted to know this length and this currently involves writing the data into temporary buffers; AV1 sometimes suffers from this as well. This commit aims to solve this by adding a callback that is called twice per packet: Once to get the size and once to actually write the data. In case of WavPack and AV1 (where parsing is cheap due to length fields) both calls will just parse the data with only the second function writing anything. For H.264/5, the position of the NALUs will need to be stored to be written lateron. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:54:23 +01:00
Andreas Rheinhardt	6221491f90	avformat/matroskaenc: Factor writing Info out Avoids the surprise of using pb for the main AVIOContext at the beginning and end of mkv_write_header() and for for the dynamic buffer opened for the Info element in the middle of mkv_write_header(). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:53:30 +01:00
Andreas Rheinhardt	a04c917399	avformat/matroskaenc: Don't waste bytes on ChapterAtoms length fields Also check the (user-provided) metadata tags for being too long. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:52:15 +01:00
Andreas Rheinhardt	e8065c7def	avformat/matroskaenc: Don't waste bytes on Video element length fields Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:50:27 +01:00
Andreas Rheinhardt	0e548fab42	avformat/matroskaenc: Factor writing TrackVideo out It is already quite big. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:42:22 +01:00
Andreas Rheinhardt	6b1968e939	avformat/matroskaenc: Avoid seeks when writing EBML header Using start/end_ebml_master() to write an EBML Master element uses seeks under the hood. This does not work if the output is unseekable with the AVIOContext's buffer being very small (the size of the currently written Matroska EBML header is 40) or with the AVIOContext being in direct mode, because then this seek can't be performed in the AVIOContext's buffer. So using an approach that does not rely on seeking at all is preferable; this is achieved by switching to EbmlWriter. Also factor writing the EBML header out into a function of its own. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:40:56 +01:00
Andreas Rheinhardt	dc555de823	avformat/matroskaenc: Don't waste bytes on AttachedFiles' length fields Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:37:39 +01:00
Andreas Rheinhardt	0148e85c3c	avformat/matroskaenc: Don't waste bytes on SimpleTags length fields Also check the (user-provided) tags for being overlong; the earlier code had an implicit unchecked size_t->int conversion. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:34:36 +01:00
Andreas Rheinhardt	b845fff57d	avformat/matroskaenc: Add API to write Masters with minimal length field This muxer currently uses two ways to ensure that no bytes are wasted by writing unnecessary long EBML length fields for Master elements and the (Simple)Block element (all the other elements are fine as one either already has the right length or getting the actual length is easy and necessary anyway): Either use an upper bound that is good enough in case one is available or write the data into a dynamic buffer first to get the length; the former approach is impossible in lots of cases, whereas the latter incurs allocations and memcpying. It is therefore unfeasible to use the latter for e.g. the attachments or the BlockGroups. This patch adds a third alternative to complement the other two: It consists of an EbmlWriter that one can add EBML elements to that can be written later by calling ebml_writer_write(); the latter function first traverses the written elements recursively and calculates the length of each element; then a second pass is performed in which all the elements are written directly (without any seeks). This new API also performs checks for overlong elements; this is in contrast to put_ebml_string() which simply performs a size_t->int conversion even for strings originating from the user. The new API is designed to have very low overhead: It uses stack arrays and performs no allocations; this also comes at a price: Right now, it can only be used in contexts in which there is a compile-time upper bound for the number of elements. It is also incompatible with storing the offset of an element in order to update this field later. Furthermore, it puts the onus of memory management (i.e. ensuring that pointers stay valid) on the user. These restrictions might be overcome in the future. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:34:17 +01:00
Andreas Rheinhardt	5e186f9693	avformat/matroskaenc: Don't open BlockGroup twice This would happen in case non-WebVTT-subtitles had BlockAdditional or DiscardPadding side-data. Given that these are not accounted for in the length of the outer BlockGroup (which is a quite sharp upper bound) it is possible for the outer BlockGroup to use an insufficient number of bytes which leads to an assert in end_ebml_master(). Fix this by not opening a second BlockGroup inside an already opened BlockGroup. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:22:34 +01:00
Andreas Rheinhardt	ca16863549	avformat/matroskaenc: Fix potential overflow Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-19 11:21:53 +01:00
erankor	625ea2d2a9	http: honor response headers in redirect caching add a dictionary that maps "src_url" -> "expiry;dst_url", the dictionary is checked before issuing an http request, and updated after getting a 3xx redirect response. the cache expiry is determined according to the following (in desc priority) - 1. Expires header 2. Cache-Control containing no-cache/no-store (disables caching) 3. Cache-Control s-maxage/max-age 4. Http codes 301/308 are cached indefinitely, other codes are not cached	2022-01-18 17:35:26 -05:00
Haihao Xiang	641c4346b3	lavc/qsvenc_hevc: add -pic_timing_sei option The SDK may insert picture timing SEI for hevc and the code to set mfx parameter has been added in qsvenc, however the corresponding option is missing in the hevc option array Reviewed-by: Limin Wang <lance.lmwang@gmail.com> Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>	2022-01-18 16:25:33 +08:00
Haihao Xiang	c4ae6908f2	lavc/qsvenc: add encode support for screen content coding extension Enables HEVC Screen Content Coding extension support on ICL+ platform Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>	2022-01-18 16:24:57 +08:00
Fei Wang	a17c990265	avfilter/tonemap_vaapi: set va parameters filters and numbers This can fill VAProcPipelineParameterBuffer correctly and make the pipeline works. Reviewed-by: Soft Works <softworkz@hotmail.com> Signed-off-by: Fei Wang <fei.w.wang@intel.com> Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>	2022-01-17 16:32:17 +08:00
Xinpeng Sun	516496069d	avfilter: add overlay vaapi filter Overlay one video on the top of another. It takes two inputs and has one output. The first input is the "main" video on which the second input is overlaid. This filter requires same memory layout for all the inputs. An example command to use this filter to overlay overlay.mp4 at the top-left corner of the main.mp4: ffmpeg -init_hw_device vaapi=foo:/dev/dri/renderD128 \ -hwaccel vaapi -hwaccel_device foo -hwaccel_output_format vaapi -c:v h264 -i main.mp4 \ -hwaccel vaapi -hwaccel_device foo -hwaccel_output_format vaapi -c:v h264 -i overlay.mp4 \ -filter_complex "[0:v][1:v]overlay_vaapi=0:0:100💯0.5[t1]" \ -map "[t1]" -an -c:v h264_vaapi -y out_vaapi.mp4 Signed-off-by: U. Artie Eoff <ullysses.a.eoff@intel.com> Signed-off-by: Xinpeng Sun <xinpeng.sun@intel.com> Signed-off-by: Zachary Zhou <zachary.zhou@intel.com> Signed-off-by: Fei Wang <fei.w.wang@intel.com> Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>	2022-01-17 16:32:17 +08:00
Rudolf Polzer	dcc9454ab9	vf_paletteuse: fix color cache lookup for Bayer dithering mode. To trigger this bug, use `paletteuse=dither=bayer:bayer_scale=0`; you will see that adjacent pixel lines will use the same dither pattern, instead of being shifted from each other by 32 units (0x20). One way to demostrate the bug is: $ convert -size 64x256 gradient:black-white -rotate 270 grad.png $ echo 'P2 2 1 255 0 255' > bw.pnm $ ffmpeg -i grad.png -filter_complex 'movie=bw.pnm,scale=256x1[bw]; [0:v][bw]paletteuse=dither=bayer:bayer_scale=0' gradbw.png Previously: https://www.rm.cloudns.org/img/uploaded/0bd152c11b9cd99e5945115534b1bdde.png Now: https://www.rm.cloudns.org/img/uploaded/89caaa5e36c38bc2c01755b30811f969.png This was caused by passing inconsistent color vs (a,r,g,b) parameters to color_get(), and NBITS being 5 meaning actually hitting the same cache node does happen in this case, but ONLY if bayer_scale is zero. The fix is passing the correct color value to color_get(). Also added a previous-failing FATE test; image comparison of the first frame: Previously: https://www.rm.cloudns.org/img/uploaded/d0ff9db8d8a7d8a3b8b88bbe92bf5fed.png Now: https://www.rm.cloudns.org/img/uploaded/a72389707e719b5cd1c58916a9e79ca8.png (on this less synthetic test image, the bug basically causes noise from cache hits vs misses) Tested: FATE passes, which exercises this filter but at the default bayer_scale. Reviewed-by: Paul B Mahol <onemda@gmail.com>	2022-01-17 01:31:06 +05:30
Gyan Doshi	bca30570d2	avformat/mpegts: add option max_packet_size Makes maximum size of emitted packet user-tunable. Default is existing 204800 bytes.	2022-01-16 10:46:38 +05:30
James Almer	b1ef5882e3	fate/ffmpeg: add missing samples dependency to fate-shortest Signed-off-by: James Almer <jamrial@gmail.com>	2022-01-16 00:32:52 -03:00
James Almer	45e45a6060	avcodec/libmp3lame: return proper error codes Signed-off-by: James Almer <jamrial@gmail.com>	2022-01-14 22:09:20 -03:00
Vittorio Giovara	7d377558a6	vf_tonemap: Fix order of planes This resulted in a dimmed tonemapping due to bad resulting luma calculation. Found by: Derek Buitenhuis Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2022-01-14 15:48:36 +01:00
Limin Wang	edd305ed54	avcodec/libopenh264enc: set iEntropyCodingModeFlag by coder option For high/main profile, user can choose to use cavlc by specify "-coder cavlc", for default, it'll will use cabac, if it's baseline, we'll use cavlc by specs anyway. ffmpeg -y -f lavfi -i testsrc -c:v libopenh264 -profile:v main -coder cavlc -frames:v 1 -bsf trace_headers -f null - before the patch: entropy_coding_mode_flag 0 = 1 after the patch: entropy_coding_mode_flag 0 = 0 Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2022-01-14 22:00:19 +08:00
Limin Wang	f74e90c2a0	avcodec/libopenh264enc: make the profile configuablable correctly due to the limitations set in `d3a7bdd4ac`, you weren't able to use main profile with OpenH264 1.8, or high profile with older versions Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2022-01-14 22:00:19 +08:00
Limin Wang	008cc90d1a	avcodec/libopenh264enc: support for colorspace and range information Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2022-01-14 22:00:19 +08:00
Andreas Rheinhardt	b57656e28b	fate/matroska: Add test for QT-mode Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 21:00:26 +01:00
Andreas Rheinhardt	99a4d16658	avformat/matroskaenc: Add option to shift data to write cues at front This is similar to the faststart option of the mov muxer, yet in contrast to it it works together with reserve_index_space (the equivalent to reserved_moov_size): If the reserved space does not suffice, the data is shifted; if not, the Cues are written at the front without shifting the data. Several tests that cover (not only) this have been added. Implements #7017. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 21:00:26 +01:00
Andreas Rheinhardt	46309f262c	avcodec/vp3: Don't output bogus warning It is perfectly fine to have from one to seven bits left at the end of parsing. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 21:00:26 +01:00
Michael Niedermayer	c36a5dfc8f	avformat/rawvideodec: check packet size Fixes: division by zero Fixes: integer overflow Fixes: 43347/clusterfuzz-testcase-minimized-ffmpeg_dem_V210X_fuzzer-5846911637127168 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Reviewed-by: lance.lmwang@gmail.com Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2022-01-13 19:43:03 +01:00
Andreas Rheinhardt	c936c319bd	avcodec/mpegpicture: Decrease size of encoding_error array The current size is AV_NUM_DATA_POINTERS (i.e. eight). This number is chosen in order to minimize the amount of allocations for AVFrame.extended_(data\|buf) for audio; it is meaningless for video for which four is sufficient. So decrease this array in order to minimize what is copied in ff_mpeg_ref_picture() and at the places that copy a whole MpegEncContext. Also do the same for snowenc. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:31:02 +01:00
Andreas Rheinhardt	fbeb8eab44	avcodec/mpeg4videodec: Avoid multiple consecutive av_log() These messages belong together, yet they can be torn apart if some other call to av_log() happens between them. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:30:21 +01:00
Andreas Rheinhardt	b263415ab7	avcodec/mpegvideo: Don't set unrestricted_mv for decoders It is write-only for them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:29:44 +01:00
Andreas Rheinhardt	3988016fa3	avcodec/h264pred: Reindentation Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:28:56 +01:00
Andreas Rheinhardt	c32f6b7f8a	avcodec/h264pred: Remove dead > 8 pixels checks for 8bit codecs RV40, SVQ3 and VP7/VP8 are eight-bit only, so it makes no sense to check for them in the codepath initializing > eight bit contexts. Move the codec-specific code to a switch located after the eight-bit init code where this is easily possible; and add checks to the macro to enable the compiler to remove the remaining checks when initializing bitdepths > 8 at compile-time. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:27:52 +01:00
Andreas Rheinhardt	0a6e000d75	avcodec/h264pred: Don't compile > 8 bit versions of VP7/8 functions VP7 and VP8 are eight bit only. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:26:39 +01:00
Andreas Rheinhardt	d0bf242d02	avcodec/h264_slice: Inline H264 codec id This code is only reached by the H.264 decoder. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:26:26 +01:00
Andreas Rheinhardt	67cccd442f	avcodec/svq3: Remove dead topright_samples_available variable, code Topright samples are always available. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:26:13 +01:00
Andreas Rheinhardt	42d30c9019	avcodec/mpegvideo, svq3: Remove unused next_p_frame_damaged Always zero since `4d2858deac`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:25:10 +01:00
Andreas Rheinhardt	75a3268bee	avcodec/h264_slice, mpeg4videodec: Don't use %s to write single char Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:24:47 +01:00
Andreas Rheinhardt	c21433c953	avcodec/mpeg4video: Split off data in a header of its own Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-01-13 08:15:28 +01:00
Limin Wang	8b9ef5a516	avutil/parseutils: use quadhd for Quad HD qHD is 960x540 (q stands for quarter) and QHD is 2560x1440 (Q is quad). use quadhd for QHD for abbreviation. Fix ticket#9591 Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2022-01-12 13:42:26 +08:00
Ming Qian	35a9307beb	avcodec/v4l2_context: remove reinit variable Cleanup after commit `3fc72c9fc1`. Fixes coverity ticket #1497095. Reviewed-by: Andriy Gelman <andriy.gelman@gmail.com> Signed-off-by: Ming Qian <ming.qian@nxp.com>	2022-01-11 23:02:37 -05:00
Linjie Fu	9c58fd2226	lavf/vf_deinterlace_vaapi: flush queued frame for field in DeinterlacingBob For DeinterlacingBob mode with rate=field, the frame number of output should equal 2x input total since only intra deinterlace is used. Currently for "backward_ref = 0, rate = field", extra_delay is introduced. Due to the async without flush, frame number of output is [expected_number - 2]. Specifically, if the input only has 1 frame, the output will be empty. Add deint_vaapi_request_frame for deinterlace_vaapi, send NULL frame to flush the queued frame. For 1 frame input in Bob mode with rate=field, before patch: 0 frame; after patch: 2 frames; ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -hwaccel_output_format vaapi -i input.h264 -an -vf deinterlace_vaapi=mode=bob:rate=field -f null - Tested-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Mark Thompson <sw@jkqxz.net> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>	2022-01-12 10:02:24 +08:00
Chen,Wenbin	e6b990e25d	libavcodec/qsvdec.c: using queue count to unref frame MSDK vc1 and av1 sometimes output frame into the same suface, but ffmpeg-qsv assume the surface will be used only once, so it will unref the frame when it receives the output surface. Now change it to unref frame according to queue count. Signed-off-by: Wenbin Chen <wenbin.chen@intel.com> Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>	2022-01-12 10:02:24 +08:00
Limin Wang	b697326a68	avformat/rtpenc_rfc4175: support for interlace format Below are steps how to test on your local host: wget --no-check-certificate https://samples.ffmpeg.org/MPEG2/interlaced/burosch1.mpg 1. interlace format: ffmpeg -re -i ./burosch1.mpg -c:v bitpacked -pix_fmt yuv422p10 -f rtp rtp://239.255.0.1:6000 copy and create sdp file test.sdp ffplay -buffer_size 671088640 -protocol_whitelist "file,rtp,udp" test.sdp 2. progressive format: ffmpeg -re -i ./burosch1.mpg -vf yadif -c:v bitpacked -pix_fmt yuv422p10 -f rtp rtp://239.255.0.1:6000 copy and create sdp file test.sdp ffplay -buffer_size 671088640 -protocol_whitelist "file,rtp,udp" test.sdp Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2022-01-12 09:21:07 +08:00
Limin Wang	3ea93bbd6d	avformat/rtpdec_rfc4175: reindent after last commit Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2022-01-12 09:21:07 +08:00
Limin Wang	824fdd0f89	avformat/rtpdec_rfc4175: support for interlace format Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2022-01-12 09:21:06 +08:00
Mark Reid	52f7026164	swscale/x86/input.asm: add x86-optimized planer rgb2yuv functions sse2 only operates on 2 lanes per loop for to_y and to_uv functions, due to the lack of pmulld instruction. Emulating pmulld with 2 pmuludq and shuffles proved too costly and made to_uv functions slower then the c implementation. For to_y on sse2 only float functions are generated, I was are not able outperform the c implementation on the integer pixel formats. For to_a on see4 only the float functions are generated. sse2 and sse4 generated nearly identical performing code on integer pixel formats, so only sse2/avx2 versions are generated. planar_gbrp_to_y_512_c: 1197.5 planar_gbrp_to_y_512_sse4: 444.5 planar_gbrp_to_y_512_avx2: 287.5 planar_gbrap_to_y_512_c: 1204.5 planar_gbrap_to_y_512_sse4: 447.5 planar_gbrap_to_y_512_avx2: 289.5 planar_gbrp9be_to_y_512_c: 1380.0 planar_gbrp9be_to_y_512_sse4: 543.5 planar_gbrp9be_to_y_512_avx2: 340.0 planar_gbrp9le_to_y_512_c: 1200.5 planar_gbrp9le_to_y_512_sse4: 442.0 planar_gbrp9le_to_y_512_avx2: 282.0 planar_gbrp10be_to_y_512_c: 1378.5 planar_gbrp10be_to_y_512_sse4: 544.0 planar_gbrp10be_to_y_512_avx2: 337.5 planar_gbrp10le_to_y_512_c: 1200.0 planar_gbrp10le_to_y_512_sse4: 448.0 planar_gbrp10le_to_y_512_avx2: 285.5 planar_gbrap10be_to_y_512_c: 1380.0 planar_gbrap10be_to_y_512_sse4: 542.0 planar_gbrap10be_to_y_512_avx2: 340.5 planar_gbrap10le_to_y_512_c: 1199.0 planar_gbrap10le_to_y_512_sse4: 446.0 planar_gbrap10le_to_y_512_avx2: 289.5 planar_gbrp12be_to_y_512_c: 10563.0 planar_gbrp12be_to_y_512_sse4: 542.5 planar_gbrp12be_to_y_512_avx2: 339.0 planar_gbrp12le_to_y_512_c: 1201.0 planar_gbrp12le_to_y_512_sse4: 440.5 planar_gbrp12le_to_y_512_avx2: 286.0 planar_gbrap12be_to_y_512_c: 1701.5 planar_gbrap12be_to_y_512_sse4: 917.0 planar_gbrap12be_to_y_512_avx2: 338.5 planar_gbrap12le_to_y_512_c: 1201.0 planar_gbrap12le_to_y_512_sse4: 444.5 planar_gbrap12le_to_y_512_avx2: 288.0 planar_gbrp14be_to_y_512_c: 1370.5 planar_gbrp14be_to_y_512_sse4: 545.0 planar_gbrp14be_to_y_512_avx2: 338.5 planar_gbrp14le_to_y_512_c: 1199.0 planar_gbrp14le_to_y_512_sse4: 444.0 planar_gbrp14le_to_y_512_avx2: 279.5 planar_gbrp16be_to_y_512_c: 1364.0 planar_gbrp16be_to_y_512_sse4: 544.5 planar_gbrp16be_to_y_512_avx2: 339.5 planar_gbrp16le_to_y_512_c: 1201.0 planar_gbrp16le_to_y_512_sse4: 445.5 planar_gbrp16le_to_y_512_avx2: 280.5 planar_gbrap16be_to_y_512_c: 1377.0 planar_gbrap16be_to_y_512_sse4: 545.0 planar_gbrap16be_to_y_512_avx2: 338.5 planar_gbrap16le_to_y_512_c: 1201.0 planar_gbrap16le_to_y_512_sse4: 442.0 planar_gbrap16le_to_y_512_avx2: 279.0 planar_gbrpf32be_to_y_512_c: 4113.0 planar_gbrpf32be_to_y_512_sse2: 2438.0 planar_gbrpf32be_to_y_512_sse4: 1068.0 planar_gbrpf32be_to_y_512_avx2: 904.5 planar_gbrpf32le_to_y_512_c: 3818.5 planar_gbrpf32le_to_y_512_sse2: 2024.5 planar_gbrpf32le_to_y_512_sse4: 1241.5 planar_gbrpf32le_to_y_512_avx2: 657.0 planar_gbrapf32be_to_y_512_c: 3707.0 planar_gbrapf32be_to_y_512_sse2: 2444.0 planar_gbrapf32be_to_y_512_sse4: 1077.0 planar_gbrapf32be_to_y_512_avx2: 909.0 planar_gbrapf32le_to_y_512_c: 3822.0 planar_gbrapf32le_to_y_512_sse2: 2024.5 planar_gbrapf32le_to_y_512_sse4: 1176.0 planar_gbrapf32le_to_y_512_avx2: 658.5 planar_gbrp_to_uv_512_c: 2325.8 planar_gbrp_to_uv_512_sse2: 1726.8 planar_gbrp_to_uv_512_sse4: 771.8 planar_gbrp_to_uv_512_avx2: 506.8 planar_gbrap_to_uv_512_c: 2281.8 planar_gbrap_to_uv_512_sse2: 1726.3 planar_gbrap_to_uv_512_sse4: 768.3 planar_gbrap_to_uv_512_avx2: 496.3 planar_gbrp9be_to_uv_512_c: 2336.8 planar_gbrp9be_to_uv_512_sse2: 1924.8 planar_gbrp9be_to_uv_512_sse4: 852.3 planar_gbrp9be_to_uv_512_avx2: 552.8 planar_gbrp9le_to_uv_512_c: 2270.3 planar_gbrp9le_to_uv_512_sse2: 1512.3 planar_gbrp9le_to_uv_512_sse4: 764.3 planar_gbrp9le_to_uv_512_avx2: 491.3 planar_gbrp10be_to_uv_512_c: 2281.8 planar_gbrp10be_to_uv_512_sse2: 1917.8 planar_gbrp10be_to_uv_512_sse4: 855.3 planar_gbrp10be_to_uv_512_avx2: 541.3 planar_gbrp10le_to_uv_512_c: 2269.8 planar_gbrp10le_to_uv_512_sse2: 1515.3 planar_gbrp10le_to_uv_512_sse4: 759.8 planar_gbrp10le_to_uv_512_avx2: 487.8 planar_gbrap10be_to_uv_512_c: 2382.3 planar_gbrap10be_to_uv_512_sse2: 1924.8 planar_gbrap10be_to_uv_512_sse4: 855.3 planar_gbrap10be_to_uv_512_avx2: 540.8 planar_gbrap10le_to_uv_512_c: 2382.3 planar_gbrap10le_to_uv_512_sse2: 1512.3 planar_gbrap10le_to_uv_512_sse4: 759.3 planar_gbrap10le_to_uv_512_avx2: 484.8 planar_gbrp12be_to_uv_512_c: 2283.8 planar_gbrp12be_to_uv_512_sse2: 1936.8 planar_gbrp12be_to_uv_512_sse4: 858.3 planar_gbrp12be_to_uv_512_avx2: 541.3 planar_gbrp12le_to_uv_512_c: 2278.8 planar_gbrp12le_to_uv_512_sse2: 1507.3 planar_gbrp12le_to_uv_512_sse4: 760.3 planar_gbrp12le_to_uv_512_avx2: 485.8 planar_gbrap12be_to_uv_512_c: 2385.3 planar_gbrap12be_to_uv_512_sse2: 1927.8 planar_gbrap12be_to_uv_512_sse4: 855.3 planar_gbrap12be_to_uv_512_avx2: 539.8 planar_gbrap12le_to_uv_512_c: 2377.3 planar_gbrap12le_to_uv_512_sse2: 1516.3 planar_gbrap12le_to_uv_512_sse4: 759.3 planar_gbrap12le_to_uv_512_avx2: 484.8 planar_gbrp14be_to_uv_512_c: 2283.8 planar_gbrp14be_to_uv_512_sse2: 1935.3 planar_gbrp14be_to_uv_512_sse4: 852.3 planar_gbrp14be_to_uv_512_avx2: 540.3 planar_gbrp14le_to_uv_512_c: 2276.8 planar_gbrp14le_to_uv_512_sse2: 1514.8 planar_gbrp14le_to_uv_512_sse4: 762.3 planar_gbrp14le_to_uv_512_avx2: 484.8 planar_gbrp16be_to_uv_512_c: 2383.3 planar_gbrp16be_to_uv_512_sse2: 1881.8 planar_gbrp16be_to_uv_512_sse4: 852.3 planar_gbrp16be_to_uv_512_avx2: 541.8 planar_gbrp16le_to_uv_512_c: 2378.3 planar_gbrp16le_to_uv_512_sse2: 1476.8 planar_gbrp16le_to_uv_512_sse4: 765.3 planar_gbrp16le_to_uv_512_avx2: 485.8 planar_gbrap16be_to_uv_512_c: 2382.3 planar_gbrap16be_to_uv_512_sse2: 1886.3 planar_gbrap16be_to_uv_512_sse4: 853.8 planar_gbrap16be_to_uv_512_avx2: 550.8 planar_gbrap16le_to_uv_512_c: 2381.8 planar_gbrap16le_to_uv_512_sse2: 1488.3 planar_gbrap16le_to_uv_512_sse4: 765.3 planar_gbrap16le_to_uv_512_avx2: 491.8 planar_gbrpf32be_to_uv_512_c: 4863.0 planar_gbrpf32be_to_uv_512_sse2: 3347.5 planar_gbrpf32be_to_uv_512_sse4: 1800.0 planar_gbrpf32be_to_uv_512_avx2: 1199.0 planar_gbrpf32le_to_uv_512_c: 4725.0 planar_gbrpf32le_to_uv_512_sse2: 2753.0 planar_gbrpf32le_to_uv_512_sse4: 1474.5 planar_gbrpf32le_to_uv_512_avx2: 927.5 planar_gbrapf32be_to_uv_512_c: 4859.0 planar_gbrapf32be_to_uv_512_sse2: 3269.0 planar_gbrapf32be_to_uv_512_sse4: 1802.0 planar_gbrapf32be_to_uv_512_avx2: 1201.5 planar_gbrapf32le_to_uv_512_c: 6338.0 planar_gbrapf32le_to_uv_512_sse2: 2756.5 planar_gbrapf32le_to_uv_512_sse4: 1476.0 planar_gbrapf32le_to_uv_512_avx2: 908.5 planar_gbrap_to_a_512_c: 383.3 planar_gbrap_to_a_512_sse2: 66.8 planar_gbrap_to_a_512_avx2: 43.8 planar_gbrap10be_to_a_512_c: 601.8 planar_gbrap10be_to_a_512_sse2: 86.3 planar_gbrap10be_to_a_512_avx2: 34.8 planar_gbrap10le_to_a_512_c: 602.3 planar_gbrap10le_to_a_512_sse2: 48.8 planar_gbrap10le_to_a_512_avx2: 31.3 planar_gbrap12be_to_a_512_c: 601.8 planar_gbrap12be_to_a_512_sse2: 111.8 planar_gbrap12be_to_a_512_avx2: 41.3 planar_gbrap12le_to_a_512_c: 385.8 planar_gbrap12le_to_a_512_sse2: 75.3 planar_gbrap12le_to_a_512_avx2: 39.8 planar_gbrap16be_to_a_512_c: 386.8 planar_gbrap16be_to_a_512_sse2: 79.8 planar_gbrap16be_to_a_512_avx2: 31.3 planar_gbrap16le_to_a_512_c: 600.3 planar_gbrap16le_to_a_512_sse2: 40.3 planar_gbrap16le_to_a_512_avx2: 30.3 planar_gbrapf32be_to_a_512_c: 1148.8 planar_gbrapf32be_to_a_512_sse2: 611.3 planar_gbrapf32be_to_a_512_sse4: 234.8 planar_gbrapf32be_to_a_512_avx2: 183.3 planar_gbrapf32le_to_a_512_c: 851.3 planar_gbrapf32le_to_a_512_sse2: 263.3 planar_gbrapf32le_to_a_512_sse4: 199.3 planar_gbrapf32le_to_a_512_avx2: 156.8 Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2022-01-11 16:34:33 -03:00
Mark Reid	9e445a5be2	swscale/x86/output.asm: add x86-optimized planer gbr yuv2anyX functions changes since v2: * fixed label changes since v1: * remove vex intruction on sse4 path * some load/pack marcos use less intructions * fixed some typos yuv2gbrp_full_X_4_512_c: 12757.6 yuv2gbrp_full_X_4_512_sse2: 8946.6 yuv2gbrp_full_X_4_512_sse4: 5138.6 yuv2gbrp_full_X_4_512_avx2: 3889.6 yuv2gbrap_full_X_4_512_c: 15368.6 yuv2gbrap_full_X_4_512_sse2: 11916.1 yuv2gbrap_full_X_4_512_sse4: 6294.6 yuv2gbrap_full_X_4_512_avx2: 3477.1 yuv2gbrp9be_full_X_4_512_c: 14381.6 yuv2gbrp9be_full_X_4_512_sse2: 9139.1 yuv2gbrp9be_full_X_4_512_sse4: 5150.1 yuv2gbrp9be_full_X_4_512_avx2: 2834.6 yuv2gbrp9le_full_X_4_512_c: 12990.1 yuv2gbrp9le_full_X_4_512_sse2: 9118.1 yuv2gbrp9le_full_X_4_512_sse4: 5132.1 yuv2gbrp9le_full_X_4_512_avx2: 2833.1 yuv2gbrp10be_full_X_4_512_c: 14401.6 yuv2gbrp10be_full_X_4_512_sse2: 9133.1 yuv2gbrp10be_full_X_4_512_sse4: 5126.1 yuv2gbrp10be_full_X_4_512_avx2: 2837.6 yuv2gbrp10le_full_X_4_512_c: 12718.1 yuv2gbrp10le_full_X_4_512_sse2: 9106.1 yuv2gbrp10le_full_X_4_512_sse4: 5120.1 yuv2gbrp10le_full_X_4_512_avx2: 2826.1 yuv2gbrap10be_full_X_4_512_c: 18535.6 yuv2gbrap10be_full_X_4_512_sse2: 33617.6 yuv2gbrap10be_full_X_4_512_sse4: 6264.1 yuv2gbrap10be_full_X_4_512_avx2: 3422.1 yuv2gbrap10le_full_X_4_512_c: 16724.1 yuv2gbrap10le_full_X_4_512_sse2: 11787.1 yuv2gbrap10le_full_X_4_512_sse4: 6282.1 yuv2gbrap10le_full_X_4_512_avx2: 3441.6 yuv2gbrp12be_full_X_4_512_c: 13723.6 yuv2gbrp12be_full_X_4_512_sse2: 9128.1 yuv2gbrp12be_full_X_4_512_sse4: 7997.6 yuv2gbrp12be_full_X_4_512_avx2: 2844.1 yuv2gbrp12le_full_X_4_512_c: 12257.1 yuv2gbrp12le_full_X_4_512_sse2: 9107.6 yuv2gbrp12le_full_X_4_512_sse4: 5142.6 yuv2gbrp12le_full_X_4_512_avx2: 2837.6 yuv2gbrap12be_full_X_4_512_c: 18511.1 yuv2gbrap12be_full_X_4_512_sse2: 12156.6 yuv2gbrap12be_full_X_4_512_sse4: 6251.1 yuv2gbrap12be_full_X_4_512_avx2: 3444.6 yuv2gbrap12le_full_X_4_512_c: 16687.1 yuv2gbrap12le_full_X_4_512_sse2: 11785.1 yuv2gbrap12le_full_X_4_512_sse4: 6243.6 yuv2gbrap12le_full_X_4_512_avx2: 3446.1 yuv2gbrp14be_full_X_4_512_c: 13690.6 yuv2gbrp14be_full_X_4_512_sse2: 9120.6 yuv2gbrp14be_full_X_4_512_sse4: 5138.1 yuv2gbrp14be_full_X_4_512_avx2: 2843.1 yuv2gbrp14le_full_X_4_512_c: 14995.6 yuv2gbrp14le_full_X_4_512_sse2: 9119.1 yuv2gbrp14le_full_X_4_512_sse4: 5126.1 yuv2gbrp14le_full_X_4_512_avx2: 2843.1 yuv2gbrp16be_full_X_4_512_c: 12367.1 yuv2gbrp16be_full_X_4_512_sse2: 8233.6 yuv2gbrp16be_full_X_4_512_sse4: 4820.1 yuv2gbrp16be_full_X_4_512_avx2: 2666.6 yuv2gbrp16le_full_X_4_512_c: 10904.1 yuv2gbrp16le_full_X_4_512_sse2: 8214.1 yuv2gbrp16le_full_X_4_512_sse4: 4824.1 yuv2gbrp16le_full_X_4_512_avx2: 2629.1 yuv2gbrap16be_full_X_4_512_c: 26569.6 yuv2gbrap16be_full_X_4_512_sse2: 10884.1 yuv2gbrap16be_full_X_4_512_sse4: 5488.1 yuv2gbrap16be_full_X_4_512_avx2: 3272.1 yuv2gbrap16le_full_X_4_512_c: 14010.1 yuv2gbrap16le_full_X_4_512_sse2: 10562.1 yuv2gbrap16le_full_X_4_512_sse4: 5463.6 yuv2gbrap16le_full_X_4_512_avx2: 3255.1 yuv2gbrpf32be_full_X_4_512_c: 14524.1 yuv2gbrpf32be_full_X_4_512_sse2: 8552.6 yuv2gbrpf32be_full_X_4_512_sse4: 4636.1 yuv2gbrpf32be_full_X_4_512_avx2: 2474.6 yuv2gbrpf32le_full_X_4_512_c: 13060.6 yuv2gbrpf32le_full_X_4_512_sse2: 9682.6 yuv2gbrpf32le_full_X_4_512_sse4: 4298.1 yuv2gbrpf32le_full_X_4_512_avx2: 2453.1 yuv2gbrapf32be_full_X_4_512_c: 18629.6 yuv2gbrapf32be_full_X_4_512_sse2: 11363.1 yuv2gbrapf32be_full_X_4_512_sse4: 15201.6 yuv2gbrapf32be_full_X_4_512_avx2: 3727.1 yuv2gbrapf32le_full_X_4_512_c: 16677.6 yuv2gbrapf32le_full_X_4_512_sse2: 10221.6 yuv2gbrapf32le_full_X_4_512_sse4: 5693.6 yuv2gbrapf32le_full_X_4_512_avx2: 3656.6 Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2022-01-11 16:33:17 -03:00
James Almer	4b053b8db1	avcodec/av1dec: honor the requested skip_frame level This supports dropping non-intra, non-key, or all frames. Tested-by: nevcairiel Signed-off-by: James Almer <jamrial@gmail.com>	2022-01-11 09:51:58 -03:00

... 9 10 11 12 13 ...

105809 Commits