FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00

Author	SHA1	Message	Date
Andreas Rheinhardt	6106fb2b4c	avcodec/hevcdsp: Offset ff_hevc_.pel_filters to simplify addressing Besides simplifying address computations (it saves 432B of .text in hevcdsp.o alone here) it also fixes undefined behaviour that occurs if mx or my are 0 (happens when the filters are unused) because they lead to an array index of -1 in the old code. This happens in the checkasm-hevc_pel FATE-test. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Reviewed-by: Nuo Mi <nuomi2021@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2024-02-13 20:25:49 -03:00
Andreas Rheinhardt	b3bbbb14d0	avcodec/hevcdsp: Constify src pointers Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-08-05 02:54:04 +02:00
Lu Wang	20194d573d	avcodec: [loongarch] Optimize Hevcdsp with LSX. ffmpeg -i 5_h265_1080p_60fps_3Mbps.mkv -f rawvideo -y /dev/null -an before: 94fps after : 110fps Signed-off-by: Hao Chen <chenhao@loongson.cn> Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2022-03-01 23:53:40 +01:00
Reimar Döffinger	30f80d855b	lavc/aarch64: port HEVC SIMD idct NEON Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth available on aarch64. For a UHD HDR (10 bit) sample video these were consuming the most time and this optimization reduced overall decode time from 19.4s to 16.4s, approximately 15% speedup. Test sample was the first 300 frames of "LG 4K HDR Demo - New York.ts", running on Apple M1. Signed-off-by: Josh Dekker <josh@itanimul.li>	2021-02-18 14:11:53 +01:00
Anton Khirnov	e15371061d	lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bump They are not properly namespaced and not intended for public use.	2021-01-01 14:14:57 +01:00
James Almer	c0683dce89	Merge commit '0b9a237b2386ff84a6f99716bd58fa27a1b767e7' * commit '0b9a237b2386ff84a6f99716bd58fa27a1b767e7': hevc: Add NEON 4x4 and 8x8 IDCT [15:12:59] <@ubitux> hevc_idct_4x4_8_c: 389.1 [15:13:00] <@ubitux> hevc_idct_4x4_8_neon: 126.6 [15:13:02] <@ubitux> our ^ [15:13:06] <@ubitux> hevc_idct_4x4_8_c: 389.3 [15:13:08] <@ubitux> hevc_idct_4x4_8_neon: 107.8 [15:13:10] <@ubitux> hevc_idct_4x4_10_c: 418.6 [15:13:12] <@ubitux> hevc_idct_4x4_10_neon: 108.1 [15:13:14] <@ubitux> libav ^ [15:13:30] <@ubitux> so yeah, we can probably trash our versions here Merged-by: James Almer <jamrial@gmail.com>	2017-10-24 19:10:22 -03:00
Clément Bœsch	f6e8d54fcc	Merge commit 'b0e6b3f4777910d61083976aa9fc78a1e0731aae' * commit 'b0e6b3f4777910d61083976aa9fc78a1e0731aae': hevc: ppc: Add HEVC 4x4 IDCT for PowerPC Merged-by: Clément Bœsch <u@pkh.me>	2017-04-17 13:54:51 +02:00
Alexandra Hájková	0b9a237b23	hevc: Add NEON 4x4 and 8x8 IDCT Optimized by Martin Storsjö <martin@martin.st>. The speedup vs C code is around 3.2-4.4x. Signed-off-by: Martin Storsjö <martin@martin.st>	2017-03-27 22:56:23 +03:00
Clément Bœsch	3d65359832	Merge commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b' * commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b': hevc: x86: Add add_residual() SIMD optimizations See `a6af4bf64d` This merge is only cosmetics (renames, space shuffling, etc). The functionnal changes in the ASM are not merged: - unrolling with %rep is kept - ADD_RES_MMX_4_8 is left untouched: this needs investigation Merged-by: Clément Bœsch <u@pkh.me>	2017-03-24 12:33:25 +01:00
Clément Bœsch	d0e132bab6	Merge commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d' * commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d': hevc: Separate adding residual to prediction from IDCT This commit should be a noop but isn't because of the following renames: - transform_add → add_residual - transform_skip → dequant - idct_4x4_luma → transform_4x4_luma Merged-by: Clément Bœsch <cboesch@gopro.com>	2017-01-31 15:31:34 +01:00
Alexandra Hajkova	b0e6b3f477	hevc: ppc: Add HEVC 4x4 IDCT for PowerPC Signed-off-by: Diego Biurrun <diego@biurrun.de>	2016-12-12 09:25:16 +01:00
Pierre Edouard Lepere	6d5636ad9a	hevc: x86: Add add_residual() SIMD optimizations Initially written by Pierre Edouard Lepere <Pierre-Edouard.Lepere@insa-rennes.fr>, extended by James Almer <jamrial@gmail.com>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>	2016-10-22 17:33:35 +02:00
Alexandra Hájková	1bd890ad17	hevc: Separate adding residual to prediction from IDCT Based on patch `250430bf28` by Mickaël Raulet <mraulet@insa-rennes.fr>, integrated to Libav by Josh de Kock <josh@itanimul.li>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>	2016-07-18 15:27:13 +02:00
Mickaël Raulet	cc16da75c2	hevc: Add coefficient limiting to speed up IDCT Integrated to libav by Josh de Kock <josh@itanimul.li>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>	2016-07-18 15:27:13 +02:00
Mickaël Raulet	a92fd8a062	hevc: Add DC IDCT Integrated to Libav by Josh de Kock <josh@itanimul.li>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>	2016-07-18 15:27:13 +02:00
Anton Khirnov	e7078e842d	hevcdsp: add x86 SIMD for MC	2015-12-05 21:11:52 +01:00
Anton Khirnov	688417399c	hevcdsp: split the pred functions by width This should allow for more efficient SIMD.	2015-12-05 21:10:41 +01:00
Anton Khirnov	818bfe7f0a	hevcdsp: split the epel functions by width This should allow for more efficient SIMD.	2015-12-05 21:09:57 +01:00
Anton Khirnov	1f821750f0	hevcdsp: split the qpel functions by width instead of by the subpixel fraction This should allow for more efficient SIMD. Keep the C versions as they are now, to allow the compiler to inline the interpolation coefficients.	2015-12-05 21:08:04 +01:00
Anton Khirnov	d8ebb6157d	hevcdsp: fix a function name put_weighted_pred_avg should be put_unweighted_pred_avg, there is no weighting there.	2015-08-21 08:46:05 +02:00
Michael Niedermayer	29d147c94d	Merge commit '059a934806d61f7af9ab3fd9f74994b838ea5eba' * commit '059a934806d61f7af9ab3fd9f74994b838ea5eba': lavc: Consistently prefix input buffer defines Conflicts: doc/examples/decoding_encoding.c libavcodec/4xm.c libavcodec/aac_adtstoasc_bsf.c libavcodec/aacdec.c libavcodec/aacenc.c libavcodec/ac3dec.h libavcodec/asvenc.c libavcodec/avcodec.h libavcodec/avpacket.c libavcodec/dvdec.c libavcodec/ffv1enc.c libavcodec/g2meet.c libavcodec/gif.c libavcodec/h264.c libavcodec/h264_mp4toannexb_bsf.c libavcodec/huffyuvdec.c libavcodec/huffyuvenc.c libavcodec/jpeglsenc.c libavcodec/libxvid.c libavcodec/mdec.c libavcodec/motionpixels.c libavcodec/mpeg4videodec.c libavcodec/mpegvideo.c libavcodec/noise_bsf.c libavcodec/nuv.c libavcodec/nvenc.c libavcodec/options.c libavcodec/parser.c libavcodec/pngenc.c libavcodec/proresenc_kostya.c libavcodec/qsvdec.c libavcodec/svq1enc.c libavcodec/tiffenc.c libavcodec/truemotion2.c libavcodec/utils.c libavcodec/utvideoenc.c libavcodec/vc1dec.c libavcodec/wmalosslessdec.c libavformat/adxdec.c libavformat/aiffdec.c libavformat/apc.c libavformat/apetag.c libavformat/avidec.c libavformat/bink.c libavformat/cafdec.c libavformat/flvdec.c libavformat/id3v2.c libavformat/isom.c libavformat/matroskadec.c libavformat/mov.c libavformat/mpc.c libavformat/mpc8.c libavformat/mpegts.c libavformat/mvi.c libavformat/mxfdec.c libavformat/mxg.c libavformat/nutdec.c libavformat/oggdec.c libavformat/oggparsecelt.c libavformat/oggparseflac.c libavformat/oggparseopus.c libavformat/oggparsespeex.c libavformat/omadec.c libavformat/rawdec.c libavformat/riffdec.c libavformat/rl2.c libavformat/rmdec.c libavformat/rtpdec_latm.c libavformat/rtpdec_mpeg4.c libavformat/rtpdec_qdm2.c libavformat/rtpdec_svq3.c libavformat/sierravmd.c libavformat/smacker.c libavformat/smush.c libavformat/spdifenc.c libavformat/takdec.c libavformat/tta.c libavformat/utils.c libavformat/vqf.c libavformat/westwood_vqa.c libavformat/xmv.c libavformat/xwma.c libavformat/yop.c Merged-by: Michael Niedermayer <michael@niedermayer.cc>	2015-07-27 23:15:19 +02:00
Shivraj Patil	4efc0e6451	avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC horizontal and vertical mc functions Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-04-17 17:39:32 +02:00
Seppo Tomperi	0c494114cc	hevcdsp: ARM NEON optimized deblocking filter cherry picked from commit 1b9ee47d2f43b0a029a9468233626102eb1473b8 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-02-05 22:01:52 +01:00
James Almer	042c1159fc	x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3,avx2} Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere. Refactoring and optimizations by James Almer. Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U Width 32 158583 decicycles in edge, sao_edge_filter_8 runs, 0 skips 5205 decicycles in ff_hevc_sao_edge_filter_32_8_ssse3, 32767 runs, 1 skips 2942 decicycles in ff_hevc_sao_edge_filter_32_8_avx2, 32767 runs, 1 skips Width 64 705639 decicycles in sao_edge_filter_8, 262144 runs, 0 skips 19224 decicycles in ff_hevc_sao_edge_filter_64_8_ssse3, 262111 runs, 33 skips 10433 decicycles in ff_hevc_sao_edge_filter_64_8_avx2, 262115 runs, 29 skips Signed-off-by: James Almer <jamrial@gmail.com>	2015-02-05 15:02:27 -03:00
James Almer	1f1c7c8a57	hevcdsp: remove compilation-time-fixed parameter from sao_edge_filter The stride_src parameter is always 2 * MAX_PB_SIZE + FF_INPUT_BUFFER_PADDING_SIZE. Signed-off-by: James Almer <jamrial@gmail.com>	2015-02-05 15:02:22 -03:00
James Almer	7457afc64d	hevcdsp: replace the SAOParams struct parameter from sao_edge_filter As with sao_band_filter, pass instead the two variables from the struct needed in the function. This simplifies writing asm optimized versions. Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr> Signed-off-by: James Almer <jamrial@gmail.com>	2015-02-04 17:53:04 -03:00
Seppo Tomperi	4386e1fd94	hevcdsp: simplified sao_edge_filter Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>	2015-02-04 17:52:54 -03:00
Seppo Tomperi	74d7faf400	hevcdsp: separated sao edge filter and pixel restore funcs Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>	2015-02-04 17:52:49 -03:00
James Almer	fa3eccb4f9	x86/hevc: add ff_hevc_sao_band_filter_{8,10,12}_{sse2,avx,avx2} Original x86 intrinsics code and initial 8bit yasm port by Pierre-Edouard Lepere. 10/12bit yasm ports, refactoring and optimizations by James Almer Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U width 32 40338 decicycles in sao_band_filter_0_8, 2048 runs, 0 skips 8056 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 2048 runs, 0 skips 7458 decicycles in ff_hevc_sao_band_filter_8_32_avx, 2048 runs, 0 skips 4504 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 2048 runs, 0 skips width 64 136046 decicycles in sao_band_filter_0_8, 16384 runs, 0 skips 28576 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 16384 runs, 0 skips 26707 decicycles in ff_hevc_sao_band_filter_8_32_avx, 16384 runs, 0 skips 14387 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 16384 runs, 0 skips Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2015-02-01 20:22:35 -03:00
James Almer	2929e56006	hevcdsp: replace the SAOParams struct parameter from sao_band_filter Pass instead the two variables from the struct needed in the function. This simplifies writing asm optimized versions of the function Signed-off-by: James Almer <jamrial@gmail.com>	2015-02-01 15:45:20 -03:00
James Almer	65e6ab0c5a	hevcdsp: remove unused parameter from sao_band_filter Signed-off-by: James Almer <jamrial@gmail.com>	2015-02-01 15:45:14 -03:00
Christophe Gisquet	dad7f15567	hevcdsp: remove more instances of compile-time-fixed parameters Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-22 15:22:42 +02:00
Christophe Gisquet	d4f44b66d3	hevcdsp: remove compilation-time-fixed parameter The dststride parameter is always MAX_PB_SIZE. Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-22 14:57:37 +02:00
Christophe Gisquet	b9f3912a65	hevc: move MAX_PB_SIZE declaration Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-22 14:21:46 +02:00
Christophe Gisquet	6786848585	hevc_deblock: change tc type The x86 asm expects int32_t so use that type. Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-06 12:38:26 +02:00
Michael Niedermayer	706f81a2c2	Merge commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d' * commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d': hevc: SSE2 and SSSE3 loop filters Conflicts: libavcodec/hevcdsp.c libavcodec/hevcdsp.h libavcodec/x86/Makefile libavcodec/x86/hevc_deblock.asm libavcodec/x86/hevcdsp_init.c See: `de7b89fd43` and several others Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-27 00:20:48 +02:00
James Almer	1ace9573dc	x86/hevc_idct: replace old and unused idct functions Only 8-bit and 10-bit idct_dc() functions are included (adding others should be trivial). Benchmarks on an Intel Core i5-4200U: idct8x8_dc SSE2 MMXEXT C cycles 22 26 57 idct16x16_dc AVX2 SSE2 C cycles 27 32 249 idct32x32_dc AVX2 SSE2 C cycles 62 126 1375 Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-26 18:00:11 +02:00
Pierre Edouard Lepere	1a880b2fb8	hevc: SSE2 and SSSE3 loop filters Additional contributions by James Almer <jamrial@gmail.com>, Carl Eugen Hoyos <cehoyos@ag.or.at>, Fiona Glaser <fiona@x264.com> and Anton Khirnov <anton@khirnov.net> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2014-07-26 15:01:01 +00:00
Anton Khirnov	73bb8f61d4	hevcdsp: remove an unneeded variable in the loop filter beta0 and beta1 will always be the same	2014-07-26 15:00:11 +00:00
Christophe Gisquet	ca081217cd	hevcdsp: change types of SAO parameters From openhevc Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-23 20:54:03 +02:00
Anton Khirnov	d7e162d46b	hevcdsp: remove an unneeded variable in the loop filter beta0 and beta1 will always be the same within a CU Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr> cherry picked from commit 4a23d824741a289c7d2d2f2871d1e2621b63fa1b Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-22 16:27:26 +02:00
Mickaël Raulet	d249e6828e	hevc/sao: optimze sao implementation - adding one extra pixel all around the frame - do not copy when SAO is not applied 5% improvement cherry picked from commit 10fc29fc19a12c4d8168fbe1a954b76386db12d0 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-18 22:46:50 +02:00
Mickaël Raulet	453f8eaee2	hevc/rext: add support for Range extension tools SPS features/flags: - transform_skip_rotation_enabled_flag - transform_skip_context_enabled_flag - implicit_rdpcm_enabled_flag - explicit_rdpcm_enabled_flag - intra_smoothing_disabled_flag - persistent_rice_adaptation_enabled_flag PPS features/flags: - log2_max_transform_skip_block_size - cross_component_prediction_enabled_flag - chroma_qp_offset_list_enabled_flag - diff_cu_chroma_qp_offset_depth - chroma_qp_offset_list_len_minus1 - cb_qp_offset_list - cr_qp_offset_list - log2_sao_offset_scale_luma - log2_sao_offset_scale_chroma (cherry picked from commit 005294c5b939a23099871c6130c8a7cc331f73ee) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-15 14:08:20 +02:00
Mickaël Raulet	5a41999d81	hevc/rext: basic infrastructure for supporting range extension - support for 4:2:2 and 4:4:4 up to 12 bits - add a new profile for range extension (cherry picked from commit d3c067fa65bbc871758d28aa07f54123430ca346) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-15 13:47:35 +02:00
Mickaël Raulet	250430bf28	hevc: separate residu and prediction (needed for Range Extension) (cherry picked from commit 6b3856ef57d66f2e59ee61fd2eb5f83b6d0d7d4a) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-15 13:37:27 +02:00
Mickaël Raulet	1241eb8870	hevc: simplify SAO computation, delay from one row its computation (cherry picked from commit f2c5f647cec786df26f442a85e6d685a131a50c9) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-07-15 13:11:33 +02:00
plepere	92cccb7bcd	avcodec/hevc: new idct + asm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-17 13:23:36 +02:00
plepere	7a2491c436	HEVC : added assembly MC functions pretty print x86 Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-06 18:23:36 +02:00
Mickaël Raulet	83976e40e8	hevc: C code update for new motion compensation pretty print C Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-06 18:22:34 +02:00
Michael Niedermayer	5410a5dc66	Merge remote-tracking branch 'qatar/master' * qatar/master: hevc: move DSP declarations from hevc.h into hevcdsp.h Conflicts: libavcodec/hevc.h libavcodec/hevcdsp.c libavcodec/hevcdsp.h See: `c8dd048ab8` Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-12-22 12:46:19 +01:00

1 2

52 Commits