Andreas Rheinhardt
b3bbbb14d0
avcodec/hevcdsp: Constify src pointers
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-05 02:54:04 +02:00
Lu Wang
20194d573d
avcodec: [loongarch] Optimize Hevcdsp with LSX.
...
ffmpeg -i 5_h265_1080p_60fps_3Mbps.mkv -f rawvideo -y /dev/null -an
before: 94fps
after : 110fps
Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-03-01 23:53:40 +01:00
Reimar Döffinger
30f80d855b
lavc/aarch64: port HEVC SIMD idct NEON
...
Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth
available on aarch64.
For a UHD HDR (10 bit) sample video these were consuming the most time
and this optimization reduced overall decode time from 19.4s to 16.4s,
approximately 15% speedup.
Test sample was the first 300 frames of "LG 4K HDR Demo - New York.ts",
running on Apple M1.
Signed-off-by: Josh Dekker <josh@itanimul.li>
2021-02-18 14:11:53 +01:00
Anton Khirnov
e15371061d
lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bump
...
They are not properly namespaced and not intended for public use.
2021-01-01 14:14:57 +01:00
James Almer
c0683dce89
Merge commit '0b9a237b2386ff84a6f99716bd58fa27a1b767e7'
...
* commit '0b9a237b2386ff84a6f99716bd58fa27a1b767e7':
hevc: Add NEON 4x4 and 8x8 IDCT
[15:12:59] <@ubitux> hevc_idct_4x4_8_c: 389.1
[15:13:00] <@ubitux> hevc_idct_4x4_8_neon: 126.6
[15:13:02] <@ubitux> our ^
[15:13:06] <@ubitux> hevc_idct_4x4_8_c: 389.3
[15:13:08] <@ubitux> hevc_idct_4x4_8_neon: 107.8
[15:13:10] <@ubitux> hevc_idct_4x4_10_c: 418.6
[15:13:12] <@ubitux> hevc_idct_4x4_10_neon: 108.1
[15:13:14] <@ubitux> libav ^
[15:13:30] <@ubitux> so yeah, we can probably trash our versions here
Merged-by: James Almer <jamrial@gmail.com>
2017-10-24 19:10:22 -03:00
Clément Bœsch
f6e8d54fcc
Merge commit 'b0e6b3f4777910d61083976aa9fc78a1e0731aae'
...
* commit 'b0e6b3f4777910d61083976aa9fc78a1e0731aae':
hevc: ppc: Add HEVC 4x4 IDCT for PowerPC
Merged-by: Clément Bœsch <u@pkh.me>
2017-04-17 13:54:51 +02:00
Alexandra Hájková
0b9a237b23
hevc: Add NEON 4x4 and 8x8 IDCT
...
Optimized by Martin Storsjö <martin@martin.st>.
The speedup vs C code is around 3.2-4.4x.
Signed-off-by: Martin Storsjö <martin@martin.st>
2017-03-27 22:56:23 +03:00
Clément Bœsch
3d65359832
Merge commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b'
...
* commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b':
hevc: x86: Add add_residual() SIMD optimizations
See a6af4bf64d
This merge is only cosmetics (renames, space shuffling, etc).
The functionnal changes in the ASM are *not* merged:
- unrolling with %rep is kept
- ADD_RES_MMX_4_8 is left untouched: this needs investigation
Merged-by: Clément Bœsch <u@pkh.me>
2017-03-24 12:33:25 +01:00
Clément Bœsch
d0e132bab6
Merge commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d'
...
* commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d':
hevc: Separate adding residual to prediction from IDCT
This commit should be a noop but isn't because of the following renames:
- transform_add → add_residual
- transform_skip → dequant
- idct_4x4_luma → transform_4x4_luma
Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-01-31 15:31:34 +01:00
Alexandra Hajkova
b0e6b3f477
hevc: ppc: Add HEVC 4x4 IDCT for PowerPC
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-12-12 09:25:16 +01:00
Pierre Edouard Lepere
6d5636ad9a
hevc: x86: Add add_residual() SIMD optimizations
...
Initially written by Pierre Edouard Lepere <Pierre-Edouard.Lepere@insa-rennes.fr>,
extended by James Almer <jamrial@gmail.com>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
2016-10-22 17:33:35 +02:00
Alexandra Hájková
1bd890ad17
hevc: Separate adding residual to prediction from IDCT
...
Based on patch 250430bf28
by Mickaël Raulet <mraulet@insa-rennes.fr>, integrated
to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
2016-07-18 15:27:13 +02:00
Mickaël Raulet
cc16da75c2
hevc: Add coefficient limiting to speed up IDCT
...
Integrated to libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
2016-07-18 15:27:13 +02:00
Mickaël Raulet
a92fd8a062
hevc: Add DC IDCT
...
Integrated to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
2016-07-18 15:27:13 +02:00
Anton Khirnov
e7078e842d
hevcdsp: add x86 SIMD for MC
2015-12-05 21:11:52 +01:00
Anton Khirnov
688417399c
hevcdsp: split the pred functions by width
...
This should allow for more efficient SIMD.
2015-12-05 21:10:41 +01:00
Anton Khirnov
818bfe7f0a
hevcdsp: split the epel functions by width
...
This should allow for more efficient SIMD.
2015-12-05 21:09:57 +01:00
Anton Khirnov
1f821750f0
hevcdsp: split the qpel functions by width instead of by the subpixel fraction
...
This should allow for more efficient SIMD.
Keep the C versions as they are now, to allow the compiler to inline the
interpolation coefficients.
2015-12-05 21:08:04 +01:00
Anton Khirnov
d8ebb6157d
hevcdsp: fix a function name
...
put_weighted_pred_avg should be put_unweighted_pred_avg, there is no
weighting there.
2015-08-21 08:46:05 +02:00
Michael Niedermayer
29d147c94d
Merge commit '059a934806d61f7af9ab3fd9f74994b838ea5eba'
...
* commit '059a934806d61f7af9ab3fd9f74994b838ea5eba':
lavc: Consistently prefix input buffer defines
Conflicts:
doc/examples/decoding_encoding.c
libavcodec/4xm.c
libavcodec/aac_adtstoasc_bsf.c
libavcodec/aacdec.c
libavcodec/aacenc.c
libavcodec/ac3dec.h
libavcodec/asvenc.c
libavcodec/avcodec.h
libavcodec/avpacket.c
libavcodec/dvdec.c
libavcodec/ffv1enc.c
libavcodec/g2meet.c
libavcodec/gif.c
libavcodec/h264.c
libavcodec/h264_mp4toannexb_bsf.c
libavcodec/huffyuvdec.c
libavcodec/huffyuvenc.c
libavcodec/jpeglsenc.c
libavcodec/libxvid.c
libavcodec/mdec.c
libavcodec/motionpixels.c
libavcodec/mpeg4videodec.c
libavcodec/mpegvideo.c
libavcodec/noise_bsf.c
libavcodec/nuv.c
libavcodec/nvenc.c
libavcodec/options.c
libavcodec/parser.c
libavcodec/pngenc.c
libavcodec/proresenc_kostya.c
libavcodec/qsvdec.c
libavcodec/svq1enc.c
libavcodec/tiffenc.c
libavcodec/truemotion2.c
libavcodec/utils.c
libavcodec/utvideoenc.c
libavcodec/vc1dec.c
libavcodec/wmalosslessdec.c
libavformat/adxdec.c
libavformat/aiffdec.c
libavformat/apc.c
libavformat/apetag.c
libavformat/avidec.c
libavformat/bink.c
libavformat/cafdec.c
libavformat/flvdec.c
libavformat/id3v2.c
libavformat/isom.c
libavformat/matroskadec.c
libavformat/mov.c
libavformat/mpc.c
libavformat/mpc8.c
libavformat/mpegts.c
libavformat/mvi.c
libavformat/mxfdec.c
libavformat/mxg.c
libavformat/nutdec.c
libavformat/oggdec.c
libavformat/oggparsecelt.c
libavformat/oggparseflac.c
libavformat/oggparseopus.c
libavformat/oggparsespeex.c
libavformat/omadec.c
libavformat/rawdec.c
libavformat/riffdec.c
libavformat/rl2.c
libavformat/rmdec.c
libavformat/rtpdec_latm.c
libavformat/rtpdec_mpeg4.c
libavformat/rtpdec_qdm2.c
libavformat/rtpdec_svq3.c
libavformat/sierravmd.c
libavformat/smacker.c
libavformat/smush.c
libavformat/spdifenc.c
libavformat/takdec.c
libavformat/tta.c
libavformat/utils.c
libavformat/vqf.c
libavformat/westwood_vqa.c
libavformat/xmv.c
libavformat/xwma.c
libavformat/yop.c
Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-27 23:15:19 +02:00
Shivraj Patil
4efc0e6451
avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC horizontal and vertical mc functions
...
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-04-17 17:39:32 +02:00
Seppo Tomperi
0c494114cc
hevcdsp: ARM NEON optimized deblocking filter
...
cherry picked from commit 1b9ee47d2f43b0a029a9468233626102eb1473b8
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-05 22:01:52 +01:00
James Almer
042c1159fc
x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3,avx2}
...
Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere.
Refactoring and optimizations by James Almer.
Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
Width 32
158583 decicycles in edge, sao_edge_filter_8 runs, 0 skips
5205 decicycles in ff_hevc_sao_edge_filter_32_8_ssse3, 32767 runs, 1 skips
2942 decicycles in ff_hevc_sao_edge_filter_32_8_avx2, 32767 runs, 1 skips
Width 64
705639 decicycles in sao_edge_filter_8, 262144 runs, 0 skips
19224 decicycles in ff_hevc_sao_edge_filter_64_8_ssse3, 262111 runs, 33 skips
10433 decicycles in ff_hevc_sao_edge_filter_64_8_avx2, 262115 runs, 29 skips
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-05 15:02:27 -03:00
James Almer
1f1c7c8a57
hevcdsp: remove compilation-time-fixed parameter from sao_edge_filter
...
The stride_src parameter is always 2 * MAX_PB_SIZE + FF_INPUT_BUFFER_PADDING_SIZE.
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-05 15:02:22 -03:00
James Almer
7457afc64d
hevcdsp: replace the SAOParams struct parameter from sao_edge_filter
...
As with sao_band_filter, pass instead the two variables from the struct needed in the function.
This simplifies writing asm optimized versions.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-04 17:53:04 -03:00
Seppo Tomperi
4386e1fd94
hevcdsp: simplified sao_edge_filter
...
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
2015-02-04 17:52:54 -03:00
Seppo Tomperi
74d7faf400
hevcdsp: separated sao edge filter and pixel restore funcs
...
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
2015-02-04 17:52:49 -03:00
James Almer
fa3eccb4f9
x86/hevc: add ff_hevc_sao_band_filter_{8,10,12}_{sse2,avx,avx2}
...
Original x86 intrinsics code and initial 8bit yasm port by Pierre-Edouard Lepere.
10/12bit yasm ports, refactoring and optimizations by James Almer
Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
width 32
40338 decicycles in sao_band_filter_0_8, 2048 runs, 0 skips
8056 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 2048 runs, 0 skips
7458 decicycles in ff_hevc_sao_band_filter_8_32_avx, 2048 runs, 0 skips
4504 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 2048 runs, 0 skips
width 64
136046 decicycles in sao_band_filter_0_8, 16384 runs, 0 skips
28576 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 16384 runs, 0 skips
26707 decicycles in ff_hevc_sao_band_filter_8_32_avx, 16384 runs, 0 skips
14387 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 16384 runs, 0 skips
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-01 20:22:35 -03:00
James Almer
2929e56006
hevcdsp: replace the SAOParams struct parameter from sao_band_filter
...
Pass instead the two variables from the struct needed in the function.
This simplifies writing asm optimized versions of the function
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-01 15:45:20 -03:00
James Almer
65e6ab0c5a
hevcdsp: remove unused parameter from sao_band_filter
...
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-01 15:45:14 -03:00
Christophe Gisquet
dad7f15567
hevcdsp: remove more instances of compile-time-fixed parameters
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 15:22:42 +02:00
Christophe Gisquet
d4f44b66d3
hevcdsp: remove compilation-time-fixed parameter
...
The dststride parameter is always MAX_PB_SIZE.
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 14:57:37 +02:00
Christophe Gisquet
b9f3912a65
hevc: move MAX_PB_SIZE declaration
...
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 14:21:46 +02:00
Christophe Gisquet
6786848585
hevc_deblock: change tc type
...
The x86 asm expects int32_t so use that type.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-06 12:38:26 +02:00
Michael Niedermayer
706f81a2c2
Merge commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d'
...
* commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d':
hevc: SSE2 and SSSE3 loop filters
Conflicts:
libavcodec/hevcdsp.c
libavcodec/hevcdsp.h
libavcodec/x86/Makefile
libavcodec/x86/hevc_deblock.asm
libavcodec/x86/hevcdsp_init.c
See: de7b89fd43
and several others
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 00:20:48 +02:00
James Almer
1ace9573dc
x86/hevc_idct: replace old and unused idct functions
...
Only 8-bit and 10-bit idct_dc() functions are included (adding others should be trivial).
Benchmarks on an Intel Core i5-4200U:
idct8x8_dc
SSE2 MMXEXT C
cycles 22 26 57
idct16x16_dc
AVX2 SSE2 C
cycles 27 32 249
idct32x32_dc
AVX2 SSE2 C
cycles 62 126 1375
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 18:00:11 +02:00
Pierre Edouard Lepere
1a880b2fb8
hevc: SSE2 and SSSE3 loop filters
...
Additional contributions by James Almer <jamrial@gmail.com>,
Carl Eugen Hoyos <cehoyos@ag.or.at>, Fiona Glaser <fiona@x264.com> and
Anton Khirnov <anton@khirnov.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-07-26 15:01:01 +00:00
Anton Khirnov
73bb8f61d4
hevcdsp: remove an unneeded variable in the loop filter
...
beta0 and beta1 will always be the same
2014-07-26 15:00:11 +00:00
Christophe Gisquet
ca081217cd
hevcdsp: change types of SAO parameters
...
From openhevc
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-23 20:54:03 +02:00
Anton Khirnov
d7e162d46b
hevcdsp: remove an unneeded variable in the loop filter
...
beta0 and beta1 will always be the same within a CU
Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>
cherry picked from commit 4a23d824741a289c7d2d2f2871d1e2621b63fa1b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:27:26 +02:00
Mickaël Raulet
d249e6828e
hevc/sao: optimze sao implementation
...
- adding one extra pixel all around the frame
- do not copy when SAO is not applied
5% improvement
cherry picked from commit 10fc29fc19a12c4d8168fbe1a954b76386db12d0
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-18 22:46:50 +02:00
Mickaël Raulet
453f8eaee2
hevc/rext: add support for Range extension tools
...
SPS features/flags:
- transform_skip_rotation_enabled_flag
- transform_skip_context_enabled_flag
- implicit_rdpcm_enabled_flag
- explicit_rdpcm_enabled_flag
- intra_smoothing_disabled_flag
- persistent_rice_adaptation_enabled_flag
PPS features/flags:
- log2_max_transform_skip_block_size
- cross_component_prediction_enabled_flag
- chroma_qp_offset_list_enabled_flag
- diff_cu_chroma_qp_offset_depth
- chroma_qp_offset_list_len_minus1
- cb_qp_offset_list
- cr_qp_offset_list
- log2_sao_offset_scale_luma
- log2_sao_offset_scale_chroma
(cherry picked from commit 005294c5b939a23099871c6130c8a7cc331f73ee)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-15 14:08:20 +02:00
Mickaël Raulet
5a41999d81
hevc/rext: basic infrastructure for supporting range extension
...
- support for 4:2:2 and 4:4:4 up to 12 bits
- add a new profile for range extension
(cherry picked from commit d3c067fa65bbc871758d28aa07f54123430ca346)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-15 13:47:35 +02:00
Mickaël Raulet
250430bf28
hevc: separate residu and prediction (needed for Range Extension)
...
(cherry picked from commit 6b3856ef57d66f2e59ee61fd2eb5f83b6d0d7d4a)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-15 13:37:27 +02:00
Mickaël Raulet
1241eb8870
hevc: simplify SAO computation, delay from one row its computation
...
(cherry picked from commit f2c5f647cec786df26f442a85e6d685a131a50c9)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-15 13:11:33 +02:00
plepere
92cccb7bcd
avcodec/hevc: new idct + asm
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-17 13:23:36 +02:00
plepere
7a2491c436
HEVC : added assembly MC functions
...
pretty print x86
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-06 18:23:36 +02:00
Mickaël Raulet
83976e40e8
hevc: C code update for new motion compensation
...
pretty print C
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-06 18:22:34 +02:00
Michael Niedermayer
5410a5dc66
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
hevc: move DSP declarations from hevc.h into hevcdsp.h
Conflicts:
libavcodec/hevc.h
libavcodec/hevcdsp.c
libavcodec/hevcdsp.h
See: c8dd048ab8
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-22 12:46:19 +01:00
Guillaume Martres
7398e0516f
hevc: move DSP declarations from hevc.h into hevcdsp.h
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-12-22 03:49:11 +01:00