James Almer
|
92219ef4ac
|
Merge commit '186bd30aa3b6c2b29b4dbf18278700b572068b1e'
* commit '186bd30aa3b6c2b29b4dbf18278700b572068b1e':
h264/arm64: implement missing 4:2:2 chroma loop filter neon functions
Merged-by: James Almer <jamrial@gmail.com>
|
2019-03-14 16:29:41 -03:00 |
|
Janne Grunau
|
186bd30aa3
|
h264/arm64: implement missing 4:2:2 chroma loop filter neon functions
|
2019-02-27 21:57:05 +01:00 |
|
James Almer
|
e4e04dce1f
|
Merge commit '28a8b5413b64b831dfb8650208bccd8b78360484'
* commit '28a8b5413b64b831dfb8650208bccd8b78360484':
h264/aarch64: add intra loop filter neon asm
Merged-by: James Almer <jamrial@gmail.com>
|
2019-02-20 15:42:01 -03:00 |
|
James Almer
|
4dc1f06f0c
|
Merge commit '846c3d6aca5484904e60946c4fe8b8833bc07f92'
* commit '846c3d6aca5484904e60946c4fe8b8833bc07f92':
h264/aarch64: optimize neon loop filter
Merged-by: James Almer <jamrial@gmail.com>
|
2019-02-20 15:41:03 -03:00 |
|
James Almer
|
5ca7eb36b7
|
Merge commit 'bb515e3a735f526ccb1068031e289eb5aeb69e22'
* commit 'bb515e3a735f526ccb1068031e289eb5aeb69e22':
h264/aarch64: sign extend int stride in loop filter asm
Merged-by: James Almer <jamrial@gmail.com>
|
2019-02-20 14:50:37 -03:00 |
|
Janne Grunau
|
28a8b5413b
|
h264/aarch64: add intra loop filter neon asm
Add my neon asm from x264 relicensed under the LGPL 2.1 or later. Ported
(x264 uses nv12 chroma) and optimized.
Cycle count for checkasm --bench on a Snapdragon 820e:
h264_h_loop_filter_luma_intra_8bpp_c: 60.0
h264_h_loop_filter_luma_intra_8bpp_neon: 54.2
h264_v_loop_filter_luma_intra_8bpp_c: 148.3
h264_v_loop_filter_luma_intra_8bpp_neon: 73.8
h264_h_loop_filter_chroma_intra_8bpp_c: 27.8
h264_h_loop_filter_chroma_intra_8bpp_neon: 21.4
h264_h_loop_filter_chroma_mbaff_intra_8bpp_c: 15.8
h264_h_loop_filter_chroma_mbaff_intra_8bpp_neon: 15.7
h264_v_loop_filter_chroma_intra_8bpp_c: 45.8
h264_v_loop_filter_chroma_intra_8bpp_neon: 17.3
|
2019-01-26 12:05:10 +01:00 |
|
Janne Grunau
|
846c3d6aca
|
h264/aarch64: optimize neon loop filter
Exit as soon as possible if no filtering will be done.
Improves the checkasm --bench cycle count on a Snapdragon 820e:
h264_h_loop_filter_luma_8bpp_c: 72.4 -> 72.5
h264_h_loop_filter_luma_8bpp_neon: 97.1 -> 56.3
h264_v_loop_filter_luma_8bpp_c: 174.0 -> 173.5
h264_v_loop_filter_luma_8bpp_neon: 62.9 -> 60.9
h264_h_loop_filter_chroma_8bpp_c: 30.2 -> 30.3
h264_h_loop_filter_chroma_8bpp_neon: 51.6 -> 25.7
h264_v_loop_filter_chroma_8bpp_c: 57.3 -> 57.3
h264_v_loop_filter_chroma_8bpp_neon: 28.0 -> 24.0
|
2019-01-26 12:05:10 +01:00 |
|
Janne Grunau
|
bb515e3a73
|
h264/aarch64: sign extend int stride in loop filter asm
|
2019-01-26 12:05:10 +01:00 |
|
Michael Niedermayer
|
92d07ea4b5
|
Merge commit 'f896bca03fc63b93851c1c14c9321c20b3cd44a6'
* commit 'f896bca03fc63b93851c1c14c9321c20b3cd44a6':
aarch64: h264 (bi)weight NEON optimizations
Merged-by: Michael Niedermayer <michaelni@gmx.at>
|
2014-01-15 15:36:37 +01:00 |
|
Michael Niedermayer
|
bf0470a5be
|
Merge commit '36e3b1f2fd262028834a9d7b1eb533c1218ee6c2'
* commit '36e3b1f2fd262028834a9d7b1eb533c1218ee6c2':
aarch64: h264 loop filter NEON optimizations
Merged-by: Michael Niedermayer <michaelni@gmx.at>
|
2014-01-15 15:27:26 +01:00 |
|
Janne Grunau
|
f896bca03f
|
aarch64: h264 (bi)weight NEON optimizations
Ported from ARMv7 NEON.
|
2014-01-15 12:31:07 +01:00 |
|
Janne Grunau
|
36e3b1f2fd
|
aarch64: h264 loop filter NEON optimizations
Ported from ARMv7 NEON.
|
2014-01-15 12:31:04 +01:00 |
|