Martin Storsjö
24b93022fe
aarch64: Disable ff_hevc_sao_band_filter_8x8_8_neon out of precaution
...
While this function on its own passes all of fate-hevc, there's
indications that the function might need to handle widths that
aren't a multiple of 8 (noted in commit
f63f9be37c
, which later was
reverted).
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-01-07 22:33:27 +02:00
Martin Storsjö
16fba44b4d
Revert "lavc/aarch64: add hevc sao edge 16x16"
...
This reverts commit a9214a2ca3
, as
it breaks fate-hevc.
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-01-07 22:33:23 +02:00
Martin Storsjö
df48b1d06f
Revert "lavc/aarch64: add hevc sao edge 8x8"
...
This reverts commit c97ffc1a77
, as
it breaks fate-hevc.
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-01-07 22:33:19 +02:00
Martin Storsjö
cafed377eb
Revert "lavc/aarch64: add hevc sao band 8x8 tiling"
...
This reverts commit f63f9be37c
, as
it breaks fate-hevc.
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-01-07 22:33:14 +02:00
J. Dekker
f63f9be37c
lavc/aarch64: add hevc sao band 8x8 tiling
...
bench on AWS Graviton:
hevc_sao_band_8x8_8_c: 317.5
hevc_sao_band_8x8_8_neon: 97.5
hevc_sao_band_16x16_8_c: 1115.0
hevc_sao_band_16x16_8_neon: 322.7
hevc_sao_band_32x32_8_c: 4599.2
hevc_sao_band_32x32_8_neon: 1246.2
hevc_sao_band_48x48_8_c: 10021.7
hevc_sao_band_48x48_8_neon: 2740.5
hevc_sao_band_64x64_8_c: 17635.0
hevc_sao_band_64x64_8_neon: 4875.7
Signed-off-by: J. Dekker <jdek@itanimul.li>
2022-01-04 14:32:26 +01:00
J. Dekker
c97ffc1a77
lavc/aarch64: add hevc sao edge 8x8
...
bench on AWS Graviton:
hevc_sao_edge_8x8_8_c: 516.0
hevc_sao_edge_8x8_8_neon: 81.0
Signed-off-by: J. Dekker <jdek@itanimul.li>
2022-01-04 14:31:56 +01:00
J. Dekker
a9214a2ca3
lavc/aarch64: add hevc sao edge 16x16
...
bench on AWS Graviton:
hevc_sao_edge_16x16_8_c: 1857.0
hevc_sao_edge_16x16_8_neon: 211.0
hevc_sao_edge_32x32_8_c: 7802.2
hevc_sao_edge_32x32_8_neon: 808.2
hevc_sao_edge_48x48_8_c: 16764.2
hevc_sao_edge_48x48_8_neon: 1796.5
hevc_sao_edge_64x64_8_c: 32647.5
hevc_sao_edge_64x64_8_neon: 3118.5
Signed-off-by: J. Dekker <jdek@itanimul.li>
2022-01-04 14:31:56 +01:00
Josh Dekker
7ac41e0db2
lavc/aarch64: add HEVC sao_band NEON
...
Only works for 8x8.
Signed-off-by: Josh Dekker <josh@itanimul.li>
2021-02-18 14:12:01 +01:00
Josh Dekker
75c2ddfa61
lavc/aarch64: add HEVC idct_dc NEON
...
Signed-off-by: Josh Dekker <josh@itanimul.li>
2021-02-18 14:12:01 +01:00
Reimar Döffinger
00c916ef61
lavc/aarch64: port HEVC add_residual NEON
...
Speedup is fairly small, around 1.5%, but these are fairly simple.
Signed-off-by: Josh Dekker <josh@itanimul.li>
2021-02-18 14:11:57 +01:00
Reimar Döffinger
30f80d855b
lavc/aarch64: port HEVC SIMD idct NEON
...
Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth
available on aarch64.
For a UHD HDR (10 bit) sample video these were consuming the most time
and this optimization reduced overall decode time from 19.4s to 16.4s,
approximately 15% speedup.
Test sample was the first 300 frames of "LG 4K HDR Demo - New York.ts",
running on Apple M1.
Signed-off-by: Josh Dekker <josh@itanimul.li>
2021-02-18 14:11:53 +01:00