You've already forked FFmpeg
mirror of
https://github.com/FFmpeg/FFmpeg.git
synced 2025-11-29 05:57:37 +02:00
Since psadbw only exists for 8-bits, we have to emulate it for 16-bit inputs. The simplest sequence is to use a normal subtraction, which is safe as long as the inputs do not exceed 32767 - so limit this implementation to 15-bit inputs and below. For 16-bit inputs, we could in theory instead use a pminw / pmaxw to ensure the resulting difference does not overflow, but this is slower, and also breaks the subsequent use of pmaddwd, so I opted to skip 16-bit SIMD for now. scene_sad10_c: 114175.6 ( 1.00x) scene_sad10_avx2: 9617.7 (11.87x) scene_sad10_avx512: 5208.8 (21.92x) scene_sad12_c: 114537.8 ( 1.00x) scene_sad12_avx2: 9614.0 (11.91x) scene_sad12_avx512: 5186.3 (22.08x) scene_sad14_c: 114113.9 ( 1.00x) scene_sad14_avx2: 9612.9 (11.87x) scene_sad14_avx512: 5186.0 (22.00x) scene_sad15_c: 114108.9 ( 1.00x) scene_sad15_avx2: 9612.3 (11.87x) scene_sad15_avx512: 5186.4 (22.00x) scene_sad16_c: 114136.0 ( 1.00x)