The original arm version didn't do saturation here. This probably doesn't make any difference for performance, but reduces the differences. Signed-off-by: Martin Storsjö <martin@martin.st>