Treat the 32 bit stride registers as signed.
Alternatively, we could make the stride arguments ptrdiff_t instead
of int, and changing all of the assembly to operate on these
registers with their full 64 bit width, but that's a much larger
and more intrusive change (and risks missing some operation, which
would clamp the intermediates to 32 bit still).
Fixes: https://trac.ffmpeg.org/ticket/9985
Signed-off-by: Martin Storsjö <martin@martin.st>
We return 0 for this particular architecture but should instead be
returning the number of lines.
Fixes users who check the return value matches what they expect.
y_offset and y_coeff being successive 32-bit integers, they are packed
into 8 bytes instead of 2x8 bytes.
See https://developer.apple.com/library/ios/documentation/Xcode/Conceptual/iPhoneOSABIReference/Articles/ARM64FunctionCallingConventions.html
> iOS diverges from Procedure Call Standard for the ARM 64-bit
> Architecture in several ways
[...]
> In the generic procedure call standard, all function arguments passed
> on the stack consume slots in multiples of 8 bytes. In iOS, this
> requirement is dropped, and values consume only the space required.
[...]
> Padding is still inserted on the stack to satisfy arguments’ alignment
> requirements.