modelled after aarch64 code
on Cortex-A8, s16 and s32 code is about 2x faster,
float code about 7x faster
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Signed-off-by: Martin Storsjö <martin@martin.st>
The correct "next" input sample is not the first sample of the
resampling buffer, but the center sample of the filter_length-sized
block at the beginning.
CC:libav-stable@libav.org
It adds unnecessary complication for insignificant usability improvement.
The user really should know if they'll need resampling compensation before
opening the context.
Note that only the documentation has changed. The current functionality will
still work until the next major bump.
Based partially on implementation by Michael Niedermayer <michaelni@gmx.at> in
libswresample in FFmpeg. See commits:
7f1ae79d38c4edba9dbd31d7bf797e525298ac55
24ab1abfb6d55bf330022df4b10d7aec80b3f116