FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-23 12:43:46 +02:00

Author	SHA1	Message	Date
Derek Buitenhuis	87b8e95008	Merge commit 'cdb1665f70def544ddab3e3ed3763ef99c8b3873' * commit 'cdb1665f70def544ddab3e3ed3763ef99c8b3873': aarch64: Make transpose_4x4H do a regular transpose Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2016-04-24 12:51:42 +01:00
Martin Storsjö	cdb1665f70	aarch64: Make transpose_4x4H do a regular transpose Previously, ff_h264_idct_add_neon (originally in the arm version) used a non-regular transpose in order to be able to use more instructions that deal with registers as 128 bit register pairs. The aarch64 translation doesn't do it to the same extent, but brought along the same structure since it was a straight translation. This reshuffles ff_h264_idct_add_neon, bringing it closer to the C implementation, making the transpose_4x4H macro do a regular transpose, usable for other algorithms as well. Previously, the third and fourth output from transpose_4x4H were swapped, and prior to `cc29d96d5a`, the same inputs as well. In addition to just swapping the outputs, also renumber the intermediate registers for better readability (making the register order match transpose_4x8B). This runs with the same number of cycles as before. Signed-off-by: Martin Storsjö <martin@martin.st>	2016-03-26 21:25:56 +02:00
Janne Grunau	cc29d96d5a	arm64: fix inverted register order in transpose_4x4H Fix related register order issue in ff_h264_idct_add_neon. Found-by: zjh8890 <243186085@qq.com>	2015-12-21 13:44:20 +01:00
Janne Grunau	2dba0407fd	avcodec/arm64: fix inverted register order in transpose_4x4H Fix related register order issue in ff_h264_idct_add_neon. Found-by: zjh8890 <243186085@qq.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-12-19 03:58:46 +01:00
Michael Niedermayer	95b59bfb9d	Revert "avcodec/aarch64/neon.S: Update neon.s for transpose_4x4H" The change was not correct and broke H264 This reverts commit `cd83f899c9`.	2015-12-17 21:26:37 +01:00
zjh8890	c18176bd55	avcodec/aarch64/neon.S: Update neon.s for transpose_4x4H The transpose_4x4H is wrong which cost me much time to find this bug. The orders of r2 and r3 are wrong, this bug waste me much time while I make aarch64 arm instruction which used the function.	2015-12-12 14:20:01 +01:00
Michael Niedermayer	bf0470a5be	Merge commit '36e3b1f2fd262028834a9d7b1eb533c1218ee6c2' * commit '36e3b1f2fd262028834a9d7b1eb533c1218ee6c2': aarch64: h264 loop filter NEON optimizations Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-15 15:27:26 +01:00
Michael Niedermayer	19fc3c0122	Merge commit 'd5dd8c7bf0f0d77c581db3236e0d938f06fd5591' * commit 'd5dd8c7bf0f0d77c581db3236e0d938f06fd5591': aarch64: h264 qpel NEON optimizations Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-15 15:13:41 +01:00
Michael Niedermayer	fb1c786a9d	Merge commit '8438b3f09f6b225d0886cc385117c38eb44ca0c1' * commit '8438b3f09f6b225d0886cc385117c38eb44ca0c1': aarch64: h264 idct NEON assembler optimizations Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-15 15:06:47 +01:00
Janne Grunau	36e3b1f2fd	aarch64: h264 loop filter NEON optimizations Ported from ARMv7 NEON.	2014-01-15 12:31:04 +01:00
Janne Grunau	d5dd8c7bf0	aarch64: h264 qpel NEON optimizations Ported from ARMv7 NEON.	2014-01-15 12:17:49 +01:00
Janne Grunau	8438b3f09f	aarch64: h264 idct NEON assembler optimizations Ported from ARMv7 NEON.	2014-01-15 12:13:41 +01:00

12 Commits