This patch modifies HEVC mc MIPS-SIMD optimized code according to improved version of generic macros.
Overall, this patch is just upgrading the code with styling changes and will bring it in sync with MIPS-SIMD optimized latest codebase at our end.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This patch moves HEVC code of uni mc cases to new file hevc_mc_uni_msa.c.
(There are total 5 sub-modules of HEVC mc functions, if we add all these modules in one single file, its size would be huge (~750k) & difficult to maintain, so splitting it in multiple files)
This patch also adds new HEVC header file libavcodec/mips/hevc_macros_msa.h
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This patch modifies H264 loopfilter, weighted & bi-weighted prediction MIPS-SIMD optimized code according to improved version of generic macros.
Also there are minor code alignment changes.
Overall, this patch is just upgrading the code with styling changes and will bring it in sync with MIPS-SIMD optimized latest codebase at our end.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
HAVE_LOONGSON is replaced by HAVE_LOONGSON3. Even Loongson-2E and 2F support
Loongson SIMD instructs but have low performance for decoding. We plan to focus
on optimizing Loongson-3A1000, 3B1500 and 3A1500, and modify the configure file
to support Loongson-2 series later by adding HAVE_LOONGSON2.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This mainly consists of replacing all the pointer arithmatic 'addiu'
instructions with PTR_ADDIU which will handle the differences in pointer
sizes when compiled on 64 bit mips systems.
The header asmdefs.h contains the PTR_ macros which expend to the correct mips
instructions to manipulate registers containing pointers.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
There are no independant uses of mips32r2 instructions except for the
FPU parts. Due to the heavy use of mips32r2 specifc fpu extensions, I
am guessing the original author intended MIPSFPU to imply MIPS32R2 anyway.
Since these fpu instructions are available on mips64 (non-r2), enable them
there as well.
Also remove the last occurence of HAVE_MIPS32R2 (which is coupled to
HAVE_MIPSFPU anyway).
mips32r2 is left in the list of options form compatability so that using
--disable-mips32r2 doesn't break anything.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Removing these removes the dependency of this code on mips32r2 which would
allow it to be used on processors which have FPU instructions, but not r2
instructions (like the mips64el debian port for instance).
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
On mips64, the registers t[4-7] do not exist. Instead of using a lot of #ifdef
or defines to handle differing register names, use variables and let GCC
allocate the registers automatically (like in the other mips assembly files).
In get_band_cost_ESC_mips, t4 and t5 were renamed to t6 and t7 to avoid a
variable name conflict.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This is obviously needed for 64-bit support.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Change register constraint on the v variable from = to +. This was causing GCC
to think that the v variable was never read and therefore not initialize it.
This fixes about 20 fate failures on mips64el.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The float_copy and fmul_and_reverse functions are refactored out from the
multiple copies in this file.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The optimized C version of this code actually runs faster than this
version, so remove it.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Remove some assembly that the compiler can easily handle optimally on its own.
GCC produces almost identical assembly.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Q_fract should have be declared as 'const float*'.
Also fix the constness of some local variables affected by this.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
GCC is perfectly happy generating optimized multiplication code on its own for
64-bit arches. GCC refuses to optimize the loongson code when in 32-bit mode,
so I've left that.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Iterative implementation of 32 bit fixed point split-radix FFT.
Max FFT that can be calculated currently is 2^12.
Signed-off-by: Nedeljko Babic <nbabic@mips.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Floating point FFT (nips optimized) breaks when hard coded tables are
not enabled because MIPS optimization of floating point FFT uses only
ff_init_ff_cos_tabs(16) which is not enabled by default in that case.
This patch is fixing it.
Signed-off-by: Nedeljko Babic <nbabic@mips.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
List of clobbered registers fixed and added where it is lacking.
Signed-off-by: Nedeljko Babic <nbabic@mips.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
lavc: Move vector_fmul_window to AVFloatDSPContext
rtpdec_mpeg4: Check the remaining amount of data before reading
Conflicts:
libavcodec/dsputil.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Add dependencies on HAVE_INLINE_ASM for files and parts of code
where it is necessary.
Signed-off-by: Nedeljko Babic <nbabic@mips.com>
Reviewed-by: Vitor Sessak <vitor1001@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
mips64: mark hi/lo registers clobbered in MAC64/MLS64 macros
fate: list lavfi tests in a makefile
Conflicts:
configure
tests/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'cbcd497f384f0f8ef3f76f85b29b644b900d6b9f':
adxdec: use planar sample format
adpcmdec: use planar sample format for adpcm_thp
adpcmdec: use planar sample format for adpcm_ea_xas
adpcmdec: use planar sample format for adpcm_ea_r1/r2/r3
adpcmdec: use planar sample format for adpcm_xa
adpcmdec: use planar sample format for adpcm_ima_ws for vqa version 3
adpcmdec: use planar sample format for adpcm_4xm
adpcmdec: use planar sample format for adpcm_ima_wav
adpcmdec: use planar sample format for adpcm_ima_qt
pcmdec: use planar sample format for pcm_lxf
mace: use planar sample format
atrac1: use planar sample format
build: non-x86: Only compile mpegvideo optimizations when necessary
rtpdec_mpeg4: au_headers is a single array, simple av_free is enough
avcodec: free extended_data instead address of it
fate: Add tests of the ff_make_absolute_url function
url: Handle relative urls starting with two slashes
url: Handle relative urls being just a new query string
url: Don't treat slashes in query parameters as directory separators
Conflicts:
libavcodec/adxdec.c
libavcodec/mips/Makefile
libavcodec/pcm.c
libavcodec/utils.c
libavformat/Makefile
libavformat/utils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>