FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-02 03:06:28 +02:00

Author	SHA1	Message	Date
gxw	464d28c070	avcodec/mips: Refine ff_h264_h_lpf_luma_inter_msa Using mask to avoid judgment, H264 4K decoding speed improved about 0.1fps tested on 3A4000 Signed-off-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2021-05-07 17:53:23 +02:00
Shiyou Yin	12614a589f	avcodec/mips: fix type mismatch in h264dsp_msa.c gcc warning: assignment from incompatible pointer type. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-07-19 10:59:43 +02:00
gxw	92fc0bfa54	avutil/mips: refactor msa SLDI_Bn_0 and SLDI_Bn macros. Changing details as following: 1. The previous order of parameters are irregular and difficult to understand. Adjust the order of the parameters according to the rule: (RTYPE, input registers, input mask/input index/..., output registers). Most of the existing msa macros follow the rule. 2. Remove the redundant macro SLDI_Bn_0 and use SLDI_Bn instead. Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-09-16 00:04:18 +02:00
gxw	a3e572d96f	avutil/mips: refine msa macros CLIP_*. Changing details as following: 1. Remove the local variable 'out_m' in 'CLIP_SH' and store the result in source vector. 2. Refine the implementation of macro 'CLIP_SH_0_255' and 'CLIP_SW_0_255'. Performance of VP8 decoding has speed up about 1.1%(from 7.03x to 7.11x). Performance of H264 decoding has speed up about 0.5%(from 4.35x to 4.37x). Performance of Theora decoding has speed up about 0.7%(from 5.79x to 5.83x). 3. Remove redundant macro 'CLIP_SH/Wn_0_255_MAX_SATU' and use 'CLIP_SH/Wn_0_255' instead, because there are no difference in the effect of this two macros. Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-08-13 16:48:38 +02:00
Shiyou Yin	153c607525	avutil/mips: refactor msa load and store macros. Replace STnxm_UB and LDnxm_SH with new macros ST_{H/W/D}{1/2/4/8}. The old macros are difficult to use because they don't follow the same parameter passing rules. Changing details as following: 1. remove LD4x4_SH. 2. replace ST2x4_UB with ST_H4. 3. replace ST4x2_UB with ST_W2. 4. replace ST4x4_UB with ST_W4. 5. replace ST4x8_UB with ST_W8. 6. replace ST6x4_UB with ST_W2 and ST_H2. 7. replace ST8x1_UB with ST_D1. 8. replace ST8x2_UB with ST_D2. 9. replace ST8x4_UB with ST_D4. 10. replace ST8x8_UB with ST_D8. 11. replace ST12x4_UB with ST_D4 and ST_W4. Examples of new macro: ST_H4(in, idx0, idx1, idx2, idx3, pdst, stride) ST_H4 store four half-word elements in vector 'in' to pdst with stride. About the macro name: 1) 'ST' means store operation. 2) 'H/W/D' means type of vector element is 'half-word/word/double-word'. 3) Number '1/2/4/8' means how many elements will be stored. About the macro parameter: 1) 'in0, in1...' 128-bits vector. 2) 'idx0, idx1...' elements index. 3) 'pdst' destination pointer to store to 4) 'stride' stride of each store operation. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-07-19 01:23:23 +02:00
Kaustubh Raste	af9433b1d6	avcodec/mips: Improve avc bi-weighted mc msa functions Replace generic with block size specific function. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-10-10 23:58:41 +02:00
Kaustubh Raste	10ab5534e0	avcodec/mips: Improve avc weighted mc msa functions Replace generic with block size specific function. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-09-27 21:15:57 +02:00
Kaustubh Raste	bba9c1c6bb	avcodec/mips: Reduced conditional cases in avc inter lpf msa functions Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-09-21 03:02:24 +02:00
Kaustubh Raste	e5a650e141	avcodec/mips: Improve avc lpf msa functions Optimize luma intra case by reducing conditional cases. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-09-15 22:19:07 +02:00
Vicente Olivert Riera	04b0792e4a	libavcodec/mips/h264dsp_msa.c: fix type in some function parameters This fixes a build problem for MIPS architecture that looks like this: libavcodec/mips/h264dsp_msa.c:2498:6: error: conflicting types for ‘ff_weight_h264_pixels16_8_msa’ void ff_weight_h264_pixels16_8_msa(uint8_t *src, int stride, This bug was introduced by commit `bc26fe8927`: avcodec/h264: Use ptrdiff_t for (bi)weight functions That commit changed the data type of some function parameters in some function definitions. However, the implementation of those functions in libavcodec/mips/h264dsp_msa.c wasn't changed accordingly. Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2016-10-13 19:15:48 +02:00
Shivraj Patil	bcd7bf7eeb	avcodec/mips: Restructure as per avutil/mips/generic_macros_msa.h This patch modifies H264 loopfilter, weighted & bi-weighted prediction MIPS-SIMD optimized code according to improved version of generic macros. Also there are minor code alignment changes. Overall, this patch is just upgrading the code with styling changes and will bring it in sync with MIPS-SIMD optimized latest codebase at our end. Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-05-28 11:57:11 +02:00
Shivraj Patil	02001ada5c	avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for H264 lpf and weight/biweight functions Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-05-01 04:19:18 +02:00

12 Commits