1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00
Commit Graph

252 Commits

Author SHA1 Message Date
Shiyou Yin
ba175578d1 avcodec: [loongson] optimize get_cabac_inline.
This optimization improved h264 decoding performance about 4%(from 74fps to 77fps, tested on loongson 3A3000).

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-19 18:33:59 +02:00
Shiyou Yin
2b646dac78 avcodec/mips: [loongson] refine ff_vc1_inv_trans_8x8_mmi.
Combined 1st and 2nd loop into one inline asm in function ff_vc1_inv_trans_8x8_mmi to
reduce memory operation, and made some small optimization in ff_vc1_inv_trans_4x8_mmi.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-19 00:44:44 +02:00
Shiyou Yin
a55adf24b9 avcodec/mips: [loongson] fix bug of svq3-watermark failed in fate test.
Failed case: svq3-watermark
When minimum loop count of following functions are greater than parameter h passed to them, svq3-watermark failed.
1. ff_put_pixels4_8_mmi
2. ff_avg_pixels4_8_mmi
3. ff_put_pixels4_l2_8_mmi
4. ff_avg_pixels4_l2_8_mmi

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-14 01:52:29 +02:00
Shiyou Yin
5161f7bcfd avutil/mips: [loongson] simplify macro TRANSPOSE_4H and TRANSPOSE_8B
Simplify macro TRANSPOSE_4H in mmiutils.h and add TRANSPOSE_8B as a common macro.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-09 12:01:07 +02:00
gxw
090647da84 avcodec/mips: [loongson] optimize vp8 decoding in vp8dsp.
Optimize vp8 loop filter with mmi, four functions optimized:
1. ff_vp8_h_loop_filter8uv_mmi.
2. ff_vp8_v_loop_filter8uv_mmi.
3. ff_vp8_h_loop_filter16_mmi.
4. ff_vp8_v_loop_filter16_mmi.

Vp8 decoding speed improved about 50%(from 73fps to 110fps, Tested on loongson 3A3000).

Signed-off-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-09 12:01:07 +02:00
Shiyou Yin
9f60c58586 avcodec/mips: [loongson] fix improper use of register constraints.
Constraint "g" means compiler can store variable in memory or register.
When we use constraint "g" for a variable and this variable was operated by
instruction which only support register operands may lead "invalid operands" error.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-07 20:35:54 +02:00
Shiyou Yin
776909e42e avcodec/mips: [loongson] reoptimize put and add pixels clamped functions.
Simplify the usage of intermediate variable addr and remove unused variable all64
in following functions:
1. ff_put_pixels_clamped_mmi
2. ff_put_signed_pixels_clamped_mmi
3. ff_add_pixels_clamped_mmi

This optimization speed up mpeg4 decode about 2% on loongson platform(tested with 3A3000).

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-05 21:45:52 +02:00
Shiyou Yin
17c635e605 avcodec/mips: [loongson] simplify the usage of intermediate variable addr.
Simplify the usage of intermediate variable addr in following functions:
1. ff_put_pixels4_8_mmi
2. ff_put_pixels8_8_mmi
3. ff_put_pixels16_8_mmi
4. ff_avg_pixels16_8_mmi.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-04 21:47:56 +02:00
Shiyou Yin
61eeb40a62 avcodec: [loongson] fix bug of mss2-wmv failed in fate test.
Failed case: mss2-wmv
In following functions, pmullh was used to multiply two 16-bit data, this will cause data overflow.
1. ff_vc1_inv_trans_8x8_dc_mmi
2. ff_vc1_inv_trans_8x8_mmi
3. ff_vc1_inv_trans_8x4_mmi
4. ff_vc1_inv_trans_4x8_mmi
5. ff_vc1_inv_trans_4x4_mmi

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-04 02:30:51 +02:00
Shiyou Yin
93b35a0555 avcodec/mips: [loongson] optimize memset in h264dsp.
Optimized memset with mmi in following functions:
1. ff_h264_add_pixels4_8_mmi.
2. ff_h264_idct_add_8_mmi.
3. ff_h264_idct8_add_8_mmi.

This optimization improved h264 decoding performance about 1.3%(tested on loongson 3A3000).

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-02 03:37:32 +02:00
Shiyou Yin
f91237baf6 avcodec/mips: [loongson] reoptimize h264_chroma_mc8_mmi v2.
Reoptimize function ff_put_h264_chroma_mc8_mmi and ff_avg_h264_chroma_mc8_mmi.
Performance of h264 decoding improved about 5%(from 69fps to 73fps, tested on loongson 3A3000).

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-02 03:37:32 +02:00
Shiyou Yin
df13b75aa1 avcodec/mips: [loongson] reoptimize simple idct with mmi.
Performance of mpeg4 decoding improved about 23%(from 128fps to 158fps, tested on loongson 3A3000).
Reoptimized following functions with mmi.
1. ff_simple_idct_put_8_mmi
2. ff_simple_idct_add_8_mmi
3. ff_simple_idct_8_mmi

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-02 03:37:32 +02:00
Shiyou Yin
c0b42987a2 avcodec/mips: fix conflicting types error of ff_vc1_h_s_overlap_mmi.
In commit 975a1a8,function ff_vc1_h_s_overlap_mmi was refactored,
but the declaration in libavcodec/mips/vc1dsp_mips.h was unchanged.

Change-Id: I90beae683511622a0cc1130ab1660ac8669ec3ef
Signed-off-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: Jerome Borsboom <jerome.borsboom@carpalis.nl>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-07-14 18:02:26 +02:00
Jerome Borsboom
975a1a81b2 avcodec/vc1: fix overlap filter for frame interlaced pictures
The overlap filter is not correct for vertical edges in frame interlaced
I and P pictures. When filtering macroblocks with different FIELDTX values,
we have to match the lines at both sides of the vertical border. In addition,
we have to use the correct rounding values, depending on the line we are
filtering.

Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl>
2018-06-29 01:18:44 +02:00
Kaustubh Raste
143fc5f6e2 avcodec/mips: Improve hevc non-uni hz and vt mc msa functions
Use mask buffer.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-14 20:48:36 +01:00
Kaustubh Raste
e5f66a9ea4 avcodec/mips: cleanup unused macros
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-14 20:48:36 +01:00
Kaustubh Raste
372a4dda33 avcodec/mips: Improve hevc non-uni hv mc msa functions
Use mask buffer.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-08 20:58:56 +01:00
Kaustubh Raste
4fba8728e8 avcodec/mips: Improve hevc uni weighted 4 tap vt mc msa functions
Use global mask buffer for appropriate mask load.
Use immediate unsigned saturation for clip to max saving one vector register.
Remove unused macro.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-08 20:58:56 +01:00
Kaustubh Raste
350721e9fd avcodec/mips: Improve hevc uni 4 tap hv mc msa functions
Use global mask buffer for appropriate mask load.
Remove unused macro and table.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-08 20:58:56 +01:00
Kaustubh Raste
4878854eef avcodec/mips: Improve hevc bi wgt 4 tap hv mc msa functions
Use global mask buffer for appropriate mask load.
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-08 20:58:56 +01:00
Kaustubh Raste
991eca0f87 avcodec/mips: Improve hevc bi 4 tap hv mc msa functions
Use global mask buffer for appropriate mask load.
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-07 04:24:06 +01:00
Kaustubh Raste
ecd69cade8 avcodec/mips: Improve avc avg mc 10, 30, 01 and 03 msa functions
Align the mask buffer to 64 bytes.
Load the specific destination bytes instead of MSA load and pack.
Remove unused macros and functions.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-07 04:24:06 +01:00
Kaustubh Raste
b9cd26f556 avcodec/mips: Improve hevc uni weighted 4 tap hz mc msa functions
Use global mask buffer for appropriate mask load.
Use immediate unsigned saturation for clip to max saving one vector register.
Remove unused macro.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-05 03:12:35 +01:00
Kaustubh Raste
1e7e9fbb03 avcodec/mips: Improve hevc uni 4 tap hz and vt mc msa functions
Use global mask buffer for appropriate mask load.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-05 03:12:35 +01:00
Kaustubh Raste
8f6c398d44 avcodec/mips: Improve hevc bi wgt 4 tap hz and vt mc msa functions
Use global mask buffer for appropriate mask load.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-04 05:05:03 +01:00
Kaustubh Raste
e1555eb76c avcodec/mips: Improve hevc bi 4 tap hz and vt mc msa functions
Use global mask buffer for appropriate mask load.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-04 05:05:03 +01:00
Kaustubh Raste
5cb261301c avcodec/mips: Improve avc avg mc 20, 21 and 23 msa functions
Load the specific destination bytes instead of MSA load and pack.
Remove unused macros and functions.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-04 05:05:03 +01:00
Kaustubh Raste
48d77d5cd4 avcodec/mips: Improve hevc uni weighted hv mc msa functions
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-03 00:56:30 +01:00
Kaustubh Raste
72dbc610be avcodec/mips: Improve avc avg mc 02, 12 and 32 msa functions
Remove loops and unroll as block sizes are known.
Load the specific destination bytes instead of MSA load and pack.
Remove unused macro and functions.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-03 00:56:30 +01:00
Kaustubh Raste
a211626978 avcodec/mips: Improve hevc uni vt and hv mc msa functions
Remove unused macro.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-01 02:10:09 +01:00
Kaustubh Raste
e39c0f29a1 avcodec/mips: Improve hevc bi hz and hv mc msa functions
Align the mask buffer.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-01 02:10:09 +01:00
Kaustubh Raste
6854dd7039 avcodec/mips: Improve hevc bi weighted copy, hz and vt mc msa functions
Pack the data to half word before clipping.
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-01 02:10:09 +01:00
Kaustubh Raste
93218c2234 avcodec/mips: Improve avc chroma avg hv mc msa functions
Replace generic with block size specific function.
Load the specific destination bytes instead of MSA load and pack.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-30 21:32:26 +01:00
Kaustubh Raste
1181d93231 avcodec/mips: Improve avc avg mc 22, 11, 31, 13 and 33 msa functions
Remove loops and unroll as block sizes are known.
Load the specific destination bytes instead of MSA load and pack.
Remove unused macro and functions.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-30 21:32:26 +01:00
Kaustubh Raste
736a48901f avcodec/mips: Improve hevc bi weighted hv mc msa functions
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-25 21:50:37 +02:00
Kaustubh Raste
ce0a52e9e9 avcodec/mips: Improve avc chroma copy and avg vert mc msa functions
Replace generic with block size specific function.
Load the specific destination bytes instead of MSA load and pack.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-25 21:50:37 +02:00
Kaustubh Raste
2aab7c2dfa avcodec/mips: Improve avc put mc 11, 31, 13 and 33 msa functions
Remove loops and unroll as block sizes are known.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-25 21:50:37 +02:00
Kaustubh Raste
e5439e272e avcodec/mips: Improve hevc uni weighted vert mc msa functions
Pack the data to half word before clipping.
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-13 02:46:31 +02:00
Kaustubh Raste
6ca821a3e7 avcodec/mips: Improve hevc uni horiz mc msa functions
Update macros to remove adds.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-13 02:46:31 +02:00
Kaustubh Raste
e63758468c avcodec/mips: Improve hevc bi copy mc msa functions
Load the specific destination bytes instead of MSA load and pack.
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-13 02:46:31 +02:00
Kaustubh Raste
e549933a27 avcodec/mips: Improve avc put mc 12, 32 and 22 msa functions
Remove loops and unroll as block sizes are known.
Removed unused functions.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-13 02:46:31 +02:00
Kaustubh Raste
27a0a83880 avcodec/mips: Improve avc chroma avg horiz mc msa functions
Replace generic with block size specific function.
Load the specific destination bytes instead of MSA load and pack.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-13 02:46:31 +02:00
Kaustubh Raste
ff53f4dc2d avcodec/mips: Improve avc uni copy mc msa functions
Load the specific bytes instead of MSA load.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-10 23:58:41 +02:00
Kaustubh Raste
eadb911643 avcodec/mips: Improve hevc uni-w horiz mc msa functions
Load the specific destination bytes instead of MSA load and pack.
Pack the data to half word before clipping.
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-10 23:58:41 +02:00
Kaustubh Raste
662234a9a2 avcodec/mips: Improve avc put mc 21, 23 and 02 msa functions
Remove loops and unroll as block sizes are known.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-10 23:58:41 +02:00
Kaustubh Raste
b59323cb72 avcodec/mips: Improve avc chroma hv mc msa functions
Replace generic with block size specific function.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-10 23:58:41 +02:00
Kaustubh Raste
af9433b1d6 avcodec/mips: Improve avc bi-weighted mc msa functions
Replace generic with block size specific function.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-10 23:58:41 +02:00
Kaustubh Raste
56822b074b avcodec/mips: preload data in hevc sao edge 135 degree filter msa functions
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-10 23:58:41 +02:00
Kaustubh Raste
51ebce7d7d avcodec/mips: Cleanup unused functions
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-06 02:42:49 +02:00
Kaustubh Raste
6796a1dd8c avcodec/mips: Improve avc put mc 20, 01 and 03 msa functions
Remove loops and unroll as block sizes are known.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-09-27 21:15:57 +02:00