FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00

Author	SHA1	Message	Date
Christophe GISQUET	7fb8b491e5	rv34dsp: factorize a multiplication in the noround inverse transform Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2012-04-28 11:16:07 -07:00
Ronald S. Bultje	c23acbaed4	Don't use ff_cropTbl[] for IDCT. Results of IDCT can by far outreach the range of ff_cropTbl[], leading to overreads and potentially crashes. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC: libav-stable@libav.org	2012-03-06 10:47:42 -08:00
Ronald S. Bultje	3ab9a2a557	rv34: change most "int stride" into "ptrdiff_t stride". This prevents having to sign-extend on 64-bit systems with 32-bit ints, such as x86-64. Also fixes crashes on systems where we don't do it and arguments are not in registers, such as Win64 for all weight functions.	2012-02-20 14:58:25 -08:00
Christophe GISQUET	9ba9c34024	rv34: 1-pass inter MB reconstruction Implement 1-pass inverse transform and reconstruction for inter blocks.	2012-01-16 19:26:41 +01:00
Christophe GISQUET	d78062386e	rv34: Intra 16x16 handling Extract processing of intra 16x16 blocks from intra macroblock processing. Also implement a function performing inverse transform and block reconstruction for DC-only blocks in 1 pass instead of 2.	2012-01-16 00:41:51 +01:00
Christophe GISQUET	3faa303a47	rv34: DC-only inverse transform When decoding coefficients, detect whether the block is DC-only, and take advantage of this knowledge to perform DC-only inverse transform. This is achieved by: - first, changing the 108x4 element modulo_three_table into a 108 element table (kind of base4), and accessing each value using mask and shifts. - then, checking low bits for 0 (as they represent the presence of higher frequency coefficients) Also provide x86 SIMD code for the DC-only inverse transform. Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>	2012-01-12 09:52:33 +01:00
Christophe GISQUET	98f24ecd6c	rv34: joint coefficient decoding and dequantization Perform dequantization while decoding coefficients instead of performing it on the entire coefficients buffer. Since quantized coefficients are very sparse, this usually causes a small speedup. Speedup of around 1% on Panda board compared to the removed here neon code. Global speedup is probably around 3%. Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>	2012-01-04 10:30:01 +01:00
Mans Rullgard	40901fc14e	rv34: move 4x4 dequant to RV34DSPContext Signed-off-by: Mans Rullgard <mans@mansr.com>	2011-12-13 12:05:34 +00:00
Janne Grunau	42d32cf53c	rv34: NEON optimised inverse transform functions Signed-off-by: Mans Rullgard <mans@mansr.com>	2011-12-06 13:48:24 +00:00
Janne Grunau	1bca8f4bc5	rv34: move inverse transform functions to DSP context	2011-10-12 15:52:22 +02:00

10 Commits