FFmpeg

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-28 20:53:54 +02:00

Author	SHA1	Message	Date
Ben Avison	2698bfdc93	checkasm: Add vc1dsp inverse transform tests This test deliberately doesn't exercise the full range of inputs described in the committee draft VC-1 standard. It says: input coefficients in frequency domain, D, satisfy -2048 <= D < 2047 intermediate coefficients, E, satisfy -4096 <= E < 4095 fully inverse-transformed coefficients, R, satisfy -512 <= R < 511 For one thing, the inequalities look odd. Did they mean them to go the other way round? That would make more sense because the equations generally both add and subtract coefficients multiplied by constants, including powers of 2. Requiring the most-negative values to be valid extends the number of bits to represent the intermediate values just for the sake of that one case! For another thing, the extreme values don't look to occur in real streams - both in my experience and supported by the following comment in the AArch32 decoder: tNhalf is half of the value of tN (as described in vc1_inv_trans_8x8_c). This is done because sometimes files have input that causes tN + tM to overflow. To avoid this overflow, we compute tNhalf, then compute tNhalf + tM (which doesn't overflow), and then we use vhadd to compute (tNhalf + (tNhalf + tM)) >> 1 which does not overflow because it is one instruction. My AArch64 decoder goes further than this. It calculates tNhalf and tM then does an SRA (essentially a fused halve and add) to compute (tN + tM) >> 1 without ever having to hold (tNhalf + tM) in a 16-bit element without overflowing. It only encounters difficulties if either tNhalf or tM overflow in isolation. I haven't had sight of the final standard, so it's possible that these issues were dealt with during finalisation, which could explain the lack of usage of extreme inputs in real streams. Or a preponderance of decoders that only support 16-bit intermediate values in their inverse transforms might have caused encoders to steer clear of such cases. I have effectively followed this approach in the test, and limited the scale of the coefficients sufficient that both the existing AArch32 decoder and my new AArch64 decoder both pass. Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-04-01 10:03:33 +03:00
Ben Avison	20cb43ea8b	checkasm: Add vc1dsp in-loop deblocking filter tests Note that the benchmarking results for these functions are highly dependent upon the input data. Therefore, each function is benchmarked twice, corresponding to the best and worst case complexity of the reference C implementation. The performance of a real stream decode will fall somewhere between these two extremes. Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>	2022-04-01 10:03:33 +03:00
Martin Storsjö	a78f136f3f	configure: Use a separate config_components.h header for $ALL_COMPONENTS This avoids unnecessary rebuilds of most source files if only the list of enabled components has changed, but not the other properties of the build, set in config.h. Signed-off-by: Martin Storsjö <martin@martin.st>	2022-03-16 14:12:49 +02:00
Wu Jianhua	f629ea2e18	avutil/cpu: add AVX512 Icelake flag Signed-off-by: Wu Jianhua <jianhua.wu@intel.com> Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: James Almer <jamrial@gmail.com>	2022-03-10 16:45:48 -03:00
Anton Khirnov	d552f2535b	lavc/h264: move some shared code from h264dec to h264_parse	2022-01-26 15:23:30 +01:00
Mark Reid	52f7026164	swscale/x86/input.asm: add x86-optimized planer rgb2yuv functions sse2 only operates on 2 lanes per loop for to_y and to_uv functions, due to the lack of pmulld instruction. Emulating pmulld with 2 pmuludq and shuffles proved too costly and made to_uv functions slower then the c implementation. For to_y on sse2 only float functions are generated, I was are not able outperform the c implementation on the integer pixel formats. For to_a on see4 only the float functions are generated. sse2 and sse4 generated nearly identical performing code on integer pixel formats, so only sse2/avx2 versions are generated. planar_gbrp_to_y_512_c: 1197.5 planar_gbrp_to_y_512_sse4: 444.5 planar_gbrp_to_y_512_avx2: 287.5 planar_gbrap_to_y_512_c: 1204.5 planar_gbrap_to_y_512_sse4: 447.5 planar_gbrap_to_y_512_avx2: 289.5 planar_gbrp9be_to_y_512_c: 1380.0 planar_gbrp9be_to_y_512_sse4: 543.5 planar_gbrp9be_to_y_512_avx2: 340.0 planar_gbrp9le_to_y_512_c: 1200.5 planar_gbrp9le_to_y_512_sse4: 442.0 planar_gbrp9le_to_y_512_avx2: 282.0 planar_gbrp10be_to_y_512_c: 1378.5 planar_gbrp10be_to_y_512_sse4: 544.0 planar_gbrp10be_to_y_512_avx2: 337.5 planar_gbrp10le_to_y_512_c: 1200.0 planar_gbrp10le_to_y_512_sse4: 448.0 planar_gbrp10le_to_y_512_avx2: 285.5 planar_gbrap10be_to_y_512_c: 1380.0 planar_gbrap10be_to_y_512_sse4: 542.0 planar_gbrap10be_to_y_512_avx2: 340.5 planar_gbrap10le_to_y_512_c: 1199.0 planar_gbrap10le_to_y_512_sse4: 446.0 planar_gbrap10le_to_y_512_avx2: 289.5 planar_gbrp12be_to_y_512_c: 10563.0 planar_gbrp12be_to_y_512_sse4: 542.5 planar_gbrp12be_to_y_512_avx2: 339.0 planar_gbrp12le_to_y_512_c: 1201.0 planar_gbrp12le_to_y_512_sse4: 440.5 planar_gbrp12le_to_y_512_avx2: 286.0 planar_gbrap12be_to_y_512_c: 1701.5 planar_gbrap12be_to_y_512_sse4: 917.0 planar_gbrap12be_to_y_512_avx2: 338.5 planar_gbrap12le_to_y_512_c: 1201.0 planar_gbrap12le_to_y_512_sse4: 444.5 planar_gbrap12le_to_y_512_avx2: 288.0 planar_gbrp14be_to_y_512_c: 1370.5 planar_gbrp14be_to_y_512_sse4: 545.0 planar_gbrp14be_to_y_512_avx2: 338.5 planar_gbrp14le_to_y_512_c: 1199.0 planar_gbrp14le_to_y_512_sse4: 444.0 planar_gbrp14le_to_y_512_avx2: 279.5 planar_gbrp16be_to_y_512_c: 1364.0 planar_gbrp16be_to_y_512_sse4: 544.5 planar_gbrp16be_to_y_512_avx2: 339.5 planar_gbrp16le_to_y_512_c: 1201.0 planar_gbrp16le_to_y_512_sse4: 445.5 planar_gbrp16le_to_y_512_avx2: 280.5 planar_gbrap16be_to_y_512_c: 1377.0 planar_gbrap16be_to_y_512_sse4: 545.0 planar_gbrap16be_to_y_512_avx2: 338.5 planar_gbrap16le_to_y_512_c: 1201.0 planar_gbrap16le_to_y_512_sse4: 442.0 planar_gbrap16le_to_y_512_avx2: 279.0 planar_gbrpf32be_to_y_512_c: 4113.0 planar_gbrpf32be_to_y_512_sse2: 2438.0 planar_gbrpf32be_to_y_512_sse4: 1068.0 planar_gbrpf32be_to_y_512_avx2: 904.5 planar_gbrpf32le_to_y_512_c: 3818.5 planar_gbrpf32le_to_y_512_sse2: 2024.5 planar_gbrpf32le_to_y_512_sse4: 1241.5 planar_gbrpf32le_to_y_512_avx2: 657.0 planar_gbrapf32be_to_y_512_c: 3707.0 planar_gbrapf32be_to_y_512_sse2: 2444.0 planar_gbrapf32be_to_y_512_sse4: 1077.0 planar_gbrapf32be_to_y_512_avx2: 909.0 planar_gbrapf32le_to_y_512_c: 3822.0 planar_gbrapf32le_to_y_512_sse2: 2024.5 planar_gbrapf32le_to_y_512_sse4: 1176.0 planar_gbrapf32le_to_y_512_avx2: 658.5 planar_gbrp_to_uv_512_c: 2325.8 planar_gbrp_to_uv_512_sse2: 1726.8 planar_gbrp_to_uv_512_sse4: 771.8 planar_gbrp_to_uv_512_avx2: 506.8 planar_gbrap_to_uv_512_c: 2281.8 planar_gbrap_to_uv_512_sse2: 1726.3 planar_gbrap_to_uv_512_sse4: 768.3 planar_gbrap_to_uv_512_avx2: 496.3 planar_gbrp9be_to_uv_512_c: 2336.8 planar_gbrp9be_to_uv_512_sse2: 1924.8 planar_gbrp9be_to_uv_512_sse4: 852.3 planar_gbrp9be_to_uv_512_avx2: 552.8 planar_gbrp9le_to_uv_512_c: 2270.3 planar_gbrp9le_to_uv_512_sse2: 1512.3 planar_gbrp9le_to_uv_512_sse4: 764.3 planar_gbrp9le_to_uv_512_avx2: 491.3 planar_gbrp10be_to_uv_512_c: 2281.8 planar_gbrp10be_to_uv_512_sse2: 1917.8 planar_gbrp10be_to_uv_512_sse4: 855.3 planar_gbrp10be_to_uv_512_avx2: 541.3 planar_gbrp10le_to_uv_512_c: 2269.8 planar_gbrp10le_to_uv_512_sse2: 1515.3 planar_gbrp10le_to_uv_512_sse4: 759.8 planar_gbrp10le_to_uv_512_avx2: 487.8 planar_gbrap10be_to_uv_512_c: 2382.3 planar_gbrap10be_to_uv_512_sse2: 1924.8 planar_gbrap10be_to_uv_512_sse4: 855.3 planar_gbrap10be_to_uv_512_avx2: 540.8 planar_gbrap10le_to_uv_512_c: 2382.3 planar_gbrap10le_to_uv_512_sse2: 1512.3 planar_gbrap10le_to_uv_512_sse4: 759.3 planar_gbrap10le_to_uv_512_avx2: 484.8 planar_gbrp12be_to_uv_512_c: 2283.8 planar_gbrp12be_to_uv_512_sse2: 1936.8 planar_gbrp12be_to_uv_512_sse4: 858.3 planar_gbrp12be_to_uv_512_avx2: 541.3 planar_gbrp12le_to_uv_512_c: 2278.8 planar_gbrp12le_to_uv_512_sse2: 1507.3 planar_gbrp12le_to_uv_512_sse4: 760.3 planar_gbrp12le_to_uv_512_avx2: 485.8 planar_gbrap12be_to_uv_512_c: 2385.3 planar_gbrap12be_to_uv_512_sse2: 1927.8 planar_gbrap12be_to_uv_512_sse4: 855.3 planar_gbrap12be_to_uv_512_avx2: 539.8 planar_gbrap12le_to_uv_512_c: 2377.3 planar_gbrap12le_to_uv_512_sse2: 1516.3 planar_gbrap12le_to_uv_512_sse4: 759.3 planar_gbrap12le_to_uv_512_avx2: 484.8 planar_gbrp14be_to_uv_512_c: 2283.8 planar_gbrp14be_to_uv_512_sse2: 1935.3 planar_gbrp14be_to_uv_512_sse4: 852.3 planar_gbrp14be_to_uv_512_avx2: 540.3 planar_gbrp14le_to_uv_512_c: 2276.8 planar_gbrp14le_to_uv_512_sse2: 1514.8 planar_gbrp14le_to_uv_512_sse4: 762.3 planar_gbrp14le_to_uv_512_avx2: 484.8 planar_gbrp16be_to_uv_512_c: 2383.3 planar_gbrp16be_to_uv_512_sse2: 1881.8 planar_gbrp16be_to_uv_512_sse4: 852.3 planar_gbrp16be_to_uv_512_avx2: 541.8 planar_gbrp16le_to_uv_512_c: 2378.3 planar_gbrp16le_to_uv_512_sse2: 1476.8 planar_gbrp16le_to_uv_512_sse4: 765.3 planar_gbrp16le_to_uv_512_avx2: 485.8 planar_gbrap16be_to_uv_512_c: 2382.3 planar_gbrap16be_to_uv_512_sse2: 1886.3 planar_gbrap16be_to_uv_512_sse4: 853.8 planar_gbrap16be_to_uv_512_avx2: 550.8 planar_gbrap16le_to_uv_512_c: 2381.8 planar_gbrap16le_to_uv_512_sse2: 1488.3 planar_gbrap16le_to_uv_512_sse4: 765.3 planar_gbrap16le_to_uv_512_avx2: 491.8 planar_gbrpf32be_to_uv_512_c: 4863.0 planar_gbrpf32be_to_uv_512_sse2: 3347.5 planar_gbrpf32be_to_uv_512_sse4: 1800.0 planar_gbrpf32be_to_uv_512_avx2: 1199.0 planar_gbrpf32le_to_uv_512_c: 4725.0 planar_gbrpf32le_to_uv_512_sse2: 2753.0 planar_gbrpf32le_to_uv_512_sse4: 1474.5 planar_gbrpf32le_to_uv_512_avx2: 927.5 planar_gbrapf32be_to_uv_512_c: 4859.0 planar_gbrapf32be_to_uv_512_sse2: 3269.0 planar_gbrapf32be_to_uv_512_sse4: 1802.0 planar_gbrapf32be_to_uv_512_avx2: 1201.5 planar_gbrapf32le_to_uv_512_c: 6338.0 planar_gbrapf32le_to_uv_512_sse2: 2756.5 planar_gbrapf32le_to_uv_512_sse4: 1476.0 planar_gbrapf32le_to_uv_512_avx2: 908.5 planar_gbrap_to_a_512_c: 383.3 planar_gbrap_to_a_512_sse2: 66.8 planar_gbrap_to_a_512_avx2: 43.8 planar_gbrap10be_to_a_512_c: 601.8 planar_gbrap10be_to_a_512_sse2: 86.3 planar_gbrap10be_to_a_512_avx2: 34.8 planar_gbrap10le_to_a_512_c: 602.3 planar_gbrap10le_to_a_512_sse2: 48.8 planar_gbrap10le_to_a_512_avx2: 31.3 planar_gbrap12be_to_a_512_c: 601.8 planar_gbrap12be_to_a_512_sse2: 111.8 planar_gbrap12be_to_a_512_avx2: 41.3 planar_gbrap12le_to_a_512_c: 385.8 planar_gbrap12le_to_a_512_sse2: 75.3 planar_gbrap12le_to_a_512_avx2: 39.8 planar_gbrap16be_to_a_512_c: 386.8 planar_gbrap16be_to_a_512_sse2: 79.8 planar_gbrap16be_to_a_512_avx2: 31.3 planar_gbrap16le_to_a_512_c: 600.3 planar_gbrap16le_to_a_512_sse2: 40.3 planar_gbrap16le_to_a_512_avx2: 30.3 planar_gbrapf32be_to_a_512_c: 1148.8 planar_gbrapf32be_to_a_512_sse2: 611.3 planar_gbrapf32be_to_a_512_sse4: 234.8 planar_gbrapf32be_to_a_512_avx2: 183.3 planar_gbrapf32le_to_a_512_c: 851.3 planar_gbrapf32le_to_a_512_sse2: 263.3 planar_gbrapf32le_to_a_512_sse4: 199.3 planar_gbrapf32le_to_a_512_avx2: 156.8 Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2022-01-11 16:34:33 -03:00
Mark Reid	9e445a5be2	swscale/x86/output.asm: add x86-optimized planer gbr yuv2anyX functions changes since v2: * fixed label changes since v1: * remove vex intruction on sse4 path * some load/pack marcos use less intructions * fixed some typos yuv2gbrp_full_X_4_512_c: 12757.6 yuv2gbrp_full_X_4_512_sse2: 8946.6 yuv2gbrp_full_X_4_512_sse4: 5138.6 yuv2gbrp_full_X_4_512_avx2: 3889.6 yuv2gbrap_full_X_4_512_c: 15368.6 yuv2gbrap_full_X_4_512_sse2: 11916.1 yuv2gbrap_full_X_4_512_sse4: 6294.6 yuv2gbrap_full_X_4_512_avx2: 3477.1 yuv2gbrp9be_full_X_4_512_c: 14381.6 yuv2gbrp9be_full_X_4_512_sse2: 9139.1 yuv2gbrp9be_full_X_4_512_sse4: 5150.1 yuv2gbrp9be_full_X_4_512_avx2: 2834.6 yuv2gbrp9le_full_X_4_512_c: 12990.1 yuv2gbrp9le_full_X_4_512_sse2: 9118.1 yuv2gbrp9le_full_X_4_512_sse4: 5132.1 yuv2gbrp9le_full_X_4_512_avx2: 2833.1 yuv2gbrp10be_full_X_4_512_c: 14401.6 yuv2gbrp10be_full_X_4_512_sse2: 9133.1 yuv2gbrp10be_full_X_4_512_sse4: 5126.1 yuv2gbrp10be_full_X_4_512_avx2: 2837.6 yuv2gbrp10le_full_X_4_512_c: 12718.1 yuv2gbrp10le_full_X_4_512_sse2: 9106.1 yuv2gbrp10le_full_X_4_512_sse4: 5120.1 yuv2gbrp10le_full_X_4_512_avx2: 2826.1 yuv2gbrap10be_full_X_4_512_c: 18535.6 yuv2gbrap10be_full_X_4_512_sse2: 33617.6 yuv2gbrap10be_full_X_4_512_sse4: 6264.1 yuv2gbrap10be_full_X_4_512_avx2: 3422.1 yuv2gbrap10le_full_X_4_512_c: 16724.1 yuv2gbrap10le_full_X_4_512_sse2: 11787.1 yuv2gbrap10le_full_X_4_512_sse4: 6282.1 yuv2gbrap10le_full_X_4_512_avx2: 3441.6 yuv2gbrp12be_full_X_4_512_c: 13723.6 yuv2gbrp12be_full_X_4_512_sse2: 9128.1 yuv2gbrp12be_full_X_4_512_sse4: 7997.6 yuv2gbrp12be_full_X_4_512_avx2: 2844.1 yuv2gbrp12le_full_X_4_512_c: 12257.1 yuv2gbrp12le_full_X_4_512_sse2: 9107.6 yuv2gbrp12le_full_X_4_512_sse4: 5142.6 yuv2gbrp12le_full_X_4_512_avx2: 2837.6 yuv2gbrap12be_full_X_4_512_c: 18511.1 yuv2gbrap12be_full_X_4_512_sse2: 12156.6 yuv2gbrap12be_full_X_4_512_sse4: 6251.1 yuv2gbrap12be_full_X_4_512_avx2: 3444.6 yuv2gbrap12le_full_X_4_512_c: 16687.1 yuv2gbrap12le_full_X_4_512_sse2: 11785.1 yuv2gbrap12le_full_X_4_512_sse4: 6243.6 yuv2gbrap12le_full_X_4_512_avx2: 3446.1 yuv2gbrp14be_full_X_4_512_c: 13690.6 yuv2gbrp14be_full_X_4_512_sse2: 9120.6 yuv2gbrp14be_full_X_4_512_sse4: 5138.1 yuv2gbrp14be_full_X_4_512_avx2: 2843.1 yuv2gbrp14le_full_X_4_512_c: 14995.6 yuv2gbrp14le_full_X_4_512_sse2: 9119.1 yuv2gbrp14le_full_X_4_512_sse4: 5126.1 yuv2gbrp14le_full_X_4_512_avx2: 2843.1 yuv2gbrp16be_full_X_4_512_c: 12367.1 yuv2gbrp16be_full_X_4_512_sse2: 8233.6 yuv2gbrp16be_full_X_4_512_sse4: 4820.1 yuv2gbrp16be_full_X_4_512_avx2: 2666.6 yuv2gbrp16le_full_X_4_512_c: 10904.1 yuv2gbrp16le_full_X_4_512_sse2: 8214.1 yuv2gbrp16le_full_X_4_512_sse4: 4824.1 yuv2gbrp16le_full_X_4_512_avx2: 2629.1 yuv2gbrap16be_full_X_4_512_c: 26569.6 yuv2gbrap16be_full_X_4_512_sse2: 10884.1 yuv2gbrap16be_full_X_4_512_sse4: 5488.1 yuv2gbrap16be_full_X_4_512_avx2: 3272.1 yuv2gbrap16le_full_X_4_512_c: 14010.1 yuv2gbrap16le_full_X_4_512_sse2: 10562.1 yuv2gbrap16le_full_X_4_512_sse4: 5463.6 yuv2gbrap16le_full_X_4_512_avx2: 3255.1 yuv2gbrpf32be_full_X_4_512_c: 14524.1 yuv2gbrpf32be_full_X_4_512_sse2: 8552.6 yuv2gbrpf32be_full_X_4_512_sse4: 4636.1 yuv2gbrpf32be_full_X_4_512_avx2: 2474.6 yuv2gbrpf32le_full_X_4_512_c: 13060.6 yuv2gbrpf32le_full_X_4_512_sse2: 9682.6 yuv2gbrpf32le_full_X_4_512_sse4: 4298.1 yuv2gbrpf32le_full_X_4_512_avx2: 2453.1 yuv2gbrapf32be_full_X_4_512_c: 18629.6 yuv2gbrapf32be_full_X_4_512_sse2: 11363.1 yuv2gbrapf32be_full_X_4_512_sse4: 15201.6 yuv2gbrapf32be_full_X_4_512_avx2: 3727.1 yuv2gbrapf32le_full_X_4_512_c: 16677.6 yuv2gbrapf32le_full_X_4_512_sse2: 10221.6 yuv2gbrapf32le_full_X_4_512_sse4: 5693.6 yuv2gbrapf32le_full_X_4_512_avx2: 3656.6 Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2022-01-11 16:33:17 -03:00
Alan Kelly	eebe406c80	libswscale: Test AV_CPU_FLAG_SLOW_GATHER for hscale functions. This is instead of EXTERNAL_AVX2_FAST so that the avx2 hscale functions are only used where they are faster.	2021-12-21 17:44:53 -03:00
Henrik Gramner	15cfb4eee3	checkasm: Use the correct AVTXContext in av_tx tests Keep a reference to the correct associated context of the reference function and use that context when calling the reference function.	2021-12-20 23:58:05 +01:00
Alan Kelly	86663963e6	x86/swscale: fix minor coding style issues Signed-off-by: James Almer <jamrial@gmail.com>	2021-12-16 13:16:04 -03:00
Alan Kelly	f900a19fa9	libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes. Fixes so that fate under 64 bit Windows passes. These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. Signed-off-by: James Almer <jamrial@gmail.com>	2021-12-15 20:04:59 -03:00
Shiyou Yin	9a840ffa17	avutil: [loongarch] Add support for loongarch SIMD. LSX and LASX is loongarch SIMD extention. They are enabled by default if compiler support it, and can be disabled with '--disable-lsx' '--disable-lasx'. Change-Id: Ie2608ea61dbd9b7fffadbf0ec2348bad6c124476 Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Reviewed-by: guxiwei <guxiwei-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2021-12-15 18:37:40 +01:00
Andreas Rheinhardt	09408539f4	checkasm/hevc_pel: Fix stack buffer overreads This patch increases several stack buffers in order to fix stack-buffer-overflows (e.g. in put_hevc_qpel_uni_hv_9 in line 814 of hevcdsp_template.c) detected with ASAN in the hevc_pel checkasm test. The buffers are increased by the minimal amount necessary in order not to mask potential future bugs. Reviewed-by: Martin Storsjö <martin@martin.st> Reviewed-by: "zhilizhao(赵志立)" <quinkblack@foxmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-09-29 04:35:31 +02:00
Andreas Rheinhardt	1ea3650823	Replace all occurences of av_mallocz_array() by av_calloc() They do the same. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-09-20 01:03:52 +02:00
Wu Jianhua	133b2767cf	tests/checkasm/vf_gblur.c: update check_horiz_slice for the new ff_horiz_slice_avx2/512 Co-authored-by: Cheng Yanfei <yanfei.cheng@intel.com> Co-authored-by: Jin Jun <jun.i.jin@intel.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>	2021-08-29 19:58:33 +02:00
Wu Jianhua	0c54ab20c2	tests/checkasm/vf_gblur.c: add check_verti_slice() for unit test Co-authored-by: Cheng Yanfei <yanfei.cheng@intel.com> Co-authored-by: Jin Jun <jun.i.jin@intel.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>	2021-08-29 19:58:33 +02:00
J. Dekker	b492cacffd	checkasm: collapse hevc pel tests Also add to `make fate-checkasm' target. Signed-off-by: J. Dekker <jdek@itanimul.li>	2021-08-24 22:12:06 +02:00
Andreas Rheinhardt	4608f7cc6a	Remove unnecessary mem.h inclusions Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-07-22 14:47:57 +02:00
J. Dekker	c866a099b2	lavu/kperf: use ff_thread_once() Signed-off-by: J. Dekker <jdek@itanimul.li>	2021-07-21 16:35:27 +02:00
J. Dekker	9a727235fd	lavu/checkasm: add (private) kperf timing for macOS Signed-off-by: J. Dekker <jdek@itanimul.li>	2021-07-20 19:40:03 +02:00
Anton Khirnov	fe490ec165	sws: separate the calls to scaled vs unscaled conversion Call the scaler function directly rather than through a function pointer. Drop the now-unused return value from ff_getSwsFunc() and rename the function to reflect its new role. This will be useful in the following commits, where it will become important that the amount of output is different for scaled vs unscaled case.	2021-07-03 15:57:13 +02:00
Matthieu Patou	b27ae2c0b7	checkasm/vp9dsp: rename the iszero function to is_zero Suggested-by: ffmpeg@fb.com Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2021-06-08 13:11:22 -03:00
Lynne	1978b143eb	checkasm: add av_tx FFT SIMD testing code This sadly required making changes to the code itself, due to the same context needing to be reused for both versions. The lookup table had to be duplicated for both versions.	2021-04-24 17:19:17 +02:00
Alan Kelly	e1484bc455	tests/checkasm/sw_scale: adds additional tests sizes for yux2yuvX Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2021-04-01 20:47:52 +02:00
James Almer	d52ceed9fd	tests/checkasm/sw_scale: use memset() to fill dither Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-19 16:19:11 -03:00
Alan Kelly	ee18edb13a	checkasm/sw_scale: properly initialize src_pixer and filter_coeff buffers Fixes valgrind uninitialised value warnings. Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-19 11:20:32 -03:00
James Almer	1371647fc3	checkasm/sw_scale: use av_free() instead of free() Fixes crashes on Win64 Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-17 20:57:33 -03:00
Alan Kelly	554c2bc708	swscale: move yuv2yuvX_sse3 to yasm, unrolls main loop And other small optimizations for ~20% speedup.	2021-02-17 21:21:03 +01:00
James Almer	bea7c51307	checkasm/vf_gblur: add a test for postscale_slice Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-17 13:39:31 -03:00
James Almer	2df3c2ed9b	checkasm/vf_gblur: split off the horiz_slice test into its own function Will come in handy for the following commit. Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-17 13:39:11 -03:00
Josh Dekker	9c513edb79	checkasm: add hevc_pel tests Co-authored-by: Niklas Haas <git@haasn.xyz> Signed-off-by: Josh Dekker <josh@itanimul.li>	2021-01-25 09:24:11 +01:00
Anton Khirnov	c8c2dfbc37	lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h That is a more appropriate place for it.	2021-01-01 14:11:01 +01:00
Limin Wang	c748bd77dc	tests: fix warning ISO C90 forbids mixed declarations and code Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2020-09-10 20:34:51 +08:00
Carl Eugen Hoyos	b61376bdee	lavfi/hflip: Support Bayer pixel formats. Fixes part of ticket #8819.	2020-08-25 01:29:24 +02:00
Jiaxun Yang	e387fcd01c	libavutil: Detect MMI and MSA flags for MIPS Add MMI & MSA runtime detection for MIPS. Basically there are two code pathes. For systems that natively support CPUCFG instruction or kernel emulated that instruction, we'll sense this feature from HWCAP and report the flags according to values grab from CPUCFG. For systems that have no CPUCFG (or not export it in HWCAP), we'll parse /proc/cpuinfo instead. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-07-23 17:21:58 +02:00
James Almer	55e1bc39cb	checkasm/vf_blend: use the correct depth parameters to initialize the blend modes This effectively enables the tests that until now were just running the C version alone. Signed-off-by: James Almer <jamrial@gmail.com>	2020-07-12 11:30:23 -03:00
Jun Zhao	7f76f20fa0	checkasm: sw_rgb: Fix mixed declaration and code Fix mixed declaration and code. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2020-06-01 23:28:07 +08:00
Andreas Rheinhardt	57e570b508	checkasm/sw_scale: Fix stack-buffer-overflow A buffer whose size is not a multiple of four has been initialized using consecutive writes of 32bits. This results in a stack-buffer-overflow reported by ASAN in the checkasm-sw_scale FATE-test. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2020-05-20 23:18:50 +02:00
Martin Storsjö	9c326af1d0	checkasm: swscale: Fix running the hscale test on 32 bit x86 This function doesn't call emms. Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-16 08:16:12 +03:00
Martin Storsjö	eba1ebd9bf	checkasm: sw_rgb: Add a test for interleaveBytes Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 23:38:01 +03:00
Martin Storsjö	5bdffced0a	checkasm: pixblockdsp: Add tests for get_pixels_unaligned and diff_pixels_unaligned Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 23:37:27 +03:00
Martin Storsjö	ed7d73355e	checkasm: aarch64: Check for stack overflows Also fill x8-x17 with garbage before calling the function. Figure out the number of stack parameters and make sure that the value on the stack after those is untouched. Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 21:22:36 +03:00
Martin Storsjö	6cb2d4d94b	checkasm: arm: Check for stack overflows Figure out the number of stack parameters and make sure that the value on the stack after those is untouched. Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 21:22:34 +03:00
Martin Storsjö	3f266cf49e	checkasm: arm: Don't use blx to call checkasm_fail_func We should just use a normal bl here, and the linker will add the 'x' bit if necessary. This fixes calling the checkasm_fail_func on windows, where the code is built in thumb mode (and the linker doesn't clear the 'x' bit in the blx instruction). Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 21:22:32 +03:00
Martin Storsjö	89cf9e1fb6	checkasm: arm: Make the indentation consistent with other files This makes it easier to share code with e.g. the dav1d implementation of checkasm. Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 21:22:27 +03:00
Josh de Kock	5913cd4e6c	checkasm: add hscale test This tests the hscale 8bpp to 14/18bpp functions with different filter sizes. Signed-off-by: Josh de Kock <josh@itanimul.li>	2020-05-15 10:29:30 +01:00
Martin Storsjö	3ce1b2bf8d	checkasm: add function to check and diff memory This was ported from dav1d (c950e7101bdf5f7117bfca816984a21e550509f0). Signed-off-by: Josh de Kock <josh@itanimul.li>	2020-05-15 10:29:30 +01:00
Linjie Fu	ddf6ca3a0e	tests/checkasm: add overflow test for hevc_add_res Add overflow test for hevc_add_res when int16_t coeff = -32768. The result of C is good, while ASM is not. To verify: make fate-checkasm-hevc_add_res ffmpeg/tests/checkasm/checkasm --test=hevc_add_res ./checkasm --test=hevc_add_res checkasm: using random seed 679391863 MMXEXT: hevc_add_res_4x4_8_mmxext (hevc_add_res.c:69) - hevc_add_res.add_residual [FAILED] SSE2: hevc_add_res_8x8_8_sse2 (hevc_add_res.c:69) hevc_add_res_16x16_8_sse2 (hevc_add_res.c:69) hevc_add_res_32x32_8_sse2 (hevc_add_res.c:69) - hevc_add_res.add_residual [FAILED] AVX: hevc_add_res_8x8_8_avx (hevc_add_res.c:69) hevc_add_res_16x16_8_avx (hevc_add_res.c:69) hevc_add_res_32x32_8_avx (hevc_add_res.c:69) - hevc_add_res.add_residual [FAILED] AVX2: hevc_add_res_32x32_8_avx2 (hevc_add_res.c:69) - hevc_add_res.add_residual [FAILED] checkasm: 8 of 14 tests have failed Signed-off-by: Xu Guangxin <guangxin.xu@intel.com> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2020-03-27 10:57:40 +01:00
Linjie Fu	69b9548dd6	checkasm/hevc_add_res: prepare test data only if the fuction is not tested check_func will return NULL for functions that have already been tested. If the func is tested and skipped (which happens several times), there is no need to prepare data(randomize_buffers and memcpy). Move relative code in compare_add_res(), prepare data and do check only if the function is not tested. Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2020-03-27 10:57:40 +01:00
Martin Storsjö	5181f491ee	checkasm: sbrdsp: Fix a spurious test failure by calculating a better epsilon for sum_square Signed-off-by: Martin Storsjö <martin@martin.st>	2020-02-08 23:00:21 +02:00
Martin Storsjö	cbb254cb4c	checkasm: Check HAVE_GETSTDHANDLE here as well This was missed in `63418e374f`. Signed-off-by: Martin Storsjö <martin@martin.st>	2020-01-24 22:17:18 +02:00
Martin Storsjö	aad0e26f93	checkasm: aacpsdsp: Tolerate extra intermediate precision in stereo_interpolate The stereo_interpolate functions add h_step to the values h BUF_SIZE times. Within the stereo_interpolate C functions, the values h (h0-h3, h00-h13) are declared as local float variables, but the compiler is free to keep them in a register with extra precision. If the accumulation is rounded to 32 bit float precision after each step, the less significant bits of h_step end up ignored and the sum can deviate, affecting the end result more than the currently set EPS. By clearing the log2(BUF_SIZE) lower bits of h_step, we make sure that the accumulation shouldn't differ significantly, regardless of any extra precision in the accmulating register/variable. This fixes the aacpsdsp checkasm test when built with clang for mingw/x86_32. Signed-off-by: Martin Storsjö <martin@martin.st>	2019-12-18 15:15:29 +02:00
Martin Storsjö	f58bda642d	checkasm: af_afir: Use a dynamic tolerance depending on values As the values generated by av_bmg_get can be arbitrarily large (only the stddev is specified), we can't use a fixed tolerance. Calculate a dynamic tolerance (like in float_dsp from `38f966b222`), based on the individual steps of the calculation. This fixes running this test with certain seeds, when built with clang for mingw/x86_32. Signed-off-by: Martin Storsjö <martin@martin.st>	2019-12-12 23:57:08 +02:00
Martin Storsjö	8f70e261fa	checkasm: float_dsp: Scale FLT/DBL_EPSILON sufficiently when comparing As the values generated by av_bmg_get can be arbitrarily large (only the stddev is specified), we can't use a fixed tolerance. This matches what was done for test_vector_dmul_scalar in `38f966b222`. This fixes the float_dsp checkasm test for some seeds, when built with clang for mingw/x86_32. Signed-off-by: Martin Storsjö <martin@martin.st>	2019-12-11 22:19:45 +02:00
Ting Fu	9691e2a426	checkasm/vf_eq: add test for vf_eq Signed-off-by: Ting Fu <ting.fu@intel.com> Signed-off-by: Ruiling Song <ruiling.song@intel.com>	2019-09-26 08:10:31 +08:00
James Almer	1d86e4b3eb	checkasm/opusdsp: declare opus_deemphasis as a function returning a float Fixes ticket #8175 Signed-off-by: James Almer <jamrial@gmail.com>	2019-09-18 11:42:06 -03:00
Lynne	4ce1e13b54	checkasm: add opusdsp tests	2019-09-11 03:28:22 +01:00
Ruiling Song	8f4963ad25	checkasm/vf_gblur: add test for horiz_slice simd Signed-off-by: Ruiling Song <ruiling.song@intel.com>	2019-06-12 08:54:05 +08:00
James Darnley	76c370af64	checkasm: add test for v210dec	2019-05-02 19:21:37 +02:00
James Almer	0dda0f3bdb	Merge commit 'f8abf7d4dfa0504f7f65e4f1fd9d22e01cb371cc' * commit 'f8abf7d4dfa0504f7f65e4f1fd9d22e01cb371cc': checkasm/h264: test 4:2:2 chroma loop filter functions Merged-by: James Almer <jamrial@gmail.com>	2019-03-14 16:31:41 -03:00
James Almer	06476249cd	Merge commit '7e5bde93a1e7641e1622814dafac0be3f413d79b' * commit '7e5bde93a1e7641e1622814dafac0be3f413d79b': build: Rename OBJDIRS variable to OUTDIRS Merged-by: James Almer <jamrial@gmail.com>	2019-03-10 19:31:13 -03:00
Janne Grunau	f8abf7d4df	checkasm/h264: test 4:2:2 chroma loop filter functions	2019-02-27 21:57:05 +01:00
James Almer	f32d293955	Merge commit 'd7f4f5c4a18a0c9e62635cfa6fe8a9302b413c01' * commit 'd7f4f5c4a18a0c9e62635cfa6fe8a9302b413c01': checkasm/h264: add loop filter tests Merged-by: James Almer <jamrial@gmail.com>	2019-02-20 15:28:25 -03:00
Diego Biurrun	7e5bde93a1	build: Rename OBJDIRS variable to OUTDIRS These directories are not just for object files.	2019-02-16 13:09:35 +01:00
Carl Eugen Hoyos	608572ce84	tests/checkasm/checkasm: Do not define an unused function. Fixes the following warning: tests/checkasm/checkasm.c:615:12: warning: 'bench_init_ffmpeg' defined but not used	2019-01-31 20:16:17 +01:00
Janne Grunau	d7f4f5c4a1	checkasm/h264: add loop filter tests	2019-01-26 12:05:10 +01:00
James Almer	f477ee3e89	checkasm/af_afir: relax the max allowed absolute difference Should fix failures on x86_32. Signed-off-by: James Almer <jamrial@gmail.com>	2019-01-13 15:00:20 -03:00
James Almer	ba89dc27b5	checkasm: add an af_afir test Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2019-01-03 10:12:18 -03:00
James Almer	93bf1dcaec	checkasm/float_dsp: add test for vector_dmul Signed-off-by: James Almer <jamrial@gmail.com>	2018-09-14 12:51:55 -03:00
Andrey Semashev	d7eb8d8475	lavfi/tests: Fix 16-bit vf_blend test to avoid memory not aligned to 2 bytes Generic C implementation of vf_blend performs reads and writes of 16-bit elements, which requires the buffers to be aligned to at least 2-byte boundary. Also, the change fixes source buffer overrun caused by src_offset being added to to test handling of misaligned buffers. Fixes: #7226 Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2018-05-30 02:42:10 +02:00
Clément Bœsch	2940af9389	tests/checkasm/nlmeans: fix invalid read/write on ii buffer	2018-05-18 19:12:11 +02:00
Jun Zhao	b30575bc98	checkasm/sw_rgb: fix the function declaration warning fix the warning: "function declaration isn’t a prototype", in C int foo() and int foo(void) are different functions. int foo() accepts an arbitrary number of arguments, while int foo(void) accepts 0 arguments. Signed-off-by: Jun Zhao <mypopydev@gmail.com>	2018-05-10 19:28:51 +08:00
Clément Bœsch	f679711c1b	checkasm: add vf_nlmeans test for ssd_integral_image	2018-05-08 10:28:06 +02:00
Martin Vignali	07a566e7d6	swscale/swscale_unscaled : add X86_64 (SSE2 and AVX) for uyvyto422 and checkasm test	2018-04-22 19:15:32 +02:00
Michael Niedermayer	18d6ff2b42	tests/checkasm/checkasm: Provide verbose failure information on float_near_abs_eps() failures This will make understanding failures and adjusting EPS easier Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2018-04-14 15:54:06 +02:00
Martin Vignali	595505083a	checkasm/vf_blend : add test for 16 bit version of grainextract grainmerge average extremity negation	2018-04-05 21:46:19 +02:00
Josh de Kock	cda43940da	checkasm/Makefile: add EXTRALIBS-libavformat Signed-off-by: Josh de Kock <josh@itanimul.li>	2018-03-31 23:20:16 +01:00
Martin Vignali	a9a7ed4f27	checkasm/swscale : add test for rgb shuffle_bytes func	2018-03-24 20:22:12 +01:00
Yingming Fan	e5b4cd4c4a	checkasm/hevc_idct : update test bit depth from 8 9 and 10 to 8 10 and 12 Signed-off-by: James Almer <jamrial@gmail.com>	2018-03-19 00:56:01 -03:00
Yingming Fan	80798e3857	checkasm/hevc_sao : add hevc_sao for checkasm Signed-off-by: James Almer <jamrial@gmail.com>	2018-03-07 23:53:32 -03:00
Martin Vignali	c0919c4985	checkasm/vf_blend : add test for blend_simple_16, phoenix_16 and difference_16	2018-02-24 21:44:23 +01:00
Martin Vignali	e3fc36a84c	checkasm/vf_blend : add depth param in order to add test for 16 bit version	2018-02-24 21:44:09 +01:00
Muhammad Faiz	81d6501be7	checkasm/Makefile: add EXTRALIBS-swresample Should fix https://ffmpeg.org/pipermail/ffmpeg-devel/2018-February/225058.html Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>	2018-02-09 17:50:44 +07:00
Martin Vignali	78b982d3b9	checkasm : add test for losslessvideoencdsp for diff bytes and sub_left_pred	2018-01-28 20:23:16 +01:00
James Darnley	40d4b13228	checkasm: support for AVX-512 functions	2017-12-24 22:02:41 +01:00
James Almer	da03242778	Revert "checkasm/vf_interlace : add test for lowpass_line 8 and 16" This reverts commit `adff97be5e`. It currently fails on Windows targets. Signed-off-by: James Almer <jamrial@gmail.com>	2017-12-19 19:07:24 -03:00
Martin Vignali	adff97be5e	checkasm/vf_interlace : add test for lowpass_line 8 and 16	2017-12-19 20:59:51 +01:00
Martin Vignali	cefb7e0060	checkasm/vf_hflip : add test for vf_hflip byte and short simd	2017-12-13 11:34:29 +01:00
Martin Storsjö	18a0f42026	checkasm: Use LOCAL_ALIGNED for aligned variables on the stack This fixes fate-checkasm-hevc_mc on ARMCC 5.0 after adding NEON HEVC MC assembly. Signed-off-by: Martin Storsjö <martin@martin.st>	2017-12-12 11:36:38 +02:00
James Almer	1215889bc1	checkasm/llviddsp: fix mixed code and declarations Signed-off-by: James Almer <jamrial@gmail.com>	2017-12-10 00:51:35 -03:00
Martin Vignali	e1121f9723	checkasm/llviddsp : add test for add_gradient_pred	2017-12-09 15:19:07 +01:00
Martin Vignali	5bda11e70e	checkasm/llviddsp : test return of add_left_pred(16)	2017-12-09 15:15:56 +01:00
Martin Vignali	179a2f04eb	checkasm/vf_threshold : add test for threshold16	2017-12-09 14:47:13 +01:00
James Almer	1b324700e3	checkasm/vf_threshold: fix mixed code and declarations Signed-off-by: James Almer <jamrial@gmail.com>	2017-12-04 15:46:09 -03:00
Martin Vignali	cfce442750	checkasm/vf_threshold : add checkasm test for threshold8	2017-12-03 19:17:15 +01:00
Martin Vignali	9bed17cd0f	checkasm/utvideo : be more explicit to the WIDTH_PADDED define	2017-12-01 21:08:07 +01:00
Michael Niedermayer	38f966b222	tests/checkasm/float_dsp: Increase allowed difference for float_dsp.vector_dmul Tested for 10000 iterations on x86-32 Fixes: Ticket6848 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-11-27 03:31:53 +01:00
James Almer	bea8eeaa2c	checkasm/utvideodsp: zero initialize the entire buffer Signed-off-by: James Almer <jamrial@gmail.com>	2017-11-21 11:24:38 -03:00
James Almer	9a05c873cf	checkasm/utvideodsp: fix mixed declarations and code Signed-off-by: James Almer <jamrial@gmail.com>	2017-11-21 11:13:24 -03:00
Martin Vignali	4a6aa6d1b2	checkasm : add test for huffyuvdsp add_int16	2017-11-21 09:41:42 +01:00
Martin Vignali	6a7eb65e1b	checkasm : add utvideodsp test	2017-11-21 09:00:27 +01:00
James Almer	501435e5e6	checkasm/jpeg2000dsp: add test for ict_float Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>	2017-11-20 18:33:57 -03:00
James Almer	20a93ea8d4	checkasm/jpeg2000dsp: refactor rct_int test Signed-off-by: James Almer <jamrial@gmail.com>	2017-11-20 18:33:57 -03:00
James Almer	0cef66c906	Merge commit '516c479172755c63063180b0c0953b68b670cdbd' * commit '516c479172755c63063180b0c0953b68b670cdbd': checkasm: Test more h264 idct variants See `2d263188ba` Merged-by: James Almer <jamrial@gmail.com>	2017-11-11 15:21:22 -03:00
James Almer	2d263188ba	Merge commit '547db1eaecd597031165a2bf637acaaacde52788' * commit '547db1eaecd597031165a2bf637acaaacde52788': checkasm: Test more h264 idct variants Merged-by: James Almer <jamrial@gmail.com>	2017-11-11 13:18:55 -03:00
James Almer	4cfb46f94f	checkasm/llviddsp: fix warnings about mixed declaration and code Signed-off-by: James Almer <jamrial@gmail.com>	2017-11-08 14:54:15 -03:00
Martin Vignali	fbe9148779	checkasm/llviddsp : add test for other dsp func add_median_pred add_left_pred : add two func one with acc 0, and one with random acc add_left_pred16	2017-11-07 00:54:17 +01:00
James Almer	122a749dfc	Merge commit 'd05c9cde0e87c23ca42957646bea483dfc09d6bf' * commit 'd05c9cde0e87c23ca42957646bea483dfc09d6bf': checkasm: aarch64: Specify alignment for the register_init const array Merged-by: James Almer <jamrial@gmail.com>	2017-10-30 21:03:50 -03:00
James Almer	f568d9d0ba	Merge commit 'e00db9f78bb475ed5103364f61892f4e75ef89ba' * commit 'e00db9f78bb475ed5103364f61892f4e75ef89ba': checkasm: hevc: Add a hevc_ prefix to the add_residual functions Merged-by: James Almer <jamrial@gmail.com>	2017-10-28 18:18:41 -03:00
James Almer	6dfcbd80ad	Merge commit '7cb1d9e2dbbe5bf4652be5d78cdd68e956fa3d63' * commit '7cb1d9e2dbbe5bf4652be5d78cdd68e956fa3d63': build: Fine-grained link-time dependency settings Also included are bug fix commits `5ff3b5cafc`, `d9da7151ee` and `5e27ef800b`. Merged-by: James Almer <jamrial@gmail.com>	2017-10-11 17:55:25 -03:00
Martin Vignali	cbbec68847	libavcodec/blockdsp : add AVX version Also modify the required alignment, to 32 instead of 16 for several codecs Signed-off-by: James Almer <jamrial@gmail.com>	2017-10-03 19:47:37 -03:00
Martin Vignali	ac5908b13f	libavcodec/exr : add x86 SIMD for predictor Signed-off-by: James Almer <jamrial@gmail.com>	2017-10-01 17:35:30 -03:00
Martin Storsjö	516c479172	checkasm: Test more h264 idct variants Signed-off-by: Martin Storsjö <martin@martin.st>	2017-09-27 13:58:39 +03:00
James Almer	7323c896b2	checkasm: add an exrdsp test Signed-off-by: James Almer <jamrial@gmail.com>	2017-09-17 19:01:40 -03:00
Clément Bœsch	e0d56f097f	checkasm: use perf API on Linux ARM* On ARM platforms, accessing the PMU registers requires special user access permissions. Since there is no other way to get accurate timers, the current implementation of timers in FFmpeg rely on these registers. Unfortunately, enabling user access to these registers on Linux is not trivial, and generally involve compiling a random and unreliable github kernel module, or patching somehow your kernel. Such module is very unlikely to reach the upstream anytime soon. Quoting Robin Murphin from ARM: > Say you do give userspace direct access to the PMU; now run two or more > programs at once that believe they can use the counters for their own > "minimal-overhead" profiling. Have fun interpreting those results... > > And that's not even getting into the implications of scheduling across > different CPUs, CPUidle, etc. where the PMU state is completely beyond > userspace's control. In general, the plan to provide userspace with > something which might happen to just about work in a few corner cases, > but is meaningless, misleading or downright broken in all others, is to > never do so. As a result, the alternative is to use the Performance Monitoring Linux API which makes use of these registers internally (assuming the PMU of your ARM board is supported in the kernel, which is definitely not a given...). While the Linux API is obviously cross platform, it does have a significant overhead which needs to be taken into account. As a result, that mode is only weakly enabled on ARM platforms exclusively. Note on the non flexibility of the implementation: the timers (native FFmpeg vs Linux API) are selected at compilation time to prevent the need of function calls, which would result in a negative impact on the cycle counters.	2017-09-08 18:51:05 +02:00
Martin Storsjö	e12f1cd616	Revert "checkasm: Test more h264 idct variants" This reverts commit `547db1eaec`. This commit wasn't supposed to be pushed (yet) since it hasn't been reviewed. Signed-off-by: Martin Storsjö <martin@martin.st>	2017-09-02 22:23:30 +03:00
Martin Storsjö	547db1eaec	checkasm: Test more h264 idct variants	2017-08-31 14:55:34 +03:00
James Almer	e51073fe00	checkasm/vf_blend: rename addition128 and difference128 to grainmerge and grainextract This was missing from `f8d0689d3f`. Fixes checkasm.	2017-08-24 23:39:09 -03:00
James Almer	6f205a42d7	checkasm: add hybrid_analysis_ileave and hybrid_synthesis_deint tests to aacpsdsp Signed-off-by: James Almer <jamrial@gmail.com>	2017-07-13 17:03:28 -03:00
James Almer	823cc7e25f	checkasm: add a g722dsp test Signed-off-by: James Almer <jamrial@gmail.com>	2017-07-13 17:00:19 -03:00
James Almer	3d3243577c	checkasm: use declare_func_float() in sbrdsp sum_square test The function returns a float. This fixes the test in x86_32 targets. Signed-off-by: James Almer <jamrial@gmail.com>	2017-07-04 23:02:57 -03:00
Matthieu Bouron	7864e07f4a	checkasm: add sbrdsp tests	2017-07-03 14:28:17 +02:00
James Almer	0eb783eb06	checkasm: randomize the full input buffer in test_hybrid_analysis Missed in the last commit.	2017-06-30 22:49:54 -03:00
James Almer	fb7b477a91	checkasm: fix size of input buffer in test_hybrid_analysis	2017-06-30 20:37:06 -03:00
Clément Bœsch	b12a36170b	lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysis	2017-06-28 12:22:39 +02:00
Clément Bœsch	edd041e64c	checkasm: add AAC PS tests This includes various fixes and improvements from James Almer. Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-28 12:22:39 +02:00
James Almer	fa50d9360b	x86/vf_blend: add sse and ssse3 extremity functions Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-27 13:17:23 -03:00
James Almer	a579dbb4f7	checkasm: add missing checks to float_dsp's butterflies_float test	2017-06-23 23:38:07 -03:00
Matthieu Bouron	067e42b851	checkasm/aarch64: fix tests returning a float Avoids overriding the v0 register (which containins the result of the tested function) in checkasm_call_checked.	2017-06-22 09:18:10 +02:00
Diego Biurrun	fd502f4f5f	build: Generalize yasm/nasm-related variable names None of them are specific to the YASM assembler. (Cherry-picked from libav commit `39e208f4d4`) Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-21 17:00:29 -03:00
James Almer	5b10f484e2	checkasm: add float_dsp tests Ported from libavutil/tests/float_dsp.c Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-14 19:20:10 -03:00
James Almer	37388b119c	checkasm: add a checkasm_checked_call function that doesn't issue emms Meant for DSP functions returning a float or double, as they'd fail if emms is called after every run on x86_32. Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-14 19:18:56 -03:00
James Almer	93dc1c1221	checkasm: add _fixed suffix to fixed_dsp tests Should prevents future conflicts with the similarly named floatdsp tests	2017-06-01 13:12:20 -03:00
Martin Storsjö	d05c9cde0e	checkasm: aarch64: Specify alignment for the register_init const array Loads from this strictly doesn't require alignment, but specify it just for consistency with the arm version. Signed-off-by: Martin Storsjö <martin@martin.st>	2017-05-15 10:19:46 +03:00
Martin Storsjö	e00db9f78b	checkasm: hevc: Add a hevc_ prefix to the add_residual functions This makes it easier to group them with the rest when running e.g. --bench=hevc. Signed-off-by: Martin Storsjö <martin@martin.st>	2017-04-21 13:32:44 +03:00
James Almer	7b3cb953f7	checkasm: add fixed_dsp tests Tested-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>	2017-04-11 18:05:13 -03:00
Clément Bœsch	210678d3c5	Merge commit '3794062ab1a13442b06f6d76c54dce51ffa54697' * commit '3794062ab1a13442b06f6d76c54dce51ffa54697': Remove Plan 9 support Merged-by: Clément Bœsch <u@pkh.me>	2017-04-09 14:52:00 +02:00
James Almer	6747fc436e	Merge commit 'effc1430b2fe5997d9d55bf28dc507c27125eb27' * commit 'effc1430b2fe5997d9d55bf28dc507c27125eb27': Revert "checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately" Merged-by: James Almer <jamrial@gmail.com>	2017-04-04 15:26:18 -03:00
Clément Bœsch	edfa7ac8ec	Merge commit '81d7f0bbca837afda1f7e60d3ae52ab1360ab44b' * commit '81d7f0bbca837afda1f7e60d3ae52ab1360ab44b': checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately Merged-by: Clément Bœsch <u@pkh.me>	2017-04-01 11:54:29 +02:00
Clément Bœsch	b589e83f43	Merge commit '9498237049d15812cecb79df47b196c73013908b' * commit '9498237049d15812cecb79df47b196c73013908b': checkasm: Add --test parameter to check only specific components Merged-by: Clément Bœsch <cboesch@gopro.com>	2017-03-31 10:06:13 +02:00
Clément Bœsch	1c9f4b5078	lavc/vp9: split into vp9{block,data,mvs} This is following Libav layout to ease merges.	2017-03-27 21:38:21 +02:00
James Almer	09ce5519f3	fate/checkasm: fix use of uninitialized memory on hevc_add_res tests	2017-03-24 22:11:34 -03:00
James Almer	36eae45510	fate/checkasm: use LOCAL_ALINGED_32 on hevc_add_res tests	2017-03-24 22:11:22 -03:00
Clément Bœsch	3d4039f964	Merge commit 'ed48a9d8143d2575a4458589cebde69ec326afd8' * commit 'ed48a9d8143d2575a4458589cebde69ec326afd8': checkasm: Add a test for HEVC add_residual Merged-by: Clément Bœsch <u@pkh.me>	2017-03-24 12:37:09 +01:00
James Almer	0d34473d8e	Merge commit 'dd5d4a0e1e3a30a254d1a57ecbdcedf230c6014b' * commit 'dd5d4a0e1e3a30a254d1a57ecbdcedf230c6014b': checkasm: aarch64: Don't clobber x29 in checkasm_stack_clobber Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 18:31:36 -03:00
James Almer	f23078904f	Merge commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055' * commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055': build: Drop arch-specific checkasm Makefiles Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 18:01:47 -03:00
James Almer	3ddae9eee9	Merge commit '93d5b022a9fd3a1a1f9c521a1eac7f0410e05b81' * commit '93d5b022a9fd3a1a1f9c521a1eac7f0410e05b81': build: Drop duplicate asm recipe Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 17:57:35 -03:00
James Almer	67b639b496	Merge commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f' * commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f': checkasm: aarch64: Add filler args to make sure all parameters are passed on the stack Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 17:38:20 -03:00
James Almer	a2d34cc51b	Merge commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6' * commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6': checkasm: aarch64: Clobber the stack before calling functions Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 17:36:53 -03:00
James Almer	cab4c7fa19	Merge commit 'a05cc56124b4f1237f6355784de821e3290ddb44' * commit 'a05cc56124b4f1237f6355784de821e3290ddb44': checkasm: arm/aarch64: Fix the amount of space reserved for stack parameters Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 17:35:38 -03:00

1 2 3 4 5 ...

433 Commits