Andreas Rheinhardt
636631d9db
Remove unnecessary libavutil/(avutil|common|internal).h inclusions
...
Some of these were made possible by moving several common macros to
libavutil/macros.h.
While just at it, also improve the other headers a bit.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-02-24 12:56:49 +01:00
Andreas Rheinhardt
155cd6baa4
Remove obsolete version.h inclusions
...
Forgotten in e7bd47e657
.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-02-24 12:56:49 +01:00
Alan Kelly
e534d98af3
libswscale: Re-factor ff_shuffle_filter_coefficients.
...
Make the code more readable and follow the style guide.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-02-17 17:17:22 +01:00
Alan Kelly
f1a5414c97
libswscale: Check and propagate memory allocation errors from ff_shuffle_filter_coefficients.
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-02-17 17:17:07 +01:00
Andreas Rheinhardt
71e2825150
swscale/x86/swscale: Remove superfluous and invalid ';'
...
Inside a function an unnecessary ';' is just a null statement;
yet outside of it it is actually illegal (but compilers happen
to accept it without warning except when using -pedantic).
So modify the macros to always expect the user to add a ';'.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-01-22 17:00:45 +01:00
Mark Reid
52f7026164
swscale/x86/input.asm: add x86-optimized planer rgb2yuv functions
...
sse2 only operates on 2 lanes per loop for to_y and to_uv functions, due
to the lack of pmulld instruction. Emulating pmulld with 2 pmuludq and shuffles
proved too costly and made to_uv functions slower then the c implementation.
For to_y on sse2 only float functions are generated,
I was are not able outperform the c implementation on the integer pixel formats.
For to_a on see4 only the float functions are generated.
sse2 and sse4 generated nearly identical performing code on integer pixel formats,
so only sse2/avx2 versions are generated.
planar_gbrp_to_y_512_c: 1197.5
planar_gbrp_to_y_512_sse4: 444.5
planar_gbrp_to_y_512_avx2: 287.5
planar_gbrap_to_y_512_c: 1204.5
planar_gbrap_to_y_512_sse4: 447.5
planar_gbrap_to_y_512_avx2: 289.5
planar_gbrp9be_to_y_512_c: 1380.0
planar_gbrp9be_to_y_512_sse4: 543.5
planar_gbrp9be_to_y_512_avx2: 340.0
planar_gbrp9le_to_y_512_c: 1200.5
planar_gbrp9le_to_y_512_sse4: 442.0
planar_gbrp9le_to_y_512_avx2: 282.0
planar_gbrp10be_to_y_512_c: 1378.5
planar_gbrp10be_to_y_512_sse4: 544.0
planar_gbrp10be_to_y_512_avx2: 337.5
planar_gbrp10le_to_y_512_c: 1200.0
planar_gbrp10le_to_y_512_sse4: 448.0
planar_gbrp10le_to_y_512_avx2: 285.5
planar_gbrap10be_to_y_512_c: 1380.0
planar_gbrap10be_to_y_512_sse4: 542.0
planar_gbrap10be_to_y_512_avx2: 340.5
planar_gbrap10le_to_y_512_c: 1199.0
planar_gbrap10le_to_y_512_sse4: 446.0
planar_gbrap10le_to_y_512_avx2: 289.5
planar_gbrp12be_to_y_512_c: 10563.0
planar_gbrp12be_to_y_512_sse4: 542.5
planar_gbrp12be_to_y_512_avx2: 339.0
planar_gbrp12le_to_y_512_c: 1201.0
planar_gbrp12le_to_y_512_sse4: 440.5
planar_gbrp12le_to_y_512_avx2: 286.0
planar_gbrap12be_to_y_512_c: 1701.5
planar_gbrap12be_to_y_512_sse4: 917.0
planar_gbrap12be_to_y_512_avx2: 338.5
planar_gbrap12le_to_y_512_c: 1201.0
planar_gbrap12le_to_y_512_sse4: 444.5
planar_gbrap12le_to_y_512_avx2: 288.0
planar_gbrp14be_to_y_512_c: 1370.5
planar_gbrp14be_to_y_512_sse4: 545.0
planar_gbrp14be_to_y_512_avx2: 338.5
planar_gbrp14le_to_y_512_c: 1199.0
planar_gbrp14le_to_y_512_sse4: 444.0
planar_gbrp14le_to_y_512_avx2: 279.5
planar_gbrp16be_to_y_512_c: 1364.0
planar_gbrp16be_to_y_512_sse4: 544.5
planar_gbrp16be_to_y_512_avx2: 339.5
planar_gbrp16le_to_y_512_c: 1201.0
planar_gbrp16le_to_y_512_sse4: 445.5
planar_gbrp16le_to_y_512_avx2: 280.5
planar_gbrap16be_to_y_512_c: 1377.0
planar_gbrap16be_to_y_512_sse4: 545.0
planar_gbrap16be_to_y_512_avx2: 338.5
planar_gbrap16le_to_y_512_c: 1201.0
planar_gbrap16le_to_y_512_sse4: 442.0
planar_gbrap16le_to_y_512_avx2: 279.0
planar_gbrpf32be_to_y_512_c: 4113.0
planar_gbrpf32be_to_y_512_sse2: 2438.0
planar_gbrpf32be_to_y_512_sse4: 1068.0
planar_gbrpf32be_to_y_512_avx2: 904.5
planar_gbrpf32le_to_y_512_c: 3818.5
planar_gbrpf32le_to_y_512_sse2: 2024.5
planar_gbrpf32le_to_y_512_sse4: 1241.5
planar_gbrpf32le_to_y_512_avx2: 657.0
planar_gbrapf32be_to_y_512_c: 3707.0
planar_gbrapf32be_to_y_512_sse2: 2444.0
planar_gbrapf32be_to_y_512_sse4: 1077.0
planar_gbrapf32be_to_y_512_avx2: 909.0
planar_gbrapf32le_to_y_512_c: 3822.0
planar_gbrapf32le_to_y_512_sse2: 2024.5
planar_gbrapf32le_to_y_512_sse4: 1176.0
planar_gbrapf32le_to_y_512_avx2: 658.5
planar_gbrp_to_uv_512_c: 2325.8
planar_gbrp_to_uv_512_sse2: 1726.8
planar_gbrp_to_uv_512_sse4: 771.8
planar_gbrp_to_uv_512_avx2: 506.8
planar_gbrap_to_uv_512_c: 2281.8
planar_gbrap_to_uv_512_sse2: 1726.3
planar_gbrap_to_uv_512_sse4: 768.3
planar_gbrap_to_uv_512_avx2: 496.3
planar_gbrp9be_to_uv_512_c: 2336.8
planar_gbrp9be_to_uv_512_sse2: 1924.8
planar_gbrp9be_to_uv_512_sse4: 852.3
planar_gbrp9be_to_uv_512_avx2: 552.8
planar_gbrp9le_to_uv_512_c: 2270.3
planar_gbrp9le_to_uv_512_sse2: 1512.3
planar_gbrp9le_to_uv_512_sse4: 764.3
planar_gbrp9le_to_uv_512_avx2: 491.3
planar_gbrp10be_to_uv_512_c: 2281.8
planar_gbrp10be_to_uv_512_sse2: 1917.8
planar_gbrp10be_to_uv_512_sse4: 855.3
planar_gbrp10be_to_uv_512_avx2: 541.3
planar_gbrp10le_to_uv_512_c: 2269.8
planar_gbrp10le_to_uv_512_sse2: 1515.3
planar_gbrp10le_to_uv_512_sse4: 759.8
planar_gbrp10le_to_uv_512_avx2: 487.8
planar_gbrap10be_to_uv_512_c: 2382.3
planar_gbrap10be_to_uv_512_sse2: 1924.8
planar_gbrap10be_to_uv_512_sse4: 855.3
planar_gbrap10be_to_uv_512_avx2: 540.8
planar_gbrap10le_to_uv_512_c: 2382.3
planar_gbrap10le_to_uv_512_sse2: 1512.3
planar_gbrap10le_to_uv_512_sse4: 759.3
planar_gbrap10le_to_uv_512_avx2: 484.8
planar_gbrp12be_to_uv_512_c: 2283.8
planar_gbrp12be_to_uv_512_sse2: 1936.8
planar_gbrp12be_to_uv_512_sse4: 858.3
planar_gbrp12be_to_uv_512_avx2: 541.3
planar_gbrp12le_to_uv_512_c: 2278.8
planar_gbrp12le_to_uv_512_sse2: 1507.3
planar_gbrp12le_to_uv_512_sse4: 760.3
planar_gbrp12le_to_uv_512_avx2: 485.8
planar_gbrap12be_to_uv_512_c: 2385.3
planar_gbrap12be_to_uv_512_sse2: 1927.8
planar_gbrap12be_to_uv_512_sse4: 855.3
planar_gbrap12be_to_uv_512_avx2: 539.8
planar_gbrap12le_to_uv_512_c: 2377.3
planar_gbrap12le_to_uv_512_sse2: 1516.3
planar_gbrap12le_to_uv_512_sse4: 759.3
planar_gbrap12le_to_uv_512_avx2: 484.8
planar_gbrp14be_to_uv_512_c: 2283.8
planar_gbrp14be_to_uv_512_sse2: 1935.3
planar_gbrp14be_to_uv_512_sse4: 852.3
planar_gbrp14be_to_uv_512_avx2: 540.3
planar_gbrp14le_to_uv_512_c: 2276.8
planar_gbrp14le_to_uv_512_sse2: 1514.8
planar_gbrp14le_to_uv_512_sse4: 762.3
planar_gbrp14le_to_uv_512_avx2: 484.8
planar_gbrp16be_to_uv_512_c: 2383.3
planar_gbrp16be_to_uv_512_sse2: 1881.8
planar_gbrp16be_to_uv_512_sse4: 852.3
planar_gbrp16be_to_uv_512_avx2: 541.8
planar_gbrp16le_to_uv_512_c: 2378.3
planar_gbrp16le_to_uv_512_sse2: 1476.8
planar_gbrp16le_to_uv_512_sse4: 765.3
planar_gbrp16le_to_uv_512_avx2: 485.8
planar_gbrap16be_to_uv_512_c: 2382.3
planar_gbrap16be_to_uv_512_sse2: 1886.3
planar_gbrap16be_to_uv_512_sse4: 853.8
planar_gbrap16be_to_uv_512_avx2: 550.8
planar_gbrap16le_to_uv_512_c: 2381.8
planar_gbrap16le_to_uv_512_sse2: 1488.3
planar_gbrap16le_to_uv_512_sse4: 765.3
planar_gbrap16le_to_uv_512_avx2: 491.8
planar_gbrpf32be_to_uv_512_c: 4863.0
planar_gbrpf32be_to_uv_512_sse2: 3347.5
planar_gbrpf32be_to_uv_512_sse4: 1800.0
planar_gbrpf32be_to_uv_512_avx2: 1199.0
planar_gbrpf32le_to_uv_512_c: 4725.0
planar_gbrpf32le_to_uv_512_sse2: 2753.0
planar_gbrpf32le_to_uv_512_sse4: 1474.5
planar_gbrpf32le_to_uv_512_avx2: 927.5
planar_gbrapf32be_to_uv_512_c: 4859.0
planar_gbrapf32be_to_uv_512_sse2: 3269.0
planar_gbrapf32be_to_uv_512_sse4: 1802.0
planar_gbrapf32be_to_uv_512_avx2: 1201.5
planar_gbrapf32le_to_uv_512_c: 6338.0
planar_gbrapf32le_to_uv_512_sse2: 2756.5
planar_gbrapf32le_to_uv_512_sse4: 1476.0
planar_gbrapf32le_to_uv_512_avx2: 908.5
planar_gbrap_to_a_512_c: 383.3
planar_gbrap_to_a_512_sse2: 66.8
planar_gbrap_to_a_512_avx2: 43.8
planar_gbrap10be_to_a_512_c: 601.8
planar_gbrap10be_to_a_512_sse2: 86.3
planar_gbrap10be_to_a_512_avx2: 34.8
planar_gbrap10le_to_a_512_c: 602.3
planar_gbrap10le_to_a_512_sse2: 48.8
planar_gbrap10le_to_a_512_avx2: 31.3
planar_gbrap12be_to_a_512_c: 601.8
planar_gbrap12be_to_a_512_sse2: 111.8
planar_gbrap12be_to_a_512_avx2: 41.3
planar_gbrap12le_to_a_512_c: 385.8
planar_gbrap12le_to_a_512_sse2: 75.3
planar_gbrap12le_to_a_512_avx2: 39.8
planar_gbrap16be_to_a_512_c: 386.8
planar_gbrap16be_to_a_512_sse2: 79.8
planar_gbrap16be_to_a_512_avx2: 31.3
planar_gbrap16le_to_a_512_c: 600.3
planar_gbrap16le_to_a_512_sse2: 40.3
planar_gbrap16le_to_a_512_avx2: 30.3
planar_gbrapf32be_to_a_512_c: 1148.8
planar_gbrapf32be_to_a_512_sse2: 611.3
planar_gbrapf32be_to_a_512_sse4: 234.8
planar_gbrapf32be_to_a_512_avx2: 183.3
planar_gbrapf32le_to_a_512_c: 851.3
planar_gbrapf32le_to_a_512_sse2: 263.3
planar_gbrapf32le_to_a_512_sse4: 199.3
planar_gbrapf32le_to_a_512_avx2: 156.8
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2022-01-11 16:34:33 -03:00
Mark Reid
9e445a5be2
swscale/x86/output.asm: add x86-optimized planer gbr yuv2anyX functions
...
changes since v2:
* fixed label
changes since v1:
* remove vex intruction on sse4 path
* some load/pack marcos use less intructions
* fixed some typos
yuv2gbrp_full_X_4_512_c: 12757.6
yuv2gbrp_full_X_4_512_sse2: 8946.6
yuv2gbrp_full_X_4_512_sse4: 5138.6
yuv2gbrp_full_X_4_512_avx2: 3889.6
yuv2gbrap_full_X_4_512_c: 15368.6
yuv2gbrap_full_X_4_512_sse2: 11916.1
yuv2gbrap_full_X_4_512_sse4: 6294.6
yuv2gbrap_full_X_4_512_avx2: 3477.1
yuv2gbrp9be_full_X_4_512_c: 14381.6
yuv2gbrp9be_full_X_4_512_sse2: 9139.1
yuv2gbrp9be_full_X_4_512_sse4: 5150.1
yuv2gbrp9be_full_X_4_512_avx2: 2834.6
yuv2gbrp9le_full_X_4_512_c: 12990.1
yuv2gbrp9le_full_X_4_512_sse2: 9118.1
yuv2gbrp9le_full_X_4_512_sse4: 5132.1
yuv2gbrp9le_full_X_4_512_avx2: 2833.1
yuv2gbrp10be_full_X_4_512_c: 14401.6
yuv2gbrp10be_full_X_4_512_sse2: 9133.1
yuv2gbrp10be_full_X_4_512_sse4: 5126.1
yuv2gbrp10be_full_X_4_512_avx2: 2837.6
yuv2gbrp10le_full_X_4_512_c: 12718.1
yuv2gbrp10le_full_X_4_512_sse2: 9106.1
yuv2gbrp10le_full_X_4_512_sse4: 5120.1
yuv2gbrp10le_full_X_4_512_avx2: 2826.1
yuv2gbrap10be_full_X_4_512_c: 18535.6
yuv2gbrap10be_full_X_4_512_sse2: 33617.6
yuv2gbrap10be_full_X_4_512_sse4: 6264.1
yuv2gbrap10be_full_X_4_512_avx2: 3422.1
yuv2gbrap10le_full_X_4_512_c: 16724.1
yuv2gbrap10le_full_X_4_512_sse2: 11787.1
yuv2gbrap10le_full_X_4_512_sse4: 6282.1
yuv2gbrap10le_full_X_4_512_avx2: 3441.6
yuv2gbrp12be_full_X_4_512_c: 13723.6
yuv2gbrp12be_full_X_4_512_sse2: 9128.1
yuv2gbrp12be_full_X_4_512_sse4: 7997.6
yuv2gbrp12be_full_X_4_512_avx2: 2844.1
yuv2gbrp12le_full_X_4_512_c: 12257.1
yuv2gbrp12le_full_X_4_512_sse2: 9107.6
yuv2gbrp12le_full_X_4_512_sse4: 5142.6
yuv2gbrp12le_full_X_4_512_avx2: 2837.6
yuv2gbrap12be_full_X_4_512_c: 18511.1
yuv2gbrap12be_full_X_4_512_sse2: 12156.6
yuv2gbrap12be_full_X_4_512_sse4: 6251.1
yuv2gbrap12be_full_X_4_512_avx2: 3444.6
yuv2gbrap12le_full_X_4_512_c: 16687.1
yuv2gbrap12le_full_X_4_512_sse2: 11785.1
yuv2gbrap12le_full_X_4_512_sse4: 6243.6
yuv2gbrap12le_full_X_4_512_avx2: 3446.1
yuv2gbrp14be_full_X_4_512_c: 13690.6
yuv2gbrp14be_full_X_4_512_sse2: 9120.6
yuv2gbrp14be_full_X_4_512_sse4: 5138.1
yuv2gbrp14be_full_X_4_512_avx2: 2843.1
yuv2gbrp14le_full_X_4_512_c: 14995.6
yuv2gbrp14le_full_X_4_512_sse2: 9119.1
yuv2gbrp14le_full_X_4_512_sse4: 5126.1
yuv2gbrp14le_full_X_4_512_avx2: 2843.1
yuv2gbrp16be_full_X_4_512_c: 12367.1
yuv2gbrp16be_full_X_4_512_sse2: 8233.6
yuv2gbrp16be_full_X_4_512_sse4: 4820.1
yuv2gbrp16be_full_X_4_512_avx2: 2666.6
yuv2gbrp16le_full_X_4_512_c: 10904.1
yuv2gbrp16le_full_X_4_512_sse2: 8214.1
yuv2gbrp16le_full_X_4_512_sse4: 4824.1
yuv2gbrp16le_full_X_4_512_avx2: 2629.1
yuv2gbrap16be_full_X_4_512_c: 26569.6
yuv2gbrap16be_full_X_4_512_sse2: 10884.1
yuv2gbrap16be_full_X_4_512_sse4: 5488.1
yuv2gbrap16be_full_X_4_512_avx2: 3272.1
yuv2gbrap16le_full_X_4_512_c: 14010.1
yuv2gbrap16le_full_X_4_512_sse2: 10562.1
yuv2gbrap16le_full_X_4_512_sse4: 5463.6
yuv2gbrap16le_full_X_4_512_avx2: 3255.1
yuv2gbrpf32be_full_X_4_512_c: 14524.1
yuv2gbrpf32be_full_X_4_512_sse2: 8552.6
yuv2gbrpf32be_full_X_4_512_sse4: 4636.1
yuv2gbrpf32be_full_X_4_512_avx2: 2474.6
yuv2gbrpf32le_full_X_4_512_c: 13060.6
yuv2gbrpf32le_full_X_4_512_sse2: 9682.6
yuv2gbrpf32le_full_X_4_512_sse4: 4298.1
yuv2gbrpf32le_full_X_4_512_avx2: 2453.1
yuv2gbrapf32be_full_X_4_512_c: 18629.6
yuv2gbrapf32be_full_X_4_512_sse2: 11363.1
yuv2gbrapf32be_full_X_4_512_sse4: 15201.6
yuv2gbrapf32be_full_X_4_512_avx2: 3727.1
yuv2gbrapf32le_full_X_4_512_c: 16677.6
yuv2gbrapf32le_full_X_4_512_sse2: 10221.6
yuv2gbrapf32le_full_X_4_512_sse4: 5693.6
yuv2gbrapf32le_full_X_4_512_avx2: 3656.6
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2022-01-11 16:33:17 -03:00
rcombs
df9180d8a0
swscale/output: use isSwappedChroma
2022-01-04 19:39:22 -06:00
rcombs
cb3a6cc082
swscale/output: use isSemiPlanarYUV for NV12/21/24/42 case
2022-01-04 19:39:22 -06:00
rcombs
f8e284be69
swscale: introduce isSwappedChroma
2022-01-04 19:39:22 -06:00
rcombs
bb4f19f2a2
swscale/output: use isDataInHighBits for 10-bit case
...
This code will need fleshing-out (probably templating) if we ever add
e.g. a P012 format.
2022-01-04 19:39:22 -06:00
rcombs
cf9e8cb52f
swscale/output: use isSemiPlanarYUV for 16-bit case
2022-01-04 19:39:22 -06:00
rcombs
e5d83463c8
swscale: introduce isDataInHighBits
2022-01-04 19:39:22 -06:00
rcombs
cb87a3b137
swscale/output: template-ize yuv2nv12cX 10-bit and 16-bit cases
...
Fixes incorrect big-endian output introduced in 88d804b7ff
Avoids making the filter-time BE check more expensive
2022-01-04 19:39:22 -06:00
Andreas Rheinhardt
b189550137
lib*/version.h: Bump Versions after release/5.0 branch
...
This is done a second time for 5.0 because master was
merged into 5.0 so that it contains the recent DOVI additions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-01-04 14:29:06 +01:00
Andreas Rheinhardt
c512be9a90
lib*/version.h: Bump Versions before release/5.0 branch
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-01-04 13:40:03 +01:00
Andreas Rheinhardt
20b0d24c2f
Makefile: Redo duplicating object files in shared builds
...
In case of shared builds, some object files containing tables
are currently duplicated into other libraries: log2_tab.c,
golomb.c, reverse.c. The check for whether this is duplicated
is simply whether CONFIG_SHARED is true. Yet this is crude:
E.g. libavdevice includes reverse.c for shared builds, but only
needs it for the decklink input device, which given that decklink
is not enabled by default will be unused in most libavdevice.so.
This commit changes this by making it more explicit about what
to duplicate from other libraries. To do this, two new Makefile
variables were added: SHLIBOBJS and STLIBOBJS. SHLIBOBJS contains
the objects that are duplicated from other libraries in case of
shared builds; STLIBOBJS contains stuff that a library has to
provide for other libraries in case of static builds. These new
variables provide a way to enable/disable with a finer granularity
than just whether shared builds are enabled or not. E.g. lavd's
Makefile now contains: SHLIBOBJS-$(CONFIG_DECKLINK_INDEV) += reverse.o
Another example is provided by the golomb tables. These are provided
by lavc for static builds, even if one uses a build configuration
that makes only lavf use them. Therefore lavc's Makefile contains
STLIBOBJS-$(CONFIG_MXF_MUXER) += golomb.o, whereas lavf's Makefile
has a corresponding SHLIBOBJS-$(CONFIG_MXF_MUXER) += golomb_tab.o.
E.g. in case the MXF muxer is the only component needing these tables
only libavformat.so will contain them for shared builds; currently
libavcodec.so does so, too.
(There is currently a CONFIG_EXTRA group for golomb. But actually
one would need two groups (golomb_avcodec and golomb_avformat) in
order to know when and where to include these tables. Therefore
this commit uses a Makefile-based approach for this and stops
using these groups for the users in libavformat.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-01-04 05:01:04 +01:00
Michael Niedermayer
4be85c9331
lib*/version.h: Bump Versions after release/5.0 branch
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-01-03 22:10:46 +01:00
Michael Niedermayer
f3964a59e1
lib*/version.h: Bump Versions before release/5.0 branch
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-01-03 22:08:31 +01:00
rcombs
3e00b9e395
swscale/x86/init: use isSemiPlanarYUV
...
Fixes P210/P410 cases introduced (and broken) in 88d804b7ff
2021-12-23 01:41:03 -06:00
rcombs
88d804b7ff
swscale: add P210/P410/P216/P416 output
2021-12-22 18:38:40 -06:00
Alan Kelly
eebe406c80
libswscale: Test AV_CPU_FLAG_SLOW_GATHER for hscale functions.
...
This is instead of EXTERNAL_AVX2_FAST so that the avx2 hscale functions
are only used where they are faster.
2021-12-21 17:44:53 -03:00
James Almer
eab91c3e2e
x86/scale_avx2: don't use $ for hex literals
...
Fixes compilation with AVX2 enabled yasm.
Signed-off-by: James Almer <jamrial@gmail.com>
2021-12-16 17:29:21 -03:00
Alan Kelly
9092e58c44
x86/scale_avx2: Change asm indent from 2 to 4 spaces.
...
Signed-off-by: James Almer <jamrial@gmail.com>
2021-12-16 13:42:04 -03:00
Alan Kelly
86663963e6
x86/swscale: fix minor coding style issues
...
Signed-off-by: James Almer <jamrial@gmail.com>
2021-12-16 13:16:04 -03:00
James Almer
76a3f961f8
x86/scale_avx2: add missing check for AVX2 assembler support
...
Should fix compilation with old yasm.
Signed-off-by: James Almer <jamrial@gmail.com>
2021-12-16 09:41:56 -03:00
Alan Kelly
f900a19fa9
libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.
...
Fixes so that fate under 64 bit Windows passes.
These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available.
Signed-off-by: James Almer <jamrial@gmail.com>
2021-12-15 20:04:59 -03:00
Andreas Rheinhardt
3be6fe9a56
swscale/yuv2rgb: Silence a set-but-unused-variable warning
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-12-03 16:10:51 +01:00
rcombs
f0204de47d
swscale: add P210/P410/P216/P416 input
2021-11-28 16:40:43 -06:00
Mark Reid
3f4ce004b8
swscale/input: clip rgbf32 values before lrintf
...
if the float pixel * 65535.0f > 2147483647.0f
lrintf may overfow and return negative values, depending on implementation.
nan and +/-inf values may also be implementation defined
clip the value first so lrintf always works.
values < 0.0f, -inf, nan = 0.0f
values > 65535.0f, +inf = 65535.0f
old timings
195960 decicycles in planar_rgbf32le_to_uv, 1 runs, 0 skips
186120 decicycles in planar_rgbf32le_to_uv, 2 runs, 0 skips
188645 decicycles in planar_rgbf32le_to_uv, 4 runs, 0 skips
183625 decicycles in planar_rgbf32le_to_uv, 8 runs, 0 skips
181157 decicycles in planar_rgbf32le_to_uv, 16 runs, 0 skips
177533 decicycles in planar_rgbf32le_to_uv, 32 runs, 0 skips
175689 decicycles in planar_rgbf32le_to_uv, 64 runs, 0 skips
232960 decicycles in planar_rgbf32be_to_uv, 1 runs, 0 skips
221380 decicycles in planar_rgbf32be_to_uv, 2 runs, 0 skips
216640 decicycles in planar_rgbf32be_to_uv, 4 runs, 0 skips
213505 decicycles in planar_rgbf32be_to_uv, 8 runs, 0 skips
211558 decicycles in planar_rgbf32be_to_uv, 16 runs, 0 skips
210596 decicycles in planar_rgbf32be_to_uv, 32 runs, 0 skips
210202 decicycles in planar_rgbf32be_to_uv, 64 runs, 0 skips
161680 decicycles in planar_rgbf32le_to_y, 1 runs, 0 skips
153540 decicycles in planar_rgbf32le_to_y, 2 runs, 0 skips
148255 decicycles in planar_rgbf32le_to_y, 4 runs, 0 skips
140600 decicycles in planar_rgbf32le_to_y, 8 runs, 0 skips
132935 decicycles in planar_rgbf32le_to_y, 16 runs, 0 skips
128531 decicycles in planar_rgbf32le_to_y, 32 runs, 0 skips
140933 decicycles in planar_rgbf32le_to_y, 64 runs, 0 skips
190980 decicycles in planar_rgbf32be_to_y, 1 runs, 0 skips
176080 decicycles in planar_rgbf32be_to_y, 2 runs, 0 skips
167980 decicycles in planar_rgbf32be_to_y, 4 runs, 0 skips
164685 decicycles in planar_rgbf32be_to_y, 8 runs, 0 skips
162751 decicycles in planar_rgbf32be_to_y, 16 runs, 0 skips
162404 decicycles in planar_rgbf32be_to_y, 32 runs, 0 skips
167849 decicycles in planar_rgbf32be_to_y, 64 runs, 0 skips
new timings
183320 decicycles in planar_rgbf32le_to_uv, 1 runs, 0 skips
175700 decicycles in planar_rgbf32le_to_uv, 2 runs, 0 skips
179570 decicycles in planar_rgbf32le_to_uv, 4 runs, 0 skips
172932 decicycles in planar_rgbf32le_to_uv, 8 runs, 0 skips
168707 decicycles in planar_rgbf32le_to_uv, 16 runs, 0 skips
165224 decicycles in planar_rgbf32le_to_uv, 32 runs, 0 skips
163423 decicycles in planar_rgbf32le_to_uv, 64 runs, 0 skips
184940 decicycles in planar_rgbf32be_to_uv, 1 runs, 0 skips
185150 decicycles in planar_rgbf32be_to_uv, 2 runs, 0 skips
185790 decicycles in planar_rgbf32be_to_uv, 4 runs, 0 skips
185472 decicycles in planar_rgbf32be_to_uv, 8 runs, 0 skips
185277 decicycles in planar_rgbf32be_to_uv, 16 runs, 0 skips
185813 decicycles in planar_rgbf32be_to_uv, 32 runs, 0 skips
185332 decicycles in planar_rgbf32be_to_uv, 64 runs, 0 skips
145400 decicycles in planar_rgbf32le_to_y, 1 runs, 0 skips
145100 decicycles in planar_rgbf32le_to_y, 2 runs, 0 skips
143490 decicycles in planar_rgbf32le_to_y, 4 runs, 0 skips
136687 decicycles in planar_rgbf32le_to_y, 8 runs, 0 skips
131271 decicycles in planar_rgbf32le_to_y, 16 runs, 0 skips
128698 decicycles in planar_rgbf32le_to_y, 32 runs, 0 skips
127170 decicycles in planar_rgbf32le_to_y, 64 runs, 0 skips
156020 decicycles in planar_rgbf32be_to_y, 1 runs, 0 skips
146990 decicycles in planar_rgbf32be_to_y, 2 runs, 0 skips
142020 decicycles in planar_rgbf32be_to_y, 4 runs, 0 skips
141052 decicycles in planar_rgbf32be_to_y, 8 runs, 0 skips
138973 decicycles in planar_rgbf32be_to_y, 16 runs, 0 skips
138027 decicycles in planar_rgbf32be_to_y, 32 runs, 0 skips
143939 decicycles in planar_rgbf32be_to_y, 64 runs, 0 skips
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-11-15 16:50:10 -03:00
Mark Reid
74e49cc583
swscale/input: unify grayf32 funcs with rgbf32 funcs
...
This is ment to be a cosmetic change
old timings:
42780 UNITS in grayf32le, 1 runs, 0 skips
56720 UNITS in grayf32le, 2 runs, 0 skips
67265 UNITS in grayf32le, 4 runs, 0 skips
58082 UNITS in grayf32le, 8 runs, 0 skips
63512 UNITS in grayf32le, 16 runs, 0 skips
52720 UNITS in grayf32le, 32 runs, 0 skips
46491 UNITS in grayf32le, 64 runs, 0 skips
68500 UNITS in grayf32be, 1 runs, 0 skips
66930 UNITS in grayf32be, 2 runs, 0 skips
62305 UNITS in grayf32be, 4 runs, 0 skips
55510 UNITS in grayf32be, 8 runs, 0 skips
50216 UNITS in grayf32be, 16 runs, 0 skips
44480 UNITS in grayf32be, 32 runs, 0 skips
42394 UNITS in grayf32be, 64 runs, 0 skips
new timings:
46660 UNITS in grayf32le, 1 runs, 0 skips
51830 UNITS in grayf32le, 2 runs, 0 skips
53390 UNITS in grayf32le, 4 runs, 0 skips
50910 UNITS in grayf32le, 8 runs, 0 skips
44968 UNITS in grayf32le, 16 runs, 0 skips
40349 UNITS in grayf32le, 32 runs, 0 skips
38330 UNITS in grayf32le, 64 runs, 0 skips
39980 UNITS in grayf32be, 1 runs, 0 skips
49630 UNITS in grayf32be, 2 runs, 0 skips
53540 UNITS in grayf32be, 4 runs, 0 skips
59767 UNITS in grayf32be, 8 runs, 0 skips
51206 UNITS in grayf32be, 16 runs, 0 skips
44743 UNITS in grayf32be, 32 runs, 0 skips
41468 UNITS in grayf32be, 64 runs, 0 skips
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-11-14 17:12:13 +01:00
Soft Works
58dce6f010
swscale/swscale: check SWS_PRINT_INFO flag for printing alignment warnings
...
This makes output consistent with a similar warning just few
lines above where this flag is checked in the same way.
Signed-off-by: softworkz <softworkz@hotmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
2021-11-13 19:55:32 +01:00
Mark Reid
d2379bd6a0
swscale/input: fix planar_rgb16_to_a for gbrap10be and gbrap12be formats
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-11-04 11:52:33 +01:00
Michael Niedermayer
8316b2a15f
swscale/swscale: Improve *ColorspaceDetails() doxy
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-24 16:54:36 +02:00
Michael Niedermayer
5f3a160b42
swscale/utils: Improve return codes of sws_setColorspaceDetails()
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-24 16:54:36 +02:00
Michael Niedermayer
c7699f95bb
swscale/utils: Set all threads to the same colorspace even on failure
...
Fixes: ./ffplay dav.y4m -vf "scale=hd1080:threads=4"
Found-by: Paul
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-24 16:54:36 +02:00
Wu Jianhua
2c734a8496
libswscale/x86/rgb2rgb: add shuffle_bytes avx2
...
Performance data(Less is better):
shuffle_bytes_ssse3 3.64654
shuffle_bytes_avx2 0.94288
Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
2021-10-15 10:59:20 +02:00
Michael Niedermayer
f801207568
swscale/swscale: Pass slice location into unscaled code also for dst scaling
...
Fixes: alphablend=checkerboard
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-03 20:38:29 +02:00
Michael Niedermayer
06d6726588
swscale/alphablend: Fix slice handling
...
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-03 20:38:29 +02:00
Michael Niedermayer
9f40b5badb
swscale/swscale_internal: Avoid unsigned for slice parameters
...
Mixing unsigned and signed often leads to unexpected arithmetic results.
Fixes: out of array write
Found-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-09-30 19:47:15 +02:00
Manuel Stoeckl
32329397e2
swscale: add input/output support for X2BGR10LE
...
Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-09-26 16:26:10 +02:00
Manuel Stoeckl
ca594df622
swscale/yuv2rgb: fix conversion to X2RGB10
...
This resolves a problem where conversions from YUV to X2RGB10LE
would produce color values a factor 4 too small, because an 8-bit
value was placed in a 10-bit channel.
Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-09-26 16:26:10 +02:00
Andreas Rheinhardt
1ea3650823
Replace all occurences of av_mallocz_array() by av_calloc()
...
They do the same.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-20 01:03:52 +02:00
Andreas Rheinhardt
044a7c08dc
swscale/swscale: Disable x86-specific code for other arches
...
SSE2 is x86 specific, yet due to the call to av_get_cpu_flags()
compilers were unable to optimize the checks (and the call) away
on other arches.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-19 23:52:37 +02:00
Andreas Rheinhardt
f440c422b7
swscale/swscale: Fix races when using unaligned strides/data
...
In this case the current code tries to warn once; to do so, it uses
ordinary static ints to store whether the warning has already been
emitted. This is both a data race (and therefore undefined behaviour)
as well as a race condition, because it is really possible for multiple
threads to be the one thread to emit the warning. This is actually
common since the introduction of the new multithreaded scaling API.
This commit fixes this by using atomic integers for the state;
furthermore, these are not static anymore, but rather contained
in the user-facing SwsContext (i.e. the parent SwsContext in case
of slice-threading).
Given that these atomic variables are not intended for synchronization
at all (but only for atomicity, i.e. only to output the warning once),
the atomic operations use memory_order_relaxed.
This affected the nv12, nv21, yuv420, yuv420p10, yuv422, yuv422p10 and
yuv444 filter-overlay FATE-tests.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-19 23:52:37 +02:00
Andreas Rheinhardt
a1255a350d
libswscale/options: Add parent_log_context_offset to AVClass
...
This allows to associate log messages from slice contexts to
the user-visible SwsContext.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-19 23:52:37 +02:00
James Almer
5fe648d04a
libswscale/swscale: initialize all dst plane pointers in sws_receive_slice()
...
Fixes valgrind warnings about use of uninitialised values.
Signed-off-by: James Almer <jamrial@gmail.com>
2021-09-07 09:44:58 -03:00
Anton Khirnov
d6fdc78e91
sws: implement slice threading
2021-09-06 09:17:53 +02:00
Anton Khirnov
42cd64c182
sws: add a new scaling API
2021-09-06 09:16:52 +02:00
Andreas Rheinhardt
2c05ee092b
avutil/internal, swresample/audioconvert: Remove cpu.h inclusions
...
These inclusions are not necessary, as cpu.h is already included
wherever it is needed (via direct inclusion or via the arch-specific
headers).
Also remove other unnecessary cpu.h inclusions from ordinary
non-headers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-22 14:33:45 +02:00
Michael Niedermayer
7874d40f10
swscale/slice: Fix wrong return on error
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-09 15:21:37 +02:00
Michael Niedermayer
fa1e158ef6
swscale/utils: Use full chroma interpolation for rgb4/8 and dither none
...
Dither none is only implemented in full chroma interpolation for these rgb formats
Its also a obscure choice (producing less nice images) that implementing it in the
other code-paths makes no sense
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-09 12:29:03 +02:00
Michael Niedermayer
7528532550
swscale/output: Implement dither none for yuv2rgb_write_full()
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-09 12:29:03 +02:00
Michael Niedermayer
997f9cfc12
swscale/slice: Check slice for allocation failure
...
Fixes: null pointer dereference
Fixes: alloc_slice.mp4
Found-by: Rafael Dutra <rafael.dutra@cispa.de>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-09 12:29:03 +02:00
Anton Khirnov
37c0fe49b7
sws: move updating the palette higher up
...
It does not interact in any way with the code setting up the image
pointers/strides, so it should not be intermixed with it.
2021-07-03 16:13:40 +02:00
Anton Khirnov
d6649d9a3b
sws: move initializing dither_error higher up
...
It does not interact in any way with the code setting up the image
pointers/strides, so it should not be intermixed with it.
2021-07-03 16:13:10 +02:00
Anton Khirnov
e188985598
sws: move the early return for zero-sized slices higher up
...
Place it right after the input parameter validation. There is no point
in performing any setup if the sws_scale() call won't do anything.
2021-07-03 16:09:43 +02:00
Anton Khirnov
a91e6c927e
sws: simplify setting sliceDir
2021-07-03 16:09:21 +02:00
Anton Khirnov
ff753f41dd
sws: merge handling frame start into a single block
...
Also, return an error code on failure rather than 0.
2021-07-03 16:09:07 +02:00
Anton Khirnov
1b11a324fe
sws: make checking for the start of a new frame more explicit
2021-07-03 16:07:22 +02:00
Anton Khirnov
0fb014b7bb
sws: reset sliceDir at the end of sws_scale()
...
Makes it more clear that resetting it does not interact with the scaling
code that it is currently intermixed with.
2021-07-03 16:05:39 +02:00
Anton Khirnov
1f80789bf7
sws: rename SwsContext.swscale to convert_unscaled
...
That function pointer is now used only for unscaled conversion.
2021-07-03 15:57:53 +02:00
Anton Khirnov
fe490ec165
sws: separate the calls to scaled vs unscaled conversion
...
Call the scaler function directly rather than through a function
pointer. Drop the now-unused return value from ff_getSwsFunc() and
rename the function to reflect its new role.
This will be useful in the following commits, where it will become
important that the amount of output is different for scaled vs unscaled
case.
2021-07-03 15:57:13 +02:00
Anton Khirnov
0f8e0957d2
sws: do not reallocate scratch buffers for each slice
2021-07-03 15:56:16 +02:00
Anton Khirnov
2730639259
sws: group the parameters validity checks together
...
Also, fail with an error code rather than 0.
2021-07-03 15:31:18 +02:00
Anton Khirnov
c05cab34a9
sws: initialize {src,dst}Stride2 consistently with {src,dst}2
2021-07-03 15:31:08 +02:00
Anton Khirnov
d3d8e09640
sws: cosmetics
...
Reindent after previous commit, rewrap long lines.
2021-07-03 15:30:56 +02:00
Anton Khirnov
f136493d03
sws: factor out cascaded scaling
2021-07-03 15:30:34 +02:00
Anton Khirnov
a2254aedc9
sws: cosmetics
...
Reindent after previous commit, split long lines.
2021-07-03 15:30:20 +02:00
Anton Khirnov
44f12718bf
sws: factor out gamma-correct scaling
2021-07-03 15:29:50 +02:00
Anton Khirnov
e355af9be9
sws: return an error code on invalid parameters to sws_scale()
2021-07-03 15:29:35 +02:00
Anton Khirnov
21a4e48f88
sws: reindent after previous commit
2021-07-03 15:29:22 +02:00
Anton Khirnov
27acca1af0
sws: factor out updating the palette
2021-07-03 15:28:46 +02:00
Anton Khirnov
f8c21ccbfc
sws: remove unnecessary braces
...
There used to be more code inside them, but it was removed in
6de58b4903
.
2021-07-03 15:28:36 +02:00
Peter Lundblad
da0abbbb01
libswscale: Make sws_init_context thread safe.
...
Call ff_sws_rgb2rgb_init via ff_thread_once instead of checking one of the
variables it updates.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-01 23:49:41 +02:00
Limin Wang
43295ae6a9
swscale/swscale_unscaled: don't use the optimized bgr24toYV12 unscaled conversion when width%2
...
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
2021-06-06 12:34:05 +08:00
Anton Khirnov
85ba17f36d
Bump major versions of all libraries.
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-04-27 11:48:05 -03:00
Andreas Rheinhardt
ea2d9b7a2e
libswscale: Remove unused deprecated functions, make used ones static
...
Deprecated in 3b905b9fe6
.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:11 -03:00
Andreas Rheinhardt
f3c197b129
Include attributes.h directly
...
Some files currently rely on libavutil/cpu.h to include it for them;
yet said file won't use include it any more after the currently
deprecated functions are removed, so include attributes.h directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-04-19 14:34:10 +02:00
Alan Kelly
3ce8d09244
libswscale/x86/yuv2yuvX: Removes unrolling for mmx and mmxext
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-04-01 20:47:52 +02:00
Alan Kelly
dc57762cb4
libswscale/x86/swscale: Only call ff_yuv2yuvX functions if the input size is > 0
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-04-01 20:47:52 +02:00
Michael Niedermayer
c361fa9e21
Bump minor versions after release branch
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-03-20 01:02:11 +01:00
Michael Niedermayer
c67d2a2875
Bump Versions before release/4.4 branch
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-03-20 01:01:12 +01:00
Andreas Rheinhardt
c23a5523b5
swscale/x86/swscale: Remove unused ASM constants
...
The last user of g15Mask, r15Mask, g16Mask and r16Mask was disabled
in 77a416e8aa
and finally removed in
36e8de07ed62609df45d064b56501e3084d25723; b15Mask and b16Mask were
apparently always unused (except for in_asm_used_var_warning_killer,
a function that only existed to make the compiler not optimize ASM
constants away).
w10 is unused since d604bab901
, w02
since ef423a6618
.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-24 09:47:54 +01:00
Andreas Rheinhardt
aad597a93c
swscale/x86/rgb2rgb: Remove unused ASM constants
...
mask24hh etc. are unused since f099fbf5f3
,
mask32b and mask32r since 296609f859
,
mask32g since b38d487466
and mask32 since
f8a138be52
.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-24 09:45:17 +01:00
Andreas Rheinhardt
49db6e4b4e
swscale/x86/yuv2rgb: Remove unused ASM constants
...
mmx_grnmask is unused since 531f97b0c3
,
the other constants since e934194b6a
.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-24 09:43:14 +01:00
Chip Kerchner
e7f53d6ac9
lsws/ppc/yuv2rgb_altivec: Fix build in non-VSX environments
...
Add inline function for vec_xl if VSX is not supported. vec_xl intrinsic
is only available on POWER 7 or higher.
Fixes ticket #8750 .
Signed-off-by: Andriy Gelman <andriy.gelman@gmail.com>
2021-02-22 23:19:21 -05:00
James Almer
1a555d3c60
swscale/x86/yuv2yuvX: use the movsxdifnidn helper macro
...
Simplifies code
Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-18 18:47:43 -03:00
James Almer
ebb48d85a0
swscale/x86/yuv2yuvX: use movq to load 8 bytes in all non-AVX2 functions
...
mova expands to movq on non-XMM functions
Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-18 18:47:43 -03:00
James Almer
d512ebbaed
swscale/x86/yuv2yuvX: use the SPLATW helper macro
...
Simplifies code
Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-18 18:47:43 -03:00
James Almer
c00567647e
swscale/x86/swscale: fix mix of inline and external function definitions
...
This includes removing pointless static function forward declarations.
Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-18 18:47:42 -03:00
James Almer
c2bf1dcace
swscale/x86/swscale: fix compilation with old yasm
...
Where AVX2 may not be supported.
Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-17 21:09:36 -03:00
Alan Kelly
554c2bc708
swscale: move yuv2yuvX_sse3 to yasm, unrolls main loop
...
And other small optimizations for ~20% speedup.
2021-02-17 21:21:03 +01:00
Carl Eugen Hoyos
2687070d9b
lsws/ppc/yuv2rgb: Fix transparency converting from yuv->rgb32.
...
Based on 68363b69
by Reimar Döffinger.
Fixes ticket #9077 .
2021-01-24 17:17:29 +01:00
Anton Khirnov
e15371061d
lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bump
...
They are not properly namespaced and not intended for public use.
2021-01-01 14:14:57 +01:00
Anton Khirnov
c8c2dfbc37
lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h
...
That is a more appropriate place for it.
2021-01-01 14:11:01 +01:00
Jeremy Leconte
29cef1bcd6
libswscale: avoid UB nullptr-with-offset.
...
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-12-24 15:27:56 +01:00
Andriy Gelman
1200264fc4
swscale/rgb2rgb_template: use shuffle macro on big-endian arches
...
Fixes fate-qtrle-32bit on big-endian.
The macro does a simple byte swap on uint8 array without any casts, so
it's valid on big-endian arches.
The mentioned test was failing because the byteswap function
shuffle_bytes_3210_c() is used in the pixel format conversion
(argb->bgra).
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andriy Gelman <andriy.gelman@gmail.com>
2020-12-12 23:07:22 -05:00
Carl Eugen Hoyos
46e362b765
lsws/x86/yuv2rgb: Fix compilation with mmxext or ssse3 disabled.
...
Fixes ticket #8986 .
2020-11-14 15:37:57 +01:00
Marton Balint
993429cfb4
swscale/x86/yuv2rgb: fix crashes when loading alpha from unaligned buffers
...
Regression since fc6a5883d6
on SSSE3 enabled
CPUs.
Fixes ticket #8955 .
Signed-off-by: Marton Balint <cus@passwd.hu>
2020-11-02 00:31:34 +01:00