1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-11-26 19:01:44 +02:00
Commit Graph

3 Commits

Author SHA1 Message Date
Paul B Mahol
452d611fc7 avfilter/vf_lut3d: allow to control when to upload CLUT for haldclut 2022-04-26 20:07:04 +02:00
Martin Storsjö
a78f136f3f configure: Use a separate config_components.h header for $ALL_COMPONENTS
This avoids unnecessary rebuilds of most source files if only the
list of enabled components has changed, but not the other properties
of the build, set in config.h.

Signed-off-by: Martin Storsjö <martin@martin.st>
2022-03-16 14:12:49 +02:00
Mark Reid
716b396740 avfilter/vf_lut3d: add x86-optimized tetrahedral interpolation
I spotted an interesting pattern that I didn't see before that leads to the implementation being faster.
The bit shifting table I was using before is no longer needed, and was able to remove quite a few lines. 
I also add use of FMA on the AVX2 version.

f32 1920x1080 1 thread with prelut
c impl
1434012700 UNITS in lut3d->interp,       1 runs,      0 skips
1434035335 UNITS in lut3d->interp,       2 runs,      0 skips
1423615347 UNITS in lut3d->interp,       4 runs,      0 skips
1426268863 UNITS in lut3d->interp,       8 runs,      0 skips

sse2
905484420 UNITS in lut3d->interp,       1 runs,      0 skips
905659010 UNITS in lut3d->interp,       2 runs,      0 skips
915167140 UNITS in lut3d->interp,       4 runs,      0 skips
915834222 UNITS in lut3d->interp,       8 runs,      0 skips

avx
574794860 UNITS in lut3d->interp,       1 runs,      0 skips
581035090 UNITS in lut3d->interp,       2 runs,      0 skips
584116720 UNITS in lut3d->interp,       4 runs,      0 skips
581460290 UNITS in lut3d->interp,       8 runs,      0 skips

avx2
301698880 UNITS in lut3d->interp,       1 runs,      0 skips
301982880 UNITS in lut3d->interp,       2 runs,      0 skips
306962430 UNITS in lut3d->interp,       4 runs,      0 skips
305472025 UNITS in lut3d->interp,       8 runs,      0 skips

gbrap16 1920x1080 1 thread with prelut
c impl
1480894840 UNITS in lut3d->interp,       1 runs,      0 skips
1502922990 UNITS in lut3d->interp,       2 runs,      0 skips
1496114307 UNITS in lut3d->interp,       4 runs,      0 skips
1492554551 UNITS in lut3d->interp,       8 runs,      0 skips

sse2
980777180 UNITS in lut3d->interp,       1 runs,      0 skips
986121520 UNITS in lut3d->interp,       2 runs,      0 skips
986489840 UNITS in lut3d->interp,       4 runs,      0 skips
998832248 UNITS in lut3d->interp,       8 runs,      0 skips

avx
622212360 UNITS in lut3d->interp,       1 runs,      0 skips
622981160 UNITS in lut3d->interp,       2 runs,      0 skips
645396315 UNITS in lut3d->interp,       4 runs,      0 skips
641057075 UNITS in lut3d->interp,       8 runs,      0 skips

avx2
321336400 UNITS in lut3d->interp,       1 runs,      0 skips
321268920 UNITS in lut3d->interp,       2 runs,      0 skips
323459895 UNITS in lut3d->interp,       4 runs,      0 skips
324949967 UNITS in lut3d->interp,       8 runs,      0 skips
2021-10-10 22:23:48 +02:00