The functions which actually drive the filter graph by pushing
frames through it need to ensure an aligned stack for SIMD functions.
This fixes a crash in YADIF filter when using a mingw build in a MSVC application.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
These smaller samples do not need to be unpacked to double words
allowing the code to process more pixels every iteration (still 2 in MMX
but 6 in SSE2). It also avoids emulating the missing double word
instructions on older instruction sets.
Like with the previous code for 16-bit samples this has been tested on
an Athlon64 and a Core2Quad.
Athlon64:
1809275 decicycles in C, 32718 runs, 50 skips
911675 decicycles in mmx, 32727 runs, 41 skips, 2.0x faster
495284 decicycles in sse2, 32747 runs, 21 skips, 3.7x faster
Core2Quad:
921363 decicycles in C, 32756 runs, 12 skips
486537 decicycles in mmx, 32764 runs, 4 skips, 1.9x faster
293296 decicycles in sse2, 32759 runs, 9 skips, 3.1x faster
284910 decicycles in ssse3, 32759 runs, 9 skips, 3.2x faster
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This is a fairly dumb copy of the assembly for 8-bit samples but it
works and produces identical output to the C version. The options have
been tested on an Athlon64 and a Core2Quad.
Athlon64:
1810385 decicycles in C, 32726 runs, 42 skips
1080744 decicycles in mmx, 32744 runs, 24 skips, 1.7x faster
818315 decicycles in sse2, 32735 runs, 33 skips, 2.2x faster
Core2Quad:
924025 decicycles in C, 32750 runs, 18 skips
623995 decicycles in mmx, 32767 runs, 1 skips, 1.5x faster
406223 decicycles in sse2, 32764 runs, 4 skips, 2.3x faster
387842 decicycles in ssse3, 32767 runs, 1 skips, 2.4x faster
307726 decicycles in sse4, 32763 runs, 5 skips, 3.0x faster
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Since lavfi works natively with AVFrame now, these functions are no longer
necessary and can be removed in a future bump.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
af_join: do not leak input frames.
asrc_anullsrc: return EOF, not -1
Conflicts:
libavfilter/asrc_anullsrc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e4a7b2177d14678ae240edcabaacfe2b14619b7b':
vf_showinfo: remove its useless init function
AVOptions: fix using named constants with child contexts.
Conflicts:
libavutil/opt.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9676b9a2cdc4a90611188fc48d8d388e427997c5':
AVOption: remove an unused function parameter.
filters.texi: restore mistakenly removed section name for noformat
avfiltergraph: use sizeof(var) instead of sizeof(type)
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This reverts commit 9efcfbed9d.
All the shame on me; this commit is actually causing more problems
(broken outputs but also crashes) than it was solving.
Before this change, the audio input and output formats are set
independently, so the lavfi negociation could pick different settings
for the input and output. This is particularly true for the channel
layout settings, where multiple choices were available.
Fixes Ticket2342.
Always use the special filter for the first and last 3 columns (only).
Changes made in 64ed397 slowed the filter to just under 3/4 of what it
was. This commit restores the speed while maintaining identical output.
For reference, on my Athlon64:
1733222 decicycles in old
2358563 decicycles in new
1727558 decicycles in this
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>