Speed: from 3.9x to 9.6x speed improvement over C, and some small
(up to 15%) speed improvements over existing MMX code (particularly
for bigger filters).
This allows using more specific implementations for chroma/luma, e.g.
we can make assumptions on filterSize being constant, thus avoiding
that test at runtime.
It just does that part in scalar form, I doubt using a vector store
over 2 array would speed it up particularly.
The function should be written to not use a scratch buffer.
The logged information is possibly false, and it tends to be outdated
after each change since the logging code needs to be manually updated.
Simplify and prevent confusing wrong debug messages.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Also remove the unnecessary isSupportedIn/Out macros.
Make the code more compact/readable, and simplify the access to
lsws-specific pixel format information.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
When converting RGB format to RGB format with the same bits per sample,
unscaled path performs conversion on the whole buffer at once. For
non-multiple-of-16 BGR24 to RGB24 conversion it means that padding at the
end of line will be converted too. Since it may be of arbitrary length
(e.g. 8 bytes), operating on the whole buffer produces obviously wrong
results.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
ptrdiff_t can be 4 bytes, which leads to the next element being 4-byte
aligned and thus at a different offset than intended. Forcing 8-byte
alignment forces equal offset of dither16/32 on x86-32 and x86-64.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
We operated on 31-bits, but with e.g. lanczos scaling, values can
add up to beyond 0x80000000, thus leading to output of zeroes. Drop
one bit of precision fixes this.
Remove unused variables "flags" and "dstFormat" in yuv2packed1,
merge source rows per plane for yuv2packed[12], and make every
source argument int16_t (some where invalidly set to uint16_t).
This prevents stack pollution and is part of the Great Evil Plan
to simplify swscale.
This will likely lead to a considerable performance boost,
since it removes a branch from the inner loop. Part of the
Great Evil Plan to simplify swscale.
On architectures such as x86 (both 32 bit and 64bit), the stack element
size is fixed, which maintains alignment. Here, this change does not
break anything. However, we also support also other architectures where
this property is not maintained and therefore, applications will crash
horribly.
This change effectively forces all applications to be recompiled against
libswscale.
This is part of the Great Evil Plan to simplify swscale. Note that
you'll see some code duplication between the output functions for
different RGB variants, and even between packed-YUV and RGB
variants. This is intentional because it improves readability.
Inline functions are easier to read, maintain, modify and test,
which justifies the slightly increased source size. This patch
also adds support for non-native endianness RGB15/16 and fixes
isSupportedOutput() to no longer claim that we support writing
non-native RGB565/555/444.
Remove inline keyword from functions that are never inlined.
Use av_always_inline for functions that should be force-inlined
for performance reasons. Use av_cold for init functions.
Remove inline keyword for functions that are only called through
their function pointers (and thus cannot be inlined); add av_cold
keyword to init function, and use av_always_inline instead of
inline for functions that must be inlined for performance reasons.
This prevents the following compiler warnings: "warning:
initialization from incompatible pointer type". Since the
variables are only ever used in inline assembly, their type
is actually irrelevant (so the part where it was wrong did
not invoke any buggy behaviour).
They are hacks added to reuse the same scaling function for
different formats and they may cause problems when SIMD
implementation of the same functions are used along with pure
C functions.
Remove duplicate "inC" and "_c" functions that do the same thing;
give each function that handles data and acts as a function pointer
a "_c" suffix; remove "_c" suffix from functions that are inherently
not optimizable. Remove inline keyword from functions that are only
used through function pointers.
Many functions have such a prefix, but do not actually use any
instructions or features from that set, thus giving the false
impression that swscale is highly optimized for a particular
system, whereas in reality it is not.
Interleave macros and code so that it's easier to find the
actual code that belongs to a function. Also reindent where
appropriate and remove dead code.
Instead, only set the function pointers if bitexact flag is
not set during initialization. Since a change in flags triggers
a re-init anyway, this doesn't situations where flag values
change during runtime.
The functions are identical to their MMX counterparts. Thus,
pretending that swscale is highly optimized for AMD3DNOW
extensions is a poorly executed practical joke at best.
Also remove code that overwrites the C versions of functions in
sws_init_swScale_altivec(), so that it uses the C functions of files
if no altivec-optimized version exists.
Adding _POSIX_C_SOURCE to CPPFLAGS globally produces all sorts of problems
since it causes certain system functions to be hidden on some (BSD) systems.
The solution is to only add the flag on systems that really require it, i.e.
glibc-based ones.
This change makes BSD systems compile out-of-the-box without the need for
adding specific flags manually. It also allows dropping a number of flags
set manually on a file-per-file basis, but were only present to work around
breakage introduced by the presence of _POSIX_C_SOURCE.
Also add _XOPEN_SOURCE to CPPFLAGS for glibc systems. We use XSI extensions
in several places already, so it is preferable to define it globally instead
of littering source files with individual #defines only needed for glibc.
Fix handling of input if not in native endianness, and add support for
9/10-bit output. This allows us to force endianness of YUV420P 9/10bit
in the H264/10bit fate tests, which should fix them on big-endian
systems.
PPC and x86 code is split off from swscale_template.c. Lots of code is
still duplicated and should be removed later.
Again uniformize the init system to be more similar to the dsputil one.
Unset h*scale_fast in the x86 init in order to make the output
consistent with the previous status. Thanks to Josh for spotting it.
Keep only the plain C code in the main rgb2rgb.c and move the x86
specific optimizations to x86/rgb2rgb.c
Change the initialization pattern a little so some of it can be
factorized to behave more like dsputils.
When HAVE_7REGS was not defined these functions had an empty body
causing the following warnings during compilation.
In file included from libswscale/x86/yuv2rgb_mmx.c:58:
libswscale/x86/yuv2rgb_template.c: In function ‘yuva420_rgb32_MMX’:
libswscale/x86/yuv2rgb_template.c:412: warning: no return statement in function returning non-void
libswscale/x86/yuv2rgb_template.c: In function ‘yuva420_bgr32_MMX’:
libswscale/x86/yuv2rgb_template.c:457: warning: no return statement in function returning non-void
Signed-off-by: Diego Biurrun <diego@biurrun.de>
It is pretty hopeless that other considerable projects will adopt
libavutil alone in other projects. Projects that need small footprint
are better off with more specialized libraries such as gnulib or rather
just copy the necessary parts that they need. With this in mind, nobody
is helped by having libavutil and libavcore split. In order to ease
maintenance inside and around FFmpeg and to reduce confusion where to
put common code, avcore's functionality is merged (back) to avutil.
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
When built with gcc 4.6, the MMX rgb24 to yuv conversion gives
wrong output. The compiler produces this warning:
libswscale/swscale_template.c:1885:5: warning: use of memory input without lvalue in asm operand 4 is deprecated
Changing the memory operand to a register makes it work.
Signed-off-by: Mans Rullgard <mans@mansr.com>
rgb32tobgr32() has been removed in favour of shuffle_bytes_2103() in r32190
Originally committed as revision 32676 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
handle_jpeg may update the src/dstFormat variables, this makes sure the
updated version is stored in the context.
This fixes roundup issue 2302.
Patch by Troot, all_crap_goes_here at hotmail
Originally committed as revision 32562 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
but worse it did not set up destination dimensions, thus every user
of it would necessarily fail.
Originally committed as revision 32424 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
scale context. Prevent pointless warnings when using
av_opt_set_defaults() for setting the default values, as in a pending
patch.
Originally committed as revision 32413 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
instead of requireing being passed through function parameters. This also
makes sws work with AVOptions.
Originally committed as revision 32368 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
If the CRC from the src->dst conversion matches a reference, it is not
necessary to perform a dst->yuva420p conversion and check the SSD.
Originally committed as revision 32213 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
The source format parameters are kept in static variables and conversion from
ref to source is only made when any parameter changes.
Originally committed as revision 32211 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
yvu9ToYv12Wrapper() used to support yv12 with the chroma planes either in the
uv order or the vu order. FFmpeg no longer has a pixel format in vu order.
Originally committed as revision 32156 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
mmap() with MAP_ANONYMOUS requires the file descriptor to be -1 in NetBSD.
Linux just ignores this parameter.
Patch by Grant Carver <grantc at cat dot co dot za>
Originally committed as revision 31984 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
r31772 | stefano | 2010-07-23 01:01:31 +0200 (Fri, 23 Jul 2010) | 2 lines
Prefer impersonal form over third person, for consistency with the
rest of FFmpeg.
The change was not approved by the maintainer.
Originally committed as revision 31847 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
sequential geometries instead of running all algorithms sequentially for each
geometry.
Originally committed as revision 31775 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
Some converters (ie. unscaled rgb24 -> argb) may write some bytes out of
bounds. Ideally the converters should be fixed, but in the meantime we allocate
more memory to prevent heap corruption.
Originally committed as revision 31768 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
darwin requires _DARWIN_C_SOURCE to be defined for MAP_ANON, which is used by
swscale to determine whether to use malloc() or mmap(). 64-bit darwin does not
have an executable heap, so mmap() must be used instead of malloc(), and
therefore _DARWIN_C_SOURCE must be defined.
Originally committed as revision 31760 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
Don't change paramater passing, but instead use casts.
Shouldn't affect asm output on anything other than win64.
libswscale should work on win64 now.
The rest of ffmpeg still isn't win64 compatible due to the issue of xmm
clobbers, but swscale doesn't use any SSE.
Patch by Anton Mitrofanov <BugMaster AT narod DOT ru>.
Originally committed as revision 31751 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
If the destination planes are offset within the destination buffer,
writing the extra bytes at the end may write outside of the destination
buffer.
Originally committed as revision 31746 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
This fixes warnings about wrong type being used, e.g.:
libswscale/yuv2rgb.c: In function ‘ff_yuv2rgb_c_init_tables’:
libswscale/yuv2rgb.c:778: warning: passing argument 4 of ‘fill_table’ from incompatible pointer type
libswscale/yuv2rgb.c:598: note: expected ‘uint8_t *’ but argument is of type ‘uint16_t *’
Originally committed as revision 31722 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
additionallym deprecate palette8torgb16 and its bgr variant without
replacement. These functions are not meant to be used by applications.
Discussed at: http://comments.gmane.org/gmane.comp.video.ffmpeg.devel/109340
Originally committed as revision 31301 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
sws_setColorspaceDetails() to ff_yuv2rgb_c_init_tables().
Allow to factorize duplicated code.
Originally committed as revision 31300 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
They contain exactly the same code as their 16bit variants, so this is
effectively code de-duplication.
Originally committed as revision 31298 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
We now have an LGPL replacement that is at least equally fast.
Originally committed as revision 31278 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
M_PI is defined by the included file libavutil/mathematics.h.
Originally committed as revision 31185 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
x86_64 / Mac OS X gcc 4.0.1
x86_64 / Linux icc (all)
x86_64 / Linux gcc 4.0.4
x86_64 / OpenBSD gcc 3.3.5
x86_64 / Linux suncc 5.10
and there are some reports of crashes.
Originally committed as revision 31170 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
long was being incorrectly used as an x86-sized register, both for 32 and 64
bits, but this is not the case in win64.
Originally committed as revision 31153 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
The old code is correct only when stride = 2*width.
Patch by Ronaldo Moura <ronaldo d moura monity com br>
Originally committed as revision 31142 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.
Originally committed as revision 31050 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
being able to compile it and deduplicate the code at the same time.
Originally committed as revision 30978 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
This is of course done with permissions from the authors. The only GPL
component left are MMX optimizations for YUV to RGB conversion.
Originally committed as revision 30965 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
runtime cpudetection mode.
Fixes compilation with '--enable-runtime-cpudetect --disable-altivec'.
Originally committed as revision 30952 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
output format.
Patch by Janusz Krzysztofik, jkrzyszt A tis D icnet D pl
Originally committed as revision 30934 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
in case altivec is disabled, even compilation of code using altivec
keywords or asm must be avoided.
Originally committed as revision 30869 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
converter with support for rgb444 output format.
Patch by Janusz Krzysztofik jkrzyszt chez tis icnet pl
Originally committed as revision 30841 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
libswscale.
Patch by Alexis Ballier, alexis D ballier A gmail
Originally committed as revision 30840 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
127.32L to me, beware when using git svn dcommit for committing stuff
to svn...
Originally committed as revision 30827 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
case where the source format is PIX_FMT_GRAY8.
This is required as PIX_FMT_GRAY8 has been declared as a paletted
format in FFmpeg r22191, fix GRAY8 -> RGB conversion.
Originally committed as revision 30826 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
format.
Make swscale-test only perform the test from the input to the output
format rather than perform all.
Also implement swscale-test-all.sh, for performing all the tests.
Improve flexibility of the swscale-test tool, this way is simpler to
perform only a subset of tests.
Originally committed as revision 30825 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
bytes when converting between RGB32 variants.
In particular fix the argb -> rgba and abgr -> bgra conversions.
See the thread:
Subject: [FFmpeg-devel] [RFC] RGB32 / BGR32 ethernal bug
Date: Tue, 26 Jan 2010 01:06:18 +0100
Originally committed as revision 30501 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
PIX_FMT_YUVJ420P
PIX_FMT_YUVJ422P
PIX_FMT_YUVJ440P
PIX_FMT_YUVJ444P
in the isSupported{In,Out} macros.
These pixel formats are not true pixel formats but hacks specific to
JPEG in libavcodec. They are deprecated and should be removed (that is
from libavcodec first and libswscale second)... but they must be
tested by swscale-test.
See thread:
Subject: [FFmpeg-devel] [PATCH] Extend show_pix_fmts() to make it print the input/output support
Date: 2010-01-30 15:54:08 GMT
Originally committed as revision 30474 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
functions. Improve readability.
-This line, and those below, will be ignored--
swscale.c
Originally committed as revision 30466 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
See the thread:
Subject: [FFmpeg-devel] [RFC] Make swscale-test perform only one convertion
Date: Fri, 29 Jan 2010 01:52:23 +0100
Originally committed as revision 30457 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
in r30419, which was causing a swscale-example regression.
Also increase my liter count by 20.0 units.
Originally committed as revision 30431 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
source and destination format, cache those values in the newly added
SwsContext:srcFormatBpp and SwsContext:dstFormatBpp fields, and remove
the fmt_depth() function.
Originally committed as revision 30419 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale
supported both as input and as output, as the conversion performed is:
yuva420p -> src -> dst -> yuva420p.
Originally committed as revision 30379 to svn://svn.mplayerhq.hu/mplayer/trunk/libswscale