macros for reading and writing 64-bit aligned little-endian values.
these macros are used by the DST decoder and give a performance boost
on platforms that where the compiler must guard against unaligned
memory access.
This attribute is supported for this architecture in MSVC as well
(but produces errors if used for 32 bit x86).
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit 'f79d847400d218cfd0b95f10358fe6e65ec3c9c4':
intreadwrite: Use the __unaligned keyword on MSVC for ARM and x86_64
Merged-by: James Almer <jamrial@gmail.com>
* commit '230b1c070baa3b6d4bd590426a365b843d60ff50':
intreadwrite: Add intermediate variables in the byteswise AV_W*() macros
Mostly a noop. Merged for cosmetic purposes.
See d83ff76ca0
Merged-by: James Almer <jamrial@gmail.com>
This reverts commit 014773b66b.
Since 230b1c070, the bytewise AV_W*() macros only expand their
argument once, i.e. doing exactly the same change as was done
in the AV_COPY*U macros, so this change is no longer necessary.
Signed-off-by: Martin Storsjö <martin@martin.st>
AV_WN64 is meant for unaligned data, but the existing av_alias* unions
(without a definition for the av_alias attribute - we don't have one
for MSVC) indicate to the compiler that they would have sufficient
alignment for normal access, i.e. the compiler is free to assume
8 byte alignment.
On ARM, this makes sure that AV_WN64 (or two consecutive AV_WN32) is
done with two str instructions instead of one strd.
Signed-off-by: Martin Storsjö <martin@martin.st>
This avoids issues with expanding the argument multiple times,
and makes sure that it is of the right type for the following shifts.
Even if the caller of a macro could be expected not to pass parameters
that have side effects if expanded multiple times, these fallback
codepaths are rarely, if ever, tested, so it is expected that such
issues can arise.
Thefore, for safety, make sure the fallback codepaths only expand
the arguments once.
Signed-off-by: Martin Storsjö <martin@martin.st>
If AV_RN and AV_WN are macros with multiple individual reads and
writes, the previous version of the AV_COPYU macro would fail if
the reads and writes overlap.
This should not be any less efficient in any case, given a
sensibly optimizing compiler.
Signed-off-by: Martin Storsjö <martin@martin.st>
Evaluating it multiple times, can have side effects and is possibly slow.
So its definitly a bad idea.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
lavc: add opt_find to AVCodecContext class.
h264: Complexify frame num gap shortening code
intreadwrite.h: fix AV_RL32/AV_RB32 signedness.
Fix decoding of mpegts streams with h264 video that does *NOT* have b frames
Add minor bumps and APIChanges entries for lavf private options.
ffmpeg: deprecate -vc and -tvstd
ffmpeg: use new avformat_open_* API.
ffserver: use new avformat_open_* API.
ffprobe: use new avformat_open_* API.
ffplay: use new avformat_open_* API.
cmdutils: add opt_default2().
dict: add AV_DICT_APPEND flag.
lavf: add avformat_write_header() as a replacement for av_write_header().
Deprecate av_open_input_* and remove their uses.
lavf: add avformat_open_input() as a replacement for av_open_input_*
AVOptions: add av_opt_find() as a replacement for av_find_opt.
AVOptions: add av_opt_set_dict() mapping a dictionary struct to a context.
ffmpeg: don't abuse a global for passing frame size from input to output
ffmpeg: don't abuse a global for passing pixel format from input to output
ffmpeg: initialise encoders earlier.
Conflicts:
cmdutils.c
doc/APIchanges
ffmpeg.c
ffplay.c
ffprobe.c
libavcodec/h264.c
libavformat/avformat.h
libavformat/utils.c
libavformat/version.h
libavutil/avutil.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The output type of the AV_RL32/AV_RB32 macros was signed int. The
resulting overflow broke at least some ASF streams with large
timestamps. Fix by adding a cast to uint32_t.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The per-arch headers can define any combination of B/L/N variants.
This ensures that whatever is defined in an arch header gets used
for all equivalents not defined there. E.g. on a little-endian
machine, AV_RN and AV_RL should give the same code.
Originally committed as revision 19658 to svn://svn.ffmpeg.org/ffmpeg/trunk
PPC is normally big endian but has special little endian load/store
instructions. Using these avoids a separate byteswap. This makes the
vorbis decoder about 5% faster. Not much else uses little-endian
read/write extensively.
GCC generates horrible PPC code for the default AV_[RW]B64 (which uses
a packed struct), so we override it with a plain pointer cast.
Originally committed as revision 18602 to svn://svn.ffmpeg.org/ffmpeg/trunk
ARMv6 and later support unaligned loads and stores for single
word/halfword but not double/multiple. GCC is ignorant of this and
will always use bytewise accesses for unaligned data. Casting to an
int32_t pointer is dangerous since a load/store double or multiple
instruction might be used (this happens with some code in FFmpeg).
Implementing the AV_[RW]* macros with inline asm using only supported
instructions gives fast and safe unaligned accesses. ARM RVCT does
the right thing with generic code.
This gives an overall speedup of up to 10%.
Originally committed as revision 18601 to svn://svn.ffmpeg.org/ffmpeg/trunk
This changes intreadwrite.h to support per-arch implementations of the
various macros allowing us to take advantage of special instructions
or other properties the compiler does not know about.
Originally committed as revision 18600 to svn://svn.ffmpeg.org/ffmpeg/trunk
Consistently apply this rule: the guard name is obtained from the
filename by stripping the leading "lib", converting '/' and '.' to
'_' and uppercasing the resulting name. Guard names in the root
directory have to be prefixed by "FFMPEG_".
Originally committed as revision 15120 to svn://svn.ffmpeg.org/ffmpeg/trunk