inv_zigzag_direct16 16-byte aligned, so mark it appropriately.
Fixes encoder crashes e.g. with MPlayer's -vf lavc.
Originally committed as revision 21389 to svn://svn.ffmpeg.org/ffmpeg/trunk
Change order of operands as gcc uses a hardcoded register per operand it seems
even for static functions
thus reducing unneeded moved (now functions try to pass the same argument in
the same spot).
Change signed int to unsigned int for array indexes as signed requires signed
extension while unsigned is free.
move the +52 up and merge it where it will end as a lea instruction, gcc always
splits the 52 out there turning the free +52 into an expensive one otherwise.
The changed code becomes a little faster.
Originally committed as revision 21375 to svn://svn.ffmpeg.org/ffmpeg/trunk
This should be faster (couldnt meassue a difference), and its less picky
on slightly out of spec dquant.
Originally committed as revision 21373 to svn://svn.ffmpeg.org/ffmpeg/trunk
This makes ffmpeg stop printing millions of
Multiple frames in a packet from stream 0
when decoding adpcm.
Originally committed as revision 21362 to svn://svn.ffmpeg.org/ffmpeg/trunk
The various avcodec_thread_init() functions are updated to return
immediately after setting avctx->thread_count. This allows -threads 0
to pass through to codecs. It also simplifies the usage for apps
using libavcodec.
Originally committed as revision 21358 to svn://svn.ffmpeg.org/ffmpeg/trunk
It allows VLD H264 decoding using DXVA2 (GPU assisted decoding API under
VISTA and Windows 7).
It is implemented by using AVHWAccel API. It has been tested successfully
for some time in VLC using an nvidia card on Windows 7.
To compile it, you need to have the system header dxva2api.h (either from
microsoft or using http://downloads.videolan.org/pub/videolan/testing/contrib/dxva2api.h)
The generated libavcodec.dll does not depend directly on any new lib as
the necessary objects are given by the application using FFmpeg.
Originally committed as revision 21353 to svn://svn.ffmpeg.org/ffmpeg/trunk
This fixes gcc failing to fit 6 memory locations into 7 registers on x86-32
Originally committed as revision 21337 to svn://svn.ffmpeg.org/ffmpeg/trunk
all inlined, its small and horizontal & vertical versions are build out of
them. no change as gcc already did this.
Originally committed as revision 21333 to svn://svn.ffmpeg.org/ffmpeg/trunk
With b_keyframe instead of IDR for detecting keyframes, ffmpeg should now
support periodic encoding with periodic intra refresh (although there is no
interface option for it yet).
Set the new timebase values for full VFR input support.
Bump configure to check for API version 83.
Originally committed as revision 21317 to svn://svn.ffmpeg.org/ffmpeg/trunk
Thats not possible except maybe in FMO which noone uses anyway.
iam also not sure if this wasnt missing a part_width.
Originally committed as revision 21312 to svn://svn.ffmpeg.org/ffmpeg/trunk
loop filter. This removes one obstacle of getting ff_h264_filter_mb_fast()
bitexact. code is maybe 0.1% faster
Originally committed as revision 21280 to svn://svn.ffmpeg.org/ffmpeg/trunk
Run loop filter per row instead of per MB, this also should make it
much easier to switch to per frame filtering and also doing so in a
seperate thread in the future if some volunteer wants to try.
Overall decoding speedup of 1.7% (single thread on pentium dual / cathedral sample)
This change also allows some optimizations to be tried that would not have
been possible before.
Originally committed as revision 21270 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fixes build with --disable-encoders --enable-encoder=snow.
This fixes MPlayer build with --disable-mencoder.
Originally committed as revision 21259 to svn://svn.ffmpeg.org/ffmpeg/trunk
~200 bytes smaller ff_h264_filter_mb()
please everyone, NEVER add code with the assumtation that gcc will remove it
without checking gcc actually does. Chances are it does not.
Originally committed as revision 21251 to svn://svn.ffmpeg.org/ffmpeg/trunk
and 5% faster.
ff_h264_filter_mb_fast() stay the same size as gcc decided not to inline these
functions there in the first place.
Originally committed as revision 21250 to svn://svn.ffmpeg.org/ffmpeg/trunk
No benchmark because its just replacing variables with litteral constants
(so no risk for slowdown outside gcc silliness) and i need sleep.
Originally committed as revision 21237 to svn://svn.ffmpeg.org/ffmpeg/trunk
Using the low-level macros directly avoids redundant open/update/close
cycles.
2-3% faster on ARM, PPC, and Core i7.
Originally committed as revision 21224 to svn://svn.ffmpeg.org/ffmpeg/trunk
This could have caused the linking failure of pred_pskip_motion() missing if
a compiler included never used static functions.
Originally committed as revision 21221 to svn://svn.ffmpeg.org/ffmpeg/trunk
About 1% faster ff_ac3_bit_alloc_calc_psd on Intel Atom, overall speedup
not measurable though.
Should have a bigger effect on systems without cmov or with very slow cmov.
Originally committed as revision 21214 to svn://svn.ffmpeg.org/ffmpeg/trunk
Since BGR24 is decoded as BGR32, fill its alpha channel with 255
using the appropriate predictors.
Originally committed as revision 21211 to svn://svn.ffmpeg.org/ffmpeg/trunk
Simplify cur_band_type, group_len, and coef/offset calculations. This
makes the code easier to read and slightly faster.
Originally committed as revision 21189 to svn://svn.ffmpeg.org/ffmpeg/trunk
The codebooks each consist of small number of values repeated in
groups of 2 or 4. Storing the codebooks as a packed list of 2- or
4-bit indexes into a table reduces their size substantially (from 7.5k
to 1.5k), resulting in less cache pressure.
For the band types with sign bits in the bitstream, storing the number
and position of non-zero codebook values using a few bits avoids
multiple get_bits() calls and floating-point comparisons which gcc
handles miserably.
Some float/int type punning also avoids gcc brain damage.
Overall speedup 20-35% on Cortex-A8, 20% on Core i7.
Originally committed as revision 21188 to svn://svn.ffmpeg.org/ffmpeg/trunk
Two of these are in fact constant size, so use the constant instead of
a variable in the declarations. The remaining one is small enough
that always using the maximum size is acceptable.
Originally committed as revision 21183 to svn://svn.ffmpeg.org/ffmpeg/trunk