Runtime of all IDCTs together goes from 3327 to 2473 cycles (intra, i.e.
~35% faster) or from 2312 to 1448 cycles (inter, i.e. ~60% faster). Total
decode time of ped1080p.webm goes from 8.086sec to 7.974sec (1.4% faster).
Sub-IDCTs will follow later. ped1080.webm goes from 9.295s to 8.191s
(13.5% faster). The IDCT itself goes from 4372 (intra) or 4337 (inter)
to 403 (intra) or 329 (inter) cycles for the DC-only form, 23755 (intra)
or 23723 (inter) to 3497 (intra) or 3607 (inter) cycles for the no-DC
form, which averages from 23393 (intra) or 16612 (inter) to 3449 (intra)
or 2392 (inter) for all 32x32s together, i.e. about ~7x faster (all
tests done on ped1080p.webm).
Fixes use of uninitialized memory
Fixes: msan_uninit-mem_7fcecee73d71_6470_luckynight-partial.tak
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The function macro always sets .align 2 before declaring the
function label (since 5c5e1ea3) and always sets the section to
.text (since 278caa6a).
The .align 5 before certain functions, added in fc252eba, were added
before .text and .align were added to the function macro and thus
became useless/unused when the function macro got them.
This restores the original intention, to align the loop entry
points.
Signed-off-by: Martin Storsjö <martin@martin.st>
This file no longer uses the pld instruction at all, all such uses
have been split into hpeldsp_arm.S.
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes use of uninitialized memory
Partly fixes; msan_uninit-mem_7fb7d24780d0_2744_R03T.CAK
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes use of uninitialized memory
Partly fixes; msan_uninit-mem_7fb7d24780d0_2744_R03T.CAK
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes use of uninitialized memory
Fixes: msan_uninit-mem_7f6e13c220d0_8489_short.flac
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes use of uninitialized memory
Fixes: msan_uninit-mem_7f49667d83db_3396_WebVTT_capability_tester.vtt
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes use of uninitialized memory
Fixes: msan_uninit-mem_7fa8c49400d0_3923_audiosig.rm
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes use of uninitialized memory
Fixes: msan_uninit-mem_7f2785ab8669_6838_mewmew_vorbis_ssa.nut
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9409c9bdbfd829353473ee6cc3e91c726481c069':
configure: Disable networking if winsock2.h is available but winsock functions aren't
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Previously, if neither of the checks for the closesocket function
succeeded, we still kept winsock2.h and networking in general
enabled.
When targeting the WinRT API subset, the winsock2.h header is
available (making the check for it succeed, giving the impression
that winsock is available), but tests that actually try to use
such a function will fail. In this case, disable the winsock2.h
feature and networking in general, as if the winsock2.h header
test would have failed in the first place.
Signed-off-by: Martin Storsjö <martin@martin.st>
The new code is faster and reuses the previous state in case of
multiple calls.
The previous code could easily end up in near-infinite loops,
if the difference between two clock() calls never was larger than
1.
This makes fate-parseutils finish in finite time when run in wine,
if CryptGenRandom isn't available (which e.g. isn't available if
targeting Windows RT/metro).
Patch originally by Michael Niedermayer but with some modifications
by Martin Storsjö.
Signed-off-by: Martin Storsjö <martin@martin.st>
* qatar/master:
configure: Update freetype check to follow upstream
Conflicts:
configure
Not merged, as its broken (Reported by ubitux)
See: cea5812fa7
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'a03a642d5ceb5f2f7c6ebbf56ff365dfbcdb65eb':
h264: do not use 422 functions for monochrome
See: 07abf13da4
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9eef9eb3014b2ed9c3ff4aac510a9f04edb555cf':
h264: check that execute_decode_slices() is not called too many times
Conflicts:
libavcodec/h264.c
The check is replaced by an assert() as the mb index should not ever go out
of bounds.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9a026c72982faf20e1c8dfbe48f0b312cdea69c8':
h264: rebuild the default ref list if the reference count changes
Conflicts:
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '50079a6aa93291e6dc9d9fb8d33da83f79e9311d':
lavc: do not leak the internal frame if opening the codec fails
Conflicts:
libavcodec/utils.c
See: 8b285f03f7
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8058284ce09030b47512746d726fb2ad3ae8a20f':
lavc: add 422/444 YUV with alpha to align_dimensions()
Conflicts:
libavcodec/utils.c
Only cosmetical changes happen as these formats already where in the list before
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '2f97094608cfd2665660f7a26a3291559b186752':
lagarith: do not call simd functions on unaligned lines
Conflicts:
libavcodec/lagarith.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>