Reviewed-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Reviewed-by: Carl Eugen Hoyos <cehoyos@ag.or.at>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
pb_eo must be handled as a rip relative address for MSVC64, so an
intermediate register is needed. Should fix link failures.
Suggested by Hendrik Leppkes and Christophe Gisquet.
Tested-By: Hendrik Leppkes <h.leppkes@gmail.com>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The epel_hv functions were still relying on only epel_hv 8-wide
being the maximum width instanciated.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array read
Fixes: asan_static-oob_30328b6_719_cov_3325483287_H264_artifacts_motion.h264
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This reverts commit 3b4ffba3af.
Unbreaks the SSSE3 code on mingw32
Conflicts:
libavcodec/x86/lossless_audiodsp.asm
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This is needed as the mmx code is used as fallback from the ssse3 code
Suggested-by: jamrial
Tested-by: wm4
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The buffer pointers would be otherwise overwritten, causing a
leak on e.g. PERSIST_RPARAM_A_RExt_Sony_1.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
All the webm/vp9 files I have seen so far can have packets that contain
1 invisible and 1 visible frame. The vp9 parser separates them. Since
the invisible frame is always (?) the first sub-packet, the new packet
is assigned the PTS of the original packet, while the packet containing
the visible frame has no PTS.
This patch essentially reassigns the PTS from the invisible to the
visible frame.
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Original x86 intrinsics code by Pierre-Edouard Lepere.
Yasm port, refactoring and optimizations by James Almer.
Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
Width 32
342694 decicycles in sao_edge_filter_10, 16384 runs, 0 skips
29476 decicycles in ff_hevc_sao_edge_filter_32_10_ssse3, 16384 runs, 0 skips
13996 decicycles in ff_hevc_sao_edge_filter_32_10_avx2, 16381 runs, 3 skips
Width 64
581163 decicycles in sao_edge_filter_10, 8192 runs, 0 skips
59774 decicycles in ff_hevc_sao_edge_filter_64_10_ssse3, 8192 runs, 0 skips
28383 decicycles in ff_hevc_sao_edge_filter_64_10_avx2, 8191 runs, 1 skips
Signed-off-by: James Almer <jamrial@gmail.com>
Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere.
Refactoring and optimizations by James Almer.
Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
Width 32
158583 decicycles in edge, sao_edge_filter_8 runs, 0 skips
5205 decicycles in ff_hevc_sao_edge_filter_32_8_ssse3, 32767 runs, 1 skips
2942 decicycles in ff_hevc_sao_edge_filter_32_8_avx2, 32767 runs, 1 skips
Width 64
705639 decicycles in sao_edge_filter_8, 262144 runs, 0 skips
19224 decicycles in ff_hevc_sao_edge_filter_64_8_ssse3, 262111 runs, 33 skips
10433 decicycles in ff_hevc_sao_edge_filter_64_8_avx2, 262115 runs, 29 skips
Signed-off-by: James Almer <jamrial@gmail.com>
This ensures we do not loose the frame in case or multiple clears
Fixes out of array read
Fixes: asan_heap-oob_2fa47ea_2100_cov_1278768963_ff_add_pixels_clamped_mmx.m2ts
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
ff_avc_write_annexb_extradata() allocates extradata, but don't add
FF_INPUT_BUFFER_PADDING_SIZE value
Signed-off-by: Lukasz Marek <lukasz.m.luki2@gmail.com>
For reasons we are not privy to, nvidia decided that the nvenc encoder
should apply aspect ratio compensation to 'DVD like' content, assuming that
the content is not bt.601 compliant, but needs to be bt.601 compliant. In
this context, that means that they make the following, questionable,
assumptions:
1) If the input dimensions are 720x480 or 720x576, assume the content has
an active area of 704x480 or 704x576.
2) Assume that whatever the input sample aspect ratio is, it does not account
for the difference between 'physical' and 'active' dimensions.
From, these assumptions, they then conclude that they can 'help', by adjusting
the sample aspect ratio by a factor of 45/44. And indeed, if you wanted to
display only the 704 wide active area with the same aspect ratio as the full
720 wide image - this would be the correct adjustment factor, but what if you
don't? And more importantly, what if you're used to ffmpeg not making this kind
of adjustment at encode time - because none of the other encoders do this!
And, what if you had already accounted for bt.601 and your input had the
correct attributes? Well, it's going to apply the compensation anyway!
So, if you take some content, and feed it through nvenc repeatedly, it
will keep scaling the aspect ratio every time, stretching your video out
more and more and more.
So, clearly, regardless of whether you want to apply bt.601 aspect ratio
adjustments or not, this is not the way to do it. With any other ffmpeg
encoder, you would do it as part of defining your input paramters or
do the adjustment at playback time, and there's no reason by nvenc
should be any different.
This change adds some logic to undo the compensation that nvenc would
otherwise do.
nvidia engineers have told us that they will work to make this
compensation mechanism optional in a future release of the nvenc
SDK. At that point, we can adapt accordingly.
Signed-off-by: Philip Langdale <philipl@overt.org>
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array read
Fixes: asan_heap-oob_1fb2f9b_3780_cov_3984375136_usf.mkv
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes integer overflow and out of array read
Fixes: asan_heap-oob_1fb2f9b_3780_cov_3984375136_usf.mkv
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
As with sao_band_filter, pass instead the two variables from the struct needed in the function.
This simplifies writing asm optimized versions.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: James Almer <jamrial@gmail.com>