* commit '8caadfc53ddc55a269722ada65294f0ea8b609ac':
fate: Be silent when switching to Git branch
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
This allows to copy information related to the stream ID from the demuxer
to the muxer, thus allowing for example to retain information related to
synchronous and asynchronous KLV data packets. This information is used
in the muxer when remuxing to distinguish the two kind of packets (if the
information is lacking, data packets are considered synchronous).
The fate reference changes are due to the use of
av_packet_merge_side_data(), which increases the size of the output
packet size, since side data is merged into the packet data.
Currently, AVStream contains an embedded AVCodecContext instance, which
is used by demuxers to export stream parameters to the caller and by
muxers to receive stream parameters from the caller. It is also used
internally as the codec context that is passed to parsers.
In addition, it is also widely used by the callers as the decoding (when
demuxer) or encoding (when muxing) context, though this has been
officially discouraged since Libav 11.
There are multiple important problems with this approach:
- the fields in AVCodecContext are in general one of
* stream parameters
* codec options
* codec state
However, it's not clear which ones are which. It is consequently
unclear which fields are a demuxer allowed to set or a muxer allowed to
read. This leads to erratic behaviour depending on whether decoding or
encoding is being performed or not (and whether it uses the AVStream
embedded codec context).
- various synchronization issues arising from the fact that the same
context is used by several different APIs (muxers/demuxers,
parsers, bitstream filters and encoders/decoders) simultaneously, with
there being no clear rules for who can modify what and the different
processes being typically delayed with respect to each other.
- avformat_find_stream_info() making it necessary to support opening
and closing a single codec context multiple times, thus
complicating the semantics of freeing various allocated objects in the
codec context.
Those problems are resolved by replacing the AVStream embedded codec
context with a newly added AVCodecParameters instance, which stores only
the stream parameters exported by the demuxers or read by the muxers.
Also bench a smaller buffer. This drastically reduces --bench runtime
and reports smaller, more readable numbers.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
The timebase change in the zmbv-8bit test is due to the fact that
previously the timebase string was evaluated as floating point, then
converted to a rational. After this commit, the timebase is passed
directly as is.
If an input file is bigger than 2GB (assume sizeof(int) == 4)),
size0/size1 will overflow, making stddev and PSNR invalid.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Otherwise the 'lcov -q --remove' run fails with the following error:
lcov: ERROR: cannot write to coverage.info!
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
When out-of-tree builds now use a relative path, the '-b' option of lcov
is not needed, so just pass the current directory to it in this case.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
The parser only parses the core, and thus has a duration relative
to the core sample rate only, not the actual stream sample rate.
FATE references changed due to now correct timestamps.
This reverts commit 8ed82d8174.
SMPTE S377-1-2009c defines in F.4.1 that the Video Line Map should
always be an array with two 32 bit integers as elements. This is
repeated in G.2.12 with actual examples for progressive content,
where the second value would always be 0.
Additionally, the IRT MXF analyser also lists this as the only
error in the MXF output from ffmpeg: https://mxf-analyser-cloud.irt.de
Reviewed-by: Tomas Härdin <tomas.hardin@codemill.se>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Since timecode_frame)start is a private option now, it stays at the default,
and is no longer written to the file.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
From
https://msdn.microsoft.com/en-us/library/windows/desktop/dd318229%28v=vs.85%29.aspx:
"If biCompression equals BI_RGB and the bitmap uses 8 bpp or less, the
bitmap has a color table immediatelly following the BITMAPINFOHEADER
structure. The color table consists of an array of RGBQUAD values. The
size of the array is given by the biClrUsed member. If biClrUsed is
zero, the array contains the maximum number of colors for the given
bitdepth; that is, 2^biBitCount colors."
Nothing about "monochrome" here. Unfortunately, pal8 to monow conversion
seems a bit flaky, but that's another story.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.
Currently only implemented for ELF.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.
Currently only implemented for ELF.
This options is only used by huffyuv, ffvhuv, jpegls, mjpeg,
mpegvideoenc, png, utvideo.
It is a very codec-specific options, so deprecate the global variant.
Set proper limits to the maximum allowed values, and update utvideoenc
tests to use the new option name.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
This option is only used by mpegvideoenc, x264, xavs, and vpx.
It is a very codec-specific option, so deprecate the global variant.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
I/S energy, especially when it comes to phase cancellations,
needs to use signed coefficients as input, yet it was using
abs'd coefficients. That was a slight bug.
The relative error between two encoding strategies is the simple
difference of rate-distortion values, and not the absolute
difference. An absolute measure would allow worsening of the
quantization error as well as improving.
1. Fix sf_idx and band_type addressing to address only the first
subwindow in the group (others could hold garbage values)
2. Don't step on ms_mask when is_mask is set. I/S selection
already sets the ms_mask properly and shouldn't be overridden.
3. Use mid/sid cb/sf when computing coding error, as should be
since those are the cb/sfs that will eventually be set.
4. Fix distortion computation on multi-subwindow groups (was
subtracting the bits terms multiple times)
5. Clear ms_mask when one side uses PNS and the other doesn't.
When using PNS, ms_mask signals correlated noise, which can be
detected just like regular M/S detection, so we don't skip
noise bands, but when only one side uses PNS setting the flag
can confuse some encoders, so avoid that.
Use two separate functions, depending on whether VFP/NEON is available.
This is set to require armv5te - it uses blx, which is only available
since armv5t, but we don't have a separate configure item for that.
(It also uses ldrd, which requires armv5te, but this could be avoided
if necessary.)
Signed-off-by: Martin Storsjö <martin@martin.st>
This commit fixes the lack of palettized display of 1-bit video
in the qtrle decoder. It is related to my commit of
lavf/qtpalette, which added 1-bit video to the "palettized video"
category. As far as I can see, everything works fine, but comments are
of course welcome.
Below are links to sample files, which should now be displayed properly
with bluish colors, but which were previously displayed in black &
white.
Matroska:
https://drive.google.com/open?id=0B3_pEBoLs0faNjI0cHBMWDhYY2c
Earth Spin 1-bit qtrle.mkv
QuickTime (mov):
https://drive.google.com/open?id=0B3_pEBoLs0faUlItWm9KaGJSTEE
Earth Spin 1-bit qtrle.mov
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '2008f76054906e9ff6bf744800af0e5a5bfe61be':
dca: remove unused decode_hf function and quant_d tables
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'aebf07075f4244caf591a3af71e5872fe314e87b':
dca: change the core to work with integer coefficients.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '711781d7a1714ea4eb0217eb1ba04811978c43d1':
x86: checkasm: check for or handle missing cleanup after MMX instructions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
Fixes Ticket #5032
The samples in Ticket #5032 is using \r\r\n as line breaks. Since we
already are handling \r, or \n, or \r\n as line breaks, \r\n\n will be
considered as a double line breaks. This is an issue because
ff_subtitles_read_text_chunk() will as a result stop extracting a chunk
after just one line.
So instead of parsing the SRT by "chunks" (which means splitting every
double LB), this new parser is detecting timing lines, and split the
events on this basis. While this sounds safe and simple, it needs to
take into account the event number preceding the timing line while
handling situations such as:
- event number starting at 0 or actually any number instead of 1
- event numbers not being ordered at all
- event number being followed by text garbage (this really happened,
see Ticket #4898)
- event payload containing one or multiple number (a protagonist saying
a count-down, a date or whatever) which could be confused with a
chapter number
- event number being empty (see Ticket #2167)
- all kind of weird line breaks can appear randomly like wild pokémons
- untrustable line breaks (Ticket #5032)
The sample madness.srt tries to sum up most of this into one sample,
ticket5032-rrn.srt is the file containing \r\r\n line breaks. and
empty-events-2167.srt contains empty events.
Check the full FPU tag word instead of only the lower half and simplify
the comparison.
Use upper-case function base name as macro name to instantiate both
checked_call variants.
The DCA core decoder converts integer coefficients read from the
bitstream to floats just after reading them (along with dequantization).
All the other steps of the audio reconstruction are done with floats
which makes the output for the DTS lossless extension (XLL)
actually lossy.
This patch changes the DCA core to work with integer coefficients
until QMF. At this point the integer coefficients are converted to floats.
The coefficients for the LFE channel (lfe_data) are not touched.
This is the first step for the really lossless XLL decoding.
Not every asm routine is expected clear the MMX state after returning.
It is however a requisite for testing floating point code in checkasm.
Annotate functions requiring cleanup with declare_func_emms() and issue
emms after the call. The remaining functions are checked for having a
cleared MMX state after return.
The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu
implementations do not support it in hardware. Vector mode code will
depending the OS either be emulated in software or result in an illegal
instruction on cpus which does not support it. This was not really
problem in practice since NEON implementations of the same functions are
preferred. It will however become a problem for checkasm which tests
every cpu flag separately.
Since this is a cpu feature newer cpu do not support anymore the
behaviour of this flag differs from the other flags. It can be only
activated by runtime cpu feature selection.
This should fix this test failing on kfreebsd, a regression since
6e5dbe7, which decreased the CMP_TARGET by 1.
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
When the interpolated value is divided by the sum of weights, no
rounding is done, which means the value is truncated. This results in
a slight bias towards dark green in the interpolated area. Rounding
properly removes the bias.
I measured this change to reduce the interpolation error by 1 to 2 %
on average on a number of sample input and logo area combinations.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
With only 7 coefficients per short window at most the extra precision
makes a difference and seems to reduce crackling and stddev even
further.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This patch does 4 things, all of which interact and thus it
woudln't be possible to commit them separately without causing
either quality regressions or assertion failures.
Fate comparison targets don't all reflect improvements in
quality, yet listening tests show substantially improved quality
and stability.
1. Increase SF range utilization.
The spec requires SF delta values to be constrained within the
range -60..60. The previous code was applying that range to
the whole SF array and not only the deltas of consecutive values,
because doing so requires smarter code: zeroing or otherwise
skipping a band may invalidate lots of SF choices.
This patch implements that logic to allow the coders to utilize
the full dynamic range of scalefactors, increasing quality quite
considerably, and fixing delta-SF-related assertion failures,
since now the limitation is enforced rather than asserted.
2. PNS tweaks
The previous modification makes big improvements in twoloop's
efficiency, and every time that happens PNS logic needs to be
tweaked accordingly to avoid it from stepping all over twoloop's
decisions. This patch includes modifications of the sort.
3. Account for lowpass cutoff during PSY analysis
The closer PSY's allocation is to final allocation the better
the quality is, and given these modifications, twoloop is now
very efficient at avoiding holes. Thus, to compute accurate
thresholds, PSY needs to account for the lowpass applied
implicitly during twoloop (by zeroing high bands).
This patch makes twoloop set the cutoff in psymodel's context
the first time it runs, and makes PSY account for it during
threshold computation, making PE and threshold computations
closer to the final allocation and thus achieving better
subjective quality.
4. Tweaks to RC lambda tracking loop in relation to PNS
Without this tweak some corner cases cause quality regressions.
Basically, lambda needs to react faster to overall bitrate
efficiency changes since now PNS can be quite successful in
enforcing maximum bitrates, when PSY allocates too many bits
to the lower bands, suppressing the signals RC logic uses to
lower lambda in those cases and causing aggressive PNS.
This tweak makes PNS much less aggressive, though it can still
use some further tweaks.
Also update MIPS specializations and adjust fuzz
Also in lavc/mips/aacpsy_mips.h: remove trailing whitespace
"Fast seek" uses linear interpolation to find the position of the
requested seek time. For CBR this is more direct than using the
mp3 TOC and bypassing the TOC avoids problems with TOC precision.
(see https://crbug.com/545914#c13)
For VBR, fast seek is not precise, so continue to prefer the TOC
when available (the lesser of two evils).
Also, some re-ordering of the logic in mp3_seek to simplify and
give usetoc=1 precedence over fastseek flag.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
As noted in a comment, pe.min in the reference encoder
is centered around current pe. The bit reservoir algo
needs pe.min to be a local minimum, because it can only
account for local PE variations. If it's set to a global
minimum as was being done, bit reservoir logic doesn't
work as efficiently.
This patch tries to forget old minimums and converge to
a local minimum without losing the stability of the
previous solution. Listening tests until now suggest this
solves numerous RC issues.
* commit '823fa7004571cb8404ca5785f9fa6e85f0f9f3d3':
fate: Rework sgi tests into a suite and add the missing ones
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
This is never mentioned in the specifications, and decoders work
just as fine without it. Update the fate references since the compressed
file is smaller.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
* commit '1d62ee38894afb696674db78cee8f8d89204a8fe':
movenc: Add a unit test for signaling of the track start times
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
This way, it never starts with 0xFFF0, and never trips the
ADTS "Detection" code in movenc.c.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* commit '3eeb7edfc2a1157b7b0e0ce21ac2cd44d55d405b':
movenc: Add a unit test for frag_discont with edit lists
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
All diferences in unit tests have been acounted for.
* commit '59e8ec0aa8ab174701d01a3bfe96fedd0b7fcead':
movenc: Add an API unit test for fragmenting options/calls
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
a set ost->frame_rate does not imply CFR in ffmpeg
The changed fate tests had all wrong packet durations
(like 1/1000 or 1/90000)
There might be more cases in which is_cfr could be set
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Also support disabling them as they seem to cause problems to some
Users. They are also not allowed in IRT D-10 thus the default for
mxf_d10 is not to write them
This also decreases the filesize when no user comment are stored
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Contrary to the normal fate tests that run via avconv, this tests
nontrivial call sequences that are only doable via the API
(mainly for different corner cases when using the muxer for
segmenting).
The test muxes fake packet data (with extradata that looks
enough like proper data to make the file be viewable with e.g.
boxdumper) and checks the hash of the produced files. The test also
verifies that fragments produced via different call sequences remain
identical (to avoid e.g. updating the output hashes and suddenly
having fragments that used to be identical suddenly diverging), for
fragments written with frag_discont and/or delay_moov.
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit '3efd71b4d0b4a73ccbbbdc092e6bbd54d92633f4':
avconv: set packet duration for CFR video streams
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
The CMP variable seems to have been inherited from fate-api-seek which set it to null
the mxf reference needed a change due to c7e14a279f
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Trim unneeded leading components and trailing zeros.
Move the formating code in a separate function.
Use the function also to format the default value, it was currently
printed as plain integer, inconsistent to the way it is parsed.
Similar to testsrc, but using drawutils and therefore
supporting a lot of pixel formats instead of just rgb24.
This allows using it as input for other tests without
requiring a format conversion.
It is also slightly faster than testsrc for some reason.
Signed-off-by: Steven Robertson <steven@strobe.cc>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
For the 10-show-existing-frame, the source file indeed has a timestamp
of 3 (or 100/33) for the second visible frame, so the fix appears to
work correctly. For the other, only the timebase is fixed, but again
appears to be correct now.
This fixes a fate failure after bumping the minor version
Its unknown why this is not needed for the other aac tests,
more investigation needed but for now i dont want to leave
it broken while its investigated
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
There were some errors in the calculation as well as an entire
unnecessary loop to find the gain coefficient. Merge the
two loops.
Thanks to @ubitux for the suggestions and testing.
The fate test command line is supposed to serve as an example. It's
nicer to explicitly state the profile rather than setting options
to force it for you.
GCC 3.4 miscompiles it on sunos. Date of release? The second of
August two thousand and five, anno Domini. That's ten years two
months and fourteen days ago. Three thousand seven hundred and
twenty seven days ago. One sixth of the average life expectancy
of a person living in a country with a human development index
of zero point eight hundred and eight, equality adjusted.
GCC 4.3 also miscompiles it, though not as bad.
The LTP encoding and the test is a bit slow currently, taking twice
the amount of time the other tests do, so in the future the
total time to encode might be cut down on that test.
These aren't quite as helpful as the ones in 8bpp, since over there,
we can use pmulhrsw, but here the coefficients have too many bits to
be able to take advantage of pmulhrsw. However, we can still skip
cols for which all coefs are 0, and instead just zero the input data
for the row itx. This helps a few % on overall decoding speed.
It was useful to (accidentally?) spot an overflow in the column pass
of the x86 simple_idct10 implementation.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
omse goes from 0.03060703 (which fails for dct-test) to 0.01663750.
This also actually improve the error of decoding the sample generated
by fate-vsynth3-dnxhd1080i-10bit using simple_idct10 to FAANI, which
goes (when resampled to yuv422p) from:
stddev: 0.06 PSNR: 72.28 MAXDIFF: 1
to identical.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Includes escapes that should now be supported and a few features not yet
fully supported, like comments, regions, classes, ruby, and lang.
All were tested with https://quuz.org/webvtt/ for validation, except
regions because the validator doesn't support them yet, and I couldn't
find any other way to validate WebVTT.
Signed-off-by: Ricardo Constantino <wiiaboo@gmail.com>
It was merged with the iff_ilbm decoder in commit
929a24efff.
Define AV_CODEC_ID_IFF_BYTERUN1 as AV_CODEC_ID_IFF_ILBM for API
compatibility.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
This finalizes merging of the work in the patches in ticket #2686.
Improvements to twoloop and RC logic are extensive.
The non-exhaustive list of twoloop improvments includes:
- Tweaks to distortion limits on the RD optimization phase of twoloop
- Deeper search in twoloop
- PNS information marking to let twoloop decide when to use it
(turned out having the decision made separately wasn't working)
- Tonal band detection and priorization
- Better band energy conservation rules
- Strict hole avoidance
For rate control:
- Use psymodel's bit allocation to allow proper use of the bit
reservoir. Don't work against the bit reservoir by moving lambda
in the opposite direction when psymodel decides to allocate more/less
bits to a frame.
- Retry the encode if the effective rate lies outside a reasonable
margin of psymodel's allocation or the selected ABR.
- Log average lambda at the end. Useful info for everyone, but especially
for tuning of the various encoder constants that relate to lambda
feedback.
Psy:
- Do not apply lowpass with a FIR filter, instead just let the coder
zero bands above the cutoff. The FIR filter induces group delay,
and while zeroing bands causes ripple, it's lost in the quantization
noise.
- Experimental VBR bit allocation code
- Tweak automatic lowpass filter threshold to maximize audio bandwidth
at all bitrates while still providing acceptable, stable quality.
I/S:
- Phase decision fixes. Unrelated to #2686, but the bugs only surfaced
when the merge was finalized. Measure I/S band energy accounting for
phase, and prevent I/S and M/S from being applied both.
PNS:
- Avoid marking short bands with PNS when they're part of a window
group in which there's a large variation of energy from one window
to the next. PNS can't preserve those and the effect is extremely
noticeable.
M/S:
- Implement BMLD protection similar to the specified in
ISO-IEC/13818:7-2003, Appendix C Section 6.1. Since M/S decision
doesn't conform to section 6.1, a different method had to be
implemented, but should provide equivalent protection.
- Move the decision logic closer to the method specified in
ISO-IEC/13818:7-2003, Appendix C Section 6.1. Specifically,
make sure M/S needs less bits than dual stereo.
- Don't apply M/S in bands that are using I/S
Now, this of course needed adjustments in the compare targets and
fuzz factors of the AAC encoder's fate tests, but if wondering why
the targets go up (more distortion), consider the previous coder
was using too many bits on LF content (far more than required by
psy), and thus those signals will now be more distorted, not less.
The extra distortion isn't audible though, I carried extensive
ABX testing to make sure.
A very similar patch was also extensively tested by Kamendo2 in
the context of #2686.
Currently only 2 profiles are evaluated because they are the only 2
with distributed test sequences.
- CID 1260: YUV 4:2:2 10 bits with block-adaptive interlace coding,
from ticket 4876;
- CID 1270: YUV 4:4:4 10 bits (HR), 1920x839, from ticket 4581.
They were generated from the ticket sequences by running the
following kind of command-line;
ffmpeg -i $INPUT -an -sn -vcodec copy -vframes 1 -y $OUTPUT.mov
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The current one, while correct, does not yield the best possible
results. The specificiations suggest another formula, which results
in quality gains in the decoded output from fate tests. This
justifies changing said formula.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Convert them to zigzag order, as the rest of them are.
When I was adding support for 10-bit DNxHD, I just copy-pasted the
missing quant matrices from the spec. Now it turns out the existing
matrices in dnxhddata.c were in zigzag order. This resulted in wrong
quantization for 10-bit DNxHD. The attached patch fixes the problem by
converting 10-bit quant matrices to zigzag order.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The System V ABI on x86-64 specifies that the al register contains an upper
bound of the number of arguments passed in vector registers when calling
variadic functions, so we aren't allowed to clobber it.
checkasm_fail_func() is a variadic function so also zero al before calling it.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Tested functions are internally kept in a binary search tree for efficient
lookups. The downside of the current implementation is that the tree quickly
becomes unbalanced which causes an unneccessary amount of comparisons between
nodes. Improve this by changing the tree into a self-balancing left-leaning
red-black tree with a worst case lookup/insertion time complexity of O(log n).
Significantly reduces the recursion depth and makes the tests run around 10%
faster overall. The relative performance improvement compared to the existing
non-balanced tree will also most likely increase as more tests are added.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The System V ABI on x86-64 specifies that the al register contains an upper
bound of the number of arguments passed in vector registers when calling
variadic functions, so we aren't allowed to clobber it.
checkasm_fail_func() is a variadic function so also zero al before calling it.
Tested functions are internally kept in a binary search tree for efficient
lookups. The downside of the current implementation is that the tree quickly
becomes unbalanced which causes an unneccessary amount of comparisons between
nodes. Improve this by changing the tree into a self-balancing left-leaning
red-black tree with a worst case lookup/insertion time complexity of O(log n).
Significantly reduces the recursion depth and makes the tests run around 10%
faster overall. The relative performance improvement compared to the existing
non-balanced tree will also most likely increase as more tests are added.
This patch tweaks search_for_pns to be both more
aggressive and more careful when applying PNS. On
the one side, it will again try to use PNS on zero
(or effectively zero) bands. For this, both zeroes
and band_type have to be checked (some ZERO bands
aren't marked in zeroes). On the other side, a more
accurate rate-distortion measure avoids using PNS
where it would cause audible distortion.
Also fixed a small bug in the computation of freq
that caused PNS usage on low-frequency bands during
8-short windows. This allows re-enabling PNS during
8-short.
The sample position is made weird and non-nominal to force catching
such issues as default values or specialized operations hiding
issues in corner cases.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This patch modifies the encode frame function to
retry encoding the frame when the resulting bit count
is too far off target, but only adjusting lambda
in small, incremental step. It also makes the logic
more conservative - otherwise it will contend with
bit reservoir-related variations in bit allocation,
and result in artifacts when frame have to be truncated
(usually at high bit rates transitioning from low
complexity to high complexity).
The randomize_buffer() implementation assures that "most of the time",
we'll do a good mix of wide16/wide8/hev/regular/no filters for complete
code coverage. However, this is not mathematically assured because that
would make the code either much more complex, or much less random.
This patch refactors the AAC coders to reuse code
between the MIPS port and the regular, portable C code.
There were two main functions that had to use
hand-optimized versions of quantization code:
- search_for_quantizers_twoloop
- codebook_trellis_rate
Those two were split into their own template header
files so they can be inlined inside both the MIPS port
and the generic code. In each context, they'll link
to their specialized implementations, and thus be
optimized by the compiler.
This approach I believe is better than maintaining
several copies of each function. As past experience has
proven, having to keep those in sync was error prone.
In this way, they will remain in sync by default.
Also, an implementation of the dequantized output
argument for the optimized quantize_and_encode
functions is included in the patch. While the current
implementation of search_for_pred still isn't using
it, future iterations of main prediction probably will.
It should not imply any measurable performance hit while
not being used.
This introduces a slight timebase computation difference in zmbv-8bit
fate test. This is expected since the new options are double instead
of ints, and the additional precision skews the results in a non
meaningful way.
The recent commits change the value slightly. Even though it's
within the threshold it's better to risk as little as possible
especially when different systems, processors, FPUs and compilers
are involved.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit changes a few things about the noise substitution
logic:
- Brings back the quantization factor (reduced to 3) during
scalefactor index calculations.
- Rejects any zeroed bands. They should be inaudiable and it's
a waste transmitting the scalefactor indices for these.
- Uses swb_offsets instead of incrementing a 'start' with every
window group size.
- Rejects all PNS during short windows.
Overall improves quality. There was a plan to use the lfg system
to create the random numbers instead of using whatever the decoder
uses but for now this works fine. Entropy is far from important here.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit once again improves the PNS implementation by scaling the
thresholds with frequency. The thresholds get looser as the frequency
increases since higher frequencies are basically noise to human ears.
Also, this introduces quantization error correction for PNS. Should
the error be too much, no PNS will be used. The energy_ratio is used
to regulate the actual encoded PNS energy: if the generated PNS
energy is higher than the energy from the psy system, energy_ratio
is used to correct it so that hopefully once requantized and
transmitted the value in the decoder will be closer to what the
encoder has.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This was an oversight when the IS system was being first implemented.
The ener01 part was largely a result of trial and error and the fact
that the sum of coef0 and coef1 could result in a zero was
overlooked. Once ener01 turns to zero it's used to divide the left
channel energy which doesn't turn out so well as it fills IS[]
with -nan's and inf's which in turn confused the quantize_band_cost.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
FATE refs changed to accomodate for the new default behavior of the function.
Numbers are now interpreted as a channel layout, instead of a number of channels.