Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The complex vertical low-pass filter slightly over-sharpens the picture. This becomes visible when several transcodings are cascaded and the error potentises, e.g. some generations of HD->SD SD->HD.
To prevent this behaviour the destination pixel must not exceed the source pixel when the average of the pixels above and below is less than the source pixel. And the other way around.
Tested and approved in a visual transcoding cascade test by video professionals.
SSIM/PSNR test with the first generation of an HD->SD file as a reference against the 6th generation(3 x SD->HD HD->SD):
Results without the patch:
SSIM Y:0.956508 (13.615881) U:0.991601 (20.757750) V:0.993004 (21.551382) All:0.974405 (15.918463)
PSNR y:31.838009 u:48.424280 v:48.962711 average:34.759466 min:31.699297 max:40.857847
Results with the patch:
SSIM Y:0.970051 (15.236232) U:0.991883 (20.905857) V:0.993174 (21.658049) All:0.981290 (17.279202)
PSNR y:34.412108 u:48.504454 v:48.969496 average:37.264644 min:34.310637 max:42.373392
Signed-off-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
On ARM platforms, accessing the PMU registers requires special user
access permissions. Since there is no other way to get accurate timers,
the current implementation of timers in FFmpeg rely on these registers.
Unfortunately, enabling user access to these registers on Linux is not
trivial, and generally involve compiling a random and unreliable github
kernel module, or patching somehow your kernel.
Such module is very unlikely to reach the upstream anytime soon. Quoting
Robin Murphin from ARM:
> Say you do give userspace direct access to the PMU; now run two or more
> programs at once that believe they can use the counters for their own
> "minimal-overhead" profiling. Have fun interpreting those results...
>
> And that's not even getting into the implications of scheduling across
> different CPUs, CPUidle, etc. where the PMU state is completely beyond
> userspace's control. In general, the plan to provide userspace with
> something which might happen to just about work in a few corner cases,
> but is meaningless, misleading or downright broken in all others, is to
> never do so.
As a result, the alternative is to use the Performance Monitoring Linux
API which makes use of these registers internally (assuming the PMU of
your ARM board is supported in the kernel, which is definitely not a
given...).
While the Linux API is obviously cross platform, it does have a
significant overhead which needs to be taken into account. As a result,
that mode is only weakly enabled on ARM platforms exclusively.
Note on the non flexibility of the implementation: the timers (native
FFmpeg vs Linux API) are selected at compilation time to prevent the
need of function calls, which would result in a negative impact on the
cycle counters.
Adds another test for asetnsamples filter where padding of the last
frame is switched off. Renames the existing test to make the difference
obvious.
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
Makes the handling of unspecified/unknown color_range values on stream
level consistent to the value used on frame level.
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
This reverts commit 547db1eaec.
This commit wasn't supposed to be pushed (yet) since it hasn't
been reviewed.
Signed-off-by: Martin Storsjö <martin@martin.st>
When we use dllexport properly for shared libraries on windows,
there's no longer any issue with linking the object files for
e.g. libavcodec statically into checkasm. (It's still not possible
to link the built object files for e.g. libavformat statically to
libavcodec though, since libavformat exepcts to load av_export_*
symbols from a DLL.)
This reverts commit 4e62b57ee0.
Signed-off-by: Martin Storsjö <martin@martin.st>
Adds FATE tests for the previously untested allrgb, allyuv, rgbtestsrc,
smptebars, smptehdbars and yuvtestsrc filters.
Also adds a test for testsrc2 filter with rgb+alpha.
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
The -map option allows for a trailing ? so that an error is not thrown if
the input stream does not exist.
This capability is extended to the map_channel option.
This allows a ffmpeg command not to break if an input channel does not
exist, which can be of use (for instance, scripts processing audio
channels with sources having unset number of audio channels).
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When sidx box support is enabled, the code will skip reading all
trun boxes (each containing ctts entries for samples inthat box).
If seeks are attempted before all ctts values are known, the old
code would dump ctts entries into the wrong location. These are
then used to compute pts values which leads to out of order and
incorrectly timestamped packets.
This patch fixes ctts processing by always using the index returned
by av_add_index_entry() as the ctts_data index. When the index gains
new entries old values are reshuffled as appropriate.
This approach makes sense since the mov demuxer is already relying
on the mapping of AVIndex entries to samples for correct demuxing.
As a result of this all ctts entries are now 1-count. A followup
change will be submitted to remove support for > 1 count entries
which will simplify seeking.
Notes for future improvement:
Probably there are other boxes (stts, stsc, etc) that are impacted
by this issue... this patch only attempts to fix ctts since it
completely breaks packet timestamping.
This patch continues using an array for the ctts data, which is not
the most ideal given the rearrangement that needs to happen (via
memmove as new entries are read in). Ideally AVIndex and the ctts
data would be set-type structures so addition is always worst case
O(lg(n)) instead of the O(n^2) that exists now; this slowdown is
noticeable during seeks.
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Since there is no information about the source format, "unspecified"
is the correct value to write here.
All tests using the MPEG-2 encoder are updated, as this changes the
header on all outputs.
Fixes filter-pixfmts-scale test failing on big-endian systems due to
alpSrc not being cast to (const int32_t**).
Also fixes distortions in the output alpha channel values by copying the
alpha channel code from the rgba64 case found elsewhere in output.c.
Fixes ticket 6555.
Signed-off-by: James Cowgill <James.Cowgill@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit switches off forced correct nesting of tags and only keeps
it for font tags. See long explanations in the code for the rationale.
This results in various FATE changes which I'll explain here:
- various swapping in font attributes, this is mostly noise due to the
old reverse stack way of printing them. The new one is more correct as
the last attribute takes over the previous ones.
- unrecognized tags disappears
- invalid tags that were previously displayed aren't anymore (instead,
we have a warning). This is better for the end user
The main benefit of this commit is to be more tolerant to error, leading
to a better handling of badly nested tags or random wrong formatting for
the end user.
This reverts commit 04aa09c4bc
and reintroduces 0ff5567a30 that
was temporarily reverted due to minor regressions.
It also reverts e5bce8b4ce that fixed FATE refs.
The fate-ffm change is caused by field_order now being set
on the output format because the first frame arrives earlier.
The fate-mxf change is assumed to be the same.
The scale2ref filter will now maintain the DAR of the main input and
not the DAR of the reference input. This previous behavior was deemed
counterintuitive for most (all?) use-cases.
Before:
scale2ref=iw/4:ow/mdar
in w:320 h:240 fmt:rgb24 sar:1/1
ref w:640 h:360 fmt:rgb24 sar:1/1
out w:160 h:120 fmt:rgb24 sar:4/3 flags:0x2
SAR: ((120 * 640) / (160 * 360)) * (1 / 1) = 4 / 3
DAR: (160 / 120) * (4 / 3) = 16 / 9
(main out now same DAR as ref)
Now:
scale2ref=iw/4:ow/mdar
in w:320 h:240 fmt:rgb24 sar:1/1
ref w:640 h:360 fmt:rgb24 sar:1/1
out w:160 h:120 fmt:rgb24 sar:1/1 flags:0x2
SAR: ((120 * 320) / (160 * 240)) * (1 / 1) = 1 / 1
DAR: (160 / 120) * (1 / 1) = 4 / 3
(main out same DAR as main in)
The scale2ref FATE test has also been updated.
Signed-off-by: Kevin Mark <kmark937@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is actually internal utvideo format.
Allows to make use of SIMD for median prediction for rgb(a) formats,
thus speeding up decoding.
Simplifies code, eases further developement and maintenance.
Update FATE because of pixel format switch.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
<@jamrial> durandal_1707: 04aa09c4bc broke fate-lavf-ffm and fate-lavf-mxf
<@durandal_1707> how so?
<@jamrial> one byte changes
<@durandal_1707> jamrial: just update checksums
<@jamrial> durandal_1707: but why did they change at all? the commit you reverted didn't affect them
<@jamrial> why does reverting it affect these tests?
<@jamrial> i don't think updating the checksum without knowing what changed is a good idea
<@durandal_1707> jamrial: the lavfi core is in weird state after removal of recursive code
<@durandal_1707> jamrial: the change is that older ones would get progressive flag set and new one doesnt
<@jamrial> alright
The md5 protocol has no seek support, but some tests use seeks. This changes
the fate tests to actually create the output files and calculate the md5 on the
written files, which also makes the tests independent of the size of the output
buffers and output buffering in general.
A new md5pipe fate test method is also introduced to keep the old functionality
for tests where using a non-seekable output was intentional, and matroska md5
tests are changed to use that.
Signed-off-by: Marton Balint <cus@passwd.hu>
Meant for DSP functions returning a float or double, as they'd fail if emms
is called after every run on x86_32.
Signed-off-by: James Almer <jamrial@gmail.com>
If the videos starts with B frame, then the minimum composition time
as computed by stts + ctts will be non-zero. Hence we need to shift
the DTS, so that the first pts is zero. This was the intention of that
code-block. However it was subtracting by the wrong amount.
For example, for one of the videos in the bug nonFormatted.mp4 we have
stts:
sample_count duration
960 1001
ctts:
sample_count duration
1 3003
2 0
1 3003
....
The resulting composition times are : 3003, 1001, 2002, 6006, ...
The minimum composition time or PTS is 1001, which should be used to
offset DTS. However the code block was wrongly using ctts[0] which is
3003. Hence the PTS was negative. This change computes the minimum pts
encountered while fixing the index, and then subtracts it from all the
timestamps after the edit list fixes are applied.
Samples files available from:
https://bugs.chromium.org/p/chromium/issues/detail?id=721451https://bugs.chromium.org/p/chromium/issues/detail?id=723537
fate-suite/h264/twofields_packet.mp4 is a similar file starting with 2
B frames. Before this change the PTS of first two B-frames was -6006
and -3003, and I am guessing one of them got dropped when being decoded
and remuxed to the framecrc before, and now it is not being dropped.
Signed-off-by: Sasi Inguva <isasi@google.com>
This test the demuxer discarding non ADTS frames at the beginning and
end of the input.
As a side effect, this commit also enables fate-adts-demux, which was
accidentally disabled in 324f0fbff1.
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
This new FATE test for the scale2ref filter makes use of the recently
added scale2ref-specific variables to maintain the aspect ratio of a
test input.
Filtergraph explanation:
[main] has an AR of 4:3. [ref] has an AR of 16:9.
640 / 4 = 160. So the new width for [main] is 160.
160 / ((320 / 240) * (1 / 1)) = 160 / (4 / 3) = 120. So the new
height for [main] is 120.
160 / 120 = 4 / 3 so [main]'s aspect ratio has been maintained while
using [ref]'s width as a reference point.
[ref] is nullsink'd since it is left unchanged by scale2ref (and so
shouldn't need to be tested).
If we were to use "iw/4:-1" in place of "iw/4:ow/mdar":
640 / 4 = 160. So the new width for [main] would be 160.
360 / 4 = 90. So the new height for [main] would be 90.
160 / 90 = 16 / 9 so [main] now has the same aspect ratio as [ref]
which is probably what you do not want.
This is currently the only test for scale2ref.
Signed-off-by: Kevin Mark <kmark937@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This removes the current API violating behavior of overwritting the stream's
extradata during packet filtering, something that should not happen after the
av_bsf_init() call.
The bitstream filter generated extradata is no longer available during
write_header(), and as such not usable with non seekable output. The FATE
tests are updated to reflect this.
Signed-off-by: James Almer <jamrial@gmail.com>
Loads from this strictly doesn't require alignment, but specify it
just for consistency with the arm version.
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit '019ab88a95cb31b698506d90e8ce56695a7f1cc5':
lavc: add an option for exporting cropping information to the caller
Merged-by: James Almer <jamrial@gmail.com>
* commit '4e62b57ee03928c12a3119dcaf78ffa1f4d6985f':
fate: Skip the checkasm test if CONFIG_STATIC is disabled
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '5c83b4d550ea42653fece092987bab56ccc32ead':
fate: Unset the sig variable if ignoring a test failure
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '11a9320de54759340531177c9f2b1e31e6112cc2':
build: Move build-system-related helper files to a separate subdirectory
"ffbuild" directory name is used instead of "avbuild".
Merged-by: Clément Bœsch <u@pkh.me>
This complex (-1 2 6 2 -1) filter slightly less reduces interlace 'twitter' but better retain detail and subjective sharpness impression compared to the linear (1 2 1) filter.
Signed-off-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
the tested sample contain negative value in the red channel
need to be clip to zero, and not set to MAX_RED
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Add an option to webm_dash_manifest demuxer to specify a value for
"bandwidth" field in the DASH manifest. The value is then used by
the muxer. Fixes an existing FIXME in the code.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: James Zern <jzern@google.com>
As it gives excellent encoding gains at an insignificant speed increase
and passes fate without problems, it should now be safe to enable by
default.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This merges commits 8e2ea69135 and
096a8effa3 by Anton Khirnov, with the
following change:
- extract_extradata_check() is added to know if the codec is supported
by the bsf before trying to initialize it. This behaviour is similar to
the old AVCodecParser.split checks.
The FATE reference changes are due to the filtered out NAL units that
the old AVCodecParser.split implementation left alone.
Decoding is unchanged as the functions that parse extradata simply
ignored said unnecessary NAL units.
Signed-off-by: James Almer <jamrial@gmail.com>
* commit 'effc1430b2fe5997d9d55bf28dc507c27125eb27':
Revert "checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately"
Merged-by: James Almer <jamrial@gmail.com>
* commit '286ab878bd39b56008035638227b3ecb8ec5bbb7':
fate.sh: Allow setting other make flags for running tests
Merged-by: James Almer <jamrial@gmail.com>
* commit '481ff3cf018811ba3235f1c236e970f32a6300b9':
fate: Add h264 and hevc extradata reload tests
Only the HEVC part is merged, see 00c8079816
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'b90c8a3d08e3f9ad4de1253376d2d1d93abb8b8c':
fate: Add tests for mov display matrix
Adapted to use ffprobe -show_entries
Merged-by: James Almer <jamrial@gmail.com>
* commit '043b0b9fb1481053b712d06d2c5b772f1845b72b':
Replace leftover uses of -aframes|-dframes|-vframes with -frames:a|d|v
The merge also includes all our own occurences.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f':
checkasm: aarch64: Add filler args to make sure all parameters are passed on the stack
Merged-by: James Almer <jamrial@gmail.com>
* commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6':
checkasm: aarch64: Clobber the stack before calling functions
Merged-by: James Almer <jamrial@gmail.com>
* commit 'a05cc56124b4f1237f6355784de821e3290ddb44':
checkasm: arm/aarch64: Fix the amount of space reserved for stack parameters
Merged-by: James Almer <jamrial@gmail.com>
* commit '22c3ab18646924ce24dc6017a9e882ff69689e40':
checkasm: Add test for huffyuvdsp add_bytes
huffyuvdsp is renamed to llviddsp to be consistent with our codebase.
Note: af607b7e07 wasn't actually required for this test since this
commit is not actually testing huffyuvdsp.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5':
audiodsp/x86: yasmify vector_clipf_sse
audiodsp: reorder arguments for vector_clipf
Merged the version from Libav after a discussion with James Almer on
IRC:
19:22 <ubitux> jamrial: opinion on 12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5?
19:23 <ubitux> it was apparently yasmified differently
19:23 <ubitux> (it depends on the previous commit arg shuffle)
19:24 <ubitux> i don't see the magic movsxdifnidn in your port btw
19:24 <ubitux> it's a port from 1d36defe94
19:25 <jamrial> seems better thanks to said arg shuffle
19:25 <jamrial> the loop is the same, but init is simpler
19:25 <jamrial> probably worth merging
19:25 <ubitux> OK
19:25 <ubitux> thanks
19:26 <jamrial> curious they didn't make len ptrdiff_t after the previous bunch of commits, heh
19:26 <ubitux> yeah indeed
Both commits are merged at the same time to prevent a conflict with our
existing yasmified ff_vector_clipf_sse.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '3aa9d37d03da3c9b482d19b3988659287815280e':
build: Fix directory dependencies of tests/pixfmts.mak target
This might not be necessary given our mkdirs in the configure, but it
probably doesn't hurt.
Merged-by: Clément Bœsch <u@pkh.me>
This field is of little value, and interferes with testing side data,
since sizes can be different on multiple architectures.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Allows to get a more realistic total bitrate (and estimated file size)
in avi_write_header. Previously a static default value of 200k was
assumed.
Adds an internal helper function for bitrate guessing.
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Preparation for potentially disabling merged side data by default in the
libs. Do this in particular because it affects fate tests.
The changed tests either reflect added packet side data, or the changed
packet size due to merged side data removal reducing the packet size.
so tsf option in aresample will have effect
previously tsf/internal_sample_format had no effect
fate is updated
s32p previously used fltp internally
dblp previously used fltp/dblp internally
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
except filter_length == 1
odd filter_length gives worse frequency response,
even when compared with shorter filter_length
also makes build_filter simpler
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
The Chen-Shapiro(CS) test was used to test normality for
Lagged Fibonacci PRNG.
Normality Hypothesis Test:
The null hypothesis formally tests if the population
the sample represents is normally-distributed. For
CS, when the normality hypothesis is True, the
distribution of QH will have a mean close to 1.
Information on CS can be found here:
http://www.stata-journal.com/sjpdf.html?articlenum=st0264http://www.originlab.com/doc/Origin-Help/NormalityTest-Algorithm
Signed-off-by: Thomas Turner <thomastdt@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The constants used in the decoder used floating point precision,
and this caused different values to be generated on different
architectures. Additionally on big endian machines, the fate test
would output bytes in native order, which is different from the one
hardcoded in the test.
So, eradicate floating point numbers and use fixed point (32.32)
arithmetics everywhere, replacing constants with precomputed integer
values, and force the pixel format output to be the same in the fate
test.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
This makes sure the actual stream parameters are used, which is
important mainly for hardware decoding+filtering cases, which would
previously require various weird workarounds to handle the fact that a
fake software graph has to be constructed, but never used.
This should also improve behaviour in rare cases where
avformat_find_stream_info() does not provide accurate information.
This merges Libav commit a3a0230. It was previously skipped.
The code in flush_encoders() which sets up a "fake" format wasn't in
Libav. I'm not sure if it's a good idea, but it tends to give
behavior closer to the old one in certain corner cases.
The vp8-size-change gives different result, because now the size of
the first frame is used. libavformat reported the size of the largest
frame for some reason.
The exr tests now use the sample aspect ratio of the first frame. For
some reason libavformat determines 0/1 as aspect ratio, while the
decoder returns the correct one.
The ffm and mxf tests change the field_order values. I'm assuming
another libavformat/decoding mismatch.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
This will be useful in the following commit, after which the muxer
timebase is not always available when encoding.
This merges Libav commit 3e265ca. It was previously skipped.
There are some changes with how/when the mux_timebase field is set,
because the Libav approach often causes a too imprecise time base
to be set. This is hard, because the muxer's write_header function
can readjust the timebase, at which point we might already have
encoded packets buffered. (It might be better to buffer them after
the encoder, instead of after all the timestamp handling logic
before muxing.)
The two FATE tests change because the output time base is raised
for subtitles. (Needed to avoid certain rounding issues in other
cases.)
Includes a minor merge fix by Mark Thompson, and
avconv: Move rescale to stream timebase before monotonisation
also by Mark Thompson <sw@jkqxz.net>.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
Previously, all link-time dependencies were added for all libraries,
resulting in bogus link-time dependencies since not all dependencies
are shared across libraries. Also, in some cases like libavutil, not
all dependencies were taken into account, resulting in some cases of
underlinking.
To address all this mess a machinery is added for tracking which
dependency belongs to which library component and then leveraged
to determine correct dependencies for all individual libraries.
This should fix the fate failure due to a truncated last frame.
Alternatively the frame could be dropped.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: 664/clusterfuzz-testcase-4917047475568640
The change to fate is due to a truncated last frames which is now detected as damaged.
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
According to the spec[1], a value of 0 means the footer is present and a value
of 1 means it's absent, the exact opposite of header presence flag where 1
means present and 0 absent.
The reason for this is compatibility with APEv1 tags, where there's no header,
footer presence was mandatory for all files, and the flags field was a zeroed
reserved field.
[1] http://wiki.hydrogenaud.io/index.php?title=Ape_Tags_Flags
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Current code returned the number of channels as channel layout in that case,
and if nret is not set then unknown layouts are typically not supported.
Also use the common parsing code. Use a temporary workaround to parse an
unknown channel layout such as '13c', after a 1 year grace period only '13C'
will work.
Signed-off-by: Marton Balint <cus@passwd.hu>
* commit '71a0472114574993df7035f4de9aa007e03817b8':
checkasm: arm: report the first clobbered register in checkasm_checked_call
Also includes 446353ea18, 59aeed93e4, and 37961044c6 to avoid breaking
too much stuff.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '38efff92f1ef81f3de20ff0460ec7b70c253d714':
FATE: add a test for H.264 with two fields per packet
h264: fix decoding multiple fields per packet with slice threads
This merge includes two commits because the FATE test was useful in
order to make proper testing.
The merge gets rid of the now unused:
- SLICE_SINGLETHREAD and SLICE_SKIPED macros
- max_contexts
- "again" label in decode_nal_units()
This commit also includes the fix from d3e4d406b.
Thanks to wm4 and Michael Niedermayer for their testing.
Merged-by: Clément Bœsch <u@pkh.me>
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
This treats the case of no slices like no frames which it basically is.
The field is added to the context as other nal related fields are also there
and passing the has_slices field per *arguments is ugly and not consistent
Found-by: ubitux
Approved-by: ubitux
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This work is sponsored by, and copyright, Google.
Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:
Cortex A7 A8 A9 A53
vp9_inv_dct_dct_16x16_sub16_add_neon: 3188.1 2435.4 2499.0 1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon: 18531.7 16582.3 14207.6 12000.3
By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:
vp9_inv_dct_dct_16x16_sub1_add_neon: 274.6 189.5 211.7 235.8
vp9_inv_dct_dct_16x16_sub2_add_neon: 2064.0 1534.8 1719.4 1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon: 2135.0 1477.2 1736.3 1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon: 2446.7 1828.7 1993.6 1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon: 2832.4 2118.3 2266.5 1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.7 2475.3 2523.5 1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon: 756.2 456.7 862.0 553.9
vp9_inv_dct_dct_32x32_sub2_add_neon: 10682.2 8190.4 8539.2 6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon: 10813.5 8014.9 8518.3 6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon: 11859.6 9313.0 9347.4 7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon: 12946.6 10752.4 10192.2 8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon: 14074.6 11946.5 11001.4 9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon: 15269.9 13662.7 11816.1 9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon: 16327.9 14940.1 12626.7 10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon: 17462.7 15776.1 13446.2 11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon: 18575.5 17157.0 14249.3 12015.1
I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.
In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.
This is cherrypicked from libav commit
9c8bc74c2b.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When building DLLs with MSVC, CONFIG_STATIC is disabled (see
d66c52c2b3 for a more verbose explanation) since the built
object files can't be linked statically (which checkasm does).
This worked up until recently, only by luck.
Signed-off-by: Martin Storsjö <martin@martin.st>
Use a tab instead of two spaces, skip the fate prefix for the test name.
This makes IGNORE line fit in even better with the other make printouts.
Signed-off-by: Martin Storsjö <martin@martin.st>
Otherwise the .rep file would still contain a signal instead of a
zero, even if the process returned success.
Signed-off-by: Martin Storsjö <martin@martin.st>
This can be useful to filter out noise in known-broken scenarios like
miscompilation by legacy compilers and similar.
Originally based on a patch by Diego Biurrun.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Additional/Modified FATE tests improve code coverage from 63.7% to 98.1%.
Changed fate-suite sample files:
* filter/hdcd-mix.flac (958K) added. It is a much better test than
filter/hdcd.flac (910K), which is now unused, but can't be removed.
* filter/hdcd-fake20bit.flac (168K) added. It is the first second of
filter/hdcd.flac, with the 16-bit LSB copied into bit 20 of a 24-bit
stream. There isn't an actual non-16-bit HDCD sample available to test.
Signed-off-by: Burt P <pburt0@gmail.com>
Make the one-time initialization in av_get_cpu_flags() thread-safe. The
static variables |flags|, |cpuflags_mask|, and |checked| in
libavutil/cpu.c are read and written using normal load and store
operations. These are considered as data races. The fix is to use atomic
load and store operations.
Remove the |checked| variable because the invalid value of -1 for
|flags| can be used to indicate the same condition. Rename |flags| to
|cpu_flags| and move it to file scope.
The fix can be verified by running the libavutil/tests/cpu_init.c test
program under ThreadSanitizer:
./configure --toolchain=clang-tsan
make libavutil/tests/cpu_init
libavutil/tests/cpu_init
There should be no warnings from ThreadSanitizer.
Co-author: Dmitry Vyukov of Google, who suggested the data race fix.
Signed-off-by: Wan-Teh Chang <wtc@google.com>
Supporting the system was a nice joke for the 9 release, but it has
run its course. Nowadays Plan 9 receives no testing and has no
practical usefulness.
This would be simpler if codecpar supported AVOptions
modern ffserver should be unaffected by this, older ffserver which required the
muxer to directly access the encoder could have issues with this, but this
direct access is just wrong and unsafe
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>