The loop iterates over the length of the vector, not the order. This is
to avoid reloading the same data for each lag value. However this means
the loop only works if the maximum order is no larger than VLENB.
The loop is roughly equivalent to:
for (size_t j = 0; j < lag; j++)
autoc[j] = 1.;
while (len > lag) {
for (ptrdiff_t j = 0; j < lag; j++)
autoc[j] += data[j] * *data;
data++;
len--;
}
while (len > 0) {
for (ptrdiff_t j = 0; j < len; j++)
autoc[j] += data[j] * *data;
data++;
len--;
}
Since register pressure is only at 50%, it should be possible to implement
the same loop for order up to 2xVLENB. But this is left for future work.
Performance numbers are all over the place from ~1.25x to ~4x speedups,
but at least they are always noticeably better than nothing.
This test verifies the parser's handling of multiframe JXL files that
have an entropy-encoded permuted table of contents for each frame. The
testcase is actually six JXL codestreams concatenated together, and the
parser needs to be able to find the boundaries.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
<OS>_VERSION_MAX_ALLOWED indicates what version is available in
the SDK, while <OS>_VERSION_MIN_REQUIRED is the version we can
assume is available, i.e. similar to what is set with e.g.
-miphoneos-version-min on the command line.
This fixes build errors like these:
src/libavdevice/avfoundation.m:788:37: error: 'AVCaptureDeviceTypeContinuityCamera' is only available on macOS 14.0 or newer [-Werror,-Wunguarded-availability-new]
[deviceTypes addObject: AVCaptureDeviceTypeContinuityCamera];
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/AVFoundation.framework/Headers/AVCaptureDevice.h:551:38: note: 'AVCaptureDeviceTypeContinuityCamera' has been marked as being introduced in macOS 14.0 here, but the deployment target is macOS 13.0.0
AVF_EXPORT AVCaptureDeviceType const AVCaptureDeviceTypeContinuityCamera API_AVAILABLE(macos(14.0), ios(17.0), macCatalyst(17.0), tvos(17.0)) API_UNAVAILABLE(visionos) API_UNAVAILABLE(watchos);
^
Alternatively, we could use these more modern APIs, if enclosed
in suitable @available() checks.
Fixes: out of array access
Fixes: 62603/clusterfuzz-testcase-minimized-ffmpeg_DEMUXER_fuzzer-5837632490569728
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Intended to replace https://patchwork.ffmpeg.org/project/ffmpeg/patch/20230802000135.26482-3-michael@niedermayer.cc/
with a more accurate block decoding magnitude bound.
Fixes: 62433/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_JPEG2000_fuzzer-5828618092937216
Fixes: 58299/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_JPEG2000_fuzzer-5828618092937216
Previous-version-reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
- Fixes YA formats, because previous code always assumed alpha as the 4th
component.
- Fixes PAL format (as long as 0 is black, as in a systematic palette), because
previous code assumed it as limited Y.
- Fixes XYZ format because it does not need nonzero chroma components
- Fixes xv30be as the bitstream mode got merged to the non-bitstream mode.
Signed-off-by: Marton Balint <cus@passwd.hu>
Should fix valgrind warnings about Conditional jump or move depends on uninitialised value.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
Change the main loop and every component (demuxers, decoders, filters,
encoders, muxers) to use the previously added transcode scheduler. Every
instance of every such component was already running in a separate
thread, but now they can actually run in parallel.
Changes the results of ffmpeg-fix_sub_duration_heartbeat - tested by
JEEB to be more correct and deterministic.
See the comment block at the top of fftools/ffmpeg_sched.h for more
details on what this scheduler is for.
This commit adds the scheduling code itself, along with minimal
integration with the rest of the program:
* allocating and freeing the scheduler
* passing it throughout the call stack in order to register the
individual components (demuxers/decoders/filtergraphs/encoders/muxers)
with the scheduler
The scheduler is not actually used as of this commit, so it should not
result in any change in behavior. That will change in future commits.
As for the analogous decoding change, this is only a preparatory step to
a fully threaded architecture and does not yet make encoding truly
parallel. The main thread will currently submit a frame and wait until
it has been fully processed by the encoder before moving on. That will
change in future commits after filters are moved to threads and a
thread-aware scheduler is added.
This code suffers from a known issue - if an encoder with a sync queue
receives EOF it will terminate after processing everything it currently
has, even though the sync queue might still be triggered by other
threads. That will be fixed in following commits.
* the code is made shorter and simpler
* avoids constantly allocating and freeing AVPackets, thanks to
ThreadQueue integration with ObjPool
* is consistent with decoding/filtering/muxing
* reduces the diff in the future switch to thread-aware scheduling
This makes ifile_get_packet() always block. Any potential issues caused
by this will be resolved by the switch to thread-aware scheduling in
future commits.
Otherwise they'd be silently ignored if received by the filtering thread
before the filtergraph can be initialized, which would make the output
dependent on the order in which frames from different inputs arrive.
As previously for decoding, this is merely "scaffolding" for moving to a
fully threaded architecture and does not yet make filtering truly
parallel - the main thread will currently wait for the filtering thread
to finish its work before continuing. That will change in future commits
after encoders are also moved to threads and a thread-aware scheduler is
added.
Avoid making decisions based on current graph input state, which makes
the output dependent on the order in which the frames from different
inputs are interleaved.
Makes the output of fate-filter-overlay-dvdsub-2397 more correct - the
subtitle appears two frames later, which is closer to its PTS as stored
in the file.