In dd770883e9, support for expressions was added. Among the constants
added were labels of qnstc, qpal, sntsc & spal.
These were added in ba2a8cb40b to represent parameter permutations where
only the resolution is different. They don't have any usage currency and
don't represent any industry standards or convention in terms of framerate.
These bits are reserved in earlier versions of the H.264 spec, and
some poor hardware decoders require they are zero. Thus, it is useful
to be able to zero these on streams that may have them set. The result
is still a valid H.264 bitstream.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Support single input for guided filter by adding guidance mode.
If the guidance mode is off, single input is required. And
edge-preserving smoothing is conducted. If the mode is on, two
inputs are needed. The second input serves as the guidance. For
this mode, more tasks are supported, such as detail enhancement,
dehazing and so on.
Signed-off-by: Xuewei Meng <xwmeng96@gmail.com>
Reviewed-by: Steven Liu <lq@chinaffmpeg.org>
HDR10+ metadata is stored in the bit stream for HEVC. The story is
different for VP9 and cannot store the metadata in the bit stream.
HDR10+ should be passed to packet side data an stored in the container
(mkv) for VP9.
This CL is taking HDR10+ from AVFrame side data in libvpxenc and is
passing it to the AVPacket side data.
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Zern <jzern@google.com>
Broken in 753930bc73, as the path to
Doxyfile passed to doxy-wrapper.sh is relative to the build dir, while
the recipe cd's to the source dir before invoking the wrapper.
It can be useful to library users, and is currently being used by ffmpeg.c
Suggested-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The user should not rely on all options always being recognized
(in particular not on error).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This feature can be used with dnn detection by setting vf_drawtext's option
text_source=side_data_detection_bboxes, for example:
./ffmpeg -i face.jpeg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:\
input=data:output=detection_out:labels=face-detection-adas-0001.label,drawbox=box_source=
side_data_detection_bboxes,drawtext=text_source=side_data_detection_bboxes:fontcolor=green:\
fontsize=40, -y face_detect.jpeg
Please note, the default fontsize of vf_drawtext is 12, which may be too
small to be seen clearly.
Signed-off-by: Ting Fu <ting.fu@intel.com>
This feature can be used with dnn detection by setting vf_drawbox's
option box_source=side_data_detection_bboxes, for example:
./ffmpeg -i face.jpeg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:\
input=data:output=detection_out:labels=face-detection-adas-0001.label,\
drawbox=box_source=side_data_detection_bboxes -y face_detect.jpeg
Signed-off-by: Ting Fu <ting.fu@intel.com>
With some minor changes by Marton Balint:
- removed trailing whitespace
- fixed network_descriptors_length
- fixed reserved_future_use flag in the start of the section
- removed unused program variable
- emit first NIT after PAT
- some other cosmetics
Signed-off-by: Ubaldo Porcheddu <ubaldo@eja.it>
Signed-off-by: Marton Balint <cus@passwd.hu>
Two modes are supported in guided filter, basic mode and fast mode.
Basic mode is the initial pushed guided filter without optimization.
Fast mode is implemented based on the basic one by sub-sampling method.
The sub-sampling ratio which can be defined by users controls the
algorithm complexity. The larger the sub-sampling ratio, the lower
the algorithm complexity.
Signed-off-by: Xuewei Meng <xwmeng96@gmail.com>
Reviewed-by: Steven Liu <liuqi05@kuaishou.com>
commit 95b854dd06 "rename sum option to normalize" missed command
part docs
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Reviewed-by: Gyan Doshi <ffmpeg@gyani.pro>
Add examples on how to use this filter, and improve the code style.
Implement the slice-level parallelism for guided filter.
Add the basic version of guided filter.
Signed-off-by: Xuewei Meng <xwmeng96@gmail.com>
Reviewed-by: Steven Liu <liuqi05@kuaishou.com>
classification is done on every detection bounding box in frame's side data,
which are the results of object detection (filter dnn_detect).
Please refer to commit log of dnn_detect for the material for detection,
and see below for classification.
- download material for classifcation:
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.bin
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.xml
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.label
- run command as:
./ffmpeg -i cici.jpg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:input=data:output=detection_out:confidence=0.6:labels=face-detection-adas-0001.label,dnn_classify=dnn_backend=openvino:model=emotions-recognition-retail-0003.xml:input=data:output=prob_emotion:confidence=0.3:labels=emotions-recognition-retail-0003.label:target=face,showinfo -f null -
We'll see the detect&classify result as below:
[Parsed_showinfo_2 @ 0x55b7d25e77c0] side data - detection bounding boxes:
[Parsed_showinfo_2 @ 0x55b7d25e77c0] source: face-detection-adas-0001.xml, emotions-recognition-retail-0003.xml
[Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 0, region: (1005, 813) -> (1086, 905), label: face, confidence: 10000/10000.
[Parsed_showinfo_2 @ 0x55b7d25e77c0] classify: label: happy, confidence: 6757/10000.
[Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 1, region: (888, 839) -> (967, 926), label: face, confidence: 6917/10000.
[Parsed_showinfo_2 @ 0x55b7d25e77c0] classify: label: anger, confidence: 4320/10000.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
This also allows to exclusively use pointers to const AVCodec in
fftools/ffmpeg_opt.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Deprecated in c29038f304.
The resample filter based upon this library has been removed as well.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Deprecated in d682ae70b4,
ineffective since ca4df37f06.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Several options that were too codec-specific were deprecated between
0e6c853221 and
0e9c4fe254.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The build log:
** Unknown command `@code' (left as is) (in src/doc/muxers.texi l. 2020)
*** '{' without macro. Before: -map} option with the ffmpeg CLI tool. (in src/doc/muxers.texi l. 2020)
*** '}' without opening '{' before: option with the ffmpeg CLI tool. (in src/doc/muxers.texi l. 2020)
Below are the example steps to do object detection:
1. download and install l_openvino_toolkit_p_2021.1.110.tgz from
https://software.intel.com/content/www/us/en/develop/tools/openvino-toolkit/download.html
or, we can get source code (tag 2021.1), build and install.
2. export LD_LIBRARY_PATH with openvino settings, for example:
.../deployment_tools/inference_engine/lib/intel64/:.../deployment_tools/inference_engine/external/tbb/lib/
3. rebuild ffmpeg from source code with configure option:
--enable-libopenvino
--extra-cflags='-I.../deployment_tools/inference_engine/include/'
--extra-ldflags='-L.../deployment_tools/inference_engine/lib/intel64'
4. download model files and test image
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/face-detection-adas-0001.bin
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/face-detection-adas-0001.xml
wget
https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/face-detection-adas-0001.label
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/images/cici.jpg
5. run ffmpeg with:
./ffmpeg -i cici.jpg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:input=data:output=detection_out:confidence=0.6:labels=face-detection-adas-0001.label,showinfo -f null -
We'll see the detect result as below:
[Parsed_showinfo_1 @ 0x560c21ecbe40] side data - detection bounding boxes:
[Parsed_showinfo_1 @ 0x560c21ecbe40] source: face-detection-adas-0001.xml
[Parsed_showinfo_1 @ 0x560c21ecbe40] index: 0, region: (1005, 813) -> (1086, 905), label: face, confidence: 10000/10000.
[Parsed_showinfo_1 @ 0x560c21ecbe40] index: 1, region: (888, 839) -> (967, 926), label: face, confidence: 6917/10000.
There are two faces detected with confidence 100% and 69.17%.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Remove the unneeded wrapping sequence element. Also the
minOccurs/maxOccurs occurrence indicators on the inner element
definitions can be removed.
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
This avoids crafted files from consuming excessive resources recomputing the clut after each pixel change
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
And forward it to the underlying UDP protocol.
Fixes ticket #7517.
Signed-off-by: Jiangjie Gao <gaojiangjie@live.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Add the "http_proxy" option and its handling to the "tls" protocol,
pass the option from the "https" protocol.
The "https" protocol already defines the "http_proxy" command line
option, like the "http" protocol does. The "http" protocol properly
honors that command line option in addition to the environment
variable. The "https" protocol doesn't, because the proxy is
evaluated in the underlying "tls" protocol, which doesn't have this
option, and thus only handles the environment variable, which it
has access to.
Fixes#7223.
Signed-off-by: Moritz Barsnick <barsnick@gmx.net>
Signed-off-by: Marton Balint <cus@passwd.hu>
av_adler32_update() is used by av_hash_update() which will be switched
to size_t at the next bump. So it also has to be made to use size_t.
This is also necessary for framecrcenc.c, because the size of side data
will become a size_t, too.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
64 bits are needed in order to retain the uid values of Matroska
chapters; the type is kept signed because the semantics of NUT chapters
depend upon whether the id is > 0 or < 0.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This cap is currently used to mark multithreading-capable codecs that
wrap external libraries with their own multithreading code. The name is
highly confusing for our API users, since libavcodec ALWAYS handles
thread_count=0 (see commit message in previous commit). Therefore rename
the cap and update its documentation to make its meaning clear.
The old name is kept deprecated until next+1 major bump.
In the meanwhile libx264 allows to be configured for including both 8/10 bit
support within a single library. The new libx264 interface was enabled in
2f96190732.
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
The option allows to select a specific window instead of the whole
screen.
Reviewed-by: Andriy Gelman <andriy.gelman@gmail.com>
Signed-off-by: Andriy Gelman <andriy.gelman@gmail.com>
Also remove AV_LOG_SIMULATE from the list as it is not used directly, and do
not use panic level on unknown loglevel, but make them warn. Also fix mapping of
NOTICE/INFO/VERBOSE and add documentation about when the option should actually
be used.
Signed-off-by: Marton Balint <cus@passwd.hu>
Maximum packet size is 10000 (RIST_MAX_PACKET_SIZE, which is unfortunately
private) minus the RIST protocol overhead which is 28 bytes for the unencrypted
case, 36 for the encrypted case.
Signed-off-by: Marton Balint <cus@passwd.hu>
This callback is functionally the same as get_buffer2() is for decoders, and
implements for the new encode API the functionality of the old encode API had
where the user could provide their own buffers.
Reviewed-by: Lynne <dev@lynne.ee>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Mark Thompson <sw@jkqxz.net>
Signed-off-by: James Almer <jamrial@gmail.com>
This commit adds a "gophers" handler to the gopher protocol. gophers
is a community-adopted protocol that acts the same way like normal
gopher with the added TLS encapsulation.
The gophers protocol is supported by gopher servers like geomydae(8),
and clients like curl(1), clic(1), and hurl(1).
This commit also adds compilation guards to both gopher and gophers,
since now there are two protocols in the file it makes sense to
have this addition.
Signed-off-by: parazyd <parazyd@dyne.org>
Signed-off-by: Marton Balint <cus@passwd.hu>
av_stream_add_side_data() already defines size as a size_t, so this makes it
consistent across all side data functions.
Signed-off-by: James Almer <jamrial@gmail.com>
av_packet_add_side_data() already defines size as a size_t, so this makes it
consistent across all side data functions
Signed-off-by: James Almer <jamrial@gmail.com>
Enables writing TTML documents or encoded TTML paragraphs as such
documents.
Additionally, a test for the combined TTML encoder and muxer has
been added to validate that the components still work.
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
Enables the usage of such values as AV_EF_EXPLODE in encoders, which
can be useful in cases such as subtitle encoders where they have the
responsibility to validate the correctness of an incoming ASS dialog line.
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
This flag was added in 492026209b
in conjunction with av_demuxer_open() to allow to pass private
options to demuxers. It worked as follows: av_open_input_stream()
(the predecessor of avformat_open_input()) would not call the
read_header function if this flag is set. Instead the user could set
private options of the demuxer via the format's private class after
avformat_open_input() and then call av_demuxer_open() which called
the format's read_header function.
This approach was abandoned in e37f161e66
and av_demuxer_open() deprecated; instead the AVDictionary based way of
passing private options to the demuxer was choosen. Yet
AVFMT_FLAG_PRIV_OPT has never been deprecated and av_demuxer_open()
never removed. This commit implements the deprecation of the flag and
schedules av_demuxer_open for removal on the next major bump.
Given that av_demuxer_open() has been deprecated in 2012 and that this
flag is useless without it, the flag will be ignored after the next
major version bump.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
AVFrame hasn't been a struct defined in libavcodec for a decade now, when
it was moved to libavutil.
Found-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The FF_API macros are private and must not be used by external callers.
As the fields in question are to be removed without replacement, just
drop them.
The fields are:
AVPacket.convergence_duration
AVCodecContext.time_base
AVCodecContext.timecode_frame_start
AV_PIX_FMT_FLAG_PSEUDOPAL pixel descriptor flag
This commit adds support for in-place FFT transforms. Since our
internal transforms were all in-place anyway, this only changes
the permutation on the input.
Unfortunately, research papers were of no help here. All focused
on dry hardware implementations, where permutes are free, or on
software implementations where binary bloat is of no concern so
storing dozen times the transforms for each permutation and version
is not considered bad practice.
Still, for a pure C implementation, it's only around 28% slower
than the multi-megabyte FFTW3 in unaligned mode.
Unlike a closed permutation like with PFA, split-radix FFT bit-reversals
contain multiple NOPs, multiple simple swaps, and a few chained swaps,
so regular single-loop single-state permute loops were not possible.
Instead, we filter out parts of the input indices which are redundant.
This allows for a single branch, and with some clever AVX512 asm,
could possibly be SIMD'd without refactoring.
The inplace_idx array is guaranteed to never be larger than the
revtab array, and in practice only requires around log2(len) entries.
The power-of-two MDCTs can be done in-place as well. And it's
possible to eliminate a copy in the compound MDCTs too, however
it'll be slower than doing them out of place, and we'd need to dirty
the input array.
This is Visual Information Fidelity (VIF) filter and one of the component
filters of VMAF. It outputs the average VIF score over all frames.
Signed-off-by: Ashish Singh <ashk43712@gmail.com>
It has been added in 6db42a2b6b,
yet since then none of the necessary create/free_device_capabilities
functions has been implemented, making this API completely useless.
Because of this one can already simplify
avdevice_capabilities_free/create and can already remove the function
pointers at the next major bump; given that the documentation explicitly
states that av_device_capabilities is not to be used by a user, it's
options can already be removed (save for the sentinel).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
A new key & value API lets us gain access to newly added parameters
without adding explicit support for them in our wrapper. Add an
option utilizing this functionality in a similar manner to other
encoder libraries' wrappers.
Signed-off-by: Bohan Li <bohanli@google.com>
In order to fine-control referencing schemes in VP9 encoding, there
is a need to use VP9E_SET_SVC_REF_FRAME_CONFIG method. This commit
provides a way to use the API through frame metadata.
The AC3 encoder used to be a separate library called "Aften", which
got merged into libavcodec (literally, SVN commits and all).
The merge preserved as much features from the library as possible.
The code had two versions - a fixed point version and a floating
point version. FFmpeg had floating point DSP code used by other
codecs, the AC3 decoder including, so the floating-point DSP was
simply replaced with FFmpeg's own functions.
However, FFmpeg had no fixed-point audio code at that point. So
the encoder brought along its own fixed-point DSP functions,
including a fixed-point MDCT.
The fixed-point MDCT itself is trivially just a float MDCT with a
different type and each multiply being a fixed-point multiply.
So over time, it got refactored, and the FFT used for all other codecs
was templated.
Due to design decisions at the time, the fixed-point version of the
encoder operates at 16-bits of precision. Although convenient, this,
even at the time, was inadequate and inefficient. The encoder is noisy,
does not produce output comparable to the float encoder, and even
rings at higher frequencies due to the badly approximated winow function.
Enter MIPS (owned by Imagination Technologies at the time). They wanted
quick fixed-point decoding on their FPUless cores. So they contributed
patches to template the AC3 decoder so it had both a fixed-point
and a floating-point version. They also did the same for the AAC decoder.
They however, used 32-bit samples. Not 16-bits. And we did not have
32-bit fixed-point DSP functions, including an MDCT. But instead of
templating our MDCT to output 3 versions (float, 32-bit fixed and 16-bit fixed),
they simply copy-pasted their own MDCT into ours, and completely
ifdeffed our own MDCT code out if a 32-bit fixed point MDCT was selected.
This is also the status quo nowadays - 2 separate MDCTs, one which
produces floating point and 16-bit fixed point versions, and one
sort-of integrated which produces 32-bit MDCT.
MIPS weren't all that interested in encoding, so they left the encoder
as-is, and they didn't care much about the ifdeffery, mess or quality - it's
not their problem.
So the MDCT/FFT code has always been a thorn in anyone looking to clean up
code's eye.
Backstory over. Internally AC3 operates on 25-bit fixed-point coefficients.
So for the floating point version, the encoder simply runs the float MDCT,
and converts the resulting coefficients to 25-bit fixed-point, as AC3 is inherently
a fixed-point codec. For the fixed-point version, the input is 16-bit samples,
so to maximize precision the frame samples are analyzed and the highest set
bit is detected via ac3_max_msb_abs_int16(), and the coefficients are then
scaled up via ac3_lshift_int16(), so the input for the FFT is always at least 14 bits,
computed in normalize_samples(). After FFT, the coefficients are scaled up to 25 bits.
This patch simply changes the encoder to accept 32-bit samples, reusing
the already well-optimized 32-bit MDCT code, allowing us to clean up and drop
a large part of a very messy code of ours, as well as prepare for the future lavu/tx
conversion. The coefficients are simply scaled down to 25 bits during windowing,
skipping 2 separate scalings, as the hacks to extend precision are simply no longer
necessary. There's no point in running the MDCT always at 32 bits when you're
going to drop 6 bits off anyway, the headroom is plenty, and the MDCT rounds
properly.
This also makes the encoder even slightly more accurate over the float version,
as there's no coefficient conversion step necessary.
SIZE SAVINGS:
ARM32:
HARDCODED TABLES:
BASE - 10709590
DROP DSP - 10702872 - diff: -6.56KiB
DROP MDCT - 10667932 - diff: -34.12KiB - both: -40.68KiB
DROP FFT - 10336652 - diff: -323.52KiB - all: -364.20KiB
SOFTCODED TABLES:
BASE - 9685096
DROP DSP - 9678378 - diff: -6.56KiB
DROP MDCT - 9643466 - diff: -34.09KiB - both: -40.65KiB
DROP FFT - 9573918 - diff: -67.92KiB - all: -108.57KiB
ARM64:
HARDCODED TABLES:
BASE - 14641112
DROP DSP - 14633806 - diff: -7.13KiB
DROP MDCT - 14604812 - diff: -28.31KiB - both: -35.45KiB
DROP FFT - 14286826 - diff: -310.53KiB - all: -345.98KiB
SOFTCODED TABLES:
BASE - 13636238
DROP DSP - 13628932 - diff: -7.13KiB
DROP MDCT - 13599866 - diff: -28.38KiB - both: -35.52KiB
DROP FFT - 13542080 - diff: -56.43KiB - all: -91.95KiB
x86:
HARDCODED TABLES:
BASE - 12367336
DROP DSP - 12354698 - diff: -12.34KiB
DROP MDCT - 12331024 - diff: -23.12KiB - both: -35.46KiB
DROP FFT - 12029788 - diff: -294.18KiB - all: -329.64KiB
SOFTCODED TABLES:
BASE - 11358094
DROP DSP - 11345456 - diff: -12.34KiB
DROP MDCT - 11321742 - diff: -23.16KiB - both: -35.50KiB
DROP FFT - 11276946 - diff: -43.75KiB - all: -79.25KiB
PERFORMANCE (10min random s32le):
ARM32 - before - 39.9x - 0m15.046s
ARM32 - after - 28.2x - 0m21.525s
Speed: -30%
ARM64 - before - 36.1x - 0m16.637s
ARM64 - after - 36.0x - 0m16.727s
Speed: -0.5%
x86 - before - 184x - 0m3.277s
x86 - after - 190x - 0m3.187s
Speed: +3%
Add 2 new options:
- reconnect_on_http_error - a list of http status codes that should be
retried. the list can contain explicit status codes / the strings
4xx/5xx.
- reconnect_on_network_error - reconnects on arbitrary errors during
connect, e.g. ECONNRESET/ETIMEDOUT
the retry employs the same exponential backoff logic as the existing
reconnect/reconnect_at_eof flags.
related tickets:
https://trac.ffmpeg.org/ticket/6066https://trac.ffmpeg.org/ticket/7768
Signed-off-by: Marton Balint <cus@passwd.hu>
Do it only when requested with the AV_CODEC_EXPORT_DATA_VIDEO_ENC_PARAMS
flag.
Drop previous code using the long-deprecated AV_FRAME_DATA_QP_TABLE*
API. Temporarily disable fate-filter-pp, fate-filter-pp7,
fate-filter-spp. They will be reenabled once these filters are converted
in following commits.
because hls_enc_key and hls_enc_iv get 16byte char
for example:
-hls_enc_key 0123456789abcdef -hls_enc_iv abcdefghijklmnop
Reviewed-by: Gyan Doshi <ffmpeg@gyani.pro>
Signed-off-by: Steven Liu <liuqi05@kuaishou.com>
At present, progress stats are updated at a hardcoded interval of
half a second. For long processes, this can lead to bloated
logs and progress reports.
Users can now set a custom period using option -stats_period
Default is kept at 0.5 seconds.