MSA2 optimizations are attached to MSA macros in generic_macros_msa.h.
It's difficult to do runtime check for them. Remove this part of code
can make it more robust. H264 1080p decoding: 5.13x==>5.12x.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Using mask to avoid judgment, H264 4K decoding speed
improved about 0.1fps tested on 3A4000
Signed-off-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
1. Refined function get_cabac_inline_mips.
2. Optimize function get_cabac_bypass and get_cabac_bypass_sign.
Speed of decoding h264: 4.89x ==> 5.05x(tested on 3A4000).
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The MSA optimization has been refined in commit 93218c2 and ce0a52e.
It is better than MMI version now.
Speed of decoding H264: 4.83x ==> 4.89x (tested on 3A4000).
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Initializing zlib in the way we do here is threadsafe, see
https://www.zlib.net/zlib_faq.html#faq21
Reviewed-by: Tomas Härdin <tjoppen@acc.umu.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Initializing zlib in the way we do here is threadsafe, see
https://www.zlib.net/zlib_faq.html#faq21
Reviewed-by: Tomas Härdin <tjoppen@acc.umu.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is not documented to be safe to call inflateEnd() on a z_stream
that has not been successfully initialized via inflateInit(); so
record whether it has been successfully initialized.
Reviewed-by: Tomas Härdin <tjoppen@acc.umu.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This will give us more room to improve the implementation later.
Suggested-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
Here the packet size is known before allocating the packet because
the encoder provides said information (and works with internal buffers
itself), so one use this information to avoid the implicit use of another
intermediate buffer for the packet data; and by switching to
ff_get_encode_buffer() one can also allow user-supplied buffers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Rick Kern <kernrj@gmail.com>
Said RL VLC is only used by the decoder, ergo don't initialize it for
the encoder and move the whole code and the RL VLC table itself to
dvdec.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
It can and therefore we switch from a heap allocated VLC table to
a VLC initialized via the mechanism for static VLCs, but without
an actual static VLC.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
It is init-threadsafe since b9c1ab8907
and except on MIPS even before that due to its use of ff_thread_once()
for static initialization.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
From the comment it's not available on old version. It works now
by testing on macOS 11.2.1. There is no document about since when.
So trying to set the configuration and ignore the error for hevc.
Signed-off-by: Rick Kern <kernrj@gmail.com>
classification is done on every detection bounding box in frame's side data,
which are the results of object detection (filter dnn_detect).
Please refer to commit log of dnn_detect for the material for detection,
and see below for classification.
- download material for classifcation:
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.bin
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.xml
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.label
- run command as:
./ffmpeg -i cici.jpg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:input=data:output=detection_out:confidence=0.6:labels=face-detection-adas-0001.label,dnn_classify=dnn_backend=openvino:model=emotions-recognition-retail-0003.xml:input=data:output=prob_emotion:confidence=0.3:labels=emotions-recognition-retail-0003.label:target=face,showinfo -f null -
We'll see the detect&classify result as below:
[Parsed_showinfo_2 @ 0x55b7d25e77c0] side data - detection bounding boxes:
[Parsed_showinfo_2 @ 0x55b7d25e77c0] source: face-detection-adas-0001.xml, emotions-recognition-retail-0003.xml
[Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 0, region: (1005, 813) -> (1086, 905), label: face, confidence: 10000/10000.
[Parsed_showinfo_2 @ 0x55b7d25e77c0] classify: label: happy, confidence: 6757/10000.
[Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 1, region: (888, 839) -> (967, 926), label: face, confidence: 6917/10000.
[Parsed_showinfo_2 @ 0x55b7d25e77c0] classify: label: anger, confidence: 4320/10000.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Different function type of model requires different parameters, for
example, object detection detects lots of objects (cat/dog/...) in
the frame, and classifcation needs to know which object (cat or dog)
it is going to classify.
The current interface needs to add a new function with more parameters
to support new requirement, with this change, we can just add a new
struct (for example DNNExecClassifyParams) based on DNNExecBaseParams,
and so we can continue to use the current interface execute_model just
with params changed.
There's one task item for one function call from dnn interface,
there's one request item for one call to openvino. For classify,
one task might need multiple inference for classification on every
bounding box, so add InferenceItem.
avpriv_set_systematic_pal2() is meant to fill fixed vales for formats that
until recently were tagged as "pseudo pal". This is no longer the case, so
this call is a no-op when used on real PAL formats.
Signed-off-by: James Almer <jamrial@gmail.com>
In particular, document that they initialize different parts of an
RLTable and therefore need not be synchronized.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The SpeedHQ encoder currently reverses the entries of two small tables
and stores them in other tables. These other tables have a size of 48
bytes, yet the code for their initialization takes 135 bytes (GCC 9.3,
x64, O3 albeit in an av_cold function). So remove the runtime
initialization and hardcode the tables.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The SpeedHQ decoder uses and initializes a RLTable's VLC, yet it also
initializes other parts of the RLTable that it does not use. This has
downsides besides being wasteful: Because the SpeedHQ encoder also
initializes these additional fields, there is a potential for data races
(and therefore undefined behaviour). In fact, removing the superfluous
initializations from the decoder automatically makes both the decoder
and the encoder init-threadsafe. This commit does so.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Here the packet size is known before allocating the packet because
the encoder itself works with an internal buffer, so one can use
this information to avoid the implicit use of another intermediate
buffer for the packet data; one can also switch to
ff_get_encode_buffer() and directly use user-supplied buffers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When the packet size is known in advance like here, one can avoid
an intermediate buffer for the packet data by using
ff_get_encode_buffer() and also set AV_CODEC_CAP_DR1 at the same time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When the packet size is known in advance like here, one can avoid
an intermediate buffer for the packet data by using
ff_get_encode_buffer() and also set AV_CODEC_CAP_DR1 at the same time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When the packet size is known in advance like here, one can avoid
an intermediate buffer for the packet data by using
ff_get_encode_buffer() and also set AV_CODEC_CAP_DR1 at the same time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>