* commit 'ca44fa5d7fda7e954f3ebfeb5b0d6d1be55fcaa3':
avcodec/libdav1d: properly free all output picture references
This commit is a noop, see 10931a0661
Merged-by: James Almer <jamrial@gmail.com>
* commit '70ab2778be9c83dab84340af7e3ba83fa0f98576':
libdav1d: update API usage to the first stable release
libdav1d: fix build after a recent API break
qsvenc: Add VDENC support for H264 and HEVC
avcodec: libdav1d AV1 decoder wrapper.
swscale: Add GRAY10
pixfmt: Add GRAY10
libx264: Pass the reordered_opaque field through the encoder
libavutil: Undeprecate the AVFrame reordered_opaque field
libaom: remove references to yuva444p pixfmt
Revert "decode: copy the output parameters from the last bsf in the chain back to the AVCodecContext"
This commit is a noop, see
87588caf8c4e9cff2824882ae091d43f1b5ca22eb5177c7051beaa350e24e92ce340e6
Merged-by: James Almer <jamrial@gmail.com>
* commit '1ff6cb2ca6652e7d2a929afd33d8ed6268c45568':
lavc/qsvenc_jpeg: set a default quality
lavc/qsvenc_jpeg: add async_depth support
This commit is a noop, see
0e3d7d845d92c25963e8
Merged-by: James Almer <jamrial@gmail.com>
* commit '04e8b8b0530e2aa33010faba3d0b6b6c9c5b704e':
avcodec/libaomenc: export the Sequence Header OBU as extradata
This commit is a noop. aom_codec_get_global_headers() is buggy at the moment.
See https://bugs.chromium.org/p/aomedia/issues/detail?id=2208
Merged-by: James Almer <jamrial@gmail.com>
* commit '97c9a5084479eeb66f4beb100cc7589a2c8bfe81':
avcodec/libaomenc: remove AVOption related to frame partitions
avcodec/extract_extradata: don't uninitialize the H2645Packet on every processed packet
avcodec/extract_extradata: Move the reference in the bsf internal buffer
avcodec/extract_extradata: Do not allocate more space than needed when removing NALUs in h264/hevc
avcodec/extract_extradata: Zero-initialize the padding bytes in all allocated buffers
avcodec/extract_extradata_bsf: Fix leak discovered via fuzzing
avcodec/bsf: Add ff_bsf_get_packet_ref() function
This commit is a noop, see
7ae52f8a6b5a412a5c3cd168e78eff2536bd86329c6dd9d624016d40011ab69ea742ab
Merged-by: James Almer <jamrial@gmail.com>
Even if NEON would be disabled, the init functions should be built
as they are called as long as ARCH_AARCH64 is set.
These functions are part of a generic DSP subsytem, not tied directly
to one decoder. (They should be built if the vp7 decoder is enabled,
even if the vp8 decoder is disabled.)
Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit b4b27dce95)
This also partially fixes assembling with MS armasm64 (via
gas-preprocessor).
The movrel macro invocations need to pass the offset via a separate
parameter. Mach-o and COFF relocations don't allow a negative
offset to a symbol, which is handled properly if the offset is passed
via the parameter. If no offset parameter is given, the macro
evaluates to something like "adrp x17, subpel_filters-16+(0)", which
older clang versions also fail to parse (the older clang versions
only support one single offset term, although it can be a parenthesis.
Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit 26d7af4c38)
If we fill with black then the generated palette will have one color more
than what the user requested. This also resulted in unwanted black specks in
the output of paletteuse, especially when generating small palettes.
- Clamp ME range to -64..63 (prevents corruption when me_range is too high)
- Allow MV's up to *and including* the positive range limit
- Allow out-of-edge ME by padding the prev buffer with a border of 0's
- Try previous MV before checking the rest (improves speed in some cases)
- More robust logic in code - ensure *mx,*my,*xored are updated together
- Improve block choices by counting 0-bytes in the entropy score
- Make histogram use uint16_t type, to allow byte counts from 16*16
(current block size) up to 255*255 (maximum allowed 8bpp block size)
- Make sure score table is big enough for a full block's worth of bytes
- Calculate *xored without using code in inner loop
The previous version was a pretty exact translation of the arm
version. This version does do some unnecessary arithemetic (it does
more operations on vectors that are only half filled; it does 4
uaddw and 4 sqxtun instead of 2 of each), but it reduces the overhead
of packing data together (which could be done for free in the arm
version).
This gives a decent speedup on Cortex A53, a minor speedup on
A72 and a very minor slowdown on Cortex A73.
Before: Cortex A53 A72 A73
vp8_idct_add_neon: 79.7 67.5 65.0
After:
vp8_idct_add_neon: 67.7 64.8 66.7
Signed-off-by: Martin Storsjö <martin@martin.st>
The original arm version didn't do saturation here. This probably
doesn't make any difference for performance, but reduces the
differences.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes it similar to put_epel16_v6, and gives a large speedup
on Cortex A53, a minor speedup on A72 and a very minor slowdown on
A73.
Before: Cortex A53 A72 A73
vp8_put_epel16_h6v6_neon: 2211.4 1586.5 1431.7
After:
vp8_put_epel16_h6v6_neon: 1736.9 1522.0 1448.1
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes it similar to put_epel16_v6, and gives a 10-25%
speedup of this function.
Before: Cortex A7 A8 A9 A53 A72
vp8_put_epel16_h6v6_neon: 3058.0 2218.5 2459.8 2183.0 1572.2
After:
vp8_put_epel16_h6v6_neon: 2670.8 1934.2 2244.4 1729.4 1503.9
Signed-off-by: Martin Storsjö <martin@martin.st>
Even if NEON would be disabled, the init functions should be built
as they are called as long as ARCH_AARCH64 is set.
These functions are part of a generic DSP subsytem, not tied directly
to one decoder. (They should be built if the vp7 decoder is enabled,
even if the vp8 decoder is disabled.)
Signed-off-by: Martin Storsjö <martin@martin.st>
The previous form also does seem to assemble on current tools,
but I think it might fail on some older aarch64 tools.
Signed-off-by: Martin Storsjö <martin@martin.st>
This also partially fixes assembling with MS armasm64 (via
gas-preprocessor).
The movrel macro invocations need to pass the offset via a separate
parameter. Mach-o and COFF relocations don't allow a negative
offset to a symbol, which is handled properly if the offset is passed
via the parameter. If no offset parameter is given, the macro
evaluates to something like "adrp x17, subpel_filters-16+(0)", which
older clang versions also fail to parse (the older clang versions
only support one single offset term, although it can be a parenthesis.
Signed-off-by: Martin Storsjö <martin@martin.st>
This was found through the Hacker One program on VLC but is not a security issue in libavformat
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>