This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/aarch64/vp9itxfm_neon.o from
19496 to 14740 bytes.
This gives a small slowdown of a couple of tens of cycles, but makes
it more feasible to add more optimized versions of these transforms.
Before:
vp9_inv_dct_dct_16x16_sub4_add_neon: 1036.7
vp9_inv_dct_dct_16x16_sub16_add_neon: 1372.2
vp9_inv_dct_dct_32x32_sub4_add_neon: 5180.0
vp9_inv_dct_dct_32x32_sub32_add_neon: 8095.7
After:
vp9_inv_dct_dct_16x16_sub4_add_neon: 1051.0
vp9_inv_dct_dct_16x16_sub16_add_neon: 1390.1
vp9_inv_dct_dct_32x32_sub4_add_neon: 5199.9
vp9_inv_dct_dct_32x32_sub32_add_neon: 8125.8
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/arm/vp9itxfm_neon.o from
15324 to 12388 bytes.
This gives a small slowdown of a couple tens of cycles, up to around
150 cycles for the full case of the largest transform, but makes
it more feasible to add more optimized versions of these transforms.
Before: Cortex A7 A8 A9 A53
vp9_inv_dct_dct_16x16_sub4_add_neon: 2063.4 1516.0 1719.5 1245.1
vp9_inv_dct_dct_16x16_sub16_add_neon: 3279.3 2454.5 2525.2 1982.3
vp9_inv_dct_dct_32x32_sub4_add_neon: 10750.0 7955.4 8525.6 6754.2
vp9_inv_dct_dct_32x32_sub32_add_neon: 18574.0 17108.4 14216.7 12010.2
After:
vp9_inv_dct_dct_16x16_sub4_add_neon: 2060.8 1608.5 1735.7 1262.0
vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.2 2443.5 2546.1 1999.5
vp9_inv_dct_dct_32x32_sub4_add_neon: 10682.0 8043.8 8581.3 6810.1
vp9_inv_dct_dct_32x32_sub32_add_neon: 18522.4 17277.4 14286.7 12087.9
Signed-off-by: Martin Storsjö <martin@martin.st>
This avoids having to count the number of frames sent to the codec
and the number of output packets received; instead just wait until
the encoder returns a buffer with the EOS flag set.
Signed-off-by: Martin Storsjö <martin@martin.st>
This avoids concatenation, which can't be used if the whole macro
is wrapped within another macro.
This is also arguably more readable.
Signed-off-by: Martin Storsjö <martin@martin.st>
No deprecation guards, because the old decode API (for which this field
is needed) doesn't have any either.
This field should be removed together with the old decode calls.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The code relies on their validity and otherwise can try to access a NULL
object->rle pointer, causing segmentation faults.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Only do this when building for a recent VAAPI version - initial
driver implementations were confused about the interpretation of the
framerate field, but hopefully this will be consistent everywhere
once 0.40.0 is released.
Default to using VBR when a target bitrate is set, unless the max rate
is also set and matches the target. Changes to the Intel driver mean
that min_qp is also respected in this case, so set a codec default to
unset the value rather than using the current default inherited from
the MPEG-4 part 2 encoder.
This includes a backward-compatibility hack to choose CBR anyway on
old drivers which have no CBR support, so that existing programs will
continue to work their options now map to VBR.
The current condition can trigger in cases where it shouldn't, with
unexpected results.
Make sure that:
- container cropping is really based on the original dimensions from the
caller
- those dimenions are discarded on size change
The code is still quite hacky and eventually should be deprecated and
removed, with the decision about which cropping is used delegated to the
caller.
Introducing enforced sync points in arbitrary places is bad for
performance. Since the vast majority of receiving code (QSV VPP or
encoders, retrieving frames through hwcontext) will do the syncing, this
change should not be visible to most callers. But bumping micro just in
case.
This is also consistent with what VAAPI hwaccel does.
We can pick the correct slice index directly from the ID3D11VideoDecoderOutputView
casted from data[3].
Signed-off-by: Anton Khirnov <anton@khirnov.net>