384fe39623 introduced a regression in the
range conversion offset calculation, resulting in a slight green tint
in full-range RGB to YUV conversions of grayscale values.
The offset being calculated was not taking into consideration a bias
needed for correctly rounding the result from the multiplication stage,
leading to a truncated value.
Fixes issue #11646.
This one takes about 2.93s on my machine, but ensures that every pixel
format conversion roundtrips correctly. Note that due to existing bugs in
libswscale, this one only passes when using the new format conversion code.
Restrict the test to -v 16 (AV_LOG_ERROR) to avoid excess amounts of output.
We can't use ANSI color codes inside av_log(), so fall back to printf()
for these; but match the INFO verbosity level.
Also change the format slightly to drop SSIM numbers down to just below
VERBOSE level, since VERBOSE tends to generate a lot of swscale related
spam.
This is more informative than the current behavior, because when the first
MERGE() succeeds but the second fails, the original link already has
merged formats and thus the error message is confusing.
Doubling the register size allowed to avoid two pmaddubsw.
It is also ABI compliant (the old version lacked an emms)
and the average versions no longer rely on padding (the old versions
used pavgb with a memory operand reading eight bytes,
although only four are needed).
Old benchmarks (the latter four refer to RV40):
avg_h264_chroma_mc4_8_c: 145.7 ( 1.00x)
avg_h264_chroma_mc4_8_ssse3: 32.3 ( 4.51x)
put_h264_chroma_mc4_8_c: 136.1 ( 1.00x)
put_h264_chroma_mc4_8_ssse3: 29.0 ( 4.70x)
avg_chroma_mc4_c: 162.1 ( 1.00x)
avg_chroma_mc4_ssse3: 31.1 ( 5.22x)
put_chroma_mc4_c: 137.5 ( 1.00x)
put_chroma_mc4_ssse3: 28.6 ( 4.81x)
New benchmarks:
avg_h264_chroma_mc4_8_c: 146.7 ( 1.00x)
avg_h264_chroma_mc4_8_ssse3: 26.5 ( 5.53x)
put_h264_chroma_mc4_8_c: 136.8 ( 1.00x)
put_h264_chroma_mc4_8_ssse3: 22.5 ( 6.09x)
avg_chroma_mc4_c: 165.5 ( 1.00x)
avg_chroma_mc4_ssse3: 27.2 ( 6.08x)
put_chroma_mc4_c: 138.1 ( 1.00x)
put_chroma_mc4_ssse3: 23.2 ( 5.96x)
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Vulkan headers define *FlagBits enum with individual bit values, and
coresponding *Flags typedef to be used to store the bitmask of
coresponding bits.
In practice those two types map to the same type, but for consistency
*Flags should be used.
Fixes MSVC warnings about type mismatch.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Wrong enum value was used to check unit_elems. While
AV_FRAME_DATA_MASTERING_DISPLAY_METADATA (11) would trigger when
UNIT_MASTERING_DISPLAY (2) was set, it also would match
UNIT_CONTENT_LIGHT_LEVEL (1) which is not expected.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Pico VR adds a '\0' after projection_type (a real C string than
a fourcc). It's not strictly correct, but doesn't affect parsing.
[prji: Projection Information Box]
position = 149574743
size = 17
version = 0
flags = 0x000000
projection_type = rect
Co-Authored-by: Keven Ma
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
In MSVC builds, object files built for shared or static libraries are
technically not compatible with each other. That's why building both
shared and static libraries simultaneously is not allowed in configure.
This may change in the future once dllimport/dllexport attributes are no
longer used. Which will be possible on next major bump. The only
remaining use of dllexport was changed in c6c8063186.
However, for test programs, we still build internal static libraries
that allow the test programs to access internal symbols.
In commit 8eca3fa619, I assumed that when
CONFIG_STATIC=0, we would never build a static library. We actually do
build one for internal purposes, for the test executables. In that case,
we only link the tested library statically (using the same object files
as built for the shared library), the rest of the libraries are still
linked dynamically.
Such libraries are never installed and are used only for test programs.
This change adds a -static suffix to these internal libraries to avoid
name conflicts. In the MSVC world, static libraries and import libraries
are generally the same thing and share the same naming conventions.
Fixes: 8eca3fa619
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
This is a regression introduced by the addition of the rotation option,
which overrode the existing rotation attribute that may have been set to
the image.
To fix it, add the rotation istead of setting it - however we have to do this
directly when mapping, so as to not add it multiple times.
Fixes: 4f623b4c59
They are overridden by SSE2 and no longer needed by the no longer
existing nsse MMX functions. Saves 240B here.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Even nsse8 has to operate on eight words and therefore gains
a lot from xmm registers (and pabsw).
Old benchmarks:
nsse_0_c: 359.2 ( 1.00x)
nsse_0_mmx: 151.8 ( 2.37x)
nsse_1_c: 151.2 ( 1.00x)
nsse_1_mmx: 77.5 ( 1.95x)
New benchmarks:
nsse_0_c: 358.8 ( 1.00x)
nsse_0_ssse3: 62.2 ( 5.77x)
nsse_1_c: 151.2 ( 1.00x)
nsse_1_ssse3: 33.6 ( 4.50x)
The MMX nsse functions have been removed.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This will avoid using xmm registers that are volatile for Win64
in the next commit.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is a post-processing codec: given delta-x/y coordinates and a run length,
the r/g/b components of the 4 surrounding pixels are summed up, and the resulting
15bit value is used as index into a color quantization table to derive the
resulting pixel at the center.
It is only used in 10-20 frames of the Rebel Assault 2 LxxRETRY.SAN files
to slightly blur the outline of the "opening aperture" effect.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
This makes the final file truly hybrid: Externally the file
is a regular, non-fragmented file, but internally, the fragmented
form also exists un-overwritten.
To make any use of that, first, the fragments need to be muxed in
a position independent form, i.e. with empty_moov+default_base_moof
(or the dash or cmaf meta-flags).
Making use of the fragmented form when the file is finalized is
not entirely obvious though. One can dump the contents of the
single mdat box, and get the fragmented form. (This is a neat
trick, but not something that anybody really is expected to
want to do.)
The main expected use case is accessing fragments in the form of
byte range segments, for e.g. HLS.
Previously, the start of the file would look like this:
- ftyp
- free
- moov
- (moov contents)
After finalizing the file, it would look like this:
- ftyp
- free
- mdat (previously moov)
- (moov contents)
In this form, the size and type of the original moov box were
overwritten, and the original moov contents is just leftover
as unused data in the mdat box.
To avoid this issue, the start of the file now looks like this:
- ftyp
- free
- free
- ftyp
- moov
- (moov contents)
The second, hidden ftyp box inside mdat, would normally never be
seen.
After finalizing, the difference is that the mdat box now is
extended to cover the ftyp and the whole moov including its header
(and all the following fragments).
I.e., the start of the file looks like this:
- ftyp
- free
- mdat
- ftyp
- moov
- (moov contents)
This allows accessing the "ftyp+moov" pair sequentially as such,
with a byte range - this range is untouched when finalizing,
producing the same ftyp+moov pair both while writing, when the
file is fragmented, and after finalizing, when the file is
transformed to non-fragmented externally.
Note; the sequential two "free+free" boxes may look slightly
silly; it could be tempting to make the second one an mdat
from the get-go. However, some players of fragmented mp4 (in
particular, Apple's HLS player) bail out if the initialization
segment contains an mdat box - therefore, use a free box.
It could also be possible to use just one single free box with
8 bytes of padding at the start - but that would require more
changes to the finalization logic.
For a segmenting user of the muxer, the only unclarity is how
to determine the right byte range for the internal ftyp+moov
pair. Currently, this requires parsing the muxer output and skip
past anything up to the start of the non-empty free box.