1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-23 12:43:46 +02:00
Commit Graph

2515 Commits

Author SHA1 Message Date
Michael Niedermayer
64098d0cd8
swscale/swscale: Check srcSliceH for bayer
Fixes: Assertion srcSliceH > 1 failed at libswscale/swscale_unscaled.c:1359
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-21 18:24:16 +01:00
Michael Niedermayer
18f26f8a2f
swscale/utils: Allocate more dithererror
Fixes: out of array read
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-21 18:24:16 +01:00
Michael Niedermayer
ebb7dffa97
swscale/tests/swscale: Add help text
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-15 23:07:44 +01:00
Michael Niedermayer
6ebe4ebee3
swscale/tests/swscale: Highlight cases that worsened
also highlight cases that worsened alot in uppercase

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-15 23:07:44 +01:00
Michael Niedermayer
f7770ec9a4
swscale/tests/swscale: Allow comparing a subset of cases to a reference file
Testing all cases exhaustively is slow

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-15 23:07:44 +01:00
Michael Niedermayer
885a802f24
swscale/tests/swscale: Test a wider range of flag combinations
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-15 23:07:43 +01:00
Michael Niedermayer
35ab103c30
swscale/tests/swscale: Compute chroma and alpha between gray and opaque frames too
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-15 23:07:43 +01:00
Michael Niedermayer
247f485448
swscale/tests/swscale: Split sws_getContext()
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-15 23:07:43 +01:00
Michael Niedermayer
1055ece30b
swscale/tests/swscale: Implement isALPHA() using AVPixFmtDescriptor
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-15 23:07:42 +01:00
Anton Khirnov
1e7d2007c3 all: use designated initializers for AVOption.unit
Makes it robust against adding fields before it, which will be useful in
following commits.

Majority of the patch generated by the following Coccinelle script:

@@
typedef AVOption;
identifier arr_name;
initializer list il;
initializer list[8] il1;
expression tail;
@@
AVOption arr_name[] = { il, { il1,
- tail
+ .unit = tail
}, ...  };

with some manual changes, as the script:
* has trouble with options defined inside macros
* sometimes does not handle options under an #else branch
* sometimes swallows whitespace
2024-02-14 14:53:41 +01:00
Rémi Denis-Courmont
b3825bbe45 riscv: test for assembler support
This should fix the build on LLVM 16 and earlier, at the cost of turning
all non-RVV optimisations off.
2023-12-08 17:21:09 +02:00
Alfred Wingate
e5ce473040 swscale/x86/rgb_2_rgb: Add opaque pointer to missed definitions of ff_nv12ToUV
Opaque parameters were previously added to the original definition of
ff_nv12ToUV, leading to gcc noticing a type mismatch with -Wlto-type-mismatch.

f2de911818
https://bugs.gentoo.org/907484

Signed-off-by: Alfred Wingate <parona@protonmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2023-12-02 11:22:46 +01:00
xufuji456
cc86343b96 lavc/hevcdsp_qpel_neon: using movi.16b instead of movi.2d
Building iOS platform with arm64, the compiler has a warning: "instruction movi.2d with immediate #0 may not function correctly on this CPU, converting to movi.16b"

Signed-off-by: xufuji456 <839789740@qq.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-11-28 15:54:49 +02:00
Rémi Denis-Courmont
6d60cc7baf sws/rgb2rgb: fix unaligned accesses in R-V V YUYV to I422p
In my personal opinion, we should not need to support unaligned YUY2
pixel maps. They should always be aligned to at least 32 bits, and the
current code assumes just 16 bits. However checkasm does test for
unaligned input bitmaps. QEMU accepts it, but real hardware dose not.

In this particular case, we can at the same time improve performance and
handle unaligned inputs, so do just that.

uyvytoyuv422_c:      104379.0
uyvytoyuv422_c:      104060.0
uyvytoyuv422_rvv_i32: 25284.0 (before)
uyvytoyuv422_rvv_i32: 19303.2 (after)
2023-11-13 18:34:29 +02:00
Rémi Denis-Courmont
5b8b5ec9c5 sws/rgb2rgb: rework R-V V YUY2 to 4:2:2 planar
This saves three scratch registers and three instructions per line. The
performance gains are mostly negligible. The main point is to free up
registers for further rework.
2023-11-13 18:34:29 +02:00
Niklas Haas
736284e7b9 swscale/yuv2rgb: fix sws_getCoefficients for colorspace=0
The documentation states that invalid entries default to SWS_CS_DEFAULT.
A value of 0 is not a valid SWS_CS_*, yet the code incorrectly
hard-codes it to BT.709 coefficients instead of SWS_CS_DEFAULT.
2023-11-09 12:53:35 +01:00
Niklas Haas
d043e5c54c swscale: don't omit ff_sws_init_range_convert for high-bit
This was a complete hack seemingly designed to work around a different
bug, which was fixed in the previous commit. As such, there is no more
reason not to do this, as it simply breaks changing color range in
sws_setColorspaceDetails for no reason.
2023-11-09 12:53:35 +01:00
Niklas Haas
cedf589c09 swscale: fix sws_setColorspaceDetails after sws_init_context
More commonly, this fixes the case of sws_setColorspaceDetails after
sws_getContext, since the latter implies sws_init_context.

The problem here is that sws_init_context sets up the range conversion
and fast path tables based on the values of srcRange/dstRange at init
time. This may result in locking in a "wrong" path (either using
unscaled fast path when range conversion later required, or using
scaled slow path when range conversion becomes no longer required).

There are two way outs:

1. Always initialize range conversion and unscaled converters, even if
   they will be unused, and extend the runtime check.
2. Re-do initialization if the values change after
   sws_setColorspaceDetails.

I opted for approach 1 because it was simpler and easier to reason
about.

Reword the av_log message to make it clear that this special converter
is not necessarily used, depending on whether or not there is range
conversion or YUV matrix conversion going on.
2023-11-09 12:53:35 +01:00
Michael Niedermayer
47e784f881
Bump versions after 6.1
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-10-29 16:19:14 +01:00
Michael Niedermayer
9d3a7d30c4
Bump versions prior to 6.1
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-10-29 15:34:05 +01:00
Martin Storsjö
a76b409dd0 aarch64: Reindent all assembly to 8/24 column indentation
libavcodec/aarch64/vc1dsp_neon.S is skipped here, as it intentionally
uses a layered indentation style to visually show how different
unrolled/interleaved phases fit together.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-21 23:25:54 +03:00
Martin Storsjö
93cda5a9c2 aarch64: Lowercase UXTW/SXTW and similar flags
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-21 23:25:23 +03:00
Martin Storsjö
184103b310 aarch64: Consistently use lowercase for vector element specifiers
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-10-21 23:25:18 +03:00
Rémi Denis-Courmont
19baf4e009 swscale/rgb2rgb: R-V V deinterleaveBytes 2023-10-03 22:53:20 +03:00
Rémi Denis-Courmont
ede3215115 swscale/rgb2rgb: fix extra iteration in R-V V interleave
There was an additional iteration doing nothing for each line,
due to checking the selected vector length instead of the available
vector length.
2023-10-03 22:53:20 +03:00
Rémi Denis-Courmont
d14130aea3 swscale/rgb2rgb: unroll R-V V interleave_bytes 2023-10-03 20:48:47 +03:00
Rémi Denis-Courmont
6269c4a440 swscale/rgb2rgb: unroll RISC-V V uyvytoyuv422 2023-10-03 20:48:39 +03:00
Rémi Denis-Courmont
e50f8e861b swscale/rgb2rgb: avoid S-regs in RISC-V V uyvytoyuv422
We can make do with callee-clobbered registers only now.
As an added bonus, this makes the code XLEN-independent.
2023-10-03 20:48:39 +03:00
Rémi Denis-Courmont
be37a2e364 swscale/rgb2rgb: rework RISC-V V uyvytoyuv422
This avoids using relatively slow register strides.
2023-10-03 20:48:39 +03:00
Rémi Denis-Courmont
1a4bd76ea5 swscale/rgb2rgb: remove R-V V shuffle_bytes_3012
This is slower than the Zbb version on real hardware due to register
strides. Proper support for vector byte-swap requires the Zvbb
extension, but it's much too early for me to worry about it.
2023-10-02 22:28:38 +03:00
Rémi Denis-Courmont
c4a144c29d swscale/rgb2rgb: add R-V Zbb shuffle_bytes_3210 2023-10-02 22:28:25 +03:00
Paul B Mahol
29b673bdcf swscale: add GBRAP14 format support 2023-09-28 19:37:58 +02:00
Andreas Rheinhardt
f8503b4c33 avutil/internal: Don't auto-include emms.h
Instead include emms.h wherever it is needed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
L. E. Segovia
ddc1cd5cdd configure: Set WIN32_LEAN_AND_MEAN at configure time
Including winsock2.h or windows.h without WIN32_LEAN_AND_MEAN cause
bzlib.h to parse as nonsense, due to an instance of #define char small
in rpcndr.h.

See:

https://stackoverflow.com/a/27794577

Signed-off-by: L. E. Segovia <amy@amyspark.me>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-08-14 22:57:28 +03:00
Rémi Denis-Courmont
c2b38619c0 swscale/rgb2rgb2: rework RISC-V V shuffle_bytes_{1230,3012}
This avoids strided loads.

Before:
shuffle_bytes_1230_rvv_i32: 308.7
shuffle_bytes_3012_rvv_i32: 308.7

After:
shuffle_bytes_1230_rvv_i32: 46.7
shuffle_bytes_3012_rvv_i32: 46.7
2023-07-21 22:18:02 +03:00
Rémi Denis-Courmont
15982554e6 swscale/rgb2rgb2: rework RISC-V V shuffle_bytes_{0321,2103}
This avoids strided loads.

Before:
shuffle_bytes_0321_rvv_i32: 307.7
shuffle_bytes_2103_rvv_i32: 308.7

After:
shuffle_bytes_0321_rvv_i32: 59.7
shuffle_bytes_2103_rvv_i32: 61.5
2023-07-21 22:18:02 +03:00
Rémi Denis-Courmont
d3948e4db5 swscale: inline ff_shuffle_bytes_3210_rvv
No functional changes.
2023-07-21 22:18:02 +03:00
Rémi Denis-Courmont
b6585eb04c lavu: add/use flag for RISC-V Zba extension
The code was blindly assuming that Zbb or V implied Zba. While the
earlier is practically always true, the later broke some QEMU setups,
as V was introduced earlier than Zba.
2023-07-19 19:29:35 +03:00
Khem Raj
a7b3c0203f libswscale/riscv: fix syntax of vsetvli
Add missing operand which clang complains about but GCC assumes it to be
'm1' if not specified.

Works around build failure with Clang:
| src/libswscale/riscv/rgb2rgb_rvv.S:88:25: error: operand must be e[8|16|32|64|128|256|512|1024],m[1|2|4|8|f2|f4|f8],[ta|tu],[ma|mu]
|         vsetvli t4, t3, e8, ta, ma
|                         ^

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-07-13 22:01:24 +03:00
Lynne
b3fb73af6b
swscale: bump minor for implementing support for the new pixfmts 2023-05-29 00:42:02 +02:00
Lynne
934525eae0
lsws: add in/out support for the new 12-bit 2-plane 422 and 444 pixfmts 2023-05-29 00:41:35 +02:00
Jin Bo
cb4ae8baee
swscale/la: Add following builtin optimized functions
yuv420_rgb24_lsx
yuv420_bgr24_lsx
yuv420_rgba32_lsx
yuv420_argb32_lsx
yuv420_bgra32_lsx
yuv420_abgr32_lsx
./configure --disable-lasx
ffmpeg -i ~/media/1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo
-pix_fmt rgb24 -y /dev/null -an
before: 184fps
after:  207fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-05-25 21:05:15 +02:00
Lu Wang
4501b1dfd7
swscale/la: Optimize the functions of the swscale series with lsx.
./configure --disable-lasx
ffmpeg -i ~/media/1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -s 640x480
-pix_fmt bgra -y /dev/null -an
before: 91fps
after:  160fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-05-25 21:05:08 +02:00
Lynne
a62a3930c2
swscale/ppc: remove hScale8To19_vsx
Fails checkasm on a Power9 system.
2023-05-20 20:07:18 +02:00
Michael Niedermayer
47ac3e6065
version.h: Bump minor post 6.0 branch
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-02-19 18:37:36 +01:00
Michael Niedermayer
62efa096af
version.h: Bump minor for 6.0 branch
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-02-19 18:32:07 +01:00
James Almer
5bad485603 Bump major versions of all libraries
Signed-off-by: James Almer <jamrial@gmail.com>
2023-02-09 15:35:14 +01:00
Tomas Härdin
a678b0c252 sws/utils.c: Do not uselessly call initFilter() when unscaling 2023-02-08 15:53:55 +01:00
Lynne
bbe95f7353
x86: replace explicit REP_RETs with RETs
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)

x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.

The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.

In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
2023-02-01 04:23:55 +01:00
Andreas Rheinhardt
1ff9c07fa6 swscale/utils: Fix indentation
Forgotten after c1eb3e7fec.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-11-24 21:02:57 +01:00