Christophe Gisquet
e8c003edd2
x86: hevc_deblock: remove unnecessary masking
...
The unpacks/shuffles later on makes it unnecessary.
Before:
1508 decicycles in h, 2096759 runs, 393 skips
2512 decicycles in v, 2095422 runs, 1730 skips
After:
1477 decicycles in h, 2096745 runs, 407 skips
2484 decicycles in v, 2095297 runs, 1855 skips
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-04 17:46:04 +02:00
James Almer
b7863c972c
x86/hevc_mc: use fewer instructions in hevc_put_hevc_{uni, bi}_w[24]_{8, 10, 12}
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-04 14:47:15 +02:00
James Almer
b1a44e6bf5
x86/hevc_mc: remove an unnecessary pxor
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-04 14:35:08 +02:00
James Almer
d0f56ca071
x86/hevc_deblock: improve 8bit transpose store macros
...
Up to four instructions less depending on function and instruction set.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-03 04:24:15 +02:00
Michael Niedermayer
f54e01c24e
Merge commit 'a786c8259dafeca9744252230b5d78f67810770c'
...
* commit 'a786c8259dafeca9744252230b5d78f67810770c':
idct: Split off Xvid IDCT
Conflicts:
libavcodec/Makefile
libavcodec/mpeg4videodec.c
libavcodec/x86/Makefile
libavcodec/x86/idctdsp_init.c
This split is somewhat restructured leaving the xvid IDCT available
outside mpeg4 if manually selected.
The code also could not be merged unchanged as it conflicted with a
bugfix in FFmpeg
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-01 16:21:52 +02:00
Diego Biurrun
a786c8259d
idct: Split off Xvid IDCT
...
The Xvid IDCT is only required to decode some Xvid-encoded MPEG-4 files,
so there is no point in having it as an unconditional part of idctdsp.
2014-08-01 01:25:18 -07:00
James Almer
62baf5b853
x86/hevc_deblock: use existing x86util transpose macro in chroma_{10, 12}
...
Cosmetic change. No measurable difference in speed.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-31 22:56:21 +02:00
Christophe Gisquet
a507623bad
x86: hevc_mc: fix register count usage
...
A macro was using a fixed register, causing too many GPRs to be
declared as used.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 22:50:50 +02:00
James Almer
73c4f63ba5
x86/hevc_deblock: add add ff_hevc_[hv]_loop_filter_luma_{8, 10, 12}_avx
...
~5% faster than SSSE3
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 14:04:59 +02:00
James Almer
88ba821f23
x86/hevc_deblock: improve luma functions register allocation
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 13:38:05 +02:00
James Almer
c74b08c5c6
x86/hevc_deblock: remove some unnecessary instructions
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 13:27:44 +02:00
James Almer
4f91bb0ff0
x86/hevc_deblock: use psignw instead of pmullw where possible
...
It's slightly faster
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 03:42:29 +02:00
Michael Niedermayer
a91c5ed008
Merge commit '4f8cf0dc4ef6110174056df7edd9dc2f2a988b6d'
...
* commit '4f8cf0dc4ef6110174056df7edd9dc2f2a988b6d':
x86: build: Restore ordering of OBJS lines
Conflicts:
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 00:34:53 +02:00
Diego Biurrun
4f8cf0dc4e
x86: build: Restore ordering of OBJS lines
2014-07-28 13:19:04 -07:00
James Almer
664e9e4331
x86/hevc_deblock: load less data in hevc_h_loop_filter_luma_8
...
Reading 8 bytes is enough.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-28 21:55:22 +02:00
James Almer
f137876182
x86/hevc_idct: add a colon to labels
...
This fixes a warning spam when using NASM
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-28 21:43:32 +02:00
Christophe Gisquet
81943a10b5
x86: hevc_mc: load less data in epel filters
...
Before:
5679 decicycles in epel_bi, 2059976 runs, 37176 skips
3468 decicycles in epel_uni, 1040886 runs, 7690 skips
After:
5323 decicycles in epel_bi, 2059493 runs, 37659 skips
3262 decicycles in epel_uni, 1040871 runs, 7705 skips
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 18:34:39 +02:00
Christophe Gisquet
36284ae981
x86: hevc_mc: replace one lea by add
...
Should have been in 036f11bdb5
.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 17:42:56 +02:00
James Almer
bfb3b2b7a6
x86/hevc_idct: add 12bit idct_dc
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 00:30:56 +02:00
Michael Niedermayer
d4a9e89b27
avcodec/x86/hevcdsp_init: make license header consistent
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 00:28:44 +02:00
Michael Niedermayer
706f81a2c2
Merge commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d'
...
* commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d':
hevc: SSE2 and SSSE3 loop filters
Conflicts:
libavcodec/hevcdsp.c
libavcodec/hevcdsp.h
libavcodec/x86/Makefile
libavcodec/x86/hevc_deblock.asm
libavcodec/x86/hevcdsp_init.c
See: de7b89fd43
and several others
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 00:20:48 +02:00
James Almer
1ace9573dc
x86/hevc_idct: replace old and unused idct functions
...
Only 8-bit and 10-bit idct_dc() functions are included (adding others should be trivial).
Benchmarks on an Intel Core i5-4200U:
idct8x8_dc
SSE2 MMXEXT C
cycles 22 26 57
idct16x16_dc
AVX2 SSE2 C
cycles 27 32 249
idct32x32_dc
AVX2 SSE2 C
cycles 62 126 1375
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 18:00:11 +02:00
Pierre Edouard Lepere
1a880b2fb8
hevc: SSE2 and SSSE3 loop filters
...
Additional contributions by James Almer <jamrial@gmail.com>,
Carl Eugen Hoyos <cehoyos@ag.or.at>, Fiona Glaser <fiona@x264.com> and
Anton Khirnov <anton@khirnov.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-07-26 15:01:01 +00:00
Christophe Gisquet
036f11bdb5
x86: hevc_mc: replace simple leas by adds
...
lea is detrimental for those simple cases. No impact overall to
the change though.
Before:
15017 decicycles in q, 1016152 runs, 32424 skips
15382 decicycles in q_bi, 1013673 runs, 34903 skips
3713 decicycles in e, 2074534 runs, 22618 skips
3901 decicycles in e_bi, 2065509 runs, 31643 skips
7852 decicycles in q_uni, 520165 runs, 4123 skips
2398 decicycles in e_uni, 1043339 runs, 5237 skips
After:
14898 decicycles in q, 1016295 runs, 32281 skips
15119 decicycles in q_bi, 1015392 runs, 33184 skips
3682 decicycles in e, 2073224
runs, 23928 skips
3720 decicycles in e_bi, 2065043 runs, 32109 skips
7643 decicycles in q_uni, 520280 runs, 4008 skips
2363 decicycles in e_uni, 1043780 runs, 4796 skips
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 05:41:04 +02:00
Mickaël Raulet
bd0f2d316f
x86/hevc: add 12bits support for MC
...
cherry picked from commit 3fcb7a4595a6f40100a22110a5805e3b7510c0fd
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 01:55:20 +02:00
Mickaël Raulet
7df98d8c4d
x86/hevc: remove unused constant in deblocking filter
...
cherry picked from commit a3f7282eaa6f1ab0524fb966c6eade50c3025f99
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 01:20:40 +02:00
Mickaël Raulet
7bdcf5c934
x86/hevc: add 12bits support for deblocking filter
...
cherry picked from commit 97d46afe320c7d61d7b9525e5f5588355cde4bb0
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 01:19:42 +02:00
Michael Niedermayer
2904d052b7
Merge commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac'
...
* commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac':
qpeldsp: Mark source pointer in qpel_mc_func function pointer const
Conflicts:
libavcodec/h264qpel_template.c
libavcodec/x86/cavsdsp.c
libavcodec/x86/rv40dsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-25 13:05:08 +02:00
Diego Biurrun
7fb993d338
qpeldsp: Mark source pointer in qpel_mc_func function pointer const
2014-07-25 02:52:54 -07:00
Christophe Gisquet
670b7f203a
x86: hevcdsp: align
...
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-23 22:18:08 +02:00
Carl Eugen Hoyos
c75fdee747
avcodec/x86/hevc_deblock: Fix compilation with nasm.
2014-07-23 10:32:27 +02:00
Michael Niedermayer
ca6b33b8bd
avcodec/x86/hevcdsp_init: Fix "warning: assignment from incompatible pointer type"
2014-07-22 16:36:12 +02:00
Anton Khirnov
d7e162d46b
hevcdsp: remove an unneeded variable in the loop filter
...
beta0 and beta1 will always be the same within a CU
Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>
cherry picked from commit 4a23d824741a289c7d2d2f2871d1e2621b63fa1b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:27:26 +02:00
Anton Khirnov
ae2f048fd7
avcodec/x86/hevc_deblock: cosmetics
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:18:05 +02:00
Anton Khirnov
b435043abb
hevc: cleanups in SSE2 and SSSE3 loop filters, use fewer instructions
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:17:29 +02:00
Anton Khirnov
e8581b17a8
avcodec/x86/hevc_deblock: use test instead of cmp 0
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:16:05 +02:00
Anton Khirnov
dc69247de4
avcodec/x86/hevc_deblock: use of paddw instead of psllw
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:14:53 +02:00
Anton Khirnov
500a0394d5
avcodec/x86/hevc_deblock: add %ifs to avoid "do nothing instructions"
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:13:28 +02:00
Anton Khirnov
7a4cf67117
hevc: cleaning up SSE2 and SSSE3 deblocking filters
...
Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>
cherry picked from commit b432041d7d1eca38831590f13b4e5baffff8186f
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:00:48 +02:00
Michael Niedermayer
d986c414de
Merge commit '81b9bf319226fe03436c80aaa8a2c91767cab7ce'
...
* commit '81b9bf319226fe03436c80aaa8a2c91767cab7ce':
dct-test: Move arch-specific bits into arch-specific subdirectories
Conflicts:
libavcodec/dct-test.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-21 13:33:51 +02:00
Diego Biurrun
81b9bf3192
dct-test: Move arch-specific bits into arch-specific subdirectories
2014-07-21 01:10:11 -07:00
Michael Niedermayer
776647360d
Merge commit '5dcc201505f71b1e73e9eef12ce89d4eed252ad0'
...
* commit '5dcc201505f71b1e73e9eef12ce89d4eed252ad0':
simple_idct: Move x86-specific declarations to a header in the x86 directory
Conflicts:
libavcodec/x86/simple_idct.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-19 13:56:29 +02:00
Michael Niedermayer
6da96a9fc9
Merge commit '85cabb8d002f2cd100ced5cc17d87bfc9460d314'
...
* commit '85cabb8d002f2cd100ced5cc17d87bfc9460d314':
fdct: Move x86-specific declarations to a header in the x86 directory
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-19 13:45:59 +02:00
Diego Biurrun
5dcc201505
simple_idct: Move x86-specific declarations to a header in the x86 directory
2014-07-19 02:33:36 -07:00
Diego Biurrun
85cabb8d00
fdct: Move x86-specific declarations to a header in the x86 directory
2014-07-19 02:25:59 -07:00
Michael Niedermayer
097bf834ba
Merge commit '9e0b29911f1f167381a7dbdfca68bf417b8c767b'
...
* commit '9e0b29911f1f167381a7dbdfca68bf417b8c767b':
x86: dnxhdenc: Eliminate some unnecessary ifdefs
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-18 22:33:24 +02:00
Michael Niedermayer
521f569734
Merge commit '8b0dd4942aac320d1ca3c40fa7ea1be342c71273'
...
* commit '8b0dd4942aac320d1ca3c40fa7ea1be342c71273':
idctdsp: prettyprinting cosmetics
Conflicts:
libavcodec/idctdsp.c
libavcodec/ppc/idctdsp.c
libavcodec/x86/idctdsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-18 22:16:04 +02:00
Michael Niedermayer
42d326353c
Merge commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae'
...
* commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae':
idct: Convert IDCT permutation #defines to an enum
Conflicts:
libavcodec/idctdsp.c
libavcodec/x86/cavsdsp.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-18 22:01:17 +02:00
Diego Biurrun
9e0b29911f
x86: dnxhdenc: Eliminate some unnecessary ifdefs
2014-07-18 09:58:17 -07:00
Diego Biurrun
8b0dd4942a
idctdsp: prettyprinting cosmetics
2014-07-18 07:51:03 -07:00