1
0
mirror of https://github.com/FFmpeg/FFmpeg.git synced 2024-12-23 12:43:46 +02:00
Commit Graph

2312 Commits

Author SHA1 Message Date
Clément Bœsch
1ea0df14c3 Merge commit '0361e4dcb4d394c88c33364415a3b8fe315b67d1'
* commit '0361e4dcb4d394c88c33364415a3b8fe315b67d1':
  h264_qpel: x86: Move function with only one instance out of template macro

Note: warning is present with clang.

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-03-31 09:44:04 +02:00
Ronald S. Bultje
f8c019944d vp9: re-split the decoder/format/dsp interface header files.
The advantage here is that the internal software decoder interface is
not exposed to the DSP functions or the hardware accelerations.
2017-03-28 18:04:26 -04:00
Clément Bœsch
1c9f4b5078 lavc/vp9: split into vp9{block,data,mvs}
This is following Libav layout to ease merges.
2017-03-27 21:38:21 +02:00
Michael Niedermayer
73fb40dc87 avcodec/x86/idctdsp: Remove duplicate include
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-03-26 19:17:30 +02:00
James Almer
ac42f08099 x86/hevc_add_res: merge missing changes from 3d65359832
Unrolling the loops triplicates the size of the assembled output
while not generating any gain in performance.
2017-03-24 11:24:18 -03:00
Clément Bœsch
3d65359832 Merge commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b'
* commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b':
  hevc: x86: Add add_residual() SIMD optimizations

See a6af4bf64d

This merge is only cosmetics (renames, space shuffling, etc).

The functionnal changes in the ASM are *not* merged:
- unrolling with %rep is kept
- ADD_RES_MMX_4_8 is left untouched: this needs investigation

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-24 12:33:25 +01:00
Clément Bœsch
40ac226014 lavc/x86/hevc: rename hevc_res_add to hevc_add_res
This will simplify incoming merge.
2017-03-24 11:45:23 +01:00
James Almer
bac44a5020 Merge commit 'b89804da9bad2d94dd95bf20ac6187447e9c17e9'
* commit 'b89804da9bad2d94dd95bf20ac6187447e9c17e9':
  x86: videodsp: Add parentheses to expression to work around warning

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 18:35:49 -03:00
James Almer
29db87af52 Merge commit '6be7944ee2ec2f045e6eb9a93237e992c8b20ac4'
* commit '6be7944ee2ec2f045e6eb9a93237e992c8b20ac4':
  x86: Add missing colons after assembly labels

Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 18:05:27 -03:00
Clément Bœsch
947230837c Merge commit '112cee0241f5799edff0e4682b9e8639b046dc78'
* commit '112cee0241f5799edff0e4682b9e8639b046dc78':
  hevc: Add SSE2 and AVX IDCT

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-23 15:58:46 +01:00
Clément Bœsch
733b13ad66 Merge commit 'e4128c08d786eb5513578e8c6063671ba03226ab'
* commit 'e4128c08d786eb5513578e8c6063671ba03226ab':
  Revert "hevc: x86: Refactor IDCT macro declarations"

So apparently this was technically correct be reverted due to
authorship. Reverted as well in FFmpeg for now...

See http://lists.libav.org/pipermail/libav-devel/2016-October/079560.html

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-23 12:03:25 +01:00
Clément Bœsch
4bb4fa28e3 Merge commit '5801f9ed245ca5ebb57b0b5183de7a24aaece133'
* commit '5801f9ed245ca5ebb57b0b5183de7a24aaece133':
  h264_intrapred: x86: Update comments left behind in 95c89da36e

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-23 11:58:01 +01:00
Clément Bœsch
9954d5b44e Merge commit 'd9dccc03890a976dba59d66ed3b5aceeaa33d14c'
* commit 'd9dccc03890a976dba59d66ed3b5aceeaa33d14c':
  hevc: x86: Refactor IDCT macro declarations

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-23 11:54:53 +01:00
James Almer
30cadfe071 avcodec/lossless_videodsp: use ptrdiff_t for length parameters
Signed-off-by: James Almer <jamrial@gmail.com>
2017-03-22 18:38:35 -03:00
Clément Bœsch
af607b7e07 lavc/huffyuvdsp: only transmit the pix_fmt instead of the whole avctx
Only the pixel format is required in that init function. This will also
simplify the incoming merge.
2017-03-22 16:22:20 +01:00
Clément Bœsch
c66bd8f3ff Merge commit 'b57e38f52cc3f31a27105c28887d57cd6812c3eb'
* commit 'b57e38f52cc3f31a27105c28887d57cd6812c3eb':
  ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-22 12:49:29 +01:00
Clément Bœsch
e39d4ff150 Merge commit '43717469f9daa402f6acb48997255827a56034e9'
* commit '43717469f9daa402f6acb48997255827a56034e9':
  ac3dsp: Reverse matrix in/out order in downmix()

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-22 11:29:46 +01:00
James Almer
aee046a895 x86/audiodsp: remove an unnecessary movss 2017-03-22 00:14:56 -03:00
James Almer
9a0fbb9ca9 Merge commit '2caa93b813adc5dbb7771dfe615da826a2947d18'
* commit '2caa93b813adc5dbb7771dfe615da826a2947d18':
  mpegaudiodsp: Change type of array stride parameters to ptrdiff_t

Merged-by: James Almer <jamrial@gmail.com>
2017-03-21 16:04:22 -03:00
James Almer
a8474df944 Merge commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c'
* commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c':
  h264chroma: Change type of stride parameters to ptrdiff_t

Merged-by: James Almer <jamrial@gmail.com>
2017-03-21 15:20:45 -03:00
James Almer
5a49097b42 Merge commit '2ec9fa5ec60dcd10e1cb10d8b4e4437e634ea428'
* commit '2ec9fa5ec60dcd10e1cb10d8b4e4437e634ea428':
  idct: Change type of array stride parameters to ptrdiff_t

Merged-by: James Almer <jamrial@gmail.com>
2017-03-21 14:29:52 -03:00
Clément Bœsch
f54da138e9 Merge commit '009adfd4fbdd78a890a4a65d6f141c467bb027fa'
* commit '009adfd4fbdd78a890a4a65d6f141c467bb027fa':
  x86: fpel: Remove unnecessary sign extend

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-21 15:02:31 +01:00
Clément Bœsch
ad98af27f7 Merge commit 'de2ae3c1fae5a2eb539b9abd7bc2a9ca8c286ff0'
* commit 'de2ae3c1fae5a2eb539b9abd7bc2a9ca8c286ff0':
  lavc: add clobber tests for the new encoding/decoding API

The merge only re-order what we already have.

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-21 14:43:53 +01:00
Clément Bœsch
83cd80d10a Merge commit '12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5'
* commit '12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5':
  audiodsp/x86: yasmify vector_clipf_sse
  audiodsp: reorder arguments for vector_clipf

Merged the version from Libav after a discussion with James Almer on
IRC:

19:22 <ubitux> jamrial: opinion on 12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5?
19:23 <ubitux> it was apparently yasmified differently
19:23 <ubitux> (it depends on the previous commit arg shuffle)
19:24 <ubitux> i don't see the magic movsxdifnidn in your port btw
19:24 <ubitux> it's a port from 1d36defe94
19:25 <jamrial> seems better thanks to said arg shuffle
19:25 <jamrial> the loop is the same, but init is simpler
19:25 <jamrial> probably worth merging
19:25 <ubitux> OK
19:25 <ubitux> thanks
19:26 <jamrial> curious they didn't make len ptrdiff_t after the previous bunch of commits, heh
19:26 <ubitux> yeah indeed

Both commits are merged at the same time to prevent a conflict with our
existing yasmified ff_vector_clipf_sse.

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 22:35:07 +01:00
Clément Bœsch
43a4c729d4 Merge commit '75d98e30afab61542faab3c0f11880834653bd6b'
* commit '75d98e30afab61542faab3c0f11880834653bd6b':
  audiodsp/x86: clear the high bits of the order parameter on 64bit

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 18:44:00 +01:00
Clément Bœsch
072fad7cf5 Merge commit '1d6c76e11febb58738c9647c47079d02b5e10094'
* commit '1d6c76e11febb58738c9647c47079d02b5e10094':
  audiodsp/x86: fix ff_vector_clip_int32_sse2

No functionnal changes, only cosmetics. This issue was fixed in
9a9e2f1c8a.

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 18:42:37 +01:00
Clément Bœsch
e07fa3008b Merge commit 'de452e503734ebb0fdbce86e9d16693b3530fad3'
* commit 'de452e503734ebb0fdbce86e9d16693b3530fad3':
  pixblockdsp: Change type of stride parameters to ptrdiff_t

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 15:58:32 +01:00
Ilia
2f3d10a01a avcodec/vp9: avx2 implementation of ipred_dl_16x16_16
vp9_diag_downleft_16x16_10bpp_c: 263.0
vp9_diag_downleft_16x16_10bpp_sse2: 44.7
vp9_diag_downleft_16x16_10bpp_ssse3: 32.5
vp9_diag_downleft_16x16_10bpp_avx: 31.9
vp9_diag_downleft_16x16_10bpp_avx2: 25.7
vp9_diag_downleft_16x16_12bpp_c: 264.7
vp9_diag_downleft_16x16_12bpp_sse2: 44.4
vp9_diag_downleft_16x16_12bpp_ssse3: 32.0
vp9_diag_downleft_16x16_12bpp_avx: 32.4
vp9_diag_downleft_16x16_12bpp_avx2: 25.5

Benchmarked with 10000 runs

Signed-off-by: Ilia <zakne0ne@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2017-03-20 09:47:43 -04:00
Mirage Abeysekara
5eb4f95bef h264pred: added AVX2 implementation for tm_vp8 16x16.
checkasm --bench results with 5000 runs

pred16x16_tm_vp8_c: 302.8
pred16x16_tm_vp8_mmx: 101.4
pred16x16_tm_vp8_mmxext: 95.5
pred16x16_tm_vp8_sse2: 95.1
pred16x16_tm_vp8_avx2: 38.2

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2017-03-20 09:45:42 -04:00
James Almer
6966a5e4d7 Merge commit '721d57e608dc4fd6c86f27c5ae76ef559d646220'
* commit '721d57e608dc4fd6c86f27c5ae76ef559d646220':
  vp56: Separate VP5 and VP6 dsp initialization

Merged-by: James Almer <jamrial@gmail.com>
2017-03-19 17:15:24 -03:00
James Almer
663640d745 Merge commit '3fd22538bc0e0de84b31335266b4b1577d3d609e'
* commit '3fd22538bc0e0de84b31335266b4b1577d3d609e':
  prores: Change type of stride parameters to ptrdiff_t

Merged-by: James Almer <jamrial@gmail.com>
2017-03-19 15:30:13 -03:00
James Almer
aec42ebc27 Merge commit 'f81be06cf614919d71ded29b8f595bef40123ad8'
* commit 'f81be06cf614919d71ded29b8f595bef40123ad8':
  cavs: Change type of stride parameters to ptrdiff_t

Merged-by: James Almer <jamrial@gmail.com>
2017-03-19 15:23:52 -03:00
James Almer
4e4dfcac58 Merge commit '802727b538b484e3f9d1345bfcc4ab24cfea8898'
* commit '802727b538b484e3f9d1345bfcc4ab24cfea8898':
  vp8: Update some assembly comments left unchanged in bd66f073fe

Merged-by: James Almer <jamrial@gmail.com>
2017-03-19 15:18:31 -03:00
James Almer
4004d33fcb Merge commit 'd9d26a3674f31f482f54e936fcb382160830877a'
* commit 'd9d26a3674f31f482f54e936fcb382160830877a':
  vp56: Change type of stride parameters to ptrdiff_t

Merged-by: James Almer <jamrial@gmail.com>
2017-03-19 14:54:25 -03:00
Clément Bœsch
6a42a54b9d Merge commit '6892df9294d93322d43255ada299507465bc93c8'
* commit '6892df9294d93322d43255ada299507465bc93c8':
  vp3: Change type of stride parameters to ptrdiff_t

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-19 18:41:26 +01:00
Clément Bœsch
8695ce73ca Merge commit 'e2b9993558b6adee42dcc6eb385a14943aaca974'
* commit 'e2b9993558b6adee42dcc6eb385a14943aaca974':
  simple_idct: x86: Drop disabled IDCT implementation

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-19 16:11:11 +01:00
Clément Bœsch
8286c359ad Merge commit 'e99ecda55082cb9dde8fd349361e169dc383943a'
* commit 'e99ecda55082cb9dde8fd349361e169dc383943a':
  checkasm: add vp9 MC tests.
  vp9mc/x86: sse2 MC assembly.
  vp9mc/x86: add AVX and AVX2 MC
  vp9mc/x86: rename ff_* to ff_vp9_*
  vp9mc/x86: rename ff_avg[48]_sse to ff_avg[48]_mmxext
  vp9mc/x86: simplify a few inits.
  vp9mc/x86: add 16px functions (64bit only).

Noop (aside from a formatting comment in vp9mc.asm). We already have all
of this. We should consider making a final diff between the two projects
when the dust comes down.

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-16 20:25:39 +01:00
Clément Bœsch
a4f5e79f7c Merge commit '89466de4aeaf5e359489b81b8a9920a2bc7936d6'
* commit '89466de4aeaf5e359489b81b8a9920a2bc7936d6':
  vp9/x86: rename vp9dsp to vp9mc

File was already renamed, only the top description is updated.

Merged-by: Clément Bœsch <u@pkh.me>
2017-03-16 20:10:47 +01:00
James Almer
e632fe9bab Merge commit '3c504bc3599f00bfc5923adc114beef34bce11d0'
* commit '3c504bc3599f00bfc5923adc114beef34bce11d0':
  x86: deduplicate some constants

Merged-by: James Almer <jamrial@gmail.com>
2017-03-15 22:07:28 -03:00
Michael Niedermayer
835d9f299c avcodec/x86/cavsdsp: Put MMX code under mmx check
Without this the FPU state becomes trashed and causes mysterious
fate failures with cpuflags=0

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-03-06 16:47:17 +01:00
James Darnley
33de0fee2c avcodec/h264: enable sse2 chroma deblock/loop filter functions
Between 1.00 and 1.16 times faster on Intel Yorkfield Core 2 Quad.
Between 1.11 and 1.39 times faster on Intel Kaby Lake Pentium.
2017-02-27 13:22:06 +01:00
James Darnley
cd893b9307 avcodec/h264: add avx 8-bit 4:2:2 chroma h intra deblock/loop filter
~1.37x faster (147 vs. 108 cycles) compared to mmxext function
2017-02-27 13:22:06 +01:00
James Darnley
0e16b3e2be avcodec/h264: add avx 8-bit 4:2:0 chroma h intra deblock/loop filter
~1.10x faster (69 vs. 63 cycles) compared to mmxext function
2017-02-27 13:22:06 +01:00
James Darnley
987ffe4b8d avcodec/h264: add avx 8-bit chroma v intra deblock/loop filter
~1.14x faster (90 vs 78 cycles) compared with mmxext
2017-02-27 13:22:06 +01:00
James Darnley
88307b3eec avcodec/h264: add avx 8-bit 4:2:2 chroma h deblock/loop filter
~1.21x faster (68 vs. 56 cycles) compared with mmxext function
2017-02-27 13:22:06 +01:00
James Darnley
ac096fc82d avcodec/h264: add avx 8-bit 4:2:0 chroma h deblock/loop filter
~1.14x faster (93 vs. 81 cycles) compared with mmxext function
2017-02-27 13:22:06 +01:00
James Darnley
5c56758843 avcodec/h264: add avx 8-bit chroma v deblock/loop filter
~1.24x faster (101 vs. 81 cycles) compared with mmxext function
2017-02-27 13:22:06 +01:00
James Darnley
5336887867 avcodec/h264: sse2, avx h luma mbaff deblock/loop filter
x86-64 only

Yorkfield:
- sse2: ~2.17x (434 vs. 200 cycles)

Nehalem:
- sse2: ~2.94x (409 vs. 139 cycles)

Skylake:
- sse2: ~3.10x (370 vs. 119 cycles)
- avx:  ~3.29x (370 vs. 112 cycles)
2017-02-18 20:26:52 +01:00
James Darnley
e18bc2114f avcodec/h264: add named parameters to x86 function 2017-02-18 20:26:50 +01:00
James Darnley
9d815b7424 avcodec/x86: deduplicate PASS8ROWS macro 2017-02-18 20:26:49 +01:00