Lynne
ace42cf581
x86/tx_float: add 15xN PFA FFT AVX SIMD
...
~4x faster than the C version.
The shuffles in the 15pt dim1 are seriously expensive. Not happy with it,
but I'm contempt.
Can be easily converted to pure AVX by removing all vpermpd/vpermps
instructions.
2022-09-23 12:35:27 +02:00
..
2022-05-06 05:19:49 +02:00
2022-09-23 12:35:27 +02:00
2022-09-21 20:24:40 +02:00
2022-09-21 07:12:39 +02:00
2022-09-21 07:12:39 +02:00
2022-08-05 03:28:45 +02:00
2022-09-21 20:26:40 +02:00
2022-08-16 14:00:34 +02:00
2022-05-25 08:04:58 +02:00
2022-04-01 10:03:33 +03:00
2022-09-23 01:50:59 +02:00
2022-09-21 07:12:39 +02:00
2022-08-17 00:00:50 +03:00
2022-08-19 22:54:51 +03:00
2022-05-06 05:19:50 +02:00
2022-05-06 05:33:38 +02:00
2022-04-01 10:03:33 +03:00
2022-05-06 05:19:49 +02:00
2022-05-06 05:19:49 +02:00
2022-05-06 05:19:50 +02:00
2022-05-06 05:19:50 +02:00
2022-05-06 05:19:50 +02:00
2022-09-19 21:28:23 -03:00