Lynne
ace42cf581
x86/tx_float: add 15xN PFA FFT AVX SIMD
...
~4x faster than the C version.
The shuffles in the 15pt dim1 are seriously expensive. Not happy with it,
but I'm contempt.
Can be easily converted to pure AVX by removing all vpermpd/vpermps
instructions.
2022-09-23 12:35:27 +02:00
..
2016-06-27 17:21:18 +02:00
2020-01-23 18:30:26 +02:00
2022-03-10 16:45:48 -03:00
2022-03-10 16:45:48 -03:00
2022-02-08 10:42:26 +01:00
2022-02-08 10:42:26 +01:00
2022-02-21 12:37:51 +01:00
2022-02-08 10:42:26 +01:00
2022-09-13 17:43:15 +02:00
2022-09-13 13:50:09 -03:00
2022-02-24 12:56:49 +01:00
2017-03-23 18:05:27 -03:00
2021-11-19 11:21:03 -03:00
2022-09-11 21:08:04 +02:00
2021-04-19 14:34:10 +02:00
2022-02-08 10:42:26 +01:00
2022-01-27 02:17:46 +01:00
2022-06-22 13:36:44 +02:00
2022-06-22 13:36:44 +02:00
2022-09-23 12:35:27 +02:00
2022-09-23 12:35:27 +02:00
2016-01-28 19:49:48 -08:00
2022-03-10 16:45:48 -03:00
2017-12-02 18:25:15 +01:00