Fast Fourier Transform in Rust

https://github.com/QuState/PhastFT

Build codecov unsafe forbidden

PhastFT

PhastFT is a high-performance, "quantum-inspired" Fast Fourier Transform (FFT) library written in pure Rust.

Features

  • Simple implementation using the Cooley-Tukey FFT algorithm
  • Performance on par with other Rust FFT implementations
  • Zero unsafe code
  • Takes advantage of latest CPU features up to and including AVX-512, but performs well even without them
  • Selects the fastest implementation at runtime. No need for -C target-cpu=native!
  • Optional parallelization of some steps to 2 threads (with even more planned)
  • 2x lower memory usage than RustFFT
  • Python bindings (via PyO3)

Limitations

  • Only supports input with a length of 2^n (i.e., a power of 2) -- input should be padded with zeros to the next power of 2
  • Requires nightly Rust compiler due to use of portable SIMD

Planned features

  • Bluestein's algorithm (to handle arbitrary sized FFTs)
  • More multi-threading
  • More work on cache-optimal FFT

How is it so fast?

PhastFT is designed around the capabilities and limitations of modern hardware (that is, anything made in the last 10 years or so).

The two major bottlenecks in FFT are the CPU cycles and memory accesses.

We picked an efficient, general-purpose FFT algorithm. Our implementation can make use of latest CPU features such as AVX-512, but performs well even without them.

Our key insight for speeding up memory accesses is that FFT is equivalent to applying gates to all qubits in [0, n). This creates the opportunity to leverage the same memory access patterns as a high-performance quantum state simulator.

We also use the Cache-Optimal Bit Reversal Algorithm (COBRA) on large datasets and optionally run it on 2 parallel threads, accelerating it even further.

All of this combined results in a fast and efficient FFT implementation competitive with the performance of existing Rust FFT crates, including RustFFT, while using significantly less memory.

Quickstart

Rust

use phastft::{
    planner::Direction,
    fft_64
};
let big_n = 1 << 10;
let mut reals: Vec<f64> = (1..=big_n).map(|i| i as f64).collect();
let mut imags: Vec<f64> = (1..=big_n).map(|i| i as f64).collect();
fft_64(&mut reals, &mut imags, Direction::Forward);

Python

Follow the instructions at https://rustup.rs/ to install Rust, then switch to the nightly channel with

Then you can install PhastFT itself:

pip install numpy
pip install git+https://github.com/QuState/PhastFT#subdirectory=pyphastft
import numpy as np
from pyphastft import fft
sig_re = np.asarray(sig_re, dtype=np.float64)
sig_im = np.asarray(sig_im, dtype=np.float64)
fft(a_re, a_im)

Normalization

phastft does not normalize outputs. Users can normalize outputs after running a forward FFT followed by an inverse FFT by scaling each element by 1/N, where N is the number of data points.

Output Order

phastft always finishes processing input data by running a bit-reversal permutation on the processed data.

Benchmarks

PhastFT is benchmarked against several other FFT libraries. Scripts and instructions to reproduce benchmark results and plots are available here.

PhastFT vs. RustFFT vs. FFTW3 PhastFT vs. RustFFT vs. FFTW3

PhastFT vs. NumPy FFT vs. pyFFTW PhastFT vs. NumPy FFT vs. pyFFTW

Contributing

Contributions to PhastFT are welcome! If you find any issues or have improvements to suggest, please open an issue or submit a pull request. Follow the contribution guidelines outlined in the CONTRIBUTING.md file.

License

PhastFT is licensed under MIT or Apache 2.0 license, at your option.

PhastFT vs. RustFFT

RustFFT is another excellent FFT implementation in pure Rust. RustFFT and PhastFT make different trade-offs.

RustFFT made the choice to work on stable Rust compiler at the cost of unsafe code, while PhastFT contains no unsafe blocks but requires a nightly build of Rust compiler to access the Portable SIMD API.

RustFFT implements multiple FFT algorithms and tries to pick the best one depending on the workload, while PhastFT has a single FFT implementation and still achieves competitive performance.

PhastFT uses 2x less memory than RustFFT, which is important for processing large datasets.

What's with the name?

The name, PhastFT, is derived from the implementation of the Quantum Fourier Transform (QFT). Namely, the quantum circuit implementation of QFT consists of the Phase gates and Hadamard gates. Hence, PhastFT.

{
"by": "logicalguess",
"descendants": 10,
"id": 40231919,
"kids": [
40235261,
40235821,
40244412,
40231920,
40235023,
40235769
],
"score": 55,
"time": 1714614567,
"title": "Fast Fourier Transform in Rust",
"type": "story",
"url": "https://github.com/QuState/PhastFT"
}
{
"author": "QuState",
"date": null,
"description": "A high-performance, “quantum-inspired” Fast Fourier Transform (FFT) library written in pure and safe Rust. - QuState/PhastFT",
"image": "https://opengraph.githubassets.com/a2e6297ceced68b2e6ddee16088176e7d0343707d8017c5c6523c26728e446d1/QuState/PhastFT",
"logo": "https://logo.clearbit.com/github.com",
"publisher": "GitHub",
"title": "GitHub - QuState/PhastFT: A high-performance, “quantum-inspired” Fast Fourier Transform (FFT) library written in pure and safe Rust.",
"url": "https://github.com/QuState/PhastFT"
}
{
"url": "https://github.com/QuState/PhastFT",
"title": "GitHub - QuState/PhastFT: A high-performance, \"quantum-inspired\" Fast Fourier Transform (FFT) library written in pure and safe Rust.",
"description": "PhastFT PhastFT is a high-performance, \"quantum-inspired\" Fast Fourier Transform (FFT) library written in pure Rust. Features Simple implementation using the Cooley-Tukey FFT algorithm Performance on par...",
"links": [
"https://github.com/QuState/PhastFT"
],
"image": "https://opengraph.githubassets.com/a2e6297ceced68b2e6ddee16088176e7d0343707d8017c5c6523c26728e446d1/QuState/PhastFT",
"content": "<div><article><p><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT/actions/workflows/rust.yml\"><img src=\"https://github.com/QuState/PhastFT/actions/workflows/rust.yml/badge.svg\" alt=\"Build\" /></a>\n<a target=\"_blank\" href=\"https://codecov.io/gh/QuState/PhastFT\"><img src=\"https://camo.githubusercontent.com/5e25c0c038730548ea255c47fdd2a61c53deba2a88299b83674c22829d43dd2a/68747470733a2f2f636f6465636f762e696f2f67682f517553746174652f506861737446542f67726170682f62616467652e7376673f746f6b656e3d494d3836584d5552484e\" alt=\"codecov\" /></a>\n<a target=\"_blank\" href=\"https://github.com/rust-secure-code/safety-dance/\"><img src=\"https://camo.githubusercontent.com/ed5ba7b59dd75f96c13b6c491b35a18efe633a9599e8b9e28d7360de154297c6/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f756e736166652d666f7262696464656e2d737563636573732e737667\" alt=\"unsafe forbidden\" /></a>\n<a target=\"_blank\" href=\"https://crates.io/crates/phastft\"><img src=\"https://camo.githubusercontent.com/fa11d202ea333381c1182f15745f1852099edccb2a0bb92c488f69e457baa24e/68747470733a2f2f696d672e736869656c64732e696f2f6372617465732f762f70686173746674\" /></a>\n<a target=\"_blank\" href=\"https://docs.rs/phastft/\"><img src=\"https://camo.githubusercontent.com/97b57a332b885c110ee1fc73c4ee8ee8018e1478717fb9503562c534f37be9bc/68747470733a2f2f646f63732e72732f706861737466742f62616467652e737667\" /></a></p>\n<p></p><h2>PhastFT</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#phastft\"></a><p></p>\n<p>PhastFT is a high-performance, \"quantum-inspired\" Fast Fourier\nTransform (FFT) library written in pure Rust.</p>\n<p></p><h2>Features</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#features\"></a><p></p>\n<ul>\n<li>Simple implementation using the Cooley-Tukey FFT algorithm</li>\n<li>Performance on par with other Rust FFT implementations</li>\n<li>Zero <code>unsafe</code> code</li>\n<li>Takes advantage of latest CPU features up to and including <code>AVX-512</code>, but performs well even without them</li>\n<li>Selects the fastest implementation at runtime. No need for <code>-C target-cpu=native</code>!</li>\n<li>Optional parallelization of some steps to 2 threads (with even more planned)</li>\n<li>2x lower memory usage than <a target=\"_blank\" href=\"https://crates.io/crates/rustfft/\">RustFFT</a></li>\n<li>Python bindings (via <a target=\"_blank\" href=\"https://github.com/PyO3/pyo3\">PyO3</a>)</li>\n</ul>\n<p></p><h2>Limitations</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#limitations\"></a><p></p>\n<ul>\n<li>Only supports input with a length of <code>2^n</code> (i.e., a power of 2) -- input should be padded with zeros to the next power\nof 2</li>\n<li>Requires nightly Rust compiler due to use of portable SIMD</li>\n</ul>\n<p></p><h2>Planned features</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#planned-features\"></a><p></p>\n<ul>\n<li>Bluestein's algorithm (to handle arbitrary sized FFTs)</li>\n<li>More multi-threading</li>\n<li>More work on cache-optimal FFT</li>\n</ul>\n<p></p><h2>How is it so fast?</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#how-is-it-so-fast\"></a><p></p>\n<p>PhastFT is designed around the capabilities and limitations of modern hardware (that is, anything made in the last 10\nyears or so).</p>\n<p>The two major bottlenecks in FFT are the <strong>CPU cycles</strong> and <strong>memory accesses</strong>.</p>\n<p>We picked an efficient, general-purpose FFT algorithm. Our implementation can make use of latest CPU features such as\n<code>AVX-512</code>, but performs well even without them.</p>\n<p>Our key insight for speeding up memory accesses is that FFT is equivalent to applying gates to all qubits in <code>[0, n)</code>.\nThis creates the opportunity to leverage the same memory access patterns as\na <a target=\"_blank\" href=\"https://github.com/QuState/spinoza\">high-performance quantum state simulator</a>.</p>\n<p>We also use the Cache-Optimal Bit Reversal\nAlgorithm (<a target=\"_blank\" href=\"https://csaws.cs.technion.ac.il/~itai/Courses/Cache/bit.pdf\">COBRA</a>)\non large datasets and optionally run it on 2 parallel threads, accelerating it even further.</p>\n<p>All of this combined results in a fast and efficient FFT implementation competitive with\nthe performance of existing Rust FFT crates,\nincluding <a target=\"_blank\" href=\"https://crates.io/crates/rustfft/\">RustFFT</a>, while using significantly less memory.</p>\n<p></p><h2>Quickstart</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#quickstart\"></a><p></p>\n<p></p><h3>Rust</h3><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#rust\"></a><p></p>\n<div><pre><span>use</span> phastft<span>::</span><span>{</span>\n planner<span>::</span><span>Direction</span><span>,</span>\n fft_64\n<span>}</span><span>;</span>\n<span>let</span> big_n = <span>1</span> &lt;&lt; <span>10</span><span>;</span>\n<span>let</span> <span>mut</span> reals<span>:</span> <span>Vec</span><span>&lt;</span><span>f64</span><span>&gt;</span> = <span>(</span><span>1</span>..=big_n<span>)</span><span>.</span><span>map</span><span>(</span>|i| i <span>as</span> <span>f64</span><span>)</span><span>.</span><span>collect</span><span>(</span><span>)</span><span>;</span>\n<span>let</span> <span>mut</span> imags<span>:</span> <span>Vec</span><span>&lt;</span><span>f64</span><span>&gt;</span> = <span>(</span><span>1</span>..=big_n<span>)</span><span>.</span><span>map</span><span>(</span>|i| i <span>as</span> <span>f64</span><span>)</span><span>.</span><span>collect</span><span>(</span><span>)</span><span>;</span>\n<span>fft_64</span><span>(</span><span>&amp;</span><span>mut</span> reals<span>,</span> <span>&amp;</span><span>mut</span> imags<span>,</span> <span>Direction</span><span>::</span><span>Forward</span><span>)</span><span>;</span></pre></div>\n<p></p><h3>Python</h3><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#python\"></a><p></p>\n<p>Follow the instructions at <a target=\"_blank\" href=\"https://rustup.rs/\">https://rustup.rs/</a> to install Rust, then switch to the nightly channel with</p>\n<p>Then you can install PhastFT itself:</p>\n<div><pre>pip install numpy\npip install git+https://github.com/QuState/PhastFT#subdirectory=pyphastft</pre></div>\n<div><pre><span>import</span> <span>numpy</span> <span>as</span> <span>np</span>\n<span>from</span> <span>pyphastft</span> <span>import</span> <span>fft</span>\n<span>sig_re</span> <span>=</span> <span>np</span>.<span>asarray</span>(<span>sig_re</span>, <span>dtype</span><span>=</span><span>np</span>.<span>float64</span>)\n<span>sig_im</span> <span>=</span> <span>np</span>.<span>asarray</span>(<span>sig_im</span>, <span>dtype</span><span>=</span><span>np</span>.<span>float64</span>)\n<span>fft</span>(<span>a_re</span>, <span>a_im</span>)</pre></div>\n<p></p><h3>Normalization</h3><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#normalization\"></a><p></p>\n<p><code>phastft</code> does not normalize outputs. Users can normalize outputs after running a forward FFT followed by an inverse\nFFT by scaling each element by <code>1/N</code>, where <code>N</code> is the number of data points.</p>\n<p></p><h3>Output Order</h3><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#output-order\"></a><p></p>\n<p><code>phastft</code> always finishes processing input data by running\na <a target=\"_blank\" href=\"https://en.wikipedia.org/wiki/Bit-reversal_permutation\">bit-reversal permutation</a> on the processed data.</p>\n<p></p><h2>Benchmarks</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#benchmarks\"></a><p></p>\n<p>PhastFT is benchmarked against several other FFT libraries. Scripts and instructions to reproduce benchmark results and\nplots are available <a target=\"_blank\" href=\"https://github.com/QuState/PhastFT/tree/main/benches#readme\">here</a>.</p>\n<p>\n <a target=\"_blank\" href=\"https://raw.githubusercontent.com/QuState/PhastFT/main/assets/benchmarks_bar_plot_4_12.png\"><img src=\"https://raw.githubusercontent.com/QuState/PhastFT/main/assets/benchmarks_bar_plot_4_12.png\" title=\"PhastFT vs. RustFFT vs. FFTW3\" alt=\"PhastFT vs. RustFFT vs. FFTW3\" /></a>\n <a target=\"_blank\" href=\"https://raw.githubusercontent.com/QuState/PhastFT/main/assets/benchmarks_bar_plot_13_29.png\"><img src=\"https://raw.githubusercontent.com/QuState/PhastFT/main/assets/benchmarks_bar_plot_13_29.png\" title=\"PhastFT vs. RustFFT vs. FFTW3\" alt=\"PhastFT vs. RustFFT vs. FFTW3\" /></a>\n</p>\n<p>\n <a target=\"_blank\" href=\"https://raw.githubusercontent.com/QuState/PhastFT/main/assets/py_benchmarks_bar_plot_0_8.png\"><img src=\"https://raw.githubusercontent.com/QuState/PhastFT/main/assets/py_benchmarks_bar_plot_0_8.png\" title=\"PhastFT vs. NumPy FFT vs. pyFFTW\" alt=\"PhastFT vs. NumPy FFT vs. pyFFTW\" /></a>\n <a target=\"_blank\" href=\"https://raw.githubusercontent.com/QuState/PhastFT/main/assets/py_benchmarks_bar_plot_9_28.png\"><img src=\"https://raw.githubusercontent.com/QuState/PhastFT/main/assets/py_benchmarks_bar_plot_9_28.png\" title=\"PhastFT vs. NumPy FFT vs. pyFFTW\" alt=\"PhastFT vs. NumPy FFT vs. pyFFTW\" /></a>\n</p>\n<p></p><h2>Contributing</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#contributing\"></a><p></p>\n<p>Contributions to PhastFT are welcome! If you find any issues or have improvements to suggest, please open an issue or\nsubmit a pull request. Follow the contribution guidelines outlined in the CONTRIBUTING.md file.</p>\n<p></p><h2>License</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#license\"></a><p></p>\n<p>PhastFT is licensed under MIT or Apache 2.0 license, at your option.</p>\n<p></p><h2>PhastFT vs. RustFFT</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#phastft-vs-rustfft\"></a><p></p>\n<p><a target=\"_blank\" href=\"https://crates.io/crates/rustfft/\">RustFFT</a> is another excellent FFT implementation in pure Rust.\nRustFFT and PhastFT make different trade-offs.</p>\n<p>RustFFT made the choice to work on stable Rust compiler at the cost of <code>unsafe</code> code,\nwhile PhastFT contains no <code>unsafe</code> blocks but requires a nightly build of Rust compiler\nto access the Portable SIMD API.</p>\n<p>RustFFT implements multiple FFT algorithms and tries to pick the best one depending on the workload,\nwhile PhastFT has a single FFT implementation and still achieves competitive performance.</p>\n<p>PhastFT uses 2x less memory than RustFFT, which is important for processing large datasets.</p>\n<p></p><h2>What's with the name?</h2><a target=\"_blank\" href=\"https://github.com/QuState/PhastFT#whats-with-the-name\"></a><p></p>\n<p>The name, <strong>PhastFT</strong>, is derived from the implementation of the\n<a target=\"_blank\" href=\"https://en.wikipedia.org/wiki/Quantum_Fourier_transform\">Quantum Fourier Transform</a> (QFT). Namely, the\n<a target=\"_blank\" href=\"https://en.wikipedia.org/wiki/Quantum_Fourier_transform#Circuit_implementation\">quantum circuit implementation of QFT</a>\nconsists of the <strong>P</strong>hase gates and <strong>H</strong>adamard gates. Hence, <strong>Ph</strong>astFT.</p>\n</article></div>",
"author": "",
"favicon": "https://github.githubassets.com/favicons/favicon.svg",
"source": "github.com",
"published": "",
"ttr": 126,
"type": "object"
}