Cyclotomic fast Fourier transform

The cyclotomic fast Fourier transform is a type of fast Fourier transform algorithm over finite fields. This algorithm first decomposes a discrete Fourier transform into several circular convolutions. This then derives the discrete Fourier transform results from the circular convolution results. When applied to a discrete Fourier transform over G F ( 2 m ) {\displaystyle \mathrm {GF} (2^{m})}, this algorithm has a very low multiplicative complexity. In practice, since there usually exist efficient algorithms for circular convolutions with specific lengths, this algorithm is very efficient.

Background

The discrete Fourier transform over finite fields finds widespread application in the decoding of error-correcting codes such as BCH codes and Reed–Solomon codes. Generalized from the complex field, a discrete Fourier transform of a sequence { f i } 0 N − 1 {\displaystyle \{f_{i}\}_{0}^{N-1}} over a finite field G F ( p m ) {\displaystyle \mathrm {GF} (p^{m})} is defined as F j = ∑ i = 0 N − 1 f i α i j , 0 ≤ j ≤ N − 1 , {\displaystyle F_{j}=\sum _{i=0}^{N-1}f_{i}\alpha ^{ij},0\leq j\leq N-1,} where α {\displaystyle \alpha } is the N-th primitive root of 1 in G F ( p m ) {\displaystyle \mathrm {GF} (p^{m})}. If the polynomial representation of { f i } 0 N − 1 {\displaystyle \{f_{i}\}_{0}^{N-1}} can defined as f ( x ) = f 0 + f 1 x + f 2 x 2 + ⋯ + f N − 1 x N − 1 = ∑ 0 N − 1 f i x i , {\displaystyle f(x)=f_{0}+f_{1}x+f_{2}x^{2}+\cdots +f_{N-1}x^{N-1}=\sum _{0}^{N-1}f_{i}x^{i},} it is easy to see that F j {\displaystyle F_{j}} is simply f ( α j ) {\displaystyle f(\alpha ^{j})}. That is, the discrete Fourier transform of a sequence converts it to a polynomial evaluation problem.

Written in matrix format, F = [ F 0 F 1 ⋮ F N − 1 ] = [ α 0 α 0 ⋯ α 0 α 0 α 1 ⋯ α N − 1 ⋮ ⋮ ⋱ ⋮ α 0 α N − 1 ⋯ α ( N − 1 ) ( N − 1 ) ] [ f 0 f 1 ⋮ f N − 1 ] = F f . {\displaystyle \mathbf {F} =\left[{\begin{matrix}F_{0}\\F_{1}\\\vdots \\F_{N-1}\end{matrix}}\right]=\left[{\begin{matrix}\alpha ^{0}&\alpha ^{0}&\cdots &\alpha ^{0}\\\alpha ^{0}&\alpha ^{1}&\cdots &\alpha ^{N-1}\\\vdots &\vdots &\ddots &\vdots \\\alpha ^{0}&\alpha ^{N-1}&\cdots &\alpha ^{(N-1)(N-1)}\end{matrix}}\right]\left[{\begin{matrix}f_{0}\\f_{1}\\\vdots \\f_{N-1}\end{matrix}}\right]={\mathcal {F}}\mathbf {f} .}

Direct evaluation of the discrete Fourier transform has an O ( N 2 ) {\displaystyle O(N^{2})} complexity. Fast Fourier transforms are just efficient algorithms evaluating the above matrix-vector product.

Algorithm

First, define a linearized polynomial over G F ( p m ) {\displaystyle \mathrm {GF} (p^{m})} as L ( x ) = ∑ i l i x p i , l i ∈ G F ( p m ) . {\displaystyle L(x)=\sum _{i}l_{i}x^{p^{i}},l_{i}\in \mathrm {GF} (p^{m}).} Here L ( x ) {\displaystyle L(x)} is called linearized, because L ( x 1 + x 2 ) = L ( x 1 ) + L ( x 2 ) {\displaystyle L(x_{1}+x_{2})=L(x_{1})+L(x_{2})}, which comes from the fact that for elements x 1 , x 2 ∈ G F ( p m ) {\displaystyle x_{1},x_{2}\in \mathrm {GF} (p^{m})} and ( x 1 + x 2 ) p = x 1 p + x 2 p {\displaystyle (x_{1}+x_{2})^{p}=x_{1}^{p}+x_{2}^{p}}.

Notice that p {\displaystyle p} is invertible modulo N {\displaystyle N} because N {\displaystyle N} must divide the order p m − 1 {\displaystyle p^{m}-1} of the multiplicative group of the field G F ( p m ) {\displaystyle \mathrm {GF} (p^{m})}. So, the elements { 0 , 1 , 2 , … , N − 1 } {\displaystyle \{0,1,2,\ldots ,N-1\}} can be partitioned into l + 1 {\displaystyle l+1} cyclotomic cosets modulo N {\displaystyle N}: { 0 } , { k 1 , p k 1 , p 2 k 1 , … , p m 1 − 1 k 1 } , … , { k l , p k l , p 2 k l , … , p m l − 1 k l } , {\displaystyle {\begin{aligned}&\{0\},\\&\{k_{1},pk_{1},p^{2}k_{1},\ldots ,p^{m_{1}-1}k_{1}\},\\&\ldots ,\\&\{k_{l},pk_{l},p^{2}k_{l},\ldots ,p^{m_{l}-1}k_{l}\},\end{aligned}}} where k i = p m i k i ( mod N ) {\displaystyle k_{i}=p^{m_{i}}k_{i}{\pmod {N}}}. Therefore, the input to the Fourier transform can be rewritten as f ( x ) = ∑ i = 0 l L i ( x k i ) , L i ( y ) = ∑ t = 0 m i − 1 y p t f p t k i mod N . {\displaystyle f(x)=\sum _{i=0}^{l}L_{i}(x^{k_{i}}),\quad L_{i}(y)=\sum _{t=0}^{m_{i}-1}y^{p^{t}}f_{p^{t}k_{i}{\bmod {N}}}.}

In this way, the polynomial representation is decomposed into a sum of linear polynomials. Hence, F j {\displaystyle F_{j}} is given by F j = f ( α j ) = ∑ i = 0 l L i ( α j k i ) {\displaystyle F_{j}=f(\alpha ^{j})=\sum _{i=0}^{l}L_{i}(\alpha ^{jk_{i}})}. Expanding α j k i ∈ G F ( p m i ) {\displaystyle \alpha ^{jk_{i}}\in \mathrm {GF} (p^{m_{i}})} with the proper basis { β i , 0 , β i , 1 , … , β i , m i − 1 } {\displaystyle \{\beta _{i,0},\beta _{i,1},\ldots ,\beta _{i,m_{i}-1}\}} yields α j k i = ∑ s = 0 m i − 1 a i j s β i , s {\displaystyle \alpha ^{jk_{i}}=\sum _{s=0}^{m_{i}-1}a_{ijs}\beta _{i,s}}, where a i j s ∈ G F ( p ) {\displaystyle a_{ijs}\in \mathrm {GF} (p)}. By the property of the linearized polynomial L i ( x ) {\displaystyle L_{i}(x)}, this yields F j = ∑ i = 0 l ∑ s = 0 m i − 1 a i j s ( ∑ t = 0 m i − 1 β i , s p t f p t k i mod N ) . {\displaystyle F_{j}=\sum _{i=0}^{l}\sum _{s=0}^{m_{i}-1}a_{ijs}\left(\sum _{t=0}^{m_{i}-1}\beta _{i,s}^{p^{t}}f_{p^{t}k_{i}{\bmod {N}}}\right).}

This equation can be rewritten in matrix form as F = A L Π f {\displaystyle \mathbf {F} =\mathbf {AL\Pi f} }, where A {\displaystyle \mathbf {A} } is an N × N {\displaystyle N\times N} matrix over G F ( p ) {\displaystyle \mathrm {GF} (p)} that contains the elements a i j s {\displaystyle a_{ijs}}, L {\displaystyle \mathbf {L} } is a block diagonal matrix, and Π {\displaystyle \mathbf {\Pi } } is a permutation matrix regrouping the elements in f {\displaystyle \mathbf {f} } according to the cyclotomic coset index.

Note that if the normal basis { γ i p 0 , γ i p 1 , ⋯ , γ i p m i − 1 } {\displaystyle \{\gamma _{i}^{p^{0}},\gamma _{i}^{p^{1}},\cdots ,\gamma _{i}^{p^{m_{i}-1}}\}} is used to expand the field elements of G F ( p m i ) {\displaystyle \mathrm {GF} (p^{m_{i}})}, the i-th block of L {\displaystyle \mathbf {L} } is given by: L i = [ γ i p 0 γ i p 1 ⋯ γ i p m i − 1 γ i p 1 γ i p 2 ⋯ γ i p 0 ⋮ ⋮ ⋱ ⋮ γ i p m i − 1 γ i p 0 ⋯ γ i p m i − 2 ] , {\displaystyle \mathbf {L} _{i}={\begin{bmatrix}\gamma _{i}^{p^{0}}&\gamma _{i}^{p^{1}}&\cdots &\gamma _{i}^{p^{m_{i}-1}}\\\gamma _{i}^{p^{1}}&\gamma _{i}^{p^{2}}&\cdots &\gamma _{i}^{p^{0}}\\\vdots &\vdots &\ddots &\vdots \\\gamma _{i}^{p^{m_{i}-1}}&\gamma _{i}^{p^{0}}&\cdots &\gamma _{i}^{p^{m_{i}-2}}\\\end{bmatrix}},} which is a circulant matrix. It is well known that a circulant matrix-vector product can be efficiently computed by convolutions. Hence, it is successful to reduce the discrete Fourier transform into short convolutions.

Complexity

When applied to a characteristic-2 field G F ( 2 m ) {\displaystyle \mathrm {GF} (2^{m})}, the matrix A {\displaystyle \mathbf {A} } is just a binary matrix. Only addition is used when calculating the matrix-vector product of A {\displaystyle \mathrm {A} } and L Π f {\displaystyle \mathrm {L\Pi f} }. It has been shown that the multiplicative complexity of the cyclotomic algorithm is given by O ( n ( log 2 ⁡ n ) log 2 ⁡ 3 2 ) {\displaystyle O(n(\log _{2}n)^{\log _{2}{\frac {3}{2}}})}, and the additive complexity is given by O ( n 2 / ( log 2 ⁡ n ) log 2 ⁡ 8 3 ) {\displaystyle O(n^{2}/(\log _{2}n)^{\log _{2}{\frac {8}{3}}})}.