Try   HackMD

Circle FFT and its implementation

Abstract: This paper primarily examines the work of Ulrich Haböck et al. titled "Circle STARKs," interpreting Circle FFT and its implementation aspects. The analysis focuses on three main areas: 1) the foundation for accelerating basic operations with CFFT; 2) the implementation of CFFT calculations; and 3) the coding implementation of CFFT.

1. Introduction

In traditional STARKs, Fast Fourier Transform (FFT) is used to efficiently perform interpolation and write adjacent row constraints. To utilize FFT for fundamental polynomial operations, the finite field

Fp must possess a smooth-order root of unity. Besides the FFT itself, the choice of finite field
p
directly influences the complexity of arithmetic operations, thereby impacting the efficiency of STARKs. The most effective field for arithmetic operations is the Mersenne prime field, specifically
p=2e1
. Notably,
p=2311
allows for highly efficient arithmetic operations on 32-bit computers. For these reasons, traditional FFT is inadequate in meeting the efficiency demands of STARKs. In reference [1], the authors continue the ECFFT [2][3] approach, constructing Circle STARKs based on the Mersenne prime
p=2311
on the circle defined by the equation
x2+y2=1
. The core innovation lies in the introduction of Circle FFT (CFFT) within STARKs.

2. Circle FFT

2.1 Fundamentals of CFFT

The choice of field significantly affects the efficiency of FFT acceleration and the feasibility of using FFT for this purpose. A prime

p that is friendly to CFFT possesses the property
p3mod4
. The set of
p+1
points on the curve
C=C(Fp)
forms a group, defined by the group operation
(x0,y0)(x1,y1):=(x0x1y0y1,x0y1+x1y0)
, where the circular group
C(Fp)
is a cyclic group. Additionally, we define the rotation operation
TP(x,y):=P(x,y)
, the squaring map
π(x,y):=(x,y)(x,y)
, and the group inverse
J(x,y):=(x,y)
.

When using FFT acceleration, the length of the coefficients to be computed must be

2n(n1). For CFFT, we define a double coset of size
N=2n
as
D=QGn1Q1Gn1
. Here,
Gn1
is a subgroup of size
2n1
of
C(Fp)
, and
QC(Fp)
. When the prime
p3mod4
, we have
D=QGn=QGn1Q1Gn1
. When
D
is a coset of the subgroup
Gn
, it is the standard position coset of size
N
. Furthermore, under the squaring map, the set
π(D)
of size
N/2
shares the same properties as
D
, meaning it is also a double coset or standard coset. The standard position coset is illustrated in Figure 1, where the elements are evenly distributed along the circle, corresponding to the binary size of the set. For
n1
πn1(D)={(xD,yD)}
contains only two elements.
Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →


Figure 1: Diagram of the three minimal standard position cosets in the affine plane over

Fp

In Circle STARKs, prior to CFFT acceleration, we first define the space of bivariate polynomials whose coefficients belong to the extension field

F:

(1)LN(F)={p(x,y)F[x,y]/(x2+y21):degpN2}

where

N is an even number. The space
LN(F)
has rotational invariance and good separability. These key properties are necessary for efficient encoding. By repeatedly substituting
y2=1x2
into the polynomial
p(x,y)
derived from
LN(F)
, we obtain

(2)p(x,y)=p0(x)+yp1(x)
where
p0(x)F[x]N/2
and
p1(x)F[x]N/21
. Equation
(2)
is referred to as the canonical form of the polynomial derived from
LN(F)
. The canonical form indicates that the set of monomials

(3)1,x,,xN2,y,yx,,yxN21

spans the space

LN(F). Based on the dimension of
LN(F)
, the canonical form must be a basis, specifically a monomial basis. Additionally, for details on the cyclic code, refer to reference [1].

For the double coset

D=QGn1Q1Gn1, with
QC(Fp)Gn
, CFFT completes the interpolation of functions derived from
FD
by calculating the coefficients corresponding to the basis of Circle FFT
Bn
. The basis for Circle FFT consists of N-dimensional polynomial bases that depend solely on the size of the field. We define the
n
-th order FFT basis as

(4)bj(n)(x,y):=yj0v1(x)j1vn1jn1(x),0j2n1

where

0j2n1 and
j=j0+j12++jn12n1((j0,,jn1){0,1}n)
;
vk(x)
is the vanishing polynomial of the standard position coset of size
2k
[1]. The family formed by
bj(n)
constitutes the set
BnLN(F)
.

According to the definitions above, the double coset

D exhibits invariance under the wrapping operation
J
, and each
J
-orbit in
D
contains only two points. Consequently, the quotient mapping

(5)ϕJ:DD/J,P{P,J(P)}

is a two-to-one mapping. The double coset obtained through recursion is given by

Dj=π(Dj+1), for
j=n1,,1
, corresponding to the descending chain of subgroups
Gn1Gn2G0
. Since the operations
J
and
π
are commutative, we can obtain the commutative diagram:

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Figure 2: Commutative diagram of the J and
𝜋 mappings: each mapping is two-to-one, and the coset size is halved.

Alternatively, the quotient

Sj=Dj/J can be considered as a subset of the x-axis, and
ϕJ=πx
serves as the projection onto the x-axis. Thus, the commutative diagram can be transformed into

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Figure 3: Transformed commutative diagram.

In the diagram, the squaring homomorphism mapping

π:SjSj1 is a two-to-one mapping given by
x2x21
.

2.2 Mathematical Description of CFFT

Given a double coset of size

2n, FFT is a divide-and-conquer algorithm that iteratively simplifies the interpolation problem for
fFD
along the projection chain:
(6)D=DnϕJ=πxSnπSn1ππS1

From the projection chain, it is evident that the entire chain is divided into two stages.

In the first stage, we use

t0(x,y)=y as the "reference odd function" to decompose
fFD
into "odd" and "even" parts according to the wrapping function
J
, resulting in two unique functions
f0
and
f1FSn
defined on the univariate domain
Sn
, specifically:

(7)f0(x)=f(x,y)+f(x,y)2

(8)f1(x)=f(x,y)f(x,y)2y
and

(9)f(x,y)=f0(x)+yf1(x)

If we factor out the coefficient

12 from equations (7) and (8) and consider
y=y1
as the rotation factor, this decomposition is quite similar to the butterfly operation in traditional FFT.

In the second stage, we select

t1(x,y)=y as the "reference odd function." On the projection domain
Sj1=π(Sj)
, the iterative decomposition yields:
(10)f0(π(x))=f(x)+f(x)2

(11)f1(π(x))=f(x)f(x)2x
and
(12)f(x)=f0(π(x))+xf1(π(x))

where

π(x)=2x21. This decomposition process continues until the projection domain reduces to the single point
S1
.

The output of the CFFT algorithm is a constant

ck=fk0,,knjF, where
0k2n1
and
k0,,kn1
are the binary representation bits of
k
. Thus, for a given basis space
Bj(0)Bj
:

(13)Bj(0)={b2k(j):0k2j11},1jn

We have:

(14)f(x)=k=02j1ckb2k(j+1)(x)

When

j=n,
ck
represents the output of the CFFT.

3. CFFT Implementation

In Algorithm [1], the authors provide the algorithm's pseudocode.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

From the pseudocode, it is evident that the entire algorithm is divided into two stages: 1) operations at the first level; and 2) operations in the subsequent layers. The primary difference lies in the calculation of the rotation factors.

3.1 Calculation of Rotation Factors

As seen in the pseudocode, although the implementation of CFFT differs between the first layer and the subsequent layers, the inconsistency mainly arises from the calculation of rotation factors, while the butterfly operation remains consistent. In the RUST implementation, the calculation of rotation factors is

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

From lines 132 to 142 of the code, it can be observed that the calculation of rotation factors distinguishes between the first layer and the remaining layers. It is necessary to differentiate between the first step and the second step, which are completed in init_domain and working_domain, respectively. This represents a significant difference from the calculation of rotation factors in traditional FFT. The generation of the initial domain is accomplished by calling the cfft_domain function.
Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

3.2 Butterfly Operator

In the implementation code, the butterfly operators for the forward and inverse transformations are as follows:

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

and
Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

It can be seen that in the implementation, the butterfly operator of CFFT is essentially consistent with that of traditional FFT.

3.3 Basic Implementation of CFFT

After the rotation factors have been precomputed, CFFT can be implemented directly based on the pseudocode.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

In the implementation, the butterfly operation is realized using direct code. Alternatively, it can be called directly using fn butterfly_cfft().


References:
[1] Ulrich Haböck and David Levit and Shahar Papini. Circle STARKs. In Cryptology ePrint Archive, Paper 2024/278, 2024. https://eprint.iacr.org/2024/278
[2] Eli Ben-Sasson, Dan Carmon, Swastik Kopparty, and David Levit. Elliptic Curve Fast Fourier Transform (ECFFT) Part I: Fast polynomial algorithms over all finite fields. In Electronic Colloquium on Computational Complexity, volume TR21-103, 2021. https: //eccc.weizmann.ac.il/report/2021/103/.
[3] Eli Ben-Sasson, Dan Carmon, Swastik Kopparty, and David Levit. Scalable and transparent proofs over all large fields, via elliptic curves (ECFFT part II). In IACR preprint archive, 2022. https://eprint.iacr.org/2022/1542.
[4] GitHub - Plonky3/Plonky3 at Circle-Fast-Fourier-Transform. https://github.com/Plonky3/Plonky3