[Halo2-Dev] Optimized Poseidon Hash Circuit

## 1. Overview Following our previous discussion, in this article, we will delve into circuit optimization techniques in zero-knowledge proof protocols. The code referenced here is from [Scroll's poseidon hash circuit](https://github.com/scroll-tech/poseidon-circuit)(scroll-dev-0723). This GitHub repository contains both the optimized version and the original version of the poseidon hash circuit. Our aim in this article is to understand the circuit optimization techniques through a hands-on example. $\quad$ $\quad$ ## 2. Parameters & Basic Settings ### Recap ![](https://hackmd.io/_uploads/S1sFHvj-a.png) - $r$ stands for rate, representing the throughput equivalent to the length of data blocks, where data is processed in rate units. - $c$ stands for capacity, indicating higher security levels with larger values but potentially slower processing speed. - Number of rounds: In the sponge construction, permutation is applied iteratively over several rounds, and this parameter defines the number of iterations. - State: An intermediate calculation result updated during the permutation process. The length of the state is $t = r + c$, and its values change as rounds progress. $\quad$ ### Settings The following parameters can be configured in various combinations. However, when changing values, it's crucial to consider security. - S-box defined as $S(x) = x^5$ is used. - $R_f$ (Full Rounds) = $8$, $R_p$ (Partial Rounds) = $57$ - Capacity $c = 1$ - Rate $r = 2$ - Width ($t$) $= 3$ elements (Rate + Capacity) The number of rounds is configured as follows: $4$ half of full rounds ($R_f/2$) are followed by $57$ partial rounds $R_p$, and $4$ half of full rounds conclude the sequence. $\quad$ ### Notation In the following notations, the index $k$ is defined within the range of $0\leq k< t$ for width $t$, and the index $j$ is defined within the range of $0\leq j< 6$. - $\vec{s}[k],\vec{s'}[k],\vec{sp}[j] \vec{t}$: advice columns where each round's state values are stored during the permutation process. - $s_i[k]$: the $k$-th state in the $i$-th round. - $\vec{rc}[k], \vec{rc'},\vec{rp}[j]$: fixed columns where the round constants used during the permutation process are stored. - $rc_i[k]$ : round constants (*$rc_{4}[0]$ is not a fixed column but part of a gate). - $rc'_i[k]$ : redefined round constant according to the optimization. - $M, N$: the MDS matrix and its inverse, both defined as a $3\times 3$ matrix concerning width $t$. - $\vec{m_k} = (m_k[0], m_k[1],m_k[2])$: the $k$-th row vector of the $3\times 3$ MDS matrix. - ${\vec{n}_k} = ({n}^{-1}_k[0], {n}^{-1}_k[1],{n}^{-1}_k[2])$: the $k$-th row vector of the $3\times 3$ MDS inverse matrix. - $\vec{s}_f, \vec{s}_{p}, \vec{s}_{t}$: binary vectors comprised of $0$ and $1$. - $is_{last}$: selector column used during the permutation process. $\quad$ $\quad$ ## 3. Circuit Design ### Gates constituting the permutation operation Below, the gates defined in the code are represented in a table format. In each table, pink represents the advice column values, while light blue is used to differentiate the fixed column values. In the table below, ${{\vec{s}[0], \vec{s}[1], \vec{s}[2]}}$ represents the state vectors for width $= 3$, and ${{\vec{rc_a}[0], \vec{rc_a}[1], \vec{rc_a}[2]}}$ denote the round constant vectors. $\quad$ - `full round` The first round of the permutation starts as a full round. The table below represents the areas defined by relative positions when the selector $\vec{s}_f$ is $1$. | $\color{#DAA7AA}{{\vec{s}[0]}}$ | $\color{#DAA7AA}{{\vec{s}[1]}}$ | $\color{#DAA7AA}{{\vec{s}[2]}}$ | $\color{#87AAB3}{{\vec{rc_a}[0]}}$ | $\color{#87AAB3}{{\vec{rc_a}[1]}}$ | $\color{#87AAB3}{{\vec{rc_a}[2]}}$ | $\vec{s}_f$ | |:-:|:-:|:-:|:-:|:-:|:-:|:-:| | $\color{#DAA7AA}{s_{i}[0]}$ | $\color{#DAA7AA}{s_{i}[1]}$ | $\color{#DAA7AA}{s_{i}[2]}$ | $\color{#87AAB3}{a_i[0]}$ | $\color{#87AAB3}{a_i[1]}$ | $\color{#87AAB3}{a_i[2]}$ | $1$ | | $\color{#DAA7AA}{s_{i+1}[0]}$ | $\color{#DAA7AA}{s_{i+1}[1]}$ | $\color{#DAA7AA}{s_{i+1}[2]}$ | | | | | $i$-th round's state operation expression is defined as follows: $\quad$ $$ \begin{matrix} exp_{k} = \sum_{j=0..2}\{(s_i[j]+a_i[j])^5*m_k[j]\} \end{matrix} $$ $\quad$ From the table above, by defining the constraints for the areas where $\vec{s}_f$ is $1$, we can verify whether the result of performing a full round operation on $s_i[k]$ matches $s_{i+1}[k]$. $\quad$ $$ \begin{matrix} exp_{0}-s_{i+1}[0] = 0\\ \quad \\ exp_{1}-s_{i+1}[1] = 0\\ \quad \\ exp_{2}-s_{i+1}[2] = 0 \end{matrix} $$ $\quad$ - `transition round` The **transition round** defines the gate for the state when transitioning from a full round to a partial round and for the output state of the last round. | $\color{#DAA7AA}{\vec{t\_in}}$ | $\color{#DAA7AA}{\vec{t\_out}[0]}$ | $\color{#DAA7AA}{{\vec{t\_out}}[1]}$ | $\color{#DAA7AA}{{\vec{t\_out}}[2]}$ |$\vec{s}_{t}$ | |:-:|:-:|:-:|:-:|:-:| | $\color{red}{{{{t\_in}}_{i-3}}}$ | $\color{#DAA7AA}{{t\_out}_{i-3}[0]}$ | $\color{#DAA7AA}{{t\_out}_{i-3}[1]}$ | $\color{#DAA7AA}{{t\_out}_{i-3}[2]}$ | | | $\color{#DAA7AA}{{t\_in}_{i-2}}$ | | | $\color{#DAA7AA}{{t\_in}_{i-1}}$ | | | $\color{#DAA7AA}{{t\_in}_{i}}$ | | | |$1$ | For the row where the value of $\vec{s}_{t}$ is $1$, let's designate the index as $i$. We'll term the input vector of the transition round as $\vec{t\_in}_{i}$ and its output as ${t\_out}_{i-3}[k]$. The reason for defining this domain is to assign the result state of the full round to ${t\_in}_{i-2}, {t\_in}_{i-1}, {t\_in}_{i}$ and make the output state of the performed partial round as ${t\_out}_{i-3}[k]$, so that the result states of the subsequent partial rounds are positioned in the rows below. Hence, the constraints between the input and output defined in the `transition round` should satisfy the following: $\quad$ $$ \begin{matrix} expr_{k} = (t\_in_{i-2}[0]+rc_4[0])^5*m_k[0]+\color{red}{t\_in_{i-1}[0]*m_k[1]}+\color{red}{t\_in_{i}[0]*m_k[2]}\\ \quad \\ expr_{0} - t\_out_{i-3}[0]=0\\ \quad \\ expr_{1} - t\_out_{i-3}[1]=0\\ \quad \\ expr_{2} - t\_out_{i-3}[2]=0\end{matrix} $$ $\quad$ - `septuple round` The septuple round outlines the constraints of operations performed during the partial rounds. | $\color{#DAA7AA}{\vec{s'}[0]}$ | $\color{#DAA7AA}{\vec{s'}[1]}$ | $\color{#DAA7AA}{\vec{s'}[2]}$ | $\color{#DAA7AA}{\vec{sp}[0]}$ | $\color{#DAA7AA}{\vec{sp}[1]}$ | $\color{#DAA7AA}{\vec{sp}[2]}$ | $\color{#DAA7AA}{\vec{sp}[3]}$ | $\color{#DAA7AA}{\vec{sp}[4]}$ | $\color{#DAA7AA}{\vec{sp}[5]}$ | $\color{#87AAB3}{\vec{rc'}}$ | $\color{#87AAB3}{\vec{rp}[0]}$ | $\color{#87AAB3}{\vec{rp}[1]}$ | $\color{#87AAB3}{\vec{rp}[2]}$ | $\color{#87AAB3}{\vec{rp}[3]}$ | $\color{#87AAB3}{\vec{rp}[4]}$ | $\color{#87AAB3}{\vec{rp}[5]}$ | $\vec{s}_{p}$ | |:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:| | $\color{#DAA7AA}{s_i[0]}$ | $\color{#DAA7AA}{s_i[1]}$ | $\color{#DAA7AA}{s_i[2]}$ | $\color{#DAA7AA}{s_{i+1}[0]}$ | $\color{#DAA7AA}{s_{i+2}[0]}$ | $\color{#DAA7AA}{s_{i+3}[0]}$ | $\color{#DAA7AA}{s_{i+4}[0]}$ | $\color{#DAA7AA}{s_{i+5}[0]}$ | $\color{#DAA7AA}{s_{i+6}[0]}$ | $\color{#87AAB3}{rc'_i[0]}$ | $\color{#87AAB3}{rc'_{i+1}[0]}$ | $\color{#87AAB3}{rc'_{i+2}[0]}$ | $\color{#87AAB3}{rc'_{i+3}[0]}$ | $\color{#87AAB3}{rc'_{i+4}[0]}$ | $\color{#87AAB3}{rc'_{i+5}[0]}$ | $\color{#87AAB3}{rc'_{i+6}[0]}$ | $1$ | | $\color{#DAA7AA}{s_{i+7}[0]}$ | $\color{#DAA7AA}{s_{i+7}[1]}$ | $\color{#DAA7AA}{s_{i+7}[2]}$ | | | | | | | $\color{#87AAB3}{rc'_{i+7}[0]}$| | | | | | | | | The input state for the septuple round gate is represented by $s_i[k]$, while its output is denoted as $s_{i+7}[k]$. This means that a single gate encapsulates the constraints pertaining to 7 partial rounds. However, not all states are allocated. Only the initial states of each round, where nonlinear operations are performed, are defined as constraints from $\vec{sp}[0]$ to $\vec{sp}[5]$. The feasibility of this setup is largely attributed to two primary optimization outcomes. At its core, this follows the approach of [Filecoin's Poseidon Optimization](https://hackmd.io/@wooju/halo2_poseidon_hash_filecoin). For more in-depth details, you can refer to the provided link. 1. [Round Constant Optimization](https://github.com/scroll-tech/poseidon-circuit/pull/9) In partial rounds, aside from the first state where nonlinear operations are executed, the round constants are redefined to become $0$. This eliminates the need for an additional fixed column, offering a distinct advantage. 2. [MDS Matrix Optimization](https://github.com/lurk-lab/neptune/blob/main/spec/poseidon_spec.pdf) The MDS matrix is reformulated using sparse matrices. This transformation alters the matrix values and operation procedures, which also impacts the round constants. Ultimately, the properties of the sparse matrix have been leveraged to more efficiently represent the operations of the partial rounds. $\vec{rp'}[0]$ to $\vec{rp'}[5]$ represent the round constants added to the initial states. If the gate input state index defined in a row where $\vec{s}_p$ is 1 is denoted as $i$, then the expression for the partial round operations is as follows: $\quad$ $$ \begin{matrix}expr_{j,k} = (s_j[0]+rc'_j[0])^5*m'_k[0]+\color{red}{s_j[1]*m'_k[1]}+{s_j[2]*m'_k[2]} \\ \quad \\ where \ i\leq j<i+7 \end{matrix} $$ $\quad$ One can indeed verify that $expr_{j, k}$ corresponds to the expression for $s_{j+1}[k]$. Using this expression, let's define the intermediary formulae within the given range for $j$ ($i\leq j< i+7$): $\quad$ $$ \begin{matrix} expr_{j, 0} = s_{j+1}[0] \end{matrix} $$ $\quad$ If the above polynomial holds true, we can confirm that the first components of the state in the table align with the expressions we've delineated for the partial round. Additionally, the constraints pertaining to the gate's output are as follows (where $k=0,1,2$): $$ \begin{matrix} expr_{i+6, k} = s_{i+7}[k] \end{matrix} $$ - $\color{red}{check(Questions)}$ - In the transition round, $t\_in_{i-3}$ is currently set to 0 in the code. It's mentioned as a helper cell, suggesting that leveraging this cell could reduce the polynomial's degree, but the exact methodology is unclear. - The fourth-round constants and the MDS matrix are set using values defined within the gates themselves during circuit design. However, I'm uncertain whether all public constants can be the part of gates in this manner. - `selectors` Though the constraints for each gate were previously described as being defined by individual selector columns, in the actual implementation, only one selector column is used. This single selector can be toggled to either consider all gates in every row or not. ### Combining all constraints The sequence and count of each gate invocation are as follows: 1. [full round] * 3 2. [transition round] * 1 3. [septuple round] * 8 4. [full round] * 3 5. [transition round] * 1 | $i$ | $\color{#DAA7AA}{\vec{s}[0]}$ | $\color{#DAA7AA}{\vec{s}[1]}$ | $\color{#DAA7AA}{\vec{s}[2]}$ | $\color{#DAA7AA}{\vec{t}}$ | $\color{#DAA7AA}{\vec{s'}[0]}$ | $\color{#DAA7AA}{\vec{s'}[1]}$ | $\color{#DAA7AA}{\vec{s'}[2]}$ | $\color{#DAA7AA}{\vec{sp}[0]}$ | $\color{#DAA7AA}{\vec{sp}[1]}$ | $\color{#DAA7AA}{\vec{sp}[2]}$ | $\color{#DAA7AA}{\vec{sp}[3]}$ | $\color{#DAA7AA}{\vec{sp}[4]}$ | $\color{#DAA7AA}{\vec{sp}[5]}$ | $\color{#87AAB3}{\vec{rc}[0]}$ | $\color{#87AAB3}{\vec{rc}[1]}$ | $\color{#87AAB3}{\vec{rc}[2]}$ | $\color{#87AAB3}{\vec{rc'}}$ | $\color{#87AAB3}{\vec{rp}[0]}$ | $\color{#87AAB3}{\vec{rp}[1]}$ | $\color{#87AAB3}{\vec{rp}[2]}$ | $\color{#87AAB3}{\vec{rp}[3]}$ | $\color{#87AAB3}{\vec{rp}[4]}$ | $\color{#87AAB3}{\vec{rp}[5]}$| $\color{#87AAB3}{\vec{is}_{last}}$ | |:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:| | $0$ | $\color{#DAA7AA}{s_0[0]}$ | $\color{#DAA7AA}{s_0[1]}$ | $\color{#DAA7AA}{s_0[2]}$ | $\color{#DAA7AA}{0}$ | $\color{#DAA7AA}{s_5[0]}$ | $\color{#DAA7AA}{s_5[1]}$ | $\color{#DAA7AA}{s_5[2]}$ | $\color{#DAA7AA}{s_6[0]}$ | $\color{#DAA7AA}{s_7[0]}$ | $\color{#DAA7AA}{s_8[0]}$ | $\color{#DAA7AA}{s_9[0]}$ | $\color{#DAA7AA}{s_{10}[0]}$ | $\color{#DAA7AA}{s_{11}[0]}$ | $\color{#87AAB3}{rc'_0[0]}$ | $\color{#87AAB3}{rc'_0[1]}$ | $\color{#87AAB3}{rc'_0[2]}$ | $\color{#87AAB3}{rc'_5[0]}$ | $\color{#87AAB3}{rc'_6[0]}$ | $\color{#87AAB3}{rc'_7[0]}$ | $\color{#87AAB3}{rc'_8[0]}$ | $\color{#87AAB3}{rc'_9[0]}$ | $\color{#87AAB3}{rc'_{10}[0]}$ | $\color{#87AAB3}{rc'_{11}[0]}$ | $\color{#87AAB3}{0}$ | | $1$ | $\color{#DAA7AA}{s_1[0]}$ | $\color{#DAA7AA}{s_1[1]}$ | $\color{#DAA7AA}{s_1[2]}$ | $\color{#DAA7AA}{s_4[0]}$ | $\color{#DAA7AA}{s_{12}[0]}$ | $\color{#DAA7AA}{s_{12}[1]}$ | $\color{#DAA7AA}{s_{12}[2]}$ | $\color{#DAA7AA}{s_{13}[0]}$ | $\color{#DAA7AA}{s_{14}[0]}$ | $\color{#DAA7AA}{s_{15}[0]}$ | $\color{#DAA7AA}{s_{16}[0]}$ | $\color{#DAA7AA}{s_{17}[0]}$ | $\color{#DAA7AA}{s_{18}[0]}$ | $\color{#87AAB3}{rc'_1[0]}$ | $\color{#87AAB3}{rc'_1[1]}$ | $\color{#87AAB3}{rc'_1[2]}$ | $\color{#87AAB3}{rc'_{12}[0]}$ | $\color{#87AAB3}{rc'_{13}[0]}$ | $\color{#87AAB3}{rc'_{14}[0]}$ | $\color{#87AAB3}{rc'_{15}[0]}$ | $\color{#87AAB3}{rc'_{16}[0]}$ | $\color{#87AAB3}{rc'_{17}[0]}$ | $\color{#87AAB3}{rc'_{18}[0]}$ | $\color{#87AAB3}{0}$ | | $2$ | $\color{#DAA7AA}{s_2[0]}$ | $\color{#DAA7AA}{s_2[1]}$ | $\color{#DAA7AA}{s_2[2]}$ | $\color{#DAA7AA}{s_4[1]}$ | $\color{#DAA7AA}{s_{19}[0]}$ |$\color{#DAA7AA}{…}$ |$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{s_{25}[0]}$| $\color{#87AAB3}{rc'_2[0]}$ | $\color{#87AAB3}{rc'_2[1]}$ | $\color{#87AAB3}{rc'_2[2]}$ | $\color{#87AAB3}{rc'_{19}[0]}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{rc'_{25}[0]}$ | $\color{#87AAB3}{0}$ | | $3$ | $\color{#DAA7AA}{s_3[0]}$ | $\color{#DAA7AA}{s_3[1]}$ | $\color{#DAA7AA}{s_3[2]}$ | $\color{#DAA7AA}{s_4[2]}$ | $\color{#DAA7AA}{s_{26}[0]}$ |$\color{#DAA7AA}{…}$ |$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{s_{32}[0]}$| $\color{#87AAB3}{rc'_3[0]}$ | $\color{#87AAB3}{rc'_3[1]}$ | $\color{#87AAB3}{rc'_3[2]}$ | $\color{#87AAB3}{rc'_{26}[0]}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{rc'_{32}[0]}$ | $\color{#87AAB3}{0}$ | | $4$ | $\color{#DAA7AA}{s_{61}[0]}$ | $\color{#DAA7AA}{s_{61}[1]}$ | $\color{#DAA7AA}{s_{61}[2]}$ | $\color{#DAA7AA}{0}$ |$\color{#DAA7AA}{s_{33}[0]}$ |$\color{#DAA7AA}{…}$ |$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{s_{39}[0]}$| $\color{#87AAB3}{rc'_{61}[0]}$ | $\color{#87AAB3}{rc'_{61}[1]}$ | $\color{#87AAB3}{rc'_{61}[2]}$ | $\color{#87AAB3}{rc'_{33}[0]}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{rc'_{39}[0]}$ | $\color{#87AAB3}{0}$ | | $5$ | $\color{#DAA7AA}{s_{62}[0]}$ | $\color{#DAA7AA}{s_{62}[1]}$ | $\color{#DAA7AA}{s_{62}[2]}$ | $\color{#DAA7AA}{s_{65}[0]}$ |$\color{#DAA7AA}{s_{40}[0]}$ |$\color{#DAA7AA}{…}$ |$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{s_{46}[0]}$| $\color{#87AAB3}{rc'_{62}[0]}$ | $\color{#87AAB3}{rc'_{62}[1]}$ | $\color{#87AAB3}{rc'_{62}[2]}$ | $\color{#87AAB3}{rc'_{40}[0]}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{rc'_{46}[0]}$ | $\color{#87AAB3}{0}$ | | $6$ | $\color{#DAA7AA}{s_{63}[0]}$ | $\color{#DAA7AA}{s_{63}[1]}$ | $\color{#DAA7AA}{s_{63}[2]}$ |$\color{#DAA7AA}{s_{65}[1]}$|$\color{#DAA7AA}{s_{47}[0]}$ |$\color{#DAA7AA}{…}$ |$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{…}$|$\color{#DAA7AA}{s_{53}[0]}$| $\color{#87AAB3}{rc'_{63}[0]}$ | $\color{#87AAB3}{rc'_{63}[1]}$ | $\color{#87AAB3}{rc'_{63}[2]}$ | $\color{#87AAB3}{rc'_{47}[0]}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{…}$ | $\color{#87AAB3}{rc'_{53}[0]}$| $\color{#87AAB3}{0}$ | | $7$ | $\color{#DAA7AA}{s_{64}[0]}$ | $\color{#DAA7AA}{s_{64}[1]}$ | $\color{#DAA7AA}{s_{64}[2]}$ |$\color{#DAA7AA}{s_{65}[2]}$| $\color{#DAA7AA}{s_{54}[0]}$ | $\color{#DAA7AA}{s_{54}[1]}$ | $\color{#DAA7AA}{s_{54}[2]}$ | $\color{#DAA7AA}{s_{55}[0]}$ | $\color{#DAA7AA}{s_{56}[0]}$ | $\color{#DAA7AA}{s_{57}[0]}$ | $\color{#DAA7AA}{s_{58}[0]}$ | $\color{#DAA7AA}{s_{59}[0]}$ |$\color{#DAA7AA}{s_{60}[0]}$ | $\color{#87AAB3}{rc'_{64}[0]}$ | $\color{#87AAB3}{rc'_{64}[1]}$ | $\color{#87AAB3}{rc'_{64}[2]}$ | $\color{#87AAB3}{rc'_{54}[0]}$ | $\color{#87AAB3}{rc'_{55}[0]}$ | $\color{#87AAB3}{rc'_{56}[0]}$ | $\color{#87AAB3}{rc'_{57}[0]}$ | $\color{#87AAB3}{rc'_{58}[0]}$ | $\color{#87AAB3}{rc'_{59}[0]}$ | $\color{#87AAB3}{rc'_{60}[0]}$ | $\color{#87AAB3}{1}$ | In the aforementioned table, ${s_i[k]}$ and ${rc'_i[k]}$ represent the state and round constants used at round $i$, respectively. The column assigning values varies with each round; for the full round, we have ${\vec{s}_i[k]}$ and ${\vec{rc'}_i[k]}$, for the transition round, there's ${\vec{t}}$, and for the partial round, values are assigned to all columns excluding the selector. The sequence in which the table values are populated according to the progression of rounds is as follows: 1. The initial full round fills rows 0-3. 2. The transition round populates rows 1-3 with the output state of the fourth round from the full round. 3. The partial round begins with the output from the transition and fills rows 0-7 of the respective column. 4. The final full round follows the values populated in step 1 to fill rows 4-7. 5. Following step 3, the transition round fills the final output state in rows 5-7. For the transition round, the values assigned to rows 5-7 of column ${\vec{t}}$, i.e., ${s_{65}[0], s_{65}[1], s_{65}[2]}$, are the output of the permutation operation. Consequently, the resulting table comprises 8 rows, 13 advice columns, and 11 fixed columns. $\quad$ ## 4. Code Analysis ### 1) SeptidonChip ```rust // Comprises chips for each round and a control chip to manage them pub struct SeptidonChip { control_chip: ControlChip, // one fixed column transition_chip: TransitionRoundChip, // one advice column full_round_chip: FullRoundChip, partial_round_chip: SeptupleRoundChip, } ``` The `SeptidonChip` defines the permutation for Poseidon. The `control_chip` consists of a single selector column and is used to define the constraints of the gates that make up the circuit. The `full_round_chip`, `partial_round_chip`, and `transition_chip` respectively define the gates for the full round, partial round, and transition round. ```rust impl SeptidonChip { // Chip creation for permutation pub fn configure<F: CachedConstants>(cs: &mut ConstraintSystem<F>) -> Self { // Creating signals using the control chip let (control_chip, signals) = ControlChip::configure(cs); let q = || signals.selector.clone(); let (full_round_chip, full_round_loop_body) = FullRoundChip::configure(cs); let (partial_round_chip, partial_round_loop_body) = SeptupleRoundChip::configure(cs, q()); let transition_chip = { // Using the output of the transition as input for the partial round let output = partial_round_chip.input(); TransitionRoundChip::configure(cs, signals.transition_round, output) }; { // The output of the column where the full's break_full_rounds signal is 1 is connected as input to the transition let output = transition_chip.input(); LoopChip::configure( cs, q(), full_round_loop_body, signals.break_full_rounds, output, ) }; { // In the column where break_partial_rounds is 1, the output of the partial is used again as input to the full let full_round_sboxes = &full_round_chip.0 .0; let output: [Cell; 3] = [ full_round_sboxes[0].input.rotated(-3), full_round_sboxes[1].input.rotated(-3), full_round_sboxes[2].input.rotated(-3), ]; LoopChip::configure( cs, q(), partial_round_loop_body, signals.break_partial_rounds, output, ) }; Self { control_chip, transition_chip, full_round_chip, partial_round_chip, } } ... } ``` Within `SeptidonChip`, the `configure` function composes all the chips necessary for the permutation and defines the interconnection relationships between the inputs and outputs of each gate. - Output of `full_round_chip` $\rightarrow$ Input of `transition_chip` - Output of `transition_chip` $\rightarrow$ Input of `partial_round_chip` - Output of `partial_round_chip` $\rightarrow$ Input of `full_round_chip` $\quad$ ### 2) ControlChip & LoopChip - `ControlChip` & `ControlSignals` ```rust pub struct ControlChip { is_last: Column<Fixed>, } pub struct ControlSignals<F: FieldExt> { pub break_full_rounds: Expression<F>, pub transition_round: Expression<F>, pub break_partial_rounds: Expression<F>, // A selector that can disable all chips on all rows. pub selector: Expression<F>, } ``` - `ControlChip`: Defined with the `is_last` selector, it's hard-coded during the circuit's `synthesize` phase to have a value of 1 in the last row (7th row). - `ControlSignals`: It defines the break points of each round by referring to the relative row position from the row where `is_last` has a value of 1. (This can be observed in the `configure` function below.) $\quad$ ```rust impl ControlChip { pub fn configure<F: FieldExt>(cs: &mut ConstraintSystem<F>) -> (Self, ControlSignals<F>) { let is_last = cs.fixed_column(); let signals = query(cs, |meta| { let signal_middle = meta.query_fixed(is_last, Rotation(4)); // Seen from the middle row. let signal_last = meta.query_fixed(is_last, Rotation::cur()); // Assuming no overlapping region between the two signals above. let middle_or_last = signal_middle.clone() + signal_last.clone(); ControlSignals { break_full_rounds: middle_or_last, transition_round: signal_middle, break_partial_rounds: signal_last, selector: Self::derive_selector(is_last, meta), } }); let chip = Self { is_last }; (chip, signals) } ... } ``` The code above uses the `is_last` selector to define the following elements of `ControlSignals`: - `break_full_rounds`: Signal in the row where the full round ends. - `transition_round`: Signal in the row where the transition round begins. - `break_partial_rounds`: Signal in the row where the partial round ends. - `selector`: A selector that can either consider or disregard all row constraints. While the 'selector' determines the validity check of constraints defined in a gate, the 'signal' modifies the output constraint of the gate, thereby distinguishing between the two. - `LoopChip` ```rust pub struct LoopChip {} pub struct LoopBody<F> { pub next_state: [Expression<F>; 3], // relative to the break signal. pub output: [Expression<F>; 3], } impl LoopChip { pub fn configure<F: FieldExt>( cs: &mut ConstraintSystem<F>, q: Expression<F>, body: LoopBody<F>, break_signal: Expression<F>, output: [Cell; 3], ) -> Self { cs.create_gate("loop", |meta| { let constraints = (0..3) .map(|i| { let destination = select::expr( break_signal.clone(), output[i].query(meta, 0), // selector = 1 (input from external source) body.next_state[i].clone(), // selector = 0 (next round state) ); destination - body.output[i].clone() }) .collect::<Vec<_>>(); Constraints::with_selector(q, constraints) }); Self {} } } ``` The `LoopChip` isn't used on its own but rather, it's employed within the `full_round_chip` or `partial_round_chip` to define gates. Alongside each iteration, the `LoopChip` is defined with an end signal. - `break_signal` $== 1$: A constraint is defined ensuring the input `output` cell is congruent with the current round's computational result. - `break_signal` $== 0$: The `next_state` cell is constrained to be identical to the current round's execution outcome. $\quad$ ### 3) full_round - `full round` ```rust pub struct FullRoundChip(pub FullState); impl FullRoundChip { pub fn configure<F: CachedConstants>(cs: &mut ConstraintSystem<F>) -> (Self, LoopBody<F>) { let chip = Self(FullState::configure(cs)); let loop_body = query(cs, |meta| { let next_state = chip.0.map(|sbox| sbox.input.query(meta, 1)); // Querying the position of the next row (offset=1) of the sbox's advice column. let output = chip.full_round_expr(meta); LoopBody { next_state, output } }); (chip, loop_body) } ... } ``` The `configure` function of `FullRoundChip` yields a `LoopBody` with the following elements: - `next_state`: Cell of the advice column's subsequent row. - `output`: The resulting state after one full round. If the break signal for a full round is 0, the `next_state` is constrained to be the same as `output`. If it's 1, the `output` is constrained to be the same as the input of another gate. For instance, it's feasible to impose a constraint where the input of a transition round matches the `output`. $\quad$ For clarity, in section 3, the description of the `full round` circuit pertains to when the signal is zero. $\quad$ ### 4) Transition Round - `Transition Round` ```rust pub struct TransitionRoundChip { column: Column<Advice>, } impl TransitionRoundChip { pub fn configure<F: CachedConstants>( cs: &mut ConstraintSystem<F>, signal: Expression<F>, next_state: [Cell; 3], ) -> Self { let chip = Self { column: cs.advice_column(), }; cs.create_gate("transition round", |meta| { // The input cells are relative to the signal. let input = chip.input(); let input = [ input[0].query(meta, 0), input[1].query(meta, 0), input[2].query(meta, 0), ]; let output = Self::first_partial_round_expr(&input); // Get the next_state from the point of view of the signal. let next_state = [ next_state[0].query(meta, -3), next_state[1].query(meta, -3), next_state[2].query(meta, -3), ]; let constraints = vec![ output[0].clone() - next_state[0].clone(), output[1].clone() - next_state[1].clone(), output[2].clone() - next_state[2].clone(), ]; Constraints::with_selector(signal, constraints) }); chip } fn first_partial_round_expr<F: CachedConstants>( input: &[Expression<F>; 3], ) -> [Expression<F>; 3] { let rc = Expression::Constant(Self::round_constant()); let sbox_out = [ params::sbox::expr(input[0].clone(), rc), input[1].clone(), input[2].clone(), ]; matmul::expr(mds(), sbox_out) } fn round_constant<F: CachedConstants>() -> F { round_constant(4)[0] } pub fn input(&self) -> [Cell; 3] { // The input of the transition round is allocated vertically. [ Cell::new(self.column, -2), Cell::new(self.column, -1), Cell::new(self.column, 0), ] } ... } ``` Inputs to the transition round are allocated vertically. This input is received from the output of the full round. As can be seen from the `transition round` table in Chapter 3, since the `next_state` query offset is '$-3$', the output state corresponds to the position '$i-3$'. The gate creation function defines constraints according to the partial round operation, and the round constant used is directly defined in the gate rather than being allocated to a separate column. $\quad$ ### 5) septuple_round - `septuple round` ```rust pub fn configure<F: CachedConstants>( cs: &mut ConstraintSystem<F>, q: Expression<F>, ) -> (Self, LoopBody<F>) { let chip = Self { // s_i[0], rc_i[0] first_sbox: SBox::configure(cs), // s_i[1], s_i[2] first_linears: [Cell::configure(cs), Cell::configure(cs)], // s_{i+1}[0], ..., s_[i+6][0] following_sboxes: (0..6) .map(|_| SBox::configure(cs)) .collect::<Vec<_>>() .try_into() .unwrap(), }; let input = chip.input(); // input_state = s_i // next_state = s_{i+7} let (input_state, next_state) = query(cs, |meta| { ( [ input[0].query(meta, 0), input[1].query(meta, 0), input[2].query(meta, 0), ], [ input[0].query(meta, 1), input[1].query(meta, 1), input[2].query(meta, 1), ], ) }); let output = { let mut checked_sbox = &chip.first_sbox; let mut state = input_state; cs.create_gate("septuple_round", |meta| { let mut constraints = vec![]; for sbox_to_check in &chip.following_sboxes { // Calculate the state of the next partial round state = Self::partial_round_expr(meta, checked_sbox, &state); // Compare the witness assigned to following_sboxes and the calculated state value let witness = sbox_to_check.input_expr(meta); constraints.push(state[0].clone() - witness.clone()); checked_sbox = sbox_to_check; } // Output the state calculated as the result of the i+7th round state = Self::partial_round_expr(meta, checked_sbox, &state); Constraints::with_selector(q, constraints) }); state }; let loop_body = LoopBody { next_state, output }; (chip, loop_body) } fn partial_round_expr<F: CachedConstants>( meta: &mut VirtualCells<'_, F>, sbox: &SBox, input: &[Expression<F>; 3], ) -> [Expression<F>; 3] { let sbox_out = [sbox.output_expr(meta), input[1].clone(), input[2].clone()]; matmul::expr(mds(), sbox_out) } pub fn input(&self) -> [Cell; 3] { [ self.first_sbox.input.clone(), self.first_linears[0].clone(), self.first_linears[1].clone(), ] } ``` `septuple round` defines constraints for the partial rounds. The input state of the gate is denoted by $s_i[k]$ and the output by $s_{i+7}[k]$. This allows representation of constraints for $7*8=56$ partial rounds. Specifically, from the $i+1$ round to the $i+6$ round, only the constraints for the first state undergoing the s-box and the round constant are considered. This series of 8 repeated gates is also designed using the `LoopChip` such that upon encountering a break signal, the output of the partial round becomes the input of the full round. $\quad$ ## 5. Conclusion The configuration of the Poseidon hash's advice and fixed columns, both before and after the optimization, is as follows: - Before: 38 rows, 13 columns - After: 8 rows, 24 columns While the optimization led to approximately doubling the number of columns, it achieved a reduction to about a quarter of the original row count. The core principles behind the optimized Poseidon hash circuit design can be summarized as: - Consider circuit designs that aim to reduce the number of rows. - If the benefits gained from reducing the row count outweigh the increase in columns, proceed with that design approach. - Whenever possible, minimize the number of selectors in the circuit design.