# Field Merkle Tree
## Merkle Tree
A Merkle tree serves as a cryptographic commitment mechanism, particularly in the context of validity proof systems. With a Merkle tree of a certain *depth*, it's possible to store up to $2^{depth}$ values, and the proofs of membership for these values are efficiently compact, proportional to the depth. The construction of a Merkle tree begins at the leaves, where each value in the set is hashed using a secure, collision-resistant hash function. These hashed values are then recursively combined and hashed in pairs, moving up the tree level by level. This process continues until a single hash, known as the Merkle root, is derived at the top of the tree. The Merkle root serves as the commitment to the entire set of values. Consider a set ${v1, v2, ..., v7}$ and assume $H$ and $C$ (both of them could be same as well) is our chosen hash and compression functions. Each value $v_i$ is hashed to initiate the tree's leaf nodes. These are then recursively compressed in pairs, forming the subsequent levels of the tree.
![image](https://hackmd.io/_uploads/SJt7RwdK6.png)
### Merkle Tree in STARKs
A Merkle Tree is utilized to commit various components in a STARK, specifically the `execution trace`, `quotient composition polynomial chunks`, and the intermediate stages of the FRI folding rounds. These components are inherently matrix-like structures, with the execution trace and quotient composition polynomial chunks typically consisting of multiple columns. In a straightforward, unoptimized approach, the prover would create individual Merkle trees for each column of these matrices. Each tree's root along with some paths would then be sent to the verifier, serving as a commitment to the corresponding column. However, this method presents a significant challenge in terms of proof size, as the number of Merkle trees scales with the number of columns in the matrices.
To address this scalability issue, a more efficient strategy is employed in production. Rather than creating a separate Merkle tree for each column, a single Merkle tree is constructed for the entire matrix. This is achieved by aggregating all the field elements in each row of the matrix and sequentially hashing them to form a single leaf node of the Merkle tree. This approach dramatically reduces the number of required Merkle trees to just one. A verifier can efficiently check the proof of any specific row by performing a single Merkle path verification, corresponding to the leaf of that row.
![image](https://hackmd.io/_uploads/BkGW-fFYT.png)
In our example, consider a trace matrix with 8 rows and 4 columns. Each row of this matrix is transformed into a leaf of a Merkle Tree by sequentially hashing its values. For example, the leaf at the second index is created by hashing the values of the second row: $( v_{20}, v_{21}, v_{22}, v_{23} )$. This is represented as:
$$
l_2 = H(v_{20}, v_{21}, v_{22}, v_{23})
$$
This approach streamlines the verification process significantly. By using a single Merkle path verification, we can efficiently provide a proof of membership of the contents of an entire row at a specific index.
## FieldMerkleTree in plonky3
In plonky3, a key concept is to batch commit multiple STARKs without needing a separate Merkle tree for each. This principle is a cornerstone in the development of [Valida-VM](https://github.com/valida-xyz/valida). The team has innovated an efficient method for batch committing to multiple matrices, despite their varying row and column counts.
:::info
One important note is that matrices whose heights round up to the same power of two (or their original height, if already a power of two) should have identical heights. Beyond this, there are no restrictions on their dimensions.
:::
### Construction of FieldMerkleTree
```rust
pub struct FieldMerkleTree<F: Field, const DIGEST_ELEMS: usize> {
pub(crate) leaves: Vec<RowMajorMatrix<F>>,
pub(crate) digest_layers: Vec<Vec<[F; DIGEST_ELEMS]>>,
}
```
#### Managing Matrix Heights
The `FieldMerkleTree` struct is designed to handle `RowMajorMatrices` of varying sizes. Initially, matrices are sorted in descending order by height. This is necessary to ensure that matrices with similar heights are processed together correctly.
#### Creating the First Digest Layer
The first step involves selecting the tallest matrices to form the initial digest layer (`first_digest_layer`). For each index in this layer, rows from all matrices at that index are concatenated and then sequentially hashed. For example, with matrices $M1$, $M2$, and $M3$ each of height $64$, the $23^{rd}$ digest is: $$H(concat(M1[22], M2[22], M3[22]))$$ This description simplifies the actual process, where sequential hashing is vectorized in *plonky3*.
![image](https://hackmd.io/_uploads/HyMYXEKF6.png)
In the above example, we have $3$ matrices, each sharing a common height but differing in the number of columns. There will be four digests in the digest layers, and the computation of each digest, denoted as $l_i$, is based on the sequential hashing of corresponding rows across all three matrices. The formula for each digest looks like this:
$$
l_i = H(concat([x_{i0}, x_{i1}, x_{i2}], [w_{i0}, w_{i1}], [v_{i0}, v_{i1}, v_{i2}, v_{i3}]))
$$
Essentially, each digest is created by sequentially hashing the rows that are at the same position in each matrix, effectively combining their data into a single hash.
#### Constructing Subsequent Layers
The next phase involves building subsequent layers, each half the length of the previous. Sibling elements from the previous layer's digests are first compressed. If there are matrices which match the current layer's height, their rows are also included in this compression. For instance, with $M4$ and $M5$ at height $32$, the digest at the $13^{th}$ index is computed as:
```rust
next_layer_digest[12] = compress(prev_layer_digest[24], prev_layer_digest[25])
tallest_digest = H1(concat(M4[12], M5[12]))
next_layer_digest[12] = compress(next_layer_digest[12], tallest_digest)
```
If there are no matrices at the current height, the process is simplified:
```rust
next_layer_digest[12] = compress(prev_layer_digest[24], prev_layer_digest[25])
```
![image](https://hackmd.io/_uploads/Byycy8FK6.png)
Continuing from the previous example where we constructed the first digest layer using three matrices of height 4, let's now consider a scenario with just one matrix of height 2 in the next iteration. Similar to our earlier approach, this single matrix's rows are sequentially hashed, resulting in a new digest layer comprising two digests.
Simultaneously, the algorithm compresses the first digest layer to create intermediate digests. These intermediate digests are then further compressed with the newly created digests from the sequential hashing of the matrix with height 2. This dual process of compression and integration forms the final digest layer for this iteration, effectively capturing the essence of both layers in a consolidated form.
If there are no matrices present in a given round, the algorithm simply carries the intermediate digests forward, adopting them as the final digest layer for that round.
![image](https://hackmd.io/_uploads/ryM08UFF6.png)
#### Handling Varying Heights
For matrices shorter than the current layer height (e.g., height $24$ instead of $32$), the process compresses the previous layer's digests for indices beyond the matrix height.
### Opening Proof