## Merkle Proofs in SSZ-QL
SSZ-QL needs of Merkle Proof so as to cryptographically verify the provided data. A Merkle Prover relying on SSZ Info data is very powerful and versatile tool. With this in mind, a recursive Merkle Prover has successfully been implemented.
For this Generic Merkle Prover many different approaches have been researched. All of them have different benefits/drawbacks that are to be explain in this write-up.
### ❌ Approach 1: Fastssz's wrapper + merkle tree build + proof collection
This was the first considered approach. By using the already battle tested Fastssz's wrapper to wrapper all the leaves in the Merkle Tree we were just one step far from building the entire Merkle Tree and another step far from producing the (multi)proofs.
This approach was implemented in this [PR#16121 SSZ-QL: merkle tree using fastssz wrapper and Ssz Info object](https://github.com/OffchainLabs/prysm/pull/16121).
All this implementation relies on building the Merkle Tree of a given `SszInfo` object with the from `SszInfo.source`. To do so, there is a new method added to `SszInfo` called [`MerkleTree()`](https://github.com/fernantho/prysm/blob/bd05f8d0b0afa4624c7d4fd0eed0e26f4c2330a8/encoding/ssz/query/merkle_tree.go#L14).
This `MerkleTree()` calls the `func buildTree(info *SszInfo, v reflect.Value, w *ssz.Wrapper) error` which is the general dispatcher to traverse the whole `SszInfo` object while building the tree.
For each different `sszType`, there is a different helper function that allows as to build those specific leaves and other helpers to pack basic elements, add padding and little endian encoding.
The implementation is pretty straightforward. The outcomes are terribly bad ❌
```bash
❯ go test -v -run 'TestQueryBeaconState_withProof' ./beacon-chain/rpc/prysm/beacon/... -timeout 300s -count 1 2>&1 | head -30
=== RUN TestQueryBeaconState_withProof
ssz_query_test.go:153: Loaded state in 265.865708ms
=== RUN TestQueryBeaconState_withProof/.slot
signal: killed
FAIL github.com/OffchainLabs/prysm/v7/beacon-chain/rpc/prysm/beacon 21.665s
FAIL
```
It just get killed anytime we try to build the Merkle Tree. The reason is clear. It tries to build the *total entire absolute intact integral* Merkle Tree which is a very gross approach. Honestly, this finding made me rethought everything and this is what it made me confused about not assessing properly the (in)famous `st, _ := util.DeterministicGenesisState(t, N)` "where N > 100.000".
At this point I **discarded the 2-step solution** (building the whole tree and collecting the proofs) and started working the on the second approach. The main issue with building the whole Merkle Tree are in the list where the `limit` is very large and the real element used are very little compared to the `limit` e.g. `validators` contains around ~1M validators in mainnet (~1.2M validators in Hoodi), but the [`VALIDATOR_REGISTRY_LIMIT`](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/beacon-chain.md#state-list-lengths) is `uint64(2**40) (= 1,099,511,627,776)`.
With first approach, when we built the tree, [there is a call](https://github.com/fernantho/prysm/blob/bd05f8d0b0afa4624c7d4fd0eed0e26f4c2330a8/encoding/ssz/query/merkle_tree.go#L159):
`w.CommitWithMixin(start, length, int(nextPowerOfTwo(uint64(limit))))`
From this `CommitWithMixin` there is another call to:
```golang
func TreeFromNodesWithMixin(leaves []*Node, num, limit int) (*Node, error) {
numLeaves := len(leaves)
if !isPowerOfTwo(limit) {
return nil, errors.New("size of tree should be a power of 2")
}
allLeaves := make([]*Node, limit)
emptyLeaf := NewNodeWithValue(make([]byte, 32))
for i := 0; i < limit; i++ {
if i < numLeaves {
allLeaves[i] = leaves[i]
} else {
allLeaves[i] = emptyLeaf
}
}
mainTree, err := TreeFromNodes(allLeaves)
if err != nil {
return nil, err
}
// Mixin len
countLeaf := LeafFromUint64(uint64(num))
return NewNodeWithLR(mainTree, countLeaf), nil
}
```
And this is the highly inefficient part of the tree building as it pads all the empty leaves.
### ✅ Approach 1B: Fastssz's wrapper fork + merkle tree build + proof collection
At a final stage, I tried to improve this approach. Building the whole Merkle Tree is not ideal and by the end of the second approach, I was used to using `trie.Zerohashes`. With that knowledge, I forked fastssz and [implemented a new method in the wrapper](https://github.com/fernantho/fastssz/tree/sparse-tree-from-nodes):
```golang
func (w *Wrapper) SparseCommitWithMixin(i, num, limit int) {
res, err := SparseTreeFromNodesWithMixin(w.nodes[i:], num, limit)
if err != nil {
panic(err)
}
// remove the old nodes
w.nodes = w.nodes[:i]
// add the new node
w.AddNode(res)
}
```
The `SparseTreeFromNodesWithMixin` function points to [`buildSparseTree`](https://github.com/fernantho/fastssz/blob/b32813952e703514f38120dbe3cc73ecfd702b9f/tree.go#L381) which is in charge of building the tree benefiting from the precalculated zerohashes.
In the end, by not building the entire merkle tree, the improvements are massive:
- it does not get killed by running out of memory.
- proof production takes between 3-4 seconds.
```bash
❯ go test -v -run 'TestQueryBeaconState_withProof' ./beacon-chain/rpc/prysm/beacon/... -timeout 300s -count 1 2>&1 | head -30
=== RUN TestQueryBeaconState_withProof
ssz_query_test.go:153: Loaded state in 280.041125ms
=== RUN TestQueryBeaconState_withProof/.slot
=== RUN TestQueryBeaconState_withProof/.latest_block_header
=== RUN TestQueryBeaconState_withProof/.validators
=== RUN TestQueryBeaconState_withProof/.validators[0]
=== RUN TestQueryBeaconState_withProof/.validators[0].withdrawal_credentials
=== RUN TestQueryBeaconState_withProof/.validators[0].effective_balance
--- PASS: TestQueryBeaconState_withProof (21.57s)
--- PASS: TestQueryBeaconState_withProof/.slot (3.36s)
--- PASS: TestQueryBeaconState_withProof/.latest_block_header (3.51s)
--- PASS: TestQueryBeaconState_withProof/.validators (3.80s)
--- PASS: TestQueryBeaconState_withProof/.validators[0] (3.38s)
--- PASS: TestQueryBeaconState_withProof/.validators[0].withdrawal_credentials (3.37s)
--- PASS: TestQueryBeaconState_withProof/.validators[0].effective_balance (3.29s)
PASS
```
### Approach 2: recursively building the Merkle Tree and collecting the hashes in one sweep
The key of this approach is to building the Merkle Tree and collecting the hashes based on the siblings' generalized indices.
There have been several refactors before coming up with this final implementation:
1. at the beginning, whenever there was a hash there was an assessment to infer if that hash belong to the Merkle Proof or not. This assessment involved a tone of looping from the `targetIndex` up to the root.
2. in a second step, siblings' generalized indices where precalculated at the beginning of the `Prove` function and updated the `ProofCollector` object:
```golang
type ProofCollector struct {
// Collected hashes
siblings map[uint64][32]byte
leaves map[uint64][32]byte
}
```
This `ProofCollector` worked fine but it was kind of simple. It contained just two generalized index key to hash value, one for siblings and another one for leaves.
3. the last version is protected for the concurrency. It has a `mutex` and two different mappings for siblings and two for leaves:
```golang
type ProofCollector struct {
mu sync.Mutex
// Required gindices (registered before merkleization)
requiredSiblings map[uint64]struct{}
requiredLeaves map[uint64]struct{}
// Collected hashes
siblings map[uint64][32]byte
leaves map[uint64][32]byte
}
```
`requiredSiblings` and `requiredLeaves` are fields initialized at the beginning and used for reading while traversing the whole Merkle Tree. SAFE
`siblings` and `leafs` are the fields that hold the proof hashes linked to their generalized index.
In the current implementation, the proof collector is created in [`Prove` function](https://github.com/fernantho/prysm/blob/00273a643124f577fb49dad2a9f6d5d05fe4de7e/encoding/ssz/query/merkle_proof.go#L17) which is the entrypoint for Merkle Proof production.
```golang
func (info *SszInfo) Prove(gindex uint64) (*fastssz.Proof, error) {
if info == nil {
return nil, fmt.Errorf("nil SszInfo")
}
collector := NewProofCollector()
collector.registerRequiredSiblings(gindex)
// info.source is guaranteed to be valid and dereferenced by AnalyzeObject
v := reflect.ValueOf(info.source).Elem()
if _, err := collector.merkleize(info, v, 1); err != nil {
return nil, err
}
return collector.toProof(), nil
}
```
Opposed to approach 1 where we built the Merkle Tree recursively, in this approach 2 we do all the (merkleization+proof collection) recursively. To do so, there is a general dispatcher [`merkleize` function](https://github.com/fernantho/prysm/blob/00273a643124f577fb49dad2a9f6d5d05fe4de7e/encoding/ssz/query/proof_collector.go#L166) which redirect the execution flow to any of the helper functions based on the `sszType`:
```golang
func (pc *ProofCollector) merkleize(info *SszInfo, v reflect.Value, currentGindex uint64) ([32]byte, error) {
if info.sszType.isBasic() {
return pc.merkleizeBasicType(info.sszType, v, currentGindex)
}
switch info.sszType {
case Container:
return pc.merkleizeContainer(info, v, currentGindex)
case List:
return pc.merkleizeList(info, v, currentGindex)
case Vector:
return pc.merkleizeVector(info, v, currentGindex)
case Bitlist:
return pc.merkleizeBitlist(info, v, currentGindex)
case Bitvector:
return pc.merkleizeBitvector(info, v, currentGindex)
default:
return [32]byte{}, fmt.Errorf("unsupported SSZ type: %v", info.sszType)
}
}
```
I'd highlight two methods from the current implementation:
- [`MerkleizeVectorAndCollect`](https://github.com/fernantho/prysm/blob/00273a643124f577fb49dad2a9f6d5d05fe4de7e/encoding/ssz/query/proof_collector.go#L587): this one replicates the `MerkleizeVector` from [Prysm](https://github.com/fernantho/prysm/blob/00273a643124f577fb49dad2a9f6d5d05fe4de7e/encoding/ssz/merkleize.go#L138) with the addition of the Proofs collector.
- [`MerkleizeVectorBody`](https://github.com/fernantho/prysm/blob/00273a643124f577fb49dad2a9f6d5d05fe4de7e/encoding/ssz/query/proof_collector.go#L296): this one introduces all the parallelism that make the major improvements.
#### Outcomes
- **Approach 2 without parallelism**:
```bash
❯ go test -v -run 'TestQueryBeaconState_withProof' ./beacon-chain/rpc/prysm/beacon/... -timeout 8000s
=== RUN TestQueryBeaconState_withProof
ssz_query_test.go:151: Loaded state in 252.138291ms
=== RUN TestQueryBeaconState_withProof/.slot
ssz_query_test.go:243: SSZ Query with proof for path '.slot' completed in 2.04800675s
=== RUN TestQueryBeaconState_withProof/.latest_block_header
ssz_query_test.go:243: SSZ Query with proof for path '.latest_block_header' completed in 2.26211225s
=== RUN TestQueryBeaconState_withProof/.validators
ssz_query_test.go:243: SSZ Query with proof for path '.validators' completed in 2.384333875s
=== RUN TestQueryBeaconState_withProof/.validators[0]
ssz_query_test.go:243: SSZ Query with proof for path '.validators[0]' completed in 1.908323375s
=== RUN TestQueryBeaconState_withProof/.validators[0].withdrawal_credentials
ssz_query_test.go:243: SSZ Query with proof for path '.validators[0].withdrawal_credentials' completed in 1.914882125s
=== RUN TestQueryBeaconState_withProof/.validators[0].effective_balance
ssz_query_test.go:243: SSZ Query with proof for path '.validators[0].effective_balance' completed in 1.927740459s
--- PASS: TestQueryBeaconState_withProof (13.53s)
--- PASS: TestQueryBeaconState_withProof/.slot (2.05s)
--- PASS: TestQueryBeaconState_withProof/.latest_block_header (2.26s)
--- PASS: TestQueryBeaconState_withProof/.validators (2.41s)
--- PASS: TestQueryBeaconState_withProof/.validators[0] (1.91s)
--- PASS: TestQueryBeaconState_withProof/.validators[0].withdrawal_credentials (1.91s)
--- PASS: TestQueryBeaconState_withProof/.validators[0].effective_balance (1.93s)
PASS
ok github.com/OffchainLabs/prysm/v7/beacon-chain/rpc/prysm/beacon 13.999s
```
- **Approach 2 with parallelism**
```bash
❯ go test -v -run 'TestQueryBeaconState' ./beacon-chain/rpc/prysm/beacon/... -timeout 300s -count 1 2>&1 | head -30
=== RUN TestQueryBeaconState
ssz_query_test.go:47: Loaded state in 275.357542ms
=== RUN TestQueryBeaconState/.slot
=== RUN TestQueryBeaconState/.latest_block_header
=== RUN TestQueryBeaconState/.validators
=== RUN TestQueryBeaconState/len(validators)
=== RUN TestQueryBeaconState/.validators[0]
=== RUN TestQueryBeaconState/.validators[0].withdrawal_credentials
=== RUN TestQueryBeaconState/.validators[0].effective_balance
--- PASS: TestQueryBeaconState (9.17s)
--- PASS: TestQueryBeaconState/.slot (1.15s)
--- PASS: TestQueryBeaconState/.latest_block_header (1.15s)
--- PASS: TestQueryBeaconState/.validators (1.19s)
--- PASS: TestQueryBeaconState/len(validators) (1.20s)
--- PASS: TestQueryBeaconState/.validators[0] (1.17s)
--- PASS: TestQueryBeaconState/.validators[0].withdrawal_credentials (1.18s)
--- PASS: TestQueryBeaconState/.validators[0].effective_balance (1.16s)
```