## Merkle proofs Prysm & Fastssz & Dynssz For the generation of Merkle Proofs, the first approach has been to use Fastssz Libraries. To implement this fast, I've followed Dynssz [approach](https://github.com/pk910/dynamic-ssz/pull/45). #### TLDR Two approaches: - "merkleize" any SSZ type, record the nodes and issue the merkle proofs (fastssz and dynssz). - extend Prysm beacon state merkle proofs to a more generic approach, reducing extra computation. ### Prysm proofs In Prysm repo, there is a myriad of helpers to compute the Merkle Tree. In relation to the `BeaconState` object, `stateutil` package provides some helper functions like: - `initializeMerkleLayers` - `recomputeDirtyFields` - `MerkleizeTrieLeaves` - (...) These functions are used (directly or indirectly) for computing merkle proofs such as: - `CurrentSyncCommitteeProof` - `NextSyncCommitteeProof` - `FinalizedRootProof` These functions are for getting proofs of top-level fields because it relies on [`types.FieldIndex.RealPosition()`](https://github.com/OffchainLabs/prysm/blob/b2a9db0826ffba64193bdc3bdc97c4603f276931/beacon-chain/state/state-native/types/types.go#L124). As soon as the path descends into a list or a nested container (validators → validator → withdrawal_credentials) the “real position” trick breaks. Some progress was made by @jun's [PR](https://github.com/OffchainLabs/prysm/pull/15443/files#diff-0997947eabffab158f6800c26de0740b974d74d25d0ac7cf29dc25fbdc5bfffc). In here, it is provided a general purpose top-level field prover. This code has been studied deeply as it could be the root for this new developments. As aforementioned, a step forward is needed so as to compute merkle proofs of any arbitrary path. Prysm's beacon state object also contains field that contains the top-level field roots and fields that track whether certain fields has to be recomputed or not: ```diff type BeaconState struct { version int genesisTime uint64 genesisValidatorsRoot [32]byte slot primitives.Slot fork *ethpb.Fork (...) id uint64 lock sync.RWMutex + dirtyFields map[types.FieldIndex]bool + dirtyIndices map[types.FieldIndex][]uint64 + stateFieldLeaves map[types.FieldIndex]*fieldtrie.FieldTrie + rebuildTrie map[types.FieldIndex]bool valMapHandler *stateutil.ValidatorMapHandler + merkleLayers [][][]byte sharedFieldReferences map[types.FieldIndex]*stateutil.Reference } ``` In this code snippet I just highlighted the most relevant fields in relation to a generic prover: - `dirtyFields` and `dirtyIndices` are useful for re-computing only the fields that have changed and keep a dirty/outdated value. - `rebuildTrie`: marks which fields are needs to be rebuilt. - `stateFieldLeaves`: contains the corresponding trie for each field. - `merkleLayers`: merkle layers from the top-level fields up to the root. To build the proofs this way, we need a several helpers such as: - a generalized index rebaser: the generalized index computed from the path are an absolute value. Accessing each nested field would need to - hash recorder: to build the final merkle proofs walking the `merkleLayers` and the specific `stateFieldLeaves` [State-native approach](https://github.com/fernantho/prysm/blob/f019749b31f6fdcb4028ee13bc3663ed6ddd5a2c/beacon-chain/state/state-native/proofs.go#L172-L173): This is an attempt to build the proofs for any given element of the `BeaconState`. The core logic is behind these function: ```golang // ProofsByGeneralizedIndices constructs a multi-proof for the provided generalized indices. func (b *BeaconState) ProofsByGeneralizedIndices(ctx context.Context, gindices []uint64) (types.MultiProof, error) { var proof types.MultiProof // Input validation. if len(gindices) == 0 { return proof, fmt.Errorf("no generalized indices provided") } // Lock the beacon state for the duration of the proof generation. b.lock.Lock() defer b.lock.Unlock() // Get the fields included in the current version of the beacon state. fields := fieldsForVersion(b.version) if len(fields) == 0 { return proof, fmt.Errorf("unsupported state version %d", b.version) } // Precompute values involved in the proof generation fieldCount := uint64(len(fields)) // Top-level fields depth in the Merkle Tree. // depthTop := ssz.Depth(fieldCount) base := math.PowerOf2(fieldCount) // Compute top-level fields proofs first. // 1. Initialize merkle layers. if err := b.initializeMerkleLayers(ctx); err != nil { return proof, err } // 2. Recompute dirty fields. if err := b.recomputeDirtyFields(ctx); err != nil { return proof, err } proof.Indices = make([]uint64, 0, len(gindices)) proof.Leaves = make([][]byte, 0, len(gindices)) // seenProofs := make(map[string]struct{}) orderedProofs := make([][]byte, 0) for _, gi := range gindices { parentField, err := b.parentFieldFromGIndex(gi) if err != nil { return types.MultiProof{}, err } // Determine if the generalized index points to a top-level field // or a nested field. GI MUST always points to a terminal node. if gi <= base { // top-level field computedProof, err := b.proofByFieldIndex(ctx, parentField) if err != nil { return types.MultiProof{}, err } proof.Proofs = append(proof.Proofs, computedProof...) proof.Indices = append(proof.Indices, gi) leaves, err := b.ValueByFieldIndex(parentField) if err != nil { return types.MultiProof{}, err } proof.Leaves = append(proof.Leaves, leaves.([]byte)) } else { // 1. recompute the field trie b.stateFieldLeaves[parentField].RecomputeTrie([]uint64{gi}, parentField) // 2. generate the proof from the field trie fieldTrieLayers := b.stateFieldLeaves[parentField].GetFieldLayers() fmt.Printf("the fields layers are: %+v\n", fieldTrieLayers) // proof.Proofs = append(proof.Proofs, computedProof...) proof.Indices = append(proof.Indices, gi) leaves, err := b.ValueByFieldIndex(parentField) if err != nil { return types.MultiProof{}, err } proof.Leaves = append(proof.Leaves, leaves.([]byte)) } } proof.Proofs = orderedProofs return proof, nil } ``` There are some flaws that would need further thinking. This analysis is only for the BeaconState. For the BeaconBlock, more developments are needed as there are no such fields as `merkleLayers` and `stateFieldLeaves` ### Fastssz and Dynssz proofs On the one hand, Fastssz provides a general purpose prover for any field within any different SSZ objects. It basically provides the function `HashTreeRootWith` to "walk" the object while "recording" the different nodes. The problem here is that to enable Fastssz approach, we would need to re-generate Prysm's CL SSZ types in order to add a `HashTreeRootWith` we can pass a merkle tree node recorder. This would allow us to easily issue proofs afterwards. But, personally, I find this very pervasive into the whole Prysm client. On the contrary, Dynssz provides the same features as Fastssz with some extended functionality. In this case, it is very inspiring how to dynamically build the merkle trees prior to issue the proofs. [This recent PR](https://github.com/pk910/dynamic-ssz/pull/45) achieves this. The following [code snippet](https://github.com/pk910/dynamic-ssz/blob/2de54585126ade1361746c7229ad2291b9d81453/treeroot.go#L103-L118 ) shows the general idea of a recursive meklre tree computation: ```golang switch sourceType.SszType { case SszTypeWrapperType: err := d.buildRootFromTypeWrapper(sourceType, sourceValue, hh, pack, idt) if err != nil { return err } case SszContainerType: err := d.buildRootFromContainer(sourceType, sourceValue, hh, idt) if err != nil { return err } case SszProgressiveContainerType: err := d.buildRootFromProgressiveContainer(sourceType, sourceValue, hh, idt) if err != nil { return err } case SszVectorType, SszBitvectorType: err := d.buildRootFromVector(sourceType, sourceValue, hh, idt) if err != nil { return err } case SszListType, SszProgressiveListType: err := d.buildRootFromList(sourceType, sourceValue, hh, idt) if err != nil { return err } case SszBitlistType, SszProgressiveBitlistType: err := d.buildRootFromBitlist(sourceType, sourceValue, hh, idt) if err != nil { return err } case SszCompatibleUnionType: err := d.buildRootFromCompatibleUnion(sourceType, sourceValue, hh, idt) if err != nil { return err } (...) ``` Each type builds each own merkle root while recording each of the node by using a custom hasher. In our case, we can proceed in a similar manner as we already have analyzed the SSZ object providing us with the SSZInfo object that allows us to "walk" any given SSZ object. ## Conclusion Although the first approach using BeaconState `merkleLayers` and `stateFieldLeaves` is more performant as it reduces the need of hashing, I have opted for the more generic approach as a first step. Having the recursive merkle tree builder with the recorder hasher allows us to achieve the **"Generic Merkle Proofs"** for any given SSZ object.