# The Native Beacon State ## Objectives We want to improve the execution speed and memory usage of various functions that deal with the beacon state, in particular making copies of the state. Benchmarking code from https://gist.github.com/rkapka/486daa18f440262c45a9e88c00b17ce1 shows that copying a native Go struct with fixed-size arrays performs 21% faster and uses 29% less memory than a protocol buffers state with slices (a more detailed analysis can be found in https://github.com/prysmaticlabs/prysm/issues/8785#issuecomment-932176575). As such, we want to use a pure Go struct whenever possible and resort to a protocol buffers state representation only when necessary, such as when performing gRPC calls. ## Tasks Here is a list of items that need to be completed to make the transition to a native beacon state: - `BeaconState` definition - Ability to convert between native and proto versions of the beacon state - Replacement of proto beacon state with the native beacon state wherever possible - Cleanup Note that there are actually two `BeaconState` types, one for `v1` / `Phase 0` and one for `v2` / `Altair`. The same procedures can be followed in both cases. ### `BeaconState` definition We need something similar to the structures defined in https://gist.github.com/rkapka/486daa18f440262c45a9e88c00b17ce1. The current `BeaconState` is defined as ``` type BeaconState struct { state *ethpb.BeaconState lock sync.RWMutex dirtyFields map[types.FieldIndex]bool dirtyIndices map[types.FieldIndex][]uint64 stateFieldLeaves map[types.FieldIndex]*fieldtrie.FieldTrie rebuildTrie map[types.FieldIndex]bool valMapHandler *stateutil.ValidatorMapHandler merkleLayers [][][]byte sharedFieldReferences map[types.FieldIndex]*stateutil.Reference } ``` It embeds a proto state and defines additional fields. All getters and setters internally make use of the proto state to get/set particular values. What we should do is replace the embedded `*ethpb.BeaconState` with a set of fields that are now indirectly accessed through the proto state. Some of these fields are proto structs themselves, which means we will have to define native structs for them too. And recursively all the way down. The main takeaway is that we will not create a new `BeaconState` type, but we will amend the existing one. *Q: Duplicate structs such as `Fork` or reuse them between `v1` and `v2`?* Because protocol buffers do not have support for fixed-size arrays, we currently have to use slices for all array fields. As an example, we have a `BlockRoots [][]byte` field in the state, although it is known from the official specification that this slice holds a `[8192][32]byte` value underneath. We want our state to have fixed-size arrays whenever possible. Arrays perform better than slices because the latter add a layer of indirection. As stated at the beginning of this document, switching to fixed-size arrays was both more performant and decreased memory footprint when copying beacon states. *Q: How does https://stackoverflow.com/questions/30525184/array-vs-slice-accessing-speed fit into the increased performance statement?* Note that it will not be enough to change field types of the beacon state. There are also getters, setters and other beacon state functions that will have to change. One example is the `Copy()` function, which returns a copy of the original beacon state. Internally it performs safe copies of slices, but we will have to define equivalent safe-copy versions for particular array sizes. ### Ability to convert between native and proto versions of the beacon state Even though it would be best to entirely get rid of the proto beacon state, there are scenarios where this is not possible. One prominent example are gRPC calls, which use proto structs by definition. The official Ethereum API specification includes the [GetStateV2 endpoint](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Debug/getStateV2) which returns a full beacon state object. Although the endpoint itself returns a JSON or SSZ representation, not a protocol buffers object, we use the grpc-gateway library internally for gRPC <--> JSON communication. The library requires a proto beacon state for this endpoint to function properly. The easiest way to handle such issues is to write custom conversion code between both representations. These helper functions should live next to proto definitions.