Diederik Loerakker
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# Eth2 Phase0 client implementation - from scratch --------------- **Contribution must read** Approach: start tackling each topic one by one. Others can assign themselves to certain sub-topics and help write this. **Contribution rules**: - Tag your work by adding your name under the title of the respective section. - If you take a section that someone is already working on, *communicate with them*. - If you like to write something anyway, don't modify the previous work, and instead write your own after the original. - If you do not tag your work, expect it to be modified by others unknowingly. **LICENSE**: by contributing here, you agree that your output is licensed as **CC0 1.0 Universal**. Like the eth2.0 specs itself. If you mix in work that is licensed differently, you either have to communicate it and make it very clear, or your contribution may be removed. **FORMAT**: - Sections are named by their contents. - Sections or subsections are tagged, after the title, with the expected reader level. - Valid levels: - **"General"**: for beginners, but not necesarily "noob", just out of the loop. - **"Intermediate"**: for those familiar with why/what, but looking to implement it. - **"Advanced"**: for those that want a deep understanding. - **"Experimental"**: for those that want to go beyond status quo, maybe even improve the specification. **Maintainers / Contributors** *Add yourself here if you contributed in some way* Maintainers: - @protolambda Contributors: - @protolambda - @leobago - @wemeetagain - you? --------------- ## Table of Contents [TOC] --------------- ## Client architecture An eth2.0 client generally consists of two main components: - Beacon node: sync and verify the Beacon-chain, provide Validator with consensus state. - Validator node: produce new data to put onto the Beacon-chain Phase1 extends this with similar responsibilities for shards. Phase2 is experimental, but then extends this with nodes capable of processing shards beyond basic data-storage functionality. #### Beacon node responsibilities #### Validator node responsibities --------------- ## Encoding and merkelization stack: SSZ Simple Serialize (SSZ) is Eth2's type system. All Eth2 datastructures use SSZ-defined types that follow the SSZ standard for _efficient_ [serialization](https://en.wikipedia.org/wiki/Serialization) and _consistent_ [merkleization](https://en.wikipedia.org/wiki/Merkle_tree). SSZ defines basic types - *unsigned integer* - uint8, uint16, uint32, uint64, uint256 - *boolean* as well as composite types - *vector* - fixed-length array - has elements of a single type - *list* - variable-length array with a max-length - has elements of a single type - *container* - ordered heterogenous collection of values Special packed encoding is given to a *vector* and *list* of *boolean*, aliased *bitvector* and *bitlist*. ### hash-tree-root, with binary trees One primary operation on SSZ datastructures is `hashTreeRoot`, the retrieval of the "root hash" (root) of the merkleized data. A root of data is used extensively throughout Eth2 as: 1. a unique identifier of the underlying data 2. a compressed form of the underlying data All SSZ-defined data can be unambiguously represented as a binary merkle tree, using sha256 as the hash function. This unambiguous representation gives us a tool to gain consensus about potentially large datasets by simply agreeing on a root. (See [Data-sharing and caching as first-class citizen](#Data-sharing-and-caching-as-first-class-citizen) on the advantages / tradeoffs of persisting the merkle tree as a representation of SSZ-defined data) One key insight is that composite SSZ types may be viewed as merkle trees composed of merkle trees. Many Eth2 datastructures have elements that are roots of historical versions of structures, eg: links to previous blocks or states. A root of data is a stand-in for the full datastructure, as a root is simply the top node in a merkle tree representation. Large Eth2 SSZ types are generally heterogeneous structures which can be recursively expanded into deeper and deeper trees that traverse further into history and deeper into large, summarized datasets. Light client protocols make extensive use of this ability to expand roots into trees. (See [Block-header and Block: Hash-tree-root transparency](#Block-header-and-Block-Hash-tree-root-transparency) for more) ### (de)serialization, with offsets ### SSZ limit type data for encoding and consensus safety ### Block-header and Block: Hash-tree-root transparency ### Data-sharing and caching as first-class citizen ### Layered approach, typing abstracting away backings #### Tree-backing encodings #### Light client functionality, SSZ Partials --------------- ## Network stack ### Discovery/bootstrapping (@leobago - General) When a node tries to join the p2p network, it first try to contact a **bootnode**, which is node maintained (by the Ethereum Foundation for exemple) with the specific purpose of service a discovery endpoint. For instance, Geth has a list of bootnodes hardcoded into its implementation. After reaching out to the bootnode, it asks for a list of nodes close to it (through the FindNode procedure), and then iteratively ask those nodes for other nodes closer to it. #### Methods ##### Kademlia DHT (@leobago - General) Kademlia is a peer-to-peer distributed hash table (DHT) intented to offer the ability to easily discover nodes in a network. Kademlia uses a XOR-based metric topology that simplifies some of the features, such as reducing the number of configuration messages in the system. The simple communication protocol involves remote procedure calls (RPC) such as Ping, Store, FindNode and FindValue. For Eth2, only Ping and FindNode calls are essential, although this might change in the future. Kademlia uses a SHA256(node id) as the Kademlia id, a 32-bit value. As a consequence of **XOR Metric** used in Kademlia, the routing table is a set of lists called k-buckets in which each bucket holds a maximum of k endpoints. ##### Discovery v5 (@leobago - General) Discovery 5 is the evolution of the mechanism by which peers discover each other in the p2p network, previously called **Discovery 4**. The Discovery 5 protocol is inspired from the Kademlia DHT, with a few differences (e.g., signed and versioned node records are used instead of DHT stores). Discovery 5 has three main functions: * Discover all online nodes and build a global topology. * Search for nodes related to a specific service/topic (e.g., Eth2) * Authoritative resolution of node records through the record `sequence`. Discovery 5 has several improvements over its predecesor (Discovery 4), for instance it allows to advertise specific topics and it also allows to store arbitrary node metadata. #### Records (@leobago - General) Records are useful in a P2P network to maintain information about the nodes in the mesh. This information can be used identify nodes as well as to rate their participation in the network. Ethereum 2 implements its own node records and they are presented in the following section. ##### ENR to enhance peer info (@leobago - General) Ethereum Node Records (ENR) correspond to signed and versioned information about the nodes. This information is usually related to the network endpoints of a node, such as its IP addresses and ports but it can also provide a way to classify different types of nodes in the network. The information is stored in the records as a simple list of key-value pairs. The list of pairs is then signed cryptographically and added into the `signature` component of the record. Every time the information changes (e.g., IP address is updated), the information is signed again and the `signature` is updated and the sequence number `seq` (64-bit unsigned integer) is incremented. ### Peer connections (@leobago - General) Upon first startup, clients MUST generate an RSA key pair in order to identify the client on the network. The SHA-256 multihash of the public key is the clients's Peer ID, which is used to look up the client in libp2p's peer book and allows a client's identity to remain constant across network changes. #### Introduce multi-addresses (@leobago - General) Multi-addresses (`multiaddr`) are a new way to make addresses more future-proof by solving several of the limitations that current addresses have. The following are some of the features that multi-addresses have: * `multiaddrs` support addresses for several network protocol increasing interoperability. * `multiaddr` conform to a simple syntax that is self-describing. * `multiaddr` are both human-readable and machine-readable. * `multiaddr` can be easily wrapped and unwrapped in several encapsulation layers. ##### Multi-addresses Example (@leobago - General) Eth2 clients are identified by `multiaddr`. For example, the human-readable `multiaddr` for a client located at example.com, available via TCP on port 8888, and with peer ID ehgyukGllbeWhyukyverG35T45GEg3G3GwfQWfewewefQU would look like this: /dns4/example.com/tcp/8888/p2p/ehgyukGllbeWhyukyverG35T45GEg3G3GwfQWfewewefQU #### Peer scoring #### Peer limits and prunes (@leobago - General) In a P2P network nodes can have an arbitrary number of connections, however there is a *sweet spot*, too few could lead to messages being lost and too many generates unnecessary traffic. However, not all peers perform equally and some of them could deliver messages unreliably or even corrupted. To avoid the case of having a node being stuck with *bad peers*, frequent recalibration is necessary. For this purpose nodes can **Prune** a mesh link and set a *backoff timer* to avoid re-grafting a link with the prunned peer. #### Multi-plexing connections ### Transports (@leobago - General) There are several transport protocols that can be used to relay information on a P2P network. In this section we describe the different ones that have been considered to be used for Eth2. #### TCP (@leobago - General) Transmission Control Protocol (TCP) is the standard way to communicate between applications running on hosts communicating via the internet, in a ordered, reliable and error-free fashion. To guarantee this, TCP implements packet retransmission and error detection as well as a three-way handshake to establish active open connections. As usual, reliability features come with a performance overhead in terms of both latency and bandwidth. #### QUIC (@leobago - General) QUIC is a transport layer netwrok protocol defined by Google in 2012 and implemented in the Chrome browser to speedup connections over TCP for various services such as Maps or Youtube. To achieve better performance, QUIC establishes multiple connections between two endpoints over User Datagram Protocol (UDP), instead of using only TCP. In addition to its high performance, the protocol is secured with ecrypted communications, Transport Layer Security (TLS). #### Websockets and WebRTC. Circuit relay mechanisms, browser nodes, etc. ### Security #### SecIO #### Noise #### TLS 1.3 ### RPC #### Multiselect libp2p protocols #### Request/response ##### Protocols as message types ##### Encoding negotiation through multiselect ##### Response codes ##### Chunkification ##### Length prefixes, compressed payloads ##### Goodbye messages ### PubSub, through libp2p GossipSub (@leobago - General) P2P networks usually implement Publish/Subscribe systems to distribute messages in an asynchronous fashion. Subscribers declare their interest for a specific topic and publishers sent messages in one of the existing topics. In this way, senders and receivers are not in direct communication, but rather interact throught the pub/sub system. There are several types of p2p networks, structured and unstructured. The first type has Super Nodes which are assigned more responsabilities (e.g., relay events, support routing, etc) than normal Nodes. The second type (i.e., unstructured) do not have Super Nodes, implying that all nodes can be ephemeral (e.g., mobile devices) making it harder to guarantee reliability of message delivery. #### GossipSub vs floodsub (@leobago - General) Due to the constrains of unstructured P2P overlays, different protocols have been proposed to relay information over the network. The simplest one being **Floodsub**, which consists on forwarding every message to all subscribed peers. This obviously creates a huge amount of unnecessary traffic over the network, as peers receive the same message through different sources. To avoid such a bandwidth waste, **GossipSub** proposes to forward *metadata* of the messages they have "seen", instead of the entire content of the message. In addition, GossipSub implements *Lazy push* in which peers that are interested on a message for which they have received the metadata, request that message explecitly. Given the difference in size between message data and metadata, using this technique it is possible to save a non-neglegible part of the network bandwidth, improving scalability. #### Content-based message IDs #### Attestation aggregation subnets ##### Techniques ###### (Naive) Local aggregation ###### On-the-fly aggregation ###### About the privacy problem ###### Handel ###### Advanced gossip based aggregation ##### Committee shuffling and subnet backbones #### Global aggregates collection topic #### Beacon blocks topic #### Misc. consensus operations topics --------------- ## State management ### Storage #### Hash-tree-root cache metadata #### Finalized storage ##### Pruning ##### Flattening #### Hot storage ##### Data-sharing ##### Batched persists ##### Handling re-orgs ##### Handling long gaps ### In-memory hot state data #### Data-sharing, caching #### Lazy-load cold state ### Block imports #### Queuing, resolve ancestors first #### Processing. From checkpoint or hot state. --------------- ## Slashing detection ### Listening for attestations ### Surround and double vote detection ### Efficient storage: cover full weak-subjectivity period ### Efficient matching: find slashings quick ## Fork choice On Eth2, as in any blockchain, there is a function to decide which is the canonical chain when multiple options are proposed. Given that different validators in the network can produce and/or receive different blocks at different times, it is common to observe different competing chains exist during short periods of time, until one of them finally is chosen as the "Fork choice". In bitcoin, for instance, the algorithm selects the longest chain as the fork choice, which is the security strategy on a PoW blockchain. On Eth2 it is not the longest chain the one that is selected, but rather the chain with most validator votes, following the PoS security strategy. ### LMD GHOST ### Fork versions ### Attestation processing ### Balance-weighting ### Justification/finalization ### Attack protection --------------- ## Eth1 ### Eth1 data voting, voting periods ### Eth1 deposit contract processing --------------- ## BLS Signatures (@leobago - General) Validators have several roles in Eth2, one of them is to **attest** that the work of other validators has been done correctly, and if not those bad actors should be slashed. This verification consists on verifying thousands of validators at every single step, which could be extremely time consuming with conventional signature verification mechanisms. Boneh-Lynn-Shacham (BLS) signatures have an interesting property that allows for signature aggregation, speeding the process of signature verification dramatically and allowing for much larger committee sizes. ### Pubkey store ### Lazy serialize/deserialize ### IETF standard ### Fast aggregate-verify --------------- ## Validator client Validators play an essential role in Eth2, they are the new miners of the network, the ones in charge of maintaining the security of the system and verifying that all nodes follow the rules. In contrast with Eth1, and any PoW chain, validators do not need to spit millions of hashes per second in order to participate in the block creation process. The randomness comes from a completely different root, which makes the whole procedure much less energy hungry and hence more sustainable. Validators have multiple roles and they follow a strict life cycle that has been manually tunned for security. ### Proposing #### RANDAO participation #### Eth1 deposits #### Attestation inclusion ##### Aggregates value optimalization #### Slashing inclusion #### Exits, and future withdrawals, transfers, etc. ### Attesting #### Signing ##### Slashing protections #### Selected as aggregator #### Subnet switch after new shuffling ### Key management #### BLS key standard ### Validator life cycle #### Deposits ##### New validator ##### Top-up existing validator #### Activation eligibilty #### Activation queue #### Active #### Exiting #### Exit queue #### Withdrawal --------------- ## Sync ### Status messages, sync peer selection ### Initial sync #### Blocks-By-Range ### Catch-up sync #### Blocks-By-Root ### Sync responses --------------- ## Consensus ### Beacon-chain transition #### Optimizations ##### Proposers pre-computation ##### Committee shuffling pre-computation ##### Active-indices and committee count pre-computation ##### Attester-status pre-computation ##### Alternatives with Memoization

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully