---
title: 'Eth2.0 Light Clients: How light is light?'
description: >
Eth2 light client design and rationale
slideOptions:
allottedMinutes: 20
---
<!-- .slide: data-background="#283032" -->
## Eth2.0 Light Clients
### How Light is Light?
note:
light clients are an integral part of eth2
we'll see why they're important and how they will work
this will be an overview of eth2 light clients with enough background to understand why things are the way they are
---
## Introduction
---
## About Me
Cayman Nava - Eth2.0 developer @ ChainSafe Systems
Twitter: [@caymannan](https://twitter.com/caymannan)
Github: [wemeetagain](https://github.com/wemeetagain)
<section>
<img width="200" src="https://pbs.twimg.com/profile_images/1111758438990835713/aE4FTs5F_400x400.png" />
<img width="200" src="https://avatars1.githubusercontent.com/u/1348242?s=460&v=4" />
</section>
---
## About Lodestar
##### Typescript Eth2.0 Ecosystem
Beacon Chain
Light Client
Developer Tooling
https://github.com/ChainSafe/lodestar
[![Discord](https://img.shields.io/discord/593655374469660673.svg?label=Discord&logo=discord)](https://discord.gg/aMxzVcr)
<section>
<img width="100" src="https://raw.githubusercontent.com/remojansen/logo.ts/master/ts.png" />
<img width="100" src="https://pbs.twimg.com/profile_images/1098120092280336384/4EUmYuFd_400x400.png" />
</section>
---
## About ChainSafe
Toronto-based blockchain protocol development
Twitter: [@chainsafeth](https://twitter.com/chainsafeth)
Github: [ChainSafe](https://github.com/ChainSafe)
![](https://avatars2.githubusercontent.com/u/27474093?s=200&v=4)
---
## Overview
- Motivation
- What and why?
- Background
- PoW vs PoS Light Clients
- Merkle Proofs, Multiproofs, and SSZ
- Eth2 Light Client
- Sync Protocol
- Data/Proof Requests
- Open questions
note:
this will be an overview of eth2 light clients with enough background to understand why things are the way they are
we'll start with motivation for light clients,
what they are, why we need them in eth2
Then, if we need to, we'll cover a little background,
we'll cover what merkle trees and merkle proofs are
(and how they're useful for light clients)
And the difference between PoW and PoS light clients
how to think about pos light clients
Then we'll dive into eth2 light clients
the meat of things
starting with the sync protocol,
followed by some specifics with data requests
and end with some open questions
---
## Motivation
---
## What is a Light Client?
> Software looking to **securely consume blockchain data** with requirements that scale **logarithmically** to total blockchain state.
note:
eg:
squaring the number of transactions
should only double a light client’s cost.
(eg. going from 1,000 tx/day to 1,000,000 tx/day)
----
<iframe src="https://www.desmos.com/calculator/wtbuqpiia9?embed" width="500px" height="500px" style="border: 1px solid #ccc" frameborder=0></iframe>
---
## Why Light Clients in Eth2?
Light clients are first class citizens
----
### Resource-constrained Environments
- mobile phones
- embedded systems
- websites
- etc.
note:
raise your hand if you have a smartphone
keep it raised if you haved a synced ethereum blockchain on your phone?
----
### Decentralization
- dreaded Infura single point of failure (:heart:)
note:
for all you who raised your hand a second ago, how many have metamask installed?
----
### Blockchains as Light Clients
eth2 can peer with other blockchains
(**eth1**, cosmos, polkadot, etc.)
note:
if the requirements are light enough, we can
that will be really important if any other blockchains want to verify anything from eth2, eth1 included
----
### More Shards, More Data
- 1024 shards, too much state
- validators are light clients-ish of other shards
note:
light clients are baked into the design of eth2,
in the sense that most regular folks, even validators, won't have all of the eth2 state.
validators will need to sync recent shard state as part of their duties
they'll be using some of the techniques we describe
---
## Background
note:
start with background, if we don't know the background, we're going to be lost with the actual light client protocol
i'm going to cover a few seemingly disparate topics, they all connect
---
### Merkle Proofs
Verify the authenticity of a chunk of data **logarithmic** to the number of chunks
note:
nin either case, we make extensive use of merkle proofs
poll audience: who needs a refresher on merkle proofs
the proof is succinct, it grows logarithmically to the total number of chunks
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))"
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))"->"h(h(a,b),h(b,c))"
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))"->"h(h(e,f),h(g,h))"
"h(h(a,b),h(b,c))"
"h(h(e,f),h(g,h))"
"h(h(a,b),h(b,c))"->"h(a,b)"
"h(h(a,b),h(b,c))"->"h(c,d)"
"h(h(e,f),h(g,h))"->"h(e,f)"
"h(h(e,f),h(g,h))"->"h(g,h)"
"h(a,b)"
"h(c,d)"
"h(e,f)"
"h(g,h)"
"h(a,b)"->a
"h(a,b)"->b
"h(c,d)"->c
"h(c,d)"->d
"h(e,f)"->e
"h(e,f)"->f
"h(g,h)"->g
"h(g,h)"->h
c [style=filled,fillcolor=pink]
{rank=same;a b c d e f g h}
}
```
note:
when we want some particular data, we assume its part of a merkle tree
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))" [style=filled,fillcolor="#90ee90"]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))"->"h(h(a,b),h(b,c))"
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))"->"h(h(e,f),h(g,h))"
"h(h(a,b),h(b,c))"
"h(h(e,f),h(g,h))"
"h(h(a,b),h(b,c))"->"h(a,b)"
"h(h(a,b),h(b,c))"->"h(c,d)"
"h(h(e,f),h(g,h))"->"h(e,f)"
"h(h(e,f),h(g,h))"->"h(g,h)"
"h(a,b)"
"h(c,d)"
"h(e,f)"
"h(g,h)"
"h(a,b)"->a
"h(a,b)"->b
"h(c,d)"->c
"h(c,d)"->d
"h(e,f)"->e
"h(e,f)"->f
"h(g,h)"->g
"h(g,h)"->h
c [style=filled,fillcolor=pink]
{rank=same;a b c d e f g h}
}
```
note:
very important, its a merkle tree that have the root of, and we "trust" it, the scheme only works if we have the root
merkle roots are often stored in a blockchain
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))" [style=filled,fillcolor="#90ee90"]
h0 [label="?"]
h1 [label="?"]
h2 [label="?"]
h3 [label="?"]
h4 [label="?"]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="?"]
h8 [label="?"]
h9 [label="?",style=filled,fillcolor=pink]
h10 [label="?"]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1"
"h0"->"h2"
"h1"->"h3"
"h1"->"h4"
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9"
"h4"->"h10"
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
note:
when verifying a merkle proof, the root is the only thing known and trusted
thats usually why you're requesting data in the first place
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))" [style=filled,fillcolor="#90ee90"]
h0 [label="?"] [style=filled, fillcolor=lightyellow]
h1 [label="?"]
h2 [label="?"]
h3 [label="?"]
h4 [label="?"]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="?"]
h8 [label="?"]
h9 [label="c",style=filled,fillcolor=pink]
h10 [label="?"]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1" [style=bold,color=red]
"h0"->"h2"
"h1"->"h3"
"h1"->"h4" [style=bold,color=red]
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9" [style=bold,color=red]
"h4"->"h10"
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
note:
when we're given a chunk of data, which we don't necessarily trust,
we have to be able to link it back up the tree to the root, which we do trust.
we need to be given the intermediate nodes required to recreate the root
these intermediate nodes are the proof
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))" [style=filled,fillcolor="#90ee90"]
h0 [label="?"] [style=filled, fillcolor=lightyellow]
h1 [label="?"]
h2 [label="?"]
h3 [label="?"]
h4 [label="?"]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="?"]
h8 [label="?"]
h9 [label="c",style=filled,fillcolor=pink]
h10 [label="d",style=filled,fillcolor=lightblue]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1" [style=bold,color=red]
"h0"->"h2"
"h1"->"h3"
"h1"->"h4" [style=bold,color=red]
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9" [style=bold,color=red]
"h4"->"h10" [style=bold,color=blue]
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
note:
we need one intermediate node per level in the tree, starting from the bottom
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))" [style=filled,fillcolor="#90ee90"]
h0 [label="?"] [style=filled, fillcolor=lightyellow]
h1 [label="?"]
h2 [label="?"]
h3 [label="?"]
h4 [label="h()",style=filled,fillcolor=pink]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="?"]
h8 [label="?"]
h9 [label="c",style=filled,fillcolor=pink]
h10 [label="d",style=filled,fillcolor=lightblue]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1" [style=bold,color=red]
"h0"->"h2"
"h1"->"h3"
"h1"->"h4" [style=bold,color=red]
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9" [style=bold,color=red]
"h4"->"h10" [style=bold,color=blue]
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
note:
with each intermediate node in the proof, we're able to create the immediate parent
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))" [style=filled,fillcolor="#90ee90"]
h0 [label="?"] [style=filled, fillcolor=lightyellow]
h1 [label="?"]
h2 [label="?"]
h3 [label="h()",style=filled,fillcolor=lightblue]
h4 [label="h()",style=filled,fillcolor=pink]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="?"]
h8 [label="?"]
h9 [label="c",style=filled,fillcolor=pink]
h10 [label="d",style=filled,fillcolor=lightblue]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1" [style=bold,color=red]
"h0"->"h2"
"h1"->"h3" [style=bold,color=blue]
"h1"->"h4" [style=bold,color=red]
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9" [style=bold,color=red]
"h4"->"h10" [style=bold,color=blue]
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))" [style=filled,fillcolor="#90ee90"]
h0 [label="?"] [style=filled, fillcolor=lightyellow]
h1 [label="h()",style=filled,fillcolor=pink]
h2 [label="?"]
h3 [label="h()",style=filled,fillcolor=lightblue]
h4 [label="h()",style=filled,fillcolor=pink]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="?"]
h8 [label="?"]
h9 [label="c",style=filled,fillcolor=pink]
h10 [label="d",style=filled,fillcolor=lightblue]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1" [style=bold,color=red]
"h0"->"h2"
"h1"->"h3" [style=bold,color=blue]
"h1"->"h4" [style=bold,color=red]
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9" [style=bold,color=red]
"h4"->"h10" [style=bold,color=blue]
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))" [style=filled,fillcolor="#90ee90"]
h0 [label="?"] [style=filled, fillcolor=lightyellow]
h1 [label="h()",style=filled,fillcolor=pink]
h2 [label="h()",style=filled,fillcolor=lightblue]
h3 [label="h()",style=filled,fillcolor=lightblue]
h4 [label="h()",style=filled,fillcolor=pink]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="?"]
h8 [label="?"]
h9 [label="c",style=filled,fillcolor=pink]
h10 [label="d",style=filled,fillcolor=lightblue]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1" [style=bold,color=red]
"h0"->"h2" [style=bold,color=blue]
"h1"->"h3" [style=bold,color=blue]
"h1"->"h4" [style=bold,color=red]
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9" [style=bold,color=red]
"h4"->"h10" [style=bold,color=blue]
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))" [style=filled,fillcolor="#90ee90"]
h0 [label="h()"] [style=filled, fillcolor=pink]
h1 [label="h()",style=filled,fillcolor=pink]
h2 [label="h()",style=filled,fillcolor=lightblue]
h3 [label="h()",style=filled,fillcolor=lightblue]
h4 [label="h()",style=filled,fillcolor=pink]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="?"]
h8 [label="?"]
h9 [label="c",style=filled,fillcolor=pink]
h10 [label="d",style=filled,fillcolor=lightblue]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1" [style=bold,color=red]
"h0"->"h2" [style=bold,color=blue]
"h1"->"h3" [style=bold,color=blue]
"h1"->"h4" [style=bold,color=red]
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9" [style=bold,color=red]
"h4"->"h10" [style=bold,color=blue]
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
note:
at that point, you can compare your trusted root against this newly computed root, and iff the roots match
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
"h(h(h(a,b),h(b,c)),h(h(e,f),h(g,h))" [style=filled,fillcolor="#90ee90"]
h0 [label="h()"] [style=filled, fillcolor="#90ee90"]
h1 [label="h()",style=filled,fillcolor="#90ee90"]
h2 [label="h()",style=filled,fillcolor=lightblue]
h3 [label="h()",style=filled,fillcolor=lightblue]
h4 [label="h()",style=filled,fillcolor="#90ee90"]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="?"]
h8 [label="?"]
h9 [label="c",style=filled,fillcolor="pink"]
h10 [label="d",style=filled,fillcolor=lightblue]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1" [style=bold,color=red]
"h0"->"h2" [style=bold,color=blue]
"h1"->"h3" [style=bold,color=blue]
"h1"->"h4" [style=bold,color=red]
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9" [style=bold,color=red]
"h4"->"h10" [style=bold,color=blue]
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
note:
then you've verified that the data is correct
you've "verified the proof"
one thing to reiterate is that you only need these blue pieces, one per level, that number grows logarithmically with the total number of leaves
---
## Merkle Multiproofs
note:
merkle multiproofs are merkle proofs
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
h0 [label="?"]
h1 [label="?"]
h2 [label="?"]
h3 [label="?"]
h4 [label="?"]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="?",style=filled,fillcolor=pink]
h8 [label="?"]
h9 [label="?",style=filled,fillcolor=pink]
h10 [label="?"]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1"
"h0"->"h2"
"h1"->"h3"
"h1"->"h4"
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9"
"h4"->"h10"
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
note:
multiproofs are proofs for multiple leaves in the tree
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
h0 [label="?"] [style=filled, fillcolor=lightyellow]
h1 [label="?"]
h2 [label="?"]
h3 [label="?"]
h4 [label="?"]
h5 [label="?"]
h6 [label="?"]
h7 [label="?"]
h7 [label="a",style=filled,fillcolor=pink]
h8 [label="?"]
h9 [label="c",style=filled,fillcolor=pink]
h10 [label="?"]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1" [style=bold,color=red]
"h0"->"h2"
"h1"->"h3" [style=bold,color=red]
"h1"->"h4" [style=bold,color=red]
"h2"->"h5"
"h2"->"h6"
"h3"->"h7" [style=bold,color=red]
"h3"->"h8"
"h4"->"h9" [style=bold,color=red]
"h4"->"h10"
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
note:
we construct the multiproof in a similar way to how we construct the individual proofs.
Identify the elements needed to recreate the roots from each leaf
BUT the idea is that we can share the elements needed for each leaf
----
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
h0 [label="h()"] [style=filled, fillcolor=lightyellow]
h1 [label="h()",style=filled,fillcolor=lightyellow]
h2 [label="h()",style=filled,fillcolor=lightblue]
h3 [label="h()",style=filled,fillcolor=lightyellow]
h4 [label="h()",style=filled,fillcolor=lightyellow]
h5 [label="?"]
h6 [label="?"]
h7 [label="a",style=filled,fillcolor=pink]
h8 [label="b",style=filled,fillcolor=lightblue]
h9 [label="c",style=filled,fillcolor=pink]
h10 [label="d",style=filled,fillcolor=lightblue]
h11 [label="?"]
h12 [label="?"]
h13 [label="?"]
h14 [label="?"]
"h0"->"h1" [style=bold,color=red]
"h0"->"h2" [style=bold,color=blue]
"h1"->"h3" [style=bold,color=red]
"h1"->"h4" [style=bold,color=red]
"h2"->"h5"
"h2"->"h6"
"h3"->"h7" [style=bold,color=red]
"h3"->"h8" [style=bold,color=blue]
"h4"->"h9" [style=bold,color=red]
"h4"->"h10" [style=bold,color=blue]
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
note:
so a proof for just A here would need 3 nodes in the tree
and a proof for just C would need 3 nodes in the tree
but instead of needing 6 nodes for the multiproof, we only need 3
---
## PoW vs. PoS Light Clients
note:
briefly look at some of the differences, how the eth2 light client will differ from an eth1 light client
---
## PoW Light Clients
Easy because headers can be verified with only protocol rules
note:
the headers have everything we need
----
<div style="background: white">
<img src="https://blog.ethereum.org/wp-content/uploads/2015/01/pow_header.png" />
</div>
1. download headers
2. verify (block by block)
3. request merkle proofs
note:
we hash the header
verify the proof of work
verify the next header's previous hash
once we get to the head, we have the relevant merkle roots, and we can request data and merkle proofs
---
## PoS Light Clients
Headers alone aren't sufficient to verify proof of stake
We need to track stake
note:
In the PoS world we're governed by some sort of (super)majority stake
We must ensure we're on the chain with the most stake
which means we need to track balances and votes
votes are cryptographic signatures
this is a different beast than PoW light clients, theres an opportunity to do things a little bit differently
---
## SSZ (Simple Serialize)
note:
this is an eth2 spec designed around consistent and easy merkleization
merkleization
----
#### SSZ Example: Checkpoint
```python
class Checkpoint(Container):
epoch: Epoch
root: Hash
```
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
h0 [label="h()"]
h1 [label="epoch",style=filled,fillcolor=lightblue]
h2 [label="root",style=filled,fillcolor=lightblue]
"h0"->"h1"
"h0"->"h2"
}
```
----
#### SSZ Example: Crosslink
```python
class Crosslink(Container):
shard: Shard
parent_root: Hash
start_epoch: Epoch
end_epoch: Epoch
data_root: Hash
```
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
h0 [label="h()"]
h1 [label="h()"]
h2 [label="h()"]
h3 [label="h()"]
h4 [label="h()"]
h5 [label="h()"]
h6 [label="h()"]
h7 [label="shard",style=filled,fillcolor=lightblue]
h8 [label="parent_root",style=filled,fillcolor=lightblue]
h9 [label="start_epoch",style=filled,fillcolor=lightblue]
h10 [label="end_epoch",style=filled,fillcolor=lightblue]
h11 [label="data_root",style=filled,fillcolor=lightblue]
h12 [label="0x0"]
h13 [label="0x0"]
h14 [label="0x0"]
"h0"->"h1"
"h0"->"h2"
"h1"->"h3"
"h1"->"h4"
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9"
"h4"->"h10"
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
----
#### SSZ Example: AttestationData
```python
class AttestationData(Container):
# LMD GHOST vote
beacon_block_root: Hash
# FFG vote
source: Checkpoint
target: Checkpoint
# Crosslink vote
crosslink: Crosslink
```
```graphviz
digraph mygraph {
nodesep=0.5
node [color=black,fontname=Courier,shape=box]
edge [color=black, style=dashed, dir=none]
hash [label="beacon_block_root",style=filled,fillcolor=lightblue]
a0 [label="source",style=filled,fillcolor=lightblue]
a1 [label="epoch"]
a2 [label="root"]
b0 [label="target",style=filled,fillcolor=lightblue]
b1 [label="epoch"]
b2 [label="root"]
h0 [label="crosslink",style=filled,fillcolor=lightblue]
h1 [label="h()"]
h2 [label="h()"]
h3 [label="h()"]
h4 [label="h()"]
h5 [label="h()"]
h6 [label="h()"]
h7 [label="shard"]
h8 [label="parent_root"]
h9 [label="start_epoch"]
h10 [label="end_epoch"]
h11 [label="data_root"]
h12 [label="0x0"]
h13 [label="0x0"]
h14 [label="0x0"]
j0 [label="h()"]
j1 [label="h()"]
j2 [label="h()"]
j0->j1
j0->j2
j1->hash
j1->a0
j2->b0
j2->h0
"h0"->"h1"
"h0"->"h2"
"a0"->"a1"
"a0"->"a2"
"b0"->"b1"
"b0"->"b2"
"h1"->"h3"
"h1"->"h4"
"h2"->"h5"
"h2"->"h6"
"h3"->"h7"
"h3"->"h8"
"h4"->"h9"
"h4"->"h10"
"h5"->"h11"
"h5"->"h12"
"h6"->"h13"
"h6"->"h14"
}
```
note:
when you would create a merkle root of this "attestationdata", you would be including the root of the underlying crosslink
and the crosslink includes the "data_root" which is the merkle root of some shard data
eth2 datastructures include merkle roots in many places because its really useful and necessary to be able to create proofs
beacon state -> beacon blocks
beacon blocks -> beacon state
shard blocks -> beacon blocks
---
## Eth2 Light Client
---
## Eth2 Sync Protocol
note:
explain by asking questions and figuring out what makes sense
----
#### Motivating Questions
How do we get updated _trusted_ merkle roots?
Can we do this _succinctly_?
note:
pow strategy not sufficient
we need to get up-to-date "trusted" merkle roots
but in Pos, that means we need stake
in the name of light clients, lets make this as lightweight as possible
----
#### What is trusted?
Roots attested by 2/3 of stake.
Staked votes gives weight to the chain.
note:
PoS requirement
----
#### Key insight 1:
Instead of syncing headers by hashing one by one
Use staked vote balance, skip ahead to a current header*
note:
we don't need to sync headers one by one
verifying each one by checking the parent hash
we're in a PoS world, where we're governed by a 2/3 majority
we can use votes as verification of recent headers
instead of checking hashes for pow validity
we have to track validator stake/votes
----
#### Key insight 2:
Instead of tracking all validator balances + votes
Track a subset of validators (committee)
note:
track a committee
----
## Where do eth2 validators validate?
----
### Crosslink Committees
- fast, changing every epoch (~6 min)
- attest to recent shard data (crosslink)
- attest to beacon block root (but only as total validator set)
- attest to recent checkpoints (FFG data)
note:
this doesn't really work for light clients because we still need the whole validator set to authenticate recent block checkpoints
----
### Shard Committees
- slow, period committies change every shard period (~27 hrs)
- attest to shard block root
note:
shard block header contains a beacon block root
better candidate, changes slowly, only need to update every ~27 hours
---
### Sync Protocol
- Sync by shard block root
- with items needed to verify weight*
- Shard root gives us beacon block header (via merkle proof)
- Every ~27 hrs, update period committee
note:
Track the period committees assigned to that shard
shard committees change fully every ~27 hours
who voted for that shard block
how much of the total stake voted for the block
we can jump 27 hrs at a time
lets look briefly at the datastructures involved in syncing
----
```python
class LightClientMemory(object):
# Randomly initialized and retained forever
shard: Shard
# Beacon header which is not expected to revert
header: BeaconBlockHeader
# period committees corresponding to the beacon header
previous_committee: CompactCommittee
current_committee: CompactCommittee
next_committee: CompactCommittee
```
note:
This is data we retain for syncing
this is the minimum amount of data to store
the shard tells us which shard we're tracking
(this is random and we shouldn't actually care which one since we're just using the shard to get to the beacon block header)
the header is the key trusted piece of data we use to verify merkle proofs against (just like in PoW light clients)
from a beacon block, we can use merkle proofs to verify data about shards, beacon state, everything
the committees are stored to keep track of pubkeys/balances of those who are voting on recent shard block roots
a shard committee is a blend of two underlying period committees, which change every ~27 hrs
----
```python
class LightClientUpdate(container):
# Shard block root (and authenticating signature data)
shard_block_root: Hash
fork_version: Version
aggregation_bits: Bitlist
signature: BLSSignature
# Updated beacon header (and authenticating branch)
header: BeaconBlockHeader
header_branch: MerkleProof
# Updated period committee (and authenticating branch)
committee: CompactCommittee
committee_branch: MerkleProof
```
note:
This is the data we request to stay synced
We need one of these every ~27 hours
The top section is the shard block root and authenticating information
lets start from the botttom of the section
the signature is an aggregated signature that contains the signatures of all attesters in the committee who voted for the shard block root
the aggregation bits tell us who in the committee signed
the fork version lets us make sure the votes are for the fork we think we're on
and the shard block root is the updated block root
The next section is the new beacon block header, our new key to the castle. If all goes well, we'll update our light client memory with this header.
The header branch is a merkle proof that we run against the shard block root.
and the committee is the new period committee. If all goes well, we'll update our light client memory with this new committee.
The committee branch is a merkle proof that we run against the header
----
```python
def update_memory(
memory: LightClientMemory,
update: LightClientUpdate
):
# Verify the update does not skip a period
# Verify shard attestations
# - vote is for the shard root
# Verify shard committee votes pass 2/3 threshold
# - vote has sufficient weight
# Verify update header against shard block root and header branch
# - header is valid
# Update period committees if entering a new period
# - verify committee against header
# Update header
```
note:
can't skip a period - we need to track all committee changes so we keep track of all stake for that shard
ensure that the vote is for the shard root we just got in the update
and that it has sufficient weight
at this point, we 'trust' the shard root
now we can use the shard root and proof to prove the validity of the included beacon block header
that way, we now trust the beacon block header
and once the header is trusted, we can use it and a proof to prove the validity of the committee update
----
### Update data size
----
- shard_block_root: 32 bytes
- fork_version: 4 bytes
- aggregation_bits: 16 bytes
- signature: 96 bytes
- header: 8 + 32 + 32 + 32 + 96 = 200 bytes
- header_branch: 4 * 32 = 128 bytes
- committee: 128 * (48 + 8) = 7,168 bytes
- committee_branch: (5 + 10) * 32 = 480 bytes
Total: **8,124 bytes** per **~27 hours**
----
Total: 8,124 bytes per ~27 hours or
**~0.083 bytes per second**
vs.
Bitcoin SPV: 80 bytes per ~560 second
**~0.143 bytes per second**
note:
Our light client sync requires approx 0.083 bytes per second
For reference, bitcoin's light client protocol, requires
approx 0.143 bytes per second
So we're doing pretty good
---
## Data/Proof Requests
Once we're synced, now what?
A: Gimme proofs
----
### Who has what data
All valididators have recent beacon chain state
1/1024 validators have recent shard state
Relayers/State providers have EE state
note:
this effects how we will request data
----
### Proof sizes: Token EE Balance Example
1. light client update(w/o new committee)? 476 bytes
2. finalized block root? 224 bytes
3. state root? 128 bytes
4. crosslink data? 672 bytes
5. ee state root? ~736 bytes
6. token balance? ~1024 bytes
note:
lets think about what a light client would need to get their updated balance on some shard on some ee
lets look at the path we would need, and roughly what the size is
total ~3.2kb
---
## Open questions
* Request structure
* How do clients request which data they need?
* Networking
* How do clients pick servers to connect to?
* How does a client quickly connect to a new shard?
* Incentivization
* How do clients pay servers?
* Should shard nodes be light client servers by default?
---
<!-- .slide: data-background="#283032" -->
## End
github.com/chainsafe/lodestar
----
References
- https://blog.ethereum.org/2015/01/10/light-clients-proof-stake/
- https://arxiv.org/pdf/1710.09437.pdf
- https://blog.ethereum.org/2014/11/25/proof-stake-learned-love-weak-subjectivity/
- https://github.com/ethereum/eth2.0-specs/blob/dev/specs/light_client/sync_protocol.md
- https://medium.com/@jgm.orinoco/understanding-sparse-merkle-multiproofs-9b9f049e8f08
- @ralexstokes and @dannyryan