Lighthouse Security Review

This document is intended to provide security reviewers an overview of the Lighthouse project.

Project Overview

The primary goal of the https://github.com/sigp/lighthouse project is to serve the following three groups of users:

Eth2 stakers.
Those who wish to obtain a view of the eth2 network via our API.
Those who wish to run a beacon node (or boot node) for the good of the network.

We serve these users via a single binary named lighthouse.

The binary

The lighthouse binary has the following subcommands:

lighthouse bn: provides a "beacon node" (BN)
- Heavy-weight, long-running service
- Connects to libp2p
- Verifies messages from network
- Serves a HTTP API
lighthouse vc: provides a "validator client" (VC)
- Light-weight, long-running service
- Requires one connection: a lighthouse bn serving the HTTP API
- Holds secret keys in memory.
- Signs consensus messages.
- Keeps track of one or more eth2 validators message-production schedule and requests messages from BN, signs them then sends them back to BN to propagate on the network.
lighthouse account: provides commands for managing validators:
- CLI tool (i.e., not a long-running service)
- Creates, encrypts, decrypts secrets for eth2 validators.
- Submits deposits to official EF deposit smart-contract, via an eth1 node (geth, etc)
- Uses EIP-2335 for BLS JSON keystore format.
- Uses EIP-2386 (draft) for mnemonic-based (BIP-39) wallets which can derive multiple, deterministic keystores (via EIP-2334).

Getting familiar with the application

Lighthouse has documentation here: https://lighthouse-book.sigmaprime.io/

To get familiar with Lighthouse I suggest the following:

Easy: Sync the Medalla testnet
- Medalla is the current eth2 testnet. It has 3+ different implementations and has been running for over a month. The EF is actively involved in this.
- First install lighthouse, then you can just run lighthouse bn --http and watch the logs.
Medium: start your own local eth2 network (brand-new chain) using the Simple Local Testnet scripts.
Harder: create a validator on the Medalla testnet with these docs.
- Ask us if you need Goerli ETH to submit a deposit.
- It doesn't matter if you dont keep this validator alive, there's ~50k validators currently.

Diagrams

We have two quite basic diagrams at this draw.io document which demonstrate how messages from the network and http api are processed and stored in databases.

Scope

Basically everything in https://github.com/sigp/lighthouse is in-scope, except lcli (it's our tool for doing development operations and is not user-facing).

We're particularily interested in finding:

Deadlocks/concurrency issues.
Integer overflows.
OOB array access.
Error propagation issues (i.e., error suppression).

Note on BLS dependencies

Lighthouse can be configured at compile time to use one of the two BLS libraries:

We use (1) as the default since it's currently the fastest. However, we have maintained (2) for quite some time.

Whilst I think a cursorary glance would be useful, we do not require a full review of these two libraries at this point. (1) is going to have a review organised by the authors and it's not clear we will use (2) in production at this stage.

Networking

The entire networking stack is in scope, except for the external rust-libp2p dependency (which also contains our gossipsub implementation).

Specifically, we would be interested in reviews of the eth2_libp2p and network crates inside lighthouse as well as the external sigp/discv5 respository which houses our discovery v5 implementation.

We are interested in all kinds of attacks. Of particular interest are DoS vectors, possible panics arising from external maliciously crafted packets, feasible eclipse attacks or sybils.

Hot spots

Here we've listed some components that we think are "hot spots" for vulns. We've rated them on a scale of 1 to 3 chillis, where 3 means we think it's the riskiest.

Block processing
- Verification before importing to database 🌶️🌶️🌶️
- Verification before gossiping 🌶️
- Skipping slots is a DoS vector 🌶️
Attestation processing
- Verification before importing to op pool/fork choice. 🌶️🌶️🌶️
- Verification before gossiping 🌶️
Shuffling algo 🌶️
Caches in BeaconChain
- Consistency/accuraccy 🌶️🌶️🌶️
- Memory ballooning 🌶️
Operation pool
- Could produce invalid blocks 🌶️🌶️
  - Note: we're not double-checking signature validity.
- Memory ballooning 🌶️
SSZ decoding 🌶️🌶️🌶️
Fork choice
- Risk for deviation from spec. 🌶️
- Risk for memory blowout. 🌶️
Key management
- Wallets, keystores & key derivation. 🌶️🌶️🌶️
- File permissions, etc. 🌶️🌶️🌶️
Slashing protection
- Don't want people to get slashed. 🌶️🌶️🌶️
Finding genesis from eth1 contract
- Inconsistency from spec 🌶️
Database inconsistency 🌶️
Validator client
- Locking/stopping 🌶️
Syncing (quite complex logically and in code form. This may require significant effort to audit) 🌶️🌶️
- Edge cases relating that arise to malcious or unforseen STATUS messages from peers.
Peer Management 🌶️
- Have a rudimentary scoring system. The handling of newly connected peers and handling timeouts may disrupt the peer management database. Potential edge cases or ways to game the current system.

Fuzzing Notes

Fuzzing has been a priority of the team, please see the following branches (some of these are quite behind master now!):
- fuzzing-state-transition
- rpc-fuzz
- fuzz-ssz-snappy-codec
- arbitrary-fuzzing-fuzzer
A lot of the fuzzing work has been moved here, as part of the Beacon Fuzz project
It mostly targets state transition functions, even though other parts of the codebase have also been fuzzed (e.g. networking)
Thanks to this PR, we can now perform structural fuzzing, i.e. use well-formed instances of custom types by deriving and implementing the Arbitrary trait, allowing us to create structured inputs from raw byte buffers.
We've been using the following fuzzing engines (as you probably know they all provide different mutation algorithms and have detected different bugs):
- libFuzzer
- Honggfuzz
- AFL
The latest Beacon Fuzz update might be quite useful in understanding where we're currently at:
- https://blog.sigmaprime.io/beacon-fuzz-07.html
You can check out the Trophies section of the Beacon Fuzz README for a (non-exhaustive) list of bugs identified
Other interesting fuzzing targets (networking):
- discv5: https://github.com/sigp/discv5/tree/fuzz-5.1
- enr: https://github.com/rust-ethereum/enr/tree/fuzz
- gossipsub-v1.1: https://github.com/sigp/rust-libp2p/tree/gossipsub-v1.1-fuzz

Eth2 Resources

Here are some resources for the broader Eth2 specification.

The spec repo

The canonical eth2 spec is here: https://github.com/ethereum/eth2.0-specs/tree/v0.12.2

Notice that I've given you a link to the v0.12.2 tag. This is the latest version and what Lighthouse is using. Don't make the mistake of accidentally using the default dev branch, although everyone eventually does!

Interesting bits:

The Beacon Chain: genesis, block processing, other operation processing. Most of your conensus questions are here.
Fork Choice: defines the eth2 fork choice rule.
Honest Validator: "The Beacon Chain" section tells you what you can and can't do, but this tells you what you should do as a validator. Very useful.
Networking spec: defines a bunch of networking stuff. Although core networking is out of the scope of this review, this section contains the "should forward … message" logic that is in the Lighthouse BeaconChain. I think this is definitely worth looking at.
SimpleSerialize defines:
- Encoding/decoding bytes.
- Merkle hashing.

Protolambda

Diederik (@protolambda) is an EF researcher and does lots of useful and interesting things.

His Eth 2.0 education resources repo has some great diagrams that can help with understanding SSZ, hashing, state processing, etc.

Eth R&D Discord

The Eth2 R&D Discord is where you can reach the EF researchers and ask questions.

Lighthouse Development Update

We release development updates ~once a month in the form of blog posts here.

The latest blog post is available here