owned this note
owned this note
Published
Linked with GitHub
# Obol Report
*Prepared by: Alex Wade (Ethereal Ventures)
Date: Jan 2023*
Over the past month, I worked with Obol to review their software development practices in preparation for their upcoming security audits. My goals were to review and analyze:
* Software development processes
* Vulnerability disclosure and escalation procedures
* Key personnel risk
The information in this report was collected through a series of interviews with Obol's project leads.
**Contents:**
[TOC]
### Background Info
:::info
*Each team lead was asked to describe Obol in terms of its goals, objectives, and key features.*
:::
**What is Obol?**
Obol builds DVT (Distributed Validator Technology) for Ethereum.
**What is Obol's goal?**
Obol's goal is to solve a classic distributed systems problem: uptime.
Rather than requiring Ethereum validators to stake on their own, Obol allows groups of operators to stake together. Using Obol, a single validator can be run cooperatively by multiple people across multiple machines.
In theory, this architecture provides validators with some redundancy against common issues: server and power outages, client failures, and more.
**What are Obol's objectives?**
Obol's business objective is to provide base-layer infrastructure to support a distributed validator ecosystem. As Obol provides base layer technology, other companies and projects will build on top of Obol.
Obol's business model is to eventually capture a portion of the revenue generated by validators that use Obol infrastructure.
**What is Obol's product?**
Obol's product consists of three main components, each run by its own team: a webapp, a client, and smart contracts.
* **Launchpad**: webapp to create and manage distributed validators.
* **Charon**: middleware client that enables operators to run distributed validators.
* **Solidity**: withdrawal and fee recipient contracts for use with distributed validators.
### Analysis - Cluster Setup and DKG
The Launchpad guides users through the process of creating a cluster config file, which defines important parameters like the validator's fee recipient and withdrawal addresses, as well as the identities of the other operators in the cluster. In order to ensure their cluster configuration is correct, users need to rely on a few different factors.
**First, users need to trust the Charon client** to perform DKG correctly, and validate things like:
* Config file is well-formed and is using the expected version
* Signatures and ENRs from other operators are valid
* Cluster config hash is correct
* DKG succeeds in producing valid signatures
* Deposit data is well-formed and is correctly generated from the cluster config and DKG.
However, Charon's validation is limited to the digital: signature checks, cluster file syntax, etc. It does NOT help would-be operators determine whether the other operators listed in their cluster definition are the real people with whom they intend to start a DVT cluster. So -
**Second, users need to come to social consensus with fellow operators.** While the cluster is being set up, it's important that each operator is an *active* participant. Each member of the group must validate and confirm that:
* the cluster file correctly reflects their address and node identity, and reflects the information they received from fellow operators
* the cluster parameters are expected -- namely, the number of validators and signing threshold
**Finally, users need to perform independent validation.** Each user should perform their own validation of the cluster definition:
* Is my information correct? (address and ENR)
* Does the information I received from the group match the cluster definition?
* Is the ETH2 deposit data correct, and does it match the information in the cluster definition?
* Are the withdrawal and fee recipient addresses correct?
These final steps are potentially the most difficult, and may require significant technical knowledge.
#### Key Risks
##### 1. Validation of Contract Deployment and Deposit Data Relies Heavily on Launchpad
From my interviews, it seems that the user deploys both the withdrawal and fee recipient contracts through the Launchpad.
What I'm picturing is that during the first parts of the cluster setup process, the user is prompted to sign one or more transactions deploying the withdrawal and fee recipient contracts to mainnet. The Launchpad apparently uses an npm package to deploy these contracts: `0xsplits/splits-sdk`, which I assume provides either JSON artifacts or a factory address on chain. The Launchpad then places the deployed contracts into the cluster config file, and the process moves on.
If an attacker has published a malicious update to the Launchpad (or compromised an underlying dependency), the contracts deployed by the Launchpad may be malicious. The questions I'd like to pose are:
* How does the group creator know the Launchpad deployed the correct contracts?
* How does the rest of the group know the creator deployed the contracts through the Launchpad?
My understanding is that this ultimately comes down to the independent verification that each of the group's members performs during and after the cluster's setup phase.
At its worst, this verification might consist solely of the cluster creator confirming to the others that, yes, *those addresses match the contracts I deployed through the Launchpad.*
A more sophisticated user might verify that *not only do the addresses match, but the deployed source code looks roughly correct.* However, this step is far out of the realm of many would-be validators. To be really certain that the source code is correct would require auditor-level knowledge.
The risk is that:
* the deployed contracts are NOT the correctly-configured 0xsplits waterfall/fee splitter contracts
* most users are ill-equipped to make this determination themselves
* we don't want to trust the Launchpad as the single source of truth
In the worst case, the cluster may end up depositing with malicious withdrawal or fee recipient credentials. If unnoticed, this may net an attacker the entire withdrawal amount, once the cluster exits.
Note that the same (or similar) risks apply to validation of deposit data, which has the potential to be similarly difficult. I'm a little fuzzy on which part of the Obol stack actually generates the deposit data / deposit transaction, so I can't speak to this as much. However, I think the mitigation for both of these is roughly the same - read on!
*Mitigation:*
It's certainly a good idea to make it harder to deploy malicious updates to the Launchpad, but this may not be entirely possible. A higher-yield strategy may be to educate and empower users to perform independent validation of the DVT setup process - without relying on information fed to them by Charon and the Launchpad.
I've outlined some ideas for this in [#R1](#R1-Users-should-deploy-cluster-contracts-through-a-known-on-chain-entry-point) and [#R2](#R2-Users-should-deposit-to-the-beacon-chain-through-a-pool-contract).
##### 2. Social Consensus, aka "Who sends the 32 ETH?"
Depositing to the beacon chain requires a total of 32 ETH. Obol's product allows multiple operators to act as a single validator together, which means would-be operators need to agree on how to fund the 32 ETH needed to initiate the deposit.
It is my understanding that currently, this process comes down to trust and loose social consensus. Essentially, the group needs to decide who chips in what amount together, and then trust someone to take the 32 ETH and complete the deposit process correctly (without running away with the money).
Granted, the initial launch of Obol will be open only to a small group of people as the kinks in the system get worked out - but in preparation for an eventual public release, the deposit process needs to be much simpler and far less reliant on trust.
*Mitigation:* See [#R2](#R2-Users-should-deposit-to-the-beacon-chain-through-a-pool-contract).
#### Potential Attack Scenarios
During the interview process, I learned that each of Obol's core components has its own GitHub repo, and that each repo has roughly the same structure in terms of organization and security policies. For each repository:
* Admin access is held by both Oisin and that product's lead
* In order to merge PRs, the submitter needs:
* CI/CD checks to pass
* Review from one person (anyone at Obol)
* Approval from the repo admin
Of course, admin access also means the ability to change these settings - so repo admins could theoretically merge PRs without needing checks to pass, and without review/approval.
The following scenarios describe the impact an attack may have.
##### 1. Publishing a malicious version of the Launchpad, or compromising an underlying dependency
:::danger
* *Reward:* High
* *Difficulty:* Medium-Low
:::
As described in [Key Risks](#Key-Risks), publishing a malicious version of the Launchpad has the potential to net the largest payout for an attacker. By tampering with the cluster's deposit data or withdrawal/fee recipient contracts, an attacker stands to gain 32 ETH or more per compromised cluster.
During the interviews, I learned that merging PRs to `main` in the Launchpad repo triggers an action that publishes to the site. Given that merges can be performed by either Edax or Oisin (as repo admins), both are prime targets for social engineering attacks.
Additionally, the use of the `0xsplits/splits-sdk` NPM package to aid in contract deployment may represent a supply chain attack vector. It may be that this applies to other Launchpad dependencies as well.
In any case, with a fairly large surface area and high potential reward, this scenario represents a credible risk to users during the cluster setup and DKG process.
See [#R1](#R1-Users-should-deploy-cluster-contracts-through-a-known-on-chain-entry-point), [#R2](#R2-Users-should-deposit-to-the-beacon-chain-through-a-pool-contract), and [#R3](#R3-Raise-the-barrier-to-entry-to-push-an-update-to-the-Launchpad) for some ideas to address this scenario.
##### 2. Publishing a malicious version of Charon to new operators
:::warning
* *Reward:* Medium
* *Difficulty*: High
:::
During the cluster setup process, Charon is responsible both for validating the cluster configuration produced by the Launchpad, as well as performing a DKG ceremony between a group's operators.
If new operators use a malicious version of Charon to perform this process, it may be possible to tamper with both of these responsibilities, or even get access to part or all of the underlying validator private key created during DKG.
However, the difficulty of this type of attack seems quite high. An attacker would first need to carry out the same type of social engineering attack described in scenario 1 to publish and tag a new version of Charon. Crucially, users would also need to install the malicious version - unlike the Launchpad, an update here is not pushed directly to users.
As long as Obol is clear and consistent with communication around releases and versioning, it seems unlikely that a user would both install a brand-new, unannounced release, and finish the cluster setup process before being warned about the attack.
##### 3. Publishing a malicious version of Charon to existing validators
:::info
* *Reward:* Low
* *Difficulty:* High
:::
Once a distributed validator is up and running, much of the danger has passed. As a middleware client, Charon sits between a validator's consensus and validator clients. As such, it shouldn't have direct access to a validator's withdrawal keys.
If existing validators update to a malicious version of Charon, it's likely the worst thing an attacker could do is slash the validator.
This is not likely to be particularly motivating to potential attackers - and paired with the high difficulty described above, this scenario seems unlikely to cause significant issues.
### Recommendations
#### R1: Users should deploy cluster contracts through a known on-chain entry point
During setup, users should only sign one transaction via the Launchpad - to a contract located at an Obol-held ENS (e.g. `launchpad.obol.eth`). This contract should deploy everything needed for the cluster to operate, like the withdrawal and fee recipient contracts. It should also initialize them with the provided reward split configuration (and any other config needed).
Rather than using an NPM library to supply a factory address or JSON artifacts, this has the benefit of being both:
* **Harder to compromise**: as long as the user knows `launchpad.obol.eth`, it's pretty difficult to trick them into deploying the wrong contracts.
* **Easier to validate for non-technical users**: the Obol contract can be queried for deployment information via etherscan. For example:
![](https://i.imgur.com/dEWB9Pv.png)
Note that in order for this to be successful, Obol needs to provide detailed steps for users to perform manual validation of their cluster setups. Users should be able to treat this as a "checklist:"
- [x] Did I send a transaction to `launchpad.obol.eth`?
- [x] Can I use the ENS name to locate and query the deployment manager contract on etherscan?
- [ ] If I input my address, does etherscan report the configuration I was expecting?
- [ ] withdrawal address matches
- [ ] fee recipient address matches
- [ ] reward split configuration matches
As long as these steps are plastered all over the place (i.e. not just on the Launchpad) and Obol puts in effort to educate users about the process, this approach should allow users to validate cluster configurations themselves - regardless of Launchpad or NPM package compromise.
#### R2: Users should deposit to the beacon chain through a pool contract
Once cluster setup and DKG is complete, a group of operators should deposit to the beacon chain by way of a pool contract. The pool contract should:
* Accept Eth from any of the group's operators
* Stop accepting Eth when the contract's balance hits (32 ETH * number of validators)
* Make it easy to pull the trigger and deposit to the beacon chain once the critical balance has been reached
* Offer all of the group's operators a "bail" option at any point before the deposit is triggered
Ideally, this contract is deployed during the setup process described in [#R1](#R1-Users-should-deploy-cluster-contracts-through-a-known-on-chain-entry-point), as another step toward allowing users to perform independent validation of the process.
Rather than relying on social consensus, this should:
* Allow operators to fund the validator without needing to trust any single party
* Make it harder to mess up the deposit or send funds to some malicious actor, as the pool contract should know what the beacon deposit contract address is
#### R3: Raise the barrier to entry to push an update to the Launchpad
Currently, either repo admin (Edax and Oisin) can publish an update to the Launchpad unchecked.
Given the risks and scenarios outlined above, consider amending this process so that the sole compromise of either admin is not sufficient to publish to the Launchpad site. It may be worthwhile to require both Edax and Oisin to approve publishing to the site.
Along with simply adding additional prerequisites to publish an update to the Launchpad, ensure that both Edax and Oisin have enabled some level of multi-factor authentication on their GitHub accounts.
### Additional Notes
#### Vulnerability Disclosure
During the interviews, I got some conflicting information when asking about Obol's vulnerability disclosure process.
Some interviewees directed me towards Obol's security repo, which details security contacts: [`ObolNetwork/obol-security`](https://github.com/ObolNetwork/obol-security), while some answered that disclosure should happen primarily through Immunefi. While these may both be part of the correct answer, it seems that Obol's disclosure process may not be as well-defined as it could be. Here are some notes:
* I wasn't able to find information about Obol on Immunefi. I also didn't find any reference to a security contact or disclosure policy in Obol's docs.
* When looking into the obol security repo, I noticed broken links in a few of the sections in `README.md` and `SECURITY.md`:
* [Security policy](https://github.com/ObolNetwork/obol-security#security-policy)
* [More Information](https://github.com/ObolNetwork/obol-security/blob/master/SECURITY.md#more-information)
* Some of the text and links in the [Bug Bounty Program](https://github.com/ObolNetwork/obol-security/blob/master/SECURITY.md#bug-bounty-program) don't seem to apply to Obol (see text referring to Vaults and Strategies).
* The [Receiving Disclosures](https://github.com/ObolNetwork/obol-security/blob/master/SECURITY.md#receiving-disclosures) section does not include a public key with which submitters can encrypt vulnerability information.
It's my understanding that these items are probably lower priority due to Obol's initial closed launch - but these should be squared away soon!
#### Key Personnel Risk - Oisin
In every interview, I asked about "key personnel" at Obol - people that have company account access, or access to specific private keys needed to access funds or carry out on-chain responsibilities on behalf of Obol. The common denominator in each of these conversations was Oisin.
On one hand, it's great that access to Obol's core permissions and functions isn't widely available to anyone at Obol. On the other hand, it also makes Oisin a high-value target for phishing and malware. I don't have the expertise to recommend mitigations to the risk this poses, but wanted to include some notes here to get the conversation started.
##### Account Access
Oisin has the highest level of access to Obol's accounts by far. In particular, I want to bring attention to:
* Register365 (DNS provider for .tech domain)
* GitHub organization admin role
These two accounts are particularly important to both uptime and security for Obol - especially considering the scenarios and risks described in this document.
For Register365, Oisin mentioned that the signup email for this account uses an @kind.eu domain. In the event this account is taken over or Oisin is unavailable, it may be more difficult for other members of Obol to prove their identities. It might be worth it to contact Register365 and see if they can enable additional security or recovery mechanisms on accounts.
The GitHub organization admin role is important for many reasons, but especially given that the Launchpad is built and deployed through GitHub. It's probably a good idea to explore additional ways to lock down this account.
Finally, Oisin mentioned that he's only using one laptop for both personal and work functions, and that he's currently signed in to pretty much everything on this machine. As it stands right now, access to this machine would probably give an attacker the ability to publish/tag a Charon release, as well as push a new version of the Launchpad to the site.
In addition to the other recommendations in this report, consider exploring Oisin's personal security as a final layer in protecting Obol's product.