###### Vyperlang team with special thanks to Omniscia team[^byline-note]
[^byline-note]: Special thanks to the [Omniscia](https://omniscia.io/) team, which, while not directly affiliated with Vyper, contributed substantial co-authorship, feedback and review of this post-mortem report
On the 30th of July, 2023, multiple Curve.Fi liquidity pools were exploited as a result of a latent vulnerability in the Vyper compiler, [specifically in versions `0.2.15`, `0.2.16`, and `0.3.0`](https://github.com/vyperlang/vyper/security/advisories/GHSA-5824-cm3x-3c38). While bug was identified and patched by the [`v0.3.1`](https://github.com/vyperlang/vyper/releases/tag/v0.3.1) release, the impact to protocols using the vulnerable compilers was not realized at the time and they were not explicitly notified. The vulnerability itself was an improperly implemented re-entrancy guard that could be bypassed under certain conditions which we will delve into in this report.
While the hacks themselves have been sufficiently covered in other post-mortems [including the official one by Curve.Fi](https://hackmd.io/@LlamaRisk/BJzSKHNjn), we would like to take a deep-dive into what exactly went wrong with the Vyper compiler itself, why the vulnerability was hard to spot, and what the ecosystem as a whole can learn from these incidents.
If you are familiar with the blockchain space and why Vyper exists, **we recommend skipping the Background** section as it contains very basic information that you most likely are aware of.
# Background
## Vyper
Vyper is a contract-oriented, domain-specific, pythonic programming language which targets the Ethereum Virtual Machine (EVM). Its goals include simplicity, pythonicity, security and auditability.
## EVM: A Single-Threaded Non-Concurrent Machine
A common problem that is a concern for code deployed on the EVM is the concept of *re-entrancy*. In contrast to traditional programs, the control flow of a "blockchain program" will be relinquished to the "active" program being executed at any given moment. From hereon, a "blockchain program" will be referred to as a contract.
To elaborate, we can think of all blockchain programs as operating on a single thread with no support for concurrency. Whenever a program invokes another, the entire control flow is passed on to the invoked program.
## Re-Entrancy: A Widespread Web 3.0 Problem
This means that the original caller's execution is essentially frozen in time until the invoked program concludes, at which point the caller resumes at the exact point they were left in. While different types of vulnerabilities can arise from this behaviour, the most well-known one is a *re-entrancy*.
As the control flow is relinquished to the invoked contract, the invoked contract can *re-enter* the original caller while it is frozen. The contracts that are vulnerable to this type of attack will contain state updates after their external contract invocation, meaning that when they are frozen **their state is outdated and incorrect**.
## Solutions
The ecosystem has come up with two ways to combat re-entrancy attacks and essentially render them ineffectual; the Checks-Effects-Interactions pattern (CEI) and re-entrancy guards.
### Checks-Effects-Interactions (CEI) Pattern
The CEI pattern is a programming methodology which dictates that a function's code should execute its security checks first, perform any effects in its storage after, and finally perform interactions with external contracts at the end of the function.
Should this pattern be followed strictly, the state of a contract during an "interaction" (i.e. control flow relinquishment) will be up-to-date and correct rendering any would-be exploitation impossible regardless of how the contract is re-entered.
### Re-Entrancy Guards
In most cases the CEI pattern is sufficient, however, the DeFi ecosystem is multi-faceted and oftentimes functions rely **on the result of an external call to proceed with their own execution**. In such cases, the CEI pattern is inapplicable and **a re-entrancy guard must be set in place**.
As one of the core principles in the Vyper language is *security*, Vyper decided to introduce a re-entrancy guard directly at the language-level via the special `@nonreentrant` function decorator. The re-entrancy guard has been a core feature of the language ever since the `v0.1.0-beta.9` release, one of the earliest versions of Vyper.
At their core, both implementations function identically; they set a storage value between two states (activated, inactive). When a function that is marked as `@nonreentrant` is invoked, the flag is:
- Ensured to NOT be in its active state
- Set to its active state
Once the function's invocation concludes, the flag is:
- Set to its inactive state
With this mechanism, `@nonreentrant` users can ensure that the function can only be re-invoked **after it concludes**, meaning that **no re-entrancy can occur regardless of what external calls are performed**. More complex forms of re-entrancy attacks exist (i.e. `view` only re-entrancies, cross-contract re-entrancies) but for the purposes of this vulnerability the basic case is what matters.
# Vyper Vulnerability Historical Timeline
## `@nonreentrant`: Tag-Based Re-Entrancy Guards
Ever since their introduction, `@nonreentrant` decorators [always supported a `<key>` to be set](https://github.com/vyperlang/vyper/pull/1264) offering more flexibility compared to a nonreentrancy lock which only applies globally at the contract level.
A simple implementation could be to use a `mapping` that takes the `key` and sets the relevant re-entrancy flag on it, however, such an approach would invoke extra cost due to the `keccak256` gas cost of `mapping` lookups.
As Vyper is a language which does not provide raw storage access to users, it will be fully aware of all storage slots a contract uses when compiling it. As such, it takes on the job of allocating storage slots, which includes ensuring that slots for storage variables and re-entrancy key locks do not overlap with each other.
[PR#1264](https://github.com/vyperlang/vyper/pull/1264) which introduced this feature in the [`v0.1.0-beta.9`](https://github.com/vyperlang/vyper/releases/tag/v0.1.0-beta.9) release of Vyper used a simple approach to ensure no overlap, storing re-entrancy flags at a specific offset from the contract's original slots (`0xFFFFFF` to be precise).
## Refactoring the Compiler
In tandem with new feature development, beginning in 2018, the Vyper compiler started a [multi-](https://github.com/vyperlang/vyper/issues/1024) [year](https://github.com/vyperlang/vyper/issues/1806) [effort](https://github.com/vyperlang/vyper/issues/2062) to refactor the then single-pass architecture into a multiple pass architecture which would separate the concerns of type checking and semantic analysis into a front-end, distinct from the code generating backend. As with most large refactoring projects, this effort was incremental and piecemeal, being worked on alongside other bugfixes and feature development until finally culminating in 2023 with [PR#3390](https://github.com/vyperlang/vyper/pull/3390).
## Location Optimization: Smarter Allocation of Storage Slots
[PR#2308](https://github.com/vyperlang/vyper/pull/2308) which was part of the Vyper [`v0.2.9`](https://github.com/vyperlang/vyper/releases/tag/v0.2.9) release was meant to make storage allocation smarter by utilizing the first unallocated storage slot available after all slots for regular storage variables of a contract have been processed, instead of starting from the `0xFFFFFF` constant. This would save bytecode space, because in the bytecode, the `PUSH` instructions which precede any load or store of a storage slot by pushing the location of the nonreentrancy key to the stack could use fewer bytes.
## Corruption Avoidance: Proper Offset Calculation
The above PR of the `v0.2.9` release worked well and would guarantee no overlap between re-entrancy flag slots and storage slots so long as the variables that are allocated sequentially in the (physical) front of the storage layout do not span multiple sequential slots.
As the Vyper language and codebase were undergoing significant refactoring at the time, [PR#2361](https://github.com/vyperlang/vyper/pull/2361) (part of the `v0.2.13` release) introduced a more efficient way to store variables that could span more than 1 storage slot (32 bytes) within a contract. As part of the larger refactoring effort, it also moved slot calculation for regular storage variables out of the existing code generation pass into the new front-end pass, but left slot calculation for re-entrancy keys as it was. Because the slot calculation for re-entrancy keys depended on the results of allocation for the regular storage variables, it ended up keeping two different allocator implementations for regular storage variable allocation between the front-end and codegen passes. This resulted in the offset calculations of [PR#2308](https://github.com/vyperlang/vyper/pull/2308) to be incorrect and require an update.
The update was introduced in [PR#2379](https://github.com/vyperlang/vyper/pull/2379) (part of the `v0.2.14` release), and was meant to calculate the storage offsets for re-entrancy flags correctly by taking into account the correct size of variables declared in storage rather than assuming all of them occupy a single slot (which was true in earlier implementations). However, the second update still had a bug stemming from the differences between the front-end and codegen allocator implementations, which we will describe below.
Because of these bugs, both `v0.2.13` and `v0.2.14` releases were "yanked"[^yank-explanation] shortly after release.
[^yank-explanation]: Briefly speaking, "yanking" means that the tags are available in the repository for historical purposes, but the releases are not published for download. For more information, please see [PEP-592](https://peps.python.org/pep-0592/).
## Decisive Event: Re-Entrancy Guard Corruption in `v0.2.14`
Shortly after `v0.2.14` was released, a Vyper user opened [issue #2393](https://github.com/vyperlang/vyper/issues/2393) in the Vyper GitHub repository denoting that re-entrancy guard tests were failing when they were upgrading the Yearn vault code to `0.2.14`.
Taking a snapshot of [the latest available release of Yearn](https://github.com/yearn/yearn-vaults/tree/efb47d8a84fcb13ceebd3ceb11b126b323bcc05d) when the user opened their issue, proceeded to compile it with `v0.2.14`, and inspecting the **decompiled bytecode** using the [EtherVM Decompiler](https://ethervm.io/decompile) will reveal that the pseudo-code storage offset `storage[0x2e]` is being utilized as the "flag" for the `@nonreentrant("withdraw")` keyword in both the `def deposit` and `def withdraw` instances of the `Vault.vy` file.
However, the same storage offset is utilized for the contract-level `managementFee` variable. This can be validated by evaluating the decompiled functions for the `managementFee()` getter function as well as the `setManagementFee` setter function which will re-use the same storage offset.
Compiling the same codebase with `v0.2.13` shows that that the re-entrancy guards were working as expected and were not overlapping in storage. However, [PR#2379](https://github.com/vyperlang/vyper/pull/2379) of the `v0.2.14` release did not fully solve the issue of corruption.
The code of the `v0.2.14` release assigning storage slots for `@nonreentrant` decorators still produced an incorrect interaction between the new front-end code and the existing codegen allocator. Re-entrancy slots still ended up overlapping with regular storage variables due to the different allocation strategies for mapping types between the front-end and codegen allocators. **The data-corruption code of `v0.2.14` is as follows**:
```python
def get_nonrentrant_counter(self, key):
"""
Nonrentrant locks use a prefix with a counter to minimise deployment cost of a contract.
We're able to set the initial re-entrant counter using the sum of the sizes
of all the storage slots because all storage slots are allocated while parsing
the module-scope, and re-entrancy locks aren't allocated until later when parsing
individual function scopes. This relies on the deprecated _globals attribute
because the new way of doing things (set_data_positions) doesn't expose the
next unallocated storage location.
"""
if key in self._nonrentrant_keys:
return self._nonrentrant_keys[key]
else:
counter = (
sum(v.size for v in self._globals.values() if not isinstance(v.typ, MappingType))
+ self._nonrentrant_counter
)
self._nonrentrant_keys[key] = counter
self._nonrentrant_counter += 1
return counter
```
Compare this to the front-end code at the time for calculating the storage layout for regular storage variables.
```python
available_slot = 0
for node in vyper_module.get_children(vy_ast.AnnAssign):
type_ = node.target._metadata["type"]
type_.set_position(StorageSlot(available_slot))
available_slot += math.ceil(type_.size_in_bytes / 32)
```
While this code would correctly consume the `key` value and yield the same `@nonreentrant` storage offset for the same `key` value, it improperly calculated the storage offset.
Specifically, the old allocator did not allocate a storage slot for `MappingType` entries (i.e. `HashMap`), while the new allocator did. `MappingType` storage entries are never written to but are reserved regardless by the compiler (ref: [Issue 2436](https://github.com/vyperlang/vyper/issues/2436)). This resulted in an inconsistency between the non-reentrant key allocator and the front-end allocator which led to the reported storage corruption.
## Vulnerability Introduced: Malfunctioning Re-Entrancy Locks in `v0.2.15`
After `v0.2.14` was yanked, in an effort to correct the `v0.2.14` release's corruption of re-entrancy guards, [PR#2391](https://github.com/vyperlang/vyper/pull/2391) included in the `v0.2.15` release was meant to fix the bug introduced in from the previously mentioned [PR#2379](https://github.com/vyperlang/vyper/pull/2379) by moving re-entrancy keys to be allocated physically in front of regular storage variables. In addition, in an effort to reduce the chance of these kinds of issues being introduced again, it completed the removal of the storage slot allocation logic from the codegen pass by moving the logic into the same function in the front-end as regular storage variable allocation. However, in doing so, it removed the old `self._nonreentrant_keys` data structure and, crucially, the accompanying logic which ensured only one lock was allocated per nonreentrant key:
```python
if key in self._nonrentrant_keys:
# --> SAFE. only allocate one slot per key <--
return self._nonrentrant_keys[key]
```
**The actual vulnerability is introduced in the following code of `v0.2.15`**:
```python
# Allocate storage slots from 0
# note storage is word-addressable, not byte-addressable
storage_slot = 0
for node in vyper_module.get_children(vy_ast.FunctionDef):
type_ = node._metadata["type"]
if type_.nonreentrant is not None:
# --> BUG! should check nonreentrant key not already allocated <--
type_.set_reentrancy_key_position(StorageSlot(storage_slot))
# TODO use one byte - or bit - per reentrancy key
# requires either an extra SLOAD or caching the value of the
# location in memory at entrance
storage_slot += 1
```
The vulnerability arises from how the `storage_slot` offsets of re-entrancy keys were ignoring the actual `<key>` of the `@nonreentrant(<key>)` decorator and were simply reserving a new slot for each seen `@nonreentrant` decorator regardless of what "key" was utilized.
## Latent Period: `v0.2.15`, `v0.2.16` and `v0.3.0`
The vulnerability that was introduced in `v0.2.15` went undetected during the interim releases `v0.2.16` as well as `v0.3.0` due to insufficient tests in the Vyper codebase at the time to detect it, a 4- month period between Jul 21, 2021 and Nov 30, 2021.
**All Vyper contracts that have been compiled with versions `v0.2.15`, `v0.2.16`, and `v0.3.0` are vulnerable to the malfunctioning re-entrancy guard**.
## Remediation: `v0.3.1` Release
The `v0.3.1` release resolved this vulnerability by adjusting how the compiler was allocating data slots to each variable within a contract. The vulnerability was fixed in **two different PRs**.
### PR#2439: Fix unused storage slots
The first PR that partially fixed the vulnerability was [PR#2439](https://github.com/vyperlang/vyper/pull/2439) which contains the following description:
> this is not a semantic bug but an optimization bug since we allocate
more slots than we actually need, leading to "holes" in the slot
allocator -- slots which are allocated but unused.
This description does not actually clearly describe the issue. The description of "holes" arises from inspecting how the compilation output's `layout` was yielding a single `slot` value for each re-entrant key. To better understand what happened, let's look at the data allocation function in `v0.3.0`:
```python
for node in vyper_module.get_children(vy_ast.FunctionDef):
type_ = node._metadata["type"]
if type_.nonreentrant is not None:
type_.set_reentrancy_key_position(StorageSlot(storage_slot))
# TODO this could have better typing but leave it untyped until
# we nail down the format better
variable_name = f"nonreentrant.{type_.nonreentrant}"
ret[variable_name] = {
"type": "nonreentrant lock",
"location": "storage",
"slot": storage_slot,
}
# TODO use one byte - or bit - per reentrancy key
# requires either an extra SLOAD or caching the value of the
# location in memory at entrance
storage_slot += 1
```
The problem with this code is that it was setting the re-entrancy key position of each `type_` (i.e. individual `@nonreentrant` key) to the latest value of `storage_slot`, incrementing on each iteration. This meant that the unique instances of `@nonreentrant(<key>)` were all using a different `storage_slot` value, however, the `ret` entry of `variable_name` was being overwritten on each iteration.
As such, the `layout` output of the compiler contained a single `nonreentrant.<key>` entry and a single storage offset, meaning that inspecting the compiler's output would appear to simply be "skipping" the storage slots of consecutive `@nonreentrant(<key>)` declarations, as per the [PR's original rationale](https://github.com/vyperlang/vyper/pull/2439).
**The non-vulnerable code that *partially* patched the vulnerability in the `v0.3.1` release is as follows**:
```python
for node in vyper_module.get_children(vy_ast.FunctionDef):
type_ = node._metadata["type"]
if type_.nonreentrant is None:
continue
variable_name = f"nonreentrant.{type_.nonreentrant}"
# a nonreentrant key can appear many times in a module but it
# only takes one slot. ignore it after the first time we see it.
if variable_name in ret:
continue
type_.set_reentrancy_key_position(StorageSlot(storage_slot))
# TODO this could have better typing but leave it untyped until
# we nail down the format better
ret[variable_name] = {
"type": "nonreentrant lock",
"location": "storage",
"slot": storage_slot,
}
# TODO use one byte - or bit - per reentrancy key
# requires either an extra SLOAD or caching the value of the
# location in memory at entrance
storage_slot += 1
```
The code will now properly allocate a single `storage_slot` the first time it identifies a duplicate re-entrancy key. However, it will not invoke the `set_reentrancy_key_position` function on each `type_` with the same offset, meaning that any `@nonreentrant(<key>)` entries beyond the first would have an "undefined" storage offset to utilize.
This led to a compiler panic[^compiler-panic] when attempting to compile contracts with `@nonreentrant` decorators. To rectify this, a further change was necessary to ensure that all `@nonreentrant` decorators were properly aware of the storage slot they need to operate on.
[^compiler-panic]: That is, the compiler would simply error out instead of producing any code at all. While annoying for the user, a compiler panic is considered a "safe" error because it will not emit code which makes it to production.
### PR#2514: fix codegen failure with nonreentrant keys
The final PR that completed the alleviation of the `@nonreentrant` vulnerability was [PR#2514](https://github.com/vyperlang/vyper/pull/2514). In detail, it expanded the above code segment to ensure that the `set_reentrancy_key_position` function is properly invoked with the correct slot allocated for the given `@nonreentrant` lock.
**The final non-vulnerable code of the `v0.3.1` Vyper release is as follows**:
```python
for node in vyper_module.get_children(vy_ast.FunctionDef):
type_ = node._metadata["type"]
if type_.nonreentrant is None:
continue
variable_name = f"nonreentrant.{type_.nonreentrant}"
# a nonreentrant key can appear many times in a module but it
# only takes one slot. after the first time we see it, do not
# increment the storage slot.
if variable_name in ret:
_slot = ret[variable_name]["slot"]
type_.set_reentrancy_key_position(StorageSlot(_slot))
continue
type_.set_reentrancy_key_position(StorageSlot(storage_slot))
# TODO this could have better typing but leave it untyped until
# we nail down the format better
ret[variable_name] = {
"type": "nonreentrant lock",
"location": "storage",
"slot": storage_slot,
}
# TODO use one byte - or bit - per reentrancy key
# requires either an extra SLOAD or caching the value of the
# location in memory at entrance
storage_slot += 1
```
As one can observe in the above segment, the `set_reentrancy_key_position` is now properly invoked for each `type_` entry that is `nonreentrant` and will correctly utilize the same `storage_slot` whenever the same `key` is specified in a `@nonreentrant(<key>)` decorator.
Furthermore, alongside the above fix, the PR included **a much needed and missing test** from the Vyper repository; a dedicated unit test **that evaluates cross-function re-entrancy**:
```python
@external
@nonreentrant('protect_special_value')
def protected_function(val: String[100], do_callback: bool) -> uint256:
self.special_value = val
if do_callback:
self.callback.updated_protected()
return 1
else:
return 2
@external
@nonreentrant('protect_special_value')
def protected_function2(val: String[100], do_callback: bool) -> uint256:
self.special_value = val
if do_callback:
# call other function with same nonreentrancy key
# --> (revert expected here) <--
Self(self).protected_function(val, False)
return 1
return 2
```
However, while the bug was identified, fixed and tested for within the compiler codebase, the impact to production contracts was not realized at the time, and protocols potentially using the relevant compiler versions were not explicitly notified.
The notion of unique `<key>` values that can be re-used across `@nonreentrant` decorators only exists for a single purpose; cross-function re-entrancy protection. The fact that such a test was missing from the Vyper repository until its `0.3.1` release was one factor in contributing to the vulnerability being introduced in the first place and continuing to be undetected for as long as it did.
# Vulnerability in Summary
- Versions Affected: `v0.2.15`, `v0.2.16`, `v0.3.0`
- Root Cause: Improper remediations to re-entrancy guard data corruption issues introduced in `v0.2.13`
- Vulnerability in Brief: All `@nonreentrant` decorators within a Vyper contract will utilize a **unique storage offset regardless of their `key`**, meaning that cross-function re-entrancy is possible on all contracts compiled with the susceptible versions
## Conditions for a Profitable Vulnerability
While the vulnerability itself is simple to identify and has been observed across a variety of live contracts, its profitability arises from a very specific set of conditions that need to be met. Specifically:
- A `.vy` contract compiled with either of the following `vyper` versions: `0.2.15`, `0.2.16`, `0.3.0`
- A primary function that utilizes the `@nonreentrant` decorator with a specific `key` and does not strictly follow the CEI pattern (i.e. contains an external call to an untrusted party before storage updates)
- A secondary function that utilizes the same `key` and would be affected by the improper state caused by the primary function
Unfortunately, these conditions are the exact ones that manifested in [the Curve.Fi liquidity pools that were exploited](https://hackmd.io/@LlamaRisk/BJzSKHNjn) as they needed to perform an external distribution of native `ETH` (which, on the EVM, can only be done with an execution context-transferring CALL[^ether-transfers]) before sensitive storage updates were performed in functions that would otherwise be protected by a properly functioning `@nonreentrant` guard.
[^ether-transfers]: Technically there are other ways to transfer ether, but they are not applicable here at the time of this writing. [EIP-5920](https://eips.ethereum.org/EIPS/eip-5920) may be a positive development here.
# Summary and Takeaways
Bugs are an unfortunate and stark reality of any large, production software project. What we can do is to try and mitigate bugs and their associated risks to the greatest extent possible.
There are several, practical steps we can take to improve correctness of smart contracts compiled with Vyper going forward:
1. Improved testing of the compiler, including continuing to improve coverage, comparing compiler output with a language spec, and utilizing formal verification (FV) tools for compiler bytecode verification.
2. Providing developers with tools to make it easier to take a multi-faceted approach to testing their code -- including source- and bytecode- level tests.
2. Tighter, two-way feedback with protocols using Vyper
But it is not enough to just focus on the correctness of the latest version of the compiler; due to the immutable nature of smart contracts, contracts compiled with past versions of Vyper could be securing a significant amount of funds.
Securing past releases of Vyper **is therefore another important new axis of focus which we will be dedicating significant resources to in the future**. It is just as important as introducing new features, providing bugfixes and refactoring for the latest release.
Ultimately, **going forward we want to take the lessons learned from recent events to make sure Vyper is the most rock-solid and secure smart contract language and compiler project in the world.** Therefore, these goals will be supported by a variety of new security-related initiatives within and beyond our team including:
- **A short-term, competitive audit** which will focus on the most recent version of Vyper in partnership with [Codehawks](https://www.codehawks.com/)
- **Short-term and long-term (open-ended) bug bounty programs** spanning all versions of the Vyper compiler in partnership with [Immunefi](https://immunefi.com/)
- **The Vyper Security Alliance**, a coordinated, multi-protocol bounty program for helping discover present and past compiler bugs affecting versions of Vyper securing live TVL
- **Collaboration with multiple audit firms** including [ChainSecurity](https://chainsecurity.com/), [OtterSec](https://osec.io/), [Statemind](https://statemind.io) and [Certora](https://www.certora.com/) to review past versions of Vyper which secure substantial live TVL and to help review the compiler on an ongoing basis going forward
- **Expansion of the team**; including a dedicated security engineering role meant to improve the security tooling of Vyper across the board, both internally and user-facing
- **Collaboration with existing security toolkits** that are offered for Solidity and would benefit the Vyper ecosystem greatly
- **Design of a language specification** which will contribute to efforts to formally verify and help testing efforts of the compiler itself
We hope to see you using Vyper soon :). Please stay tuned for more announcements about these initiatives in the coming weeks! To follow along with further announcements, please follow the [official Vyper twitter](https://twitter.com/vyperlang). To help contribute, please see the [Vyper Github](https://github.com/vyperlang/vyper)! If you are excited about Vyper and want to help with funding -- or just want to chat -- please reach out to us via the [Vyper discord](https://discord.gg/6tw7PTM7C2) and we will always be happy to welcome you to the community.