On the 30th of July, 2023, multiple Curve.Fi liquidity pools were exploited as a result of a latent vulnerability in the Vyper compiler, specifically in versions 0.2.15
, 0.2.16
, and 0.3.0
. While bug was identified and patched by the v0.3.1
release, the impact to protocols using the vulnerable compilers was not realized at the time and they were not explicitly notified. The vulnerability itself was an improperly implemented re-entrancy guard that could be bypassed under certain conditions which we will delve into in this report.
While the hacks themselves have been sufficiently covered in other post-mortems including the official one by Curve.Fi, we would like to take a deep-dive into what exactly went wrong with the Vyper compiler itself, why the vulnerability was hard to spot, and what the ecosystem as a whole can learn from these incidents.
If you are familiar with the blockchain space and why Vyper exists, we recommend skipping the Background section as it contains very basic information that you most likely are aware of.
Vyper is a contract-oriented, domain-specific, pythonic programming language which targets the Ethereum Virtual Machine (EVM). Its goals include simplicity, pythonicity, security and auditability.
A common problem that is a concern for code deployed on the EVM is the concept of re-entrancy. In contrast to traditional programs, the control flow of a "blockchain program" will be relinquished to the "active" program being executed at any given moment. From hereon, a "blockchain program" will be referred to as a contract.
To elaborate, we can think of all blockchain programs as operating on a single thread with no support for concurrency. Whenever a program invokes another, the entire control flow is passed on to the invoked program.
This means that the original caller's execution is essentially frozen in time until the invoked program concludes, at which point the caller resumes at the exact point they were left in. While different types of vulnerabilities can arise from this behaviour, the most well-known one is a re-entrancy.
As the control flow is relinquished to the invoked contract, the invoked contract can re-enter the original caller while it is frozen. The contracts that are vulnerable to this type of attack will contain state updates after their external contract invocation, meaning that when they are frozen their state is outdated and incorrect.
The ecosystem has come up with two ways to combat re-entrancy attacks and essentially render them ineffectual; the Checks-Effects-Interactions pattern (CEI) and re-entrancy guards.
The CEI pattern is a programming methodology which dictates that a function's code should execute its security checks first, perform any effects in its storage after, and finally perform interactions with external contracts at the end of the function.
Should this pattern be followed strictly, the state of a contract during an "interaction" (i.e. control flow relinquishment) will be up-to-date and correct rendering any would-be exploitation impossible regardless of how the contract is re-entered.
In most cases the CEI pattern is sufficient, however, the DeFi ecosystem is multi-faceted and oftentimes functions rely on the result of an external call to proceed with their own execution. In such cases, the CEI pattern is inapplicable and a re-entrancy guard must be set in place.
As one of the core principles in the Vyper language is security, Vyper decided to introduce a re-entrancy guard directly at the language-level via the special @nonreentrant
function decorator. The re-entrancy guard has been a core feature of the language ever since the v0.1.0-beta.9
release, one of the earliest versions of Vyper.
At their core, both implementations function identically; they set a storage value between two states (activated, inactive). When a function that is marked as @nonreentrant
is invoked, the flag is:
Once the function's invocation concludes, the flag is:
With this mechanism, @nonreentrant
users can ensure that the function can only be re-invoked after it concludes, meaning that no re-entrancy can occur regardless of what external calls are performed. More complex forms of re-entrancy attacks exist (i.e. view
only re-entrancies, cross-contract re-entrancies) but for the purposes of this vulnerability the basic case is what matters.
@nonreentrant
: Tag-Based Re-Entrancy GuardsEver since their introduction, @nonreentrant
decorators always supported a <key>
to be set offering more flexibility compared to a nonreentrancy lock which only applies globally at the contract level.
A simple implementation could be to use a mapping
that takes the key
and sets the relevant re-entrancy flag on it, however, such an approach would invoke extra cost due to the keccak256
gas cost of mapping
lookups.
As Vyper is a language which does not provide raw storage access to users, it will be fully aware of all storage slots a contract uses when compiling it. As such, it takes on the job of allocating storage slots, which includes ensuring that slots for storage variables and re-entrancy key locks do not overlap with each other.
PR#1264 which introduced this feature in the v0.1.0-beta.9
release of Vyper used a simple approach to ensure no overlap, storing re-entrancy flags at a specific offset from the contract's original slots (0xFFFFFF
to be precise).
In tandem with new feature development, beginning in 2018, the Vyper compiler started a multi- year effort to refactor the then single-pass architecture into a multiple pass architecture which would separate the concerns of type checking and semantic analysis into a front-end, distinct from the code generating backend. As with most large refactoring projects, this effort was incremental and piecemeal, being worked on alongside other bugfixes and feature development until finally culminating in 2023 with PR#3390.
PR#2308 which was part of the Vyper v0.2.9
release was meant to make storage allocation smarter by utilizing the first unallocated storage slot available after all slots for regular storage variables of a contract have been processed, instead of starting from the 0xFFFFFF
constant. This would save bytecode space, because in the bytecode, the PUSH
instructions which precede any load or store of a storage slot by pushing the location of the nonreentrancy key to the stack could use fewer bytes.
The above PR of the v0.2.9
release worked well and would guarantee no overlap between re-entrancy flag slots and storage slots so long as the variables that are allocated sequentially in the (physical) front of the storage layout do not span multiple sequential slots.
As the Vyper language and codebase were undergoing significant refactoring at the time, PR#2361 (part of the v0.2.13
release) introduced a more efficient way to store variables that could span more than 1 storage slot (32 bytes) within a contract. As part of the larger refactoring effort, it also moved slot calculation for regular storage variables out of the existing code generation pass into the new front-end pass, but left slot calculation for re-entrancy keys as it was. Because the slot calculation for re-entrancy keys depended on the results of allocation for the regular storage variables, it ended up keeping two different allocator implementations for regular storage variable allocation between the front-end and codegen passes. This resulted in the offset calculations of PR#2308 to be incorrect and require an update.
The update was introduced in PR#2379 (part of the v0.2.14
release), and was meant to calculate the storage offsets for re-entrancy flags correctly by taking into account the correct size of variables declared in storage rather than assuming all of them occupy a single slot (which was true in earlier implementations). However, the second update still had a bug stemming from the differences between the front-end and codegen allocator implementations, which we will describe below.
Because of these bugs, both v0.2.13
and v0.2.14
releases were "yanked"[2] shortly after release.
v0.2.14
Shortly after v0.2.14
was released, a Vyper user opened issue #2393 in the Vyper GitHub repository denoting that re-entrancy guard tests were failing when they were upgrading the Yearn vault code to 0.2.14
.
Taking a snapshot of the latest available release of Yearn when the user opened their issue, proceeded to compile it with v0.2.14
, and inspecting the decompiled bytecode using the EtherVM Decompiler will reveal that the pseudo-code storage offset storage[0x2e]
is being utilized as the "flag" for the @nonreentrant("withdraw")
keyword in both the def deposit
and def withdraw
instances of the Vault.vy
file.
However, the same storage offset is utilized for the contract-level managementFee
variable. This can be validated by evaluating the decompiled functions for the managementFee()
getter function as well as the setManagementFee
setter function which will re-use the same storage offset.
Compiling the same codebase with v0.2.13
shows that that the re-entrancy guards were working as expected and were not overlapping in storage. However, PR#2379 of the v0.2.14
release did not fully solve the issue of corruption.
The code of the v0.2.14
release assigning storage slots for @nonreentrant
decorators still produced an incorrect interaction between the new front-end code and the existing codegen allocator. Re-entrancy slots still ended up overlapping with regular storage variables due to the different allocation strategies for mapping types between the front-end and codegen allocators. The data-corruption code of v0.2.14
is as follows:
def get_nonrentrant_counter(self, key):
"""
Nonrentrant locks use a prefix with a counter to minimise deployment cost of a contract.
We're able to set the initial re-entrant counter using the sum of the sizes
of all the storage slots because all storage slots are allocated while parsing
the module-scope, and re-entrancy locks aren't allocated until later when parsing
individual function scopes. This relies on the deprecated _globals attribute
because the new way of doing things (set_data_positions) doesn't expose the
next unallocated storage location.
"""
if key in self._nonrentrant_keys:
return self._nonrentrant_keys[key]
else:
counter = (
sum(v.size for v in self._globals.values() if not isinstance(v.typ, MappingType))
+ self._nonrentrant_counter
)
self._nonrentrant_keys[key] = counter
self._nonrentrant_counter += 1
return counter
Compare this to the front-end code at the time for calculating the storage layout for regular storage variables.
available_slot = 0
for node in vyper_module.get_children(vy_ast.AnnAssign):
type_ = node.target._metadata["type"]
type_.set_position(StorageSlot(available_slot))
available_slot += math.ceil(type_.size_in_bytes / 32)
While this code would correctly consume the key
value and yield the same @nonreentrant
storage offset for the same key
value, it improperly calculated the storage offset.
Specifically, the old allocator did not allocate a storage slot for MappingType
entries (i.e. HashMap
), while the new allocator did. MappingType
storage entries are never written to but are reserved regardless by the compiler (ref: Issue 2436). This resulted in an inconsistency between the non-reentrant key allocator and the front-end allocator which led to the reported storage corruption.
v0.2.15
After v0.2.14
was yanked, in an effort to correct the v0.2.14
release's corruption of re-entrancy guards, PR#2391 included in the v0.2.15
release was meant to fix the bug introduced in from the previously mentioned PR#2379 by moving re-entrancy keys to be allocated physically in front of regular storage variables. In addition, in an effort to reduce the chance of these kinds of issues being introduced again, it completed the removal of the storage slot allocation logic from the codegen pass by moving the logic into the same function in the front-end as regular storage variable allocation. However, in doing so, it removed the old self._nonreentrant_keys
data structure and, crucially, the accompanying logic which ensured only one lock was allocated per nonreentrant key:
if key in self._nonrentrant_keys:
# --> SAFE. only allocate one slot per key <--
return self._nonrentrant_keys[key]
The actual vulnerability is introduced in the following code of v0.2.15
:
# Allocate storage slots from 0
# note storage is word-addressable, not byte-addressable
storage_slot = 0
for node in vyper_module.get_children(vy_ast.FunctionDef):
type_ = node._metadata["type"]
if type_.nonreentrant is not None:
# --> BUG! should check nonreentrant key not already allocated <--
type_.set_reentrancy_key_position(StorageSlot(storage_slot))
# TODO use one byte - or bit - per reentrancy key
# requires either an extra SLOAD or caching the value of the
# location in memory at entrance
storage_slot += 1
The vulnerability arises from how the storage_slot
offsets of re-entrancy keys were ignoring the actual <key>
of the @nonreentrant(<key>)
decorator and were simply reserving a new slot for each seen @nonreentrant
decorator regardless of what "key" was utilized.
v0.2.15
, v0.2.16
and v0.3.0
The vulnerability that was introduced in v0.2.15
went undetected during the interim releases v0.2.16
as well as v0.3.0
due to insufficient tests in the Vyper codebase at the time to detect it, a 4- month period between Jul 21, 2021 and Nov 30, 2021.
All Vyper contracts that have been compiled with versions v0.2.15
, v0.2.16
, and v0.3.0
are vulnerable to the malfunctioning re-entrancy guard.
v0.3.1
ReleaseThe v0.3.1
release resolved this vulnerability by adjusting how the compiler was allocating data slots to each variable within a contract. The vulnerability was fixed in two different PRs.
The first PR that partially fixed the vulnerability was PR#2439 which contains the following description:
this is not a semantic bug but an optimization bug since we allocate
more slots than we actually need, leading to "holes" in the slot
allocator – slots which are allocated but unused.
This description does not actually clearly describe the issue. The description of "holes" arises from inspecting how the compilation output's layout
was yielding a single slot
value for each re-entrant key. To better understand what happened, let's look at the data allocation function in v0.3.0
:
for node in vyper_module.get_children(vy_ast.FunctionDef):
type_ = node._metadata["type"]
if type_.nonreentrant is not None:
type_.set_reentrancy_key_position(StorageSlot(storage_slot))
# TODO this could have better typing but leave it untyped until
# we nail down the format better
variable_name = f"nonreentrant.{type_.nonreentrant}"
ret[variable_name] = {
"type": "nonreentrant lock",
"location": "storage",
"slot": storage_slot,
}
# TODO use one byte - or bit - per reentrancy key
# requires either an extra SLOAD or caching the value of the
# location in memory at entrance
storage_slot += 1
The problem with this code is that it was setting the re-entrancy key position of each type_
(i.e. individual @nonreentrant
key) to the latest value of storage_slot
, incrementing on each iteration. This meant that the unique instances of @nonreentrant(<key>)
were all using a different storage_slot
value, however, the ret
entry of variable_name
was being overwritten on each iteration.
As such, the layout
output of the compiler contained a single nonreentrant.<key>
entry and a single storage offset, meaning that inspecting the compiler's output would appear to simply be "skipping" the storage slots of consecutive @nonreentrant(<key>)
declarations, as per the PR's original rationale.
The non-vulnerable code that partially patched the vulnerability in the v0.3.1
release is as follows:
for node in vyper_module.get_children(vy_ast.FunctionDef):
type_ = node._metadata["type"]
if type_.nonreentrant is None:
continue
variable_name = f"nonreentrant.{type_.nonreentrant}"
# a nonreentrant key can appear many times in a module but it
# only takes one slot. ignore it after the first time we see it.
if variable_name in ret:
continue
type_.set_reentrancy_key_position(StorageSlot(storage_slot))
# TODO this could have better typing but leave it untyped until
# we nail down the format better
ret[variable_name] = {
"type": "nonreentrant lock",
"location": "storage",
"slot": storage_slot,
}
# TODO use one byte - or bit - per reentrancy key
# requires either an extra SLOAD or caching the value of the
# location in memory at entrance
storage_slot += 1
The code will now properly allocate a single storage_slot
the first time it identifies a duplicate re-entrancy key. However, it will not invoke the set_reentrancy_key_position
function on each type_
with the same offset, meaning that any @nonreentrant(<key>)
entries beyond the first would have an "undefined" storage offset to utilize.
This led to a compiler panic[3] when attempting to compile contracts with @nonreentrant
decorators. To rectify this, a further change was necessary to ensure that all @nonreentrant
decorators were properly aware of the storage slot they need to operate on.
The final PR that completed the alleviation of the @nonreentrant
vulnerability was PR#2514. In detail, it expanded the above code segment to ensure that the set_reentrancy_key_position
function is properly invoked with the correct slot allocated for the given @nonreentrant
lock.
The final non-vulnerable code of the v0.3.1
Vyper release is as follows:
for node in vyper_module.get_children(vy_ast.FunctionDef):
type_ = node._metadata["type"]
if type_.nonreentrant is None:
continue
variable_name = f"nonreentrant.{type_.nonreentrant}"
# a nonreentrant key can appear many times in a module but it
# only takes one slot. after the first time we see it, do not
# increment the storage slot.
if variable_name in ret:
_slot = ret[variable_name]["slot"]
type_.set_reentrancy_key_position(StorageSlot(_slot))
continue
type_.set_reentrancy_key_position(StorageSlot(storage_slot))
# TODO this could have better typing but leave it untyped until
# we nail down the format better
ret[variable_name] = {
"type": "nonreentrant lock",
"location": "storage",
"slot": storage_slot,
}
# TODO use one byte - or bit - per reentrancy key
# requires either an extra SLOAD or caching the value of the
# location in memory at entrance
storage_slot += 1
As one can observe in the above segment, the set_reentrancy_key_position
is now properly invoked for each type_
entry that is nonreentrant
and will correctly utilize the same storage_slot
whenever the same key
is specified in a @nonreentrant(<key>)
decorator.
Furthermore, alongside the above fix, the PR included a much needed and missing test from the Vyper repository; a dedicated unit test that evaluates cross-function re-entrancy:
@external
@nonreentrant('protect_special_value')
def protected_function(val: String[100], do_callback: bool) -> uint256:
self.special_value = val
if do_callback:
self.callback.updated_protected()
return 1
else:
return 2
@external
@nonreentrant('protect_special_value')
def protected_function2(val: String[100], do_callback: bool) -> uint256:
self.special_value = val
if do_callback:
# call other function with same nonreentrancy key
# --> (revert expected here) <--
Self(self).protected_function(val, False)
return 1
return 2
However, while the bug was identified, fixed and tested for within the compiler codebase, the impact to production contracts was not realized at the time, and protocols potentially using the relevant compiler versions were not explicitly notified.
The notion of unique <key>
values that can be re-used across @nonreentrant
decorators only exists for a single purpose; cross-function re-entrancy protection. The fact that such a test was missing from the Vyper repository until its 0.3.1
release was one factor in contributing to the vulnerability being introduced in the first place and continuing to be undetected for as long as it did.
v0.2.15
, v0.2.16
, v0.3.0
v0.2.13
@nonreentrant
decorators within a Vyper contract will utilize a unique storage offset regardless of their key
, meaning that cross-function re-entrancy is possible on all contracts compiled with the susceptible versionsWhile the vulnerability itself is simple to identify and has been observed across a variety of live contracts, its profitability arises from a very specific set of conditions that need to be met. Specifically:
.vy
contract compiled with either of the following vyper
versions: 0.2.15
, 0.2.16
, 0.3.0
@nonreentrant
decorator with a specific key
and does not strictly follow the CEI pattern (i.e. contains an external call to an untrusted party before storage updates)key
and would be affected by the improper state caused by the primary functionUnfortunately, these conditions are the exact ones that manifested in the Curve.Fi liquidity pools that were exploited as they needed to perform an external distribution of native ETH
(which, on the EVM, can only be done with an execution context-transferring CALL[4]) before sensitive storage updates were performed in functions that would otherwise be protected by a properly functioning @nonreentrant
guard.
Bugs are an unfortunate and stark reality of any large, production software project. What we can do is to try and mitigate bugs and their associated risks to the greatest extent possible.
There are several, practical steps we can take to improve correctness of smart contracts compiled with Vyper going forward:
But it is not enough to just focus on the correctness of the latest version of the compiler; due to the immutable nature of smart contracts, contracts compiled with past versions of Vyper could be securing a significant amount of funds.
Securing past releases of Vyper is therefore another important new axis of focus which we will be dedicating significant resources to in the future. It is just as important as introducing new features, providing bugfixes and refactoring for the latest release.
Ultimately, going forward we want to take the lessons learned from recent events to make sure Vyper is the most rock-solid and secure smart contract language and compiler project in the world. Therefore, these goals will be supported by a variety of new security-related initiatives within and beyond our team including:
We hope to see you using Vyper soon :). Please stay tuned for more announcements about these initiatives in the coming weeks! To follow along with further announcements, please follow the official Vyper twitter. To help contribute, please see the Vyper Github! If you are excited about Vyper and want to help with funding – or just want to chat – please reach out to us via the Vyper discord and we will always be happy to welcome you to the community.
Special thanks to the Omniscia team, which, while not directly affiliated with Vyper, contributed substantial co-authorship, feedback and review of this post-mortem report ↩︎
Briefly speaking, "yanking" means that the tags are available in the repository for historical purposes, but the releases are not published for download. For more information, please see PEP-592. ↩︎
That is, the compiler would simply error out instead of producing any code at all. While annoying for the user, a compiler panic is considered a "safe" error because it will not emit code which makes it to production. ↩︎
Technically there are other ways to transfer ether, but they are not applicable here at the time of this writing. EIP-5920 may be a positive development here. ↩︎