Vyperlang team with special thanks to Omniscia team^[1]

On the 30th of July, 2023, multiple Curve.Fi liquidity pools were exploited as a result of a latent vulnerability in the Vyper compiler, specifically in versions 0.2.15, 0.2.16, and 0.3.0. While bug was identified and patched by the v0.3.1 release, the impact to protocols using the vulnerable compilers was not realized at the time and they were not explicitly notified. The vulnerability itself was an improperly implemented re-entrancy guard that could be bypassed under certain conditions which we will delve into in this report.

While the hacks themselves have been sufficiently covered in other post-mortems including the official one by Curve.Fi, we would like to take a deep-dive into what exactly went wrong with the Vyper compiler itself, why the vulnerability was hard to spot, and what the ecosystem as a whole can learn from these incidents.

If you are familiar with the blockchain space and why Vyper exists, we recommend skipping the Background section as it contains very basic information that you most likely are aware of.

Background

Vyper

Vyper is a contract-oriented, domain-specific, pythonic programming language which targets the Ethereum Virtual Machine (EVM). Its goals include simplicity, pythonicity, security and auditability.

EVM: A Single-Threaded Non-Concurrent Machine

A common problem that is a concern for code deployed on the EVM is the concept of re-entrancy. In contrast to traditional programs, the control flow of a "blockchain program" will be relinquished to the "active" program being executed at any given moment. From hereon, a "blockchain program" will be referred to as a contract.

To elaborate, we can think of all blockchain programs as operating on a single thread with no support for concurrency. Whenever a program invokes another, the entire control flow is passed on to the invoked program.

Re-Entrancy: A Widespread Web 3.0 Problem

This means that the original caller's execution is essentially frozen in time until the invoked program concludes, at which point the caller resumes at the exact point they were left in. While different types of vulnerabilities can arise from this behaviour, the most well-known one is a re-entrancy.

As the control flow is relinquished to the invoked contract, the invoked contract can re-enter the original caller while it is frozen. The contracts that are vulnerable to this type of attack will contain state updates after their external contract invocation, meaning that when they are frozen their state is outdated and incorrect.

Solutions

The ecosystem has come up with two ways to combat re-entrancy attacks and essentially render them ineffectual; the Checks-Effects-Interactions pattern (CEI) and re-entrancy guards.

Checks-Effects-Interactions (CEI) Pattern

The CEI pattern is a programming methodology which dictates that a function's code should execute its security checks first, perform any effects in its storage after, and finally perform interactions with external contracts at the end of the function.

Should this pattern be followed strictly, the state of a contract during an "interaction" (i.e. control flow relinquishment) will be up-to-date and correct rendering any would-be exploitation impossible regardless of how the contract is re-entered.

Re-Entrancy Guards

In most cases the CEI pattern is sufficient, however, the DeFi ecosystem is multi-faceted and oftentimes functions rely on the result of an external call to proceed with their own execution. In such cases, the CEI pattern is inapplicable and a re-entrancy guard must be set in place.

As one of the core principles in the Vyper language is security, Vyper decided to introduce a re-entrancy guard directly at the language-level via the special @nonreentrant function decorator. The re-entrancy guard has been a core feature of the language ever since the v0.1.0-beta.9 release, one of the earliest versions of Vyper.

At their core, both implementations function identically; they set a storage value between two states (activated, inactive). When a function that is marked as @nonreentrant is invoked, the flag is:

Ensured to NOT be in its active state
Set to its active state

Once the function's invocation concludes, the flag is:

Set to its inactive state

With this mechanism, @nonreentrant users can ensure that the function can only be re-invoked after it concludes, meaning that no re-entrancy can occur regardless of what external calls are performed. More complex forms of re-entrancy attacks exist (i.e. view only re-entrancies, cross-contract re-entrancies) but for the purposes of this vulnerability the basic case is what matters.

Vyper Vulnerability Historical Timeline

`@nonreentrant`: Tag-Based Re-Entrancy Guards

Ever since their introduction, @nonreentrant decorators always supported a <key> to be set offering more flexibility compared to a nonreentrancy lock which only applies globally at the contract level.

A simple implementation could be to use a mapping that takes the key and sets the relevant re-entrancy flag on it, however, such an approach would invoke extra cost due to the keccak256 gas cost of mapping lookups.

As Vyper is a language which does not provide raw storage access to users, it will be fully aware of all storage slots a contract uses when compiling it. As such, it takes on the job of allocating storage slots, which includes ensuring that slots for storage variables and re-entrancy key locks do not overlap with each other.

PR#1264 which introduced this feature in the v0.1.0-beta.9 release of Vyper used a simple approach to ensure no overlap, storing re-entrancy flags at a specific offset from the contract's original slots (0xFFFFFF to be precise).

Refactoring the Compiler

In tandem with new feature development, beginning in 2018, the Vyper compiler started a multi- year effort to refactor the then single-pass architecture into a multiple pass architecture which would separate the concerns of type checking and semantic analysis into a front-end, distinct from the code generating backend. As with most large refactoring projects, this effort was incremental and piecemeal, being worked on alongside other bugfixes and feature development until finally culminating in 2023 with PR#3390.

Location Optimization: Smarter Allocation of Storage Slots

PR#2308 which was part of the Vyper v0.2.9 release was meant to make storage allocation smarter by utilizing the first unallocated storage slot available after all slots for regular storage variables of a contract have been processed, instead of starting from the 0xFFFFFF constant. This would save bytecode space, because in the bytecode, the PUSH instructions which precede any load or store of a storage slot by pushing the location of the nonreentrancy key to the stack could use fewer bytes.

Corruption Avoidance: Proper Offset Calculation

The above PR of the v0.2.9 release worked well and would guarantee no overlap between re-entrancy flag slots and storage slots so long as the variables that are allocated sequentially in the (physical) front of the storage layout do not span multiple sequential slots.

As the Vyper language and codebase were undergoing significant refactoring at the time, PR#2361 (part of the v0.2.13 release) introduced a more efficient way to store variables that could span more than 1 storage slot (32 bytes) within a contract. As part of the larger refactoring effort, it also moved slot calculation for regular storage variables out of the existing code generation pass into the new front-end pass, but left slot calculation for re-entrancy keys as it was. Because the slot calculation for re-entrancy keys depended on the results of allocation for the regular storage variables, it ended up keeping two different allocator implementations for regular storage variable allocation between the front-end and codegen passes. This resulted in the offset calculations of PR#2308 to be incorrect and require an update.

The update was introduced in PR#2379 (part of the v0.2.14 release), and was meant to calculate the storage offsets for re-entrancy flags correctly by taking into account the correct size of variables declared in storage rather than assuming all of them occupy a single slot (which was true in earlier implementations). However, the second update still had a bug stemming from the differences between the front-end and codegen allocator implementations, which we will describe below.

Because of these bugs, both v0.2.13 and v0.2.14 releases were "yanked"^[2] shortly after release.

Decisive Event: Re-Entrancy Guard Corruption in `v0.2.14`

Shortly after v0.2.14 was released, a Vyper user opened issue #2393 in the Vyper GitHub repository denoting that re-entrancy guard tests were failing when they were upgrading the Yearn vault code to 0.2.14.

Taking a snapshot of the latest available release of Yearn when the user opened their issue, proceeded to compile it with v0.2.14, and inspecting the decompiled bytecode using the EtherVM Decompiler will reveal that the pseudo-code storage offset storage[0x2e] is being utilized as the "flag" for the @nonreentrant("withdraw") keyword in both the def deposit and def withdraw instances of the Vault.vy file.

However, the same storage offset is utilized for the contract-level managementFee variable. This can be validated by evaluating the decompiled functions for the managementFee() getter function as well as the setManagementFee setter function which will re-use the same storage offset.

Compiling the same codebase with v0.2.13 shows that that the re-entrancy guards were working as expected and were not overlapping in storage. However, PR#2379 of the v0.2.14 release did not fully solve the issue of corruption.

The code of the v0.2.14 release assigning storage slots for @nonreentrant decorators still produced an incorrect interaction between the new front-end code and the existing codegen allocator. Re-entrancy slots still ended up overlapping with regular storage variables due to the different allocation strategies for mapping types between the front-end and codegen allocators. The data-corruption code of v0.2.14 is as follows:

def get_nonrentrant_counter(self, key):
    """
    Nonrentrant locks use a prefix with a counter to minimise deployment cost of a contract.
    We're able to set the initial re-entrant counter using the sum of the sizes
    of all the storage slots because all storage slots are allocated while parsing
    the module-scope, and re-entrancy locks aren't allocated until later when parsing
    individual function scopes. This relies on the deprecated _globals attribute
    because the new way of doing things (set_data_positions) doesn't expose the
    next unallocated storage location.
    """
    if key in self._nonrentrant_keys:
        return self._nonrentrant_keys[key]
    else:
        counter = (
            sum(v.size for v in self._globals.values() if not isinstance(v.typ, MappingType))
            + self._nonrentrant_counter
        )
        self._nonrentrant_keys[key] = counter
        self._nonrentrant_counter += 1
        return counter

Compare this to the front-end code at the time for calculating the storage layout for regular storage variables.

    available_slot = 0
    for node in vyper_module.get_children(vy_ast.AnnAssign):
        type_ = node.target._metadata["type"]
        type_.set_position(StorageSlot(available_slot))
        available_slot += math.ceil(type_.size_in_bytes / 32)

While this code would correctly consume the key value and yield the same @nonreentrant storage offset for the same key value, it improperly calculated the storage offset.

Specifically, the old allocator did not allocate a storage slot for MappingType entries (i.e. HashMap), while the new allocator did. MappingType storage entries are never written to but are reserved regardless by the compiler (ref: Issue 2436). This resulted in an inconsistency between the non-reentrant key allocator and the front-end allocator which led to the reported storage corruption.

Vulnerability Introduced: Malfunctioning Re-Entrancy Locks in `v0.2.15`

After v0.2.14 was yanked, in an effort to correct the v0.2.14 release's corruption of re-entrancy guards, PR#2391 included in the v0.2.15 release was meant to fix the bug introduced in from the previously mentioned PR#2379 by moving re-entrancy keys to be allocated physically in front of regular storage variables. In addition, in an effort to reduce the chance of these kinds of issues being introduced again, it completed the removal of the storage slot allocation logic from the codegen pass by moving the logic into the same function in the front-end as regular storage variable allocation. However, in doing so, it removed the old self._nonreentrant_keys data structure and, crucially, the accompanying logic which ensured only one lock was allocated per nonreentrant key:

    if key in self._nonrentrant_keys:
        # --> SAFE. only allocate one slot per key <--
        return self._nonrentrant_keys[key]

The actual vulnerability is introduced in the following code of v0.2.15:

# Allocate storage slots from 0
# note storage is word-addressable, not byte-addressable
storage_slot = 0

for node in vyper_module.get_children(vy_ast.FunctionDef):
    type_ = node._metadata["type"]
    if type_.nonreentrant is not None:
        # --> BUG! should check nonreentrant key not already allocated <--
        type_.set_reentrancy_key_position(StorageSlot(storage_slot))
        # TODO use one byte - or bit - per reentrancy key
        # requires either an extra SLOAD or caching the value of the
        # location in memory at entrance
        storage_slot += 1

The vulnerability arises from how the storage_slot offsets of re-entrancy keys were ignoring the actual <key> of the @nonreentrant(<key>) decorator and were simply reserving a new slot for each seen @nonreentrant decorator regardless of what "key" was utilized.

Latent Period: `v0.2.15`, `v0.2.16` and `v0.3.0`

The vulnerability that was introduced in v0.2.15 went undetected during the interim releases v0.2.16 as well as v0.3.0 due to insufficient tests in the Vyper codebase at the time to detect it, a 4- month period between Jul 21, 2021 and Nov 30, 2021.

All Vyper contracts that have been compiled with versions v0.2.15, v0.2.16, and v0.3.0 are vulnerable to the malfunctioning re-entrancy guard.

Remediation: `v0.3.1` Release

The v0.3.1 release resolved this vulnerability by adjusting how the compiler was allocating data slots to each variable within a contract. The vulnerability was fixed in two different PRs.

PR#2439: Fix unused storage slots

The first PR that partially fixed the vulnerability was PR#2439 which contains the following description:

this is not a semantic bug but an optimization bug since we allocate
more slots than we actually need, leading to "holes" in the slot
allocator – slots which are allocated but unused.

This description does not actually clearly describe the issue. The description of "holes" arises from inspecting how the compilation output's layout was yielding a single slot value for each re-entrant key. To better understand what happened, let's look at the data allocation function in v0.3.0:

for node in vyper_module.get_children(vy_ast.FunctionDef):
    type_ = node._metadata["type"]
    if type_.nonreentrant is not None:
        type_.set_reentrancy_key_position(StorageSlot(storage_slot))

        # TODO this could have better typing but leave it untyped until
        # we nail down the format better
        variable_name = f"nonreentrant.{type_.nonreentrant}"
        ret[variable_name] = {
            "type": "nonreentrant lock",
            "location": "storage",
            "slot": storage_slot,
        }

        # TODO use one byte - or bit - per reentrancy key
        # requires either an extra SLOAD or caching the value of the
        # location in memory at entrance
        storage_slot += 1

The problem with this code is that it was setting the re-entrancy key position of each type_ (i.e. individual @nonreentrant key) to the latest value of storage_slot, incrementing on each iteration. This meant that the unique instances of @nonreentrant(<key>) were all using a different storage_slot value, however, the ret entry of variable_name was being overwritten on each iteration.

As such, the layout output of the compiler contained a single nonreentrant.<key> entry and a single storage offset, meaning that inspecting the compiler's output would appear to simply be "skipping" the storage slots of consecutive @nonreentrant(<key>) declarations, as per the PR's original rationale.

The non-vulnerable code that partially patched the vulnerability in the v0.3.1 release is as follows:

for node in vyper_module.get_children(vy_ast.FunctionDef):
    type_ = node._metadata["type"]
    if type_.nonreentrant is None:
        continue

    variable_name = f"nonreentrant.{type_.nonreentrant}"

    # a nonreentrant key can appear many times in a module but it
    # only takes one slot. ignore it after the first time we see it.
    if variable_name in ret:
        continue

    type_.set_reentrancy_key_position(StorageSlot(storage_slot))

    # TODO this could have better typing but leave it untyped until
    # we nail down the format better
    ret[variable_name] = {
        "type": "nonreentrant lock",
        "location": "storage",
        "slot": storage_slot,
    }

    # TODO use one byte - or bit - per reentrancy key
    # requires either an extra SLOAD or caching the value of the
    # location in memory at entrance
    storage_slot += 1

The code will now properly allocate a single storage_slot the first time it identifies a duplicate re-entrancy key. However, it will not invoke the set_reentrancy_key_position function on each type_ with the same offset, meaning that any @nonreentrant(<key>) entries beyond the first would have an "undefined" storage offset to utilize.

This led to a compiler panic^[3] when attempting to compile contracts with @nonreentrant decorators. To rectify this, a further change was necessary to ensure that all @nonreentrant decorators were properly aware of the storage slot they need to operate on.

PR#2514: fix codegen failure with nonreentrant keys

The final PR that completed the alleviation of the @nonreentrant vulnerability was PR#2514. In detail, it expanded the above code segment to ensure that the set_reentrancy_key_position function is properly invoked with the correct slot allocated for the given @nonreentrant lock.

The final non-vulnerable code of the v0.3.1 Vyper release is as follows:

for node in vyper_module.get_children(vy_ast.FunctionDef):
    type_ = node._metadata["type"]
    if type_.nonreentrant is None:
        continue
    variable_name = f"nonreentrant.{type_.nonreentrant}"

    # a nonreentrant key can appear many times in a module but it
    # only takes one slot. after the first time we see it, do not
    # increment the storage slot.
    if variable_name in ret:
        _slot = ret[variable_name]["slot"]
        type_.set_reentrancy_key_position(StorageSlot(_slot))
        continue

    type_.set_reentrancy_key_position(StorageSlot(storage_slot))
    # TODO this could have better typing but leave it untyped until
    # we nail down the format better
    ret[variable_name] = {
        "type": "nonreentrant lock",
        "location": "storage",
        "slot": storage_slot,
    }
    # TODO use one byte - or bit - per reentrancy key
    # requires either an extra SLOAD or caching the value of the
    # location in memory at entrance
    storage_slot += 1

As one can observe in the above segment, the set_reentrancy_key_position is now properly invoked for each type_ entry that is nonreentrant and will correctly utilize the same storage_slot whenever the same key is specified in a @nonreentrant(<key>) decorator.

Furthermore, alongside the above fix, the PR included a much needed and missing test from the Vyper repository; a dedicated unit test that evaluates cross-function re-entrancy:

@external
@nonreentrant('protect_special_value')
def protected_function(val: String[100], do_callback: bool) -> uint256:
    self.special_value = val

    if do_callback:
        self.callback.updated_protected()
        return 1
    else:
        return 2

@external
@nonreentrant('protect_special_value')
def protected_function2(val: String[100], do_callback: bool) -> uint256:
    self.special_value = val
    if do_callback:
        # call other function with same nonreentrancy key
        # --> (revert expected here) <--
        Self(self).protected_function(val, False)
        return 1
    return 2

However, while the bug was identified, fixed and tested for within the compiler codebase, the impact to production contracts was not realized at the time, and protocols potentially using the relevant compiler versions were not explicitly notified.

The notion of unique <key> values that can be re-used across @nonreentrant decorators only exists for a single purpose; cross-function re-entrancy protection. The fact that such a test was missing from the Vyper repository until its 0.3.1 release was one factor in contributing to the vulnerability being introduced in the first place and continuing to be undetected for as long as it did.

Vulnerability in Summary

Versions Affected: v0.2.15, v0.2.16, v0.3.0
Root Cause: Improper remediations to re-entrancy guard data corruption issues introduced in v0.2.13
Vulnerability in Brief: All @nonreentrant decorators within a Vyper contract will utilize a unique storage offset regardless of their key, meaning that cross-function re-entrancy is possible on all contracts compiled with the susceptible versions

Conditions for a Profitable Vulnerability

While the vulnerability itself is simple to identify and has been observed across a variety of live contracts, its profitability arises from a very specific set of conditions that need to be met. Specifically:

A .vy contract compiled with either of the following vyper versions: 0.2.15, 0.2.16, 0.3.0
A primary function that utilizes the @nonreentrant decorator with a specific key and does not strictly follow the CEI pattern (i.e. contains an external call to an untrusted party before storage updates)
A secondary function that utilizes the same key and would be affected by the improper state caused by the primary function

Unfortunately, these conditions are the exact ones that manifested in the Curve.Fi liquidity pools that were exploited as they needed to perform an external distribution of native ETH (which, on the EVM, can only be done with an execution context-transferring CALL^[4]) before sensitive storage updates were performed in functions that would otherwise be protected by a properly functioning @nonreentrant guard.

Summary and Takeaways

Bugs are an unfortunate and stark reality of any large, production software project. What we can do is to try and mitigate bugs and their associated risks to the greatest extent possible.

There are several, practical steps we can take to improve correctness of smart contracts compiled with Vyper going forward:

Improved testing of the compiler, including continuing to improve coverage, comparing compiler output with a language spec, and utilizing formal verification (FV) tools for compiler bytecode verification.
Providing developers with tools to make it easier to take a multi-faceted approach to testing their code – including source- and bytecode- level tests.
Tighter, two-way feedback with protocols using Vyper

But it is not enough to just focus on the correctness of the latest version of the compiler; due to the immutable nature of smart contracts, contracts compiled with past versions of Vyper could be securing a significant amount of funds.

Securing past releases of Vyper is therefore another important new axis of focus which we will be dedicating significant resources to in the future. It is just as important as introducing new features, providing bugfixes and refactoring for the latest release.

Ultimately, going forward we want to take the lessons learned from recent events to make sure Vyper is the most rock-solid and secure smart contract language and compiler project in the world. Therefore, these goals will be supported by a variety of new security-related initiatives within and beyond our team including:

A short-term, competitive audit which will focus on the most recent version of Vyper in partnership with Codehawks
Short-term and long-term (open-ended) bug bounty programs spanning all versions of the Vyper compiler in partnership with Immunefi
The Vyper Security Alliance, a coordinated, multi-protocol bounty program for helping discover present and past compiler bugs affecting versions of Vyper securing live TVL
Collaboration with multiple audit firms including ChainSecurity, OtterSec, Statemind and Certora to review past versions of Vyper which secure substantial live TVL and to help review the compiler on an ongoing basis going forward
Expansion of the team; including a dedicated security engineering role meant to improve the security tooling of Vyper across the board, both internally and user-facing
Collaboration with existing security toolkits that are offered for Solidity and would benefit the Vyper ecosystem greatly
Design of a language specification which will contribute to efforts to formally verify and help testing efforts of the compiler itself

We hope to see you using Vyper soon :). Please stay tuned for more announcements about these initiatives in the coming weeks! To follow along with further announcements, please follow the official Vyper twitter. To help contribute, please see the Vyper Github! If you are excited about Vyper and want to help with funding – or just want to chat – please reach out to us via the Vyper discord and we will always be happy to welcome you to the community.

Special thanks to the Omniscia team, which, while not directly affiliated with Vyper, contributed substantial co-authorship, feedback and review of this post-mortem report ↩︎
Briefly speaking, "yanking" means that the tags are available in the repository for historical purposes, but the releases are not published for download. For more information, please see PEP-592. ↩︎
That is, the compiler would simply error out instead of producing any code at all. While annoying for the user, a compiler panic is considered a "safe" error because it will not emit code which makes it to production. ↩︎
Technically there are other ways to transfer ether, but they are not applicable here at the time of this writing. EIP-5920 may be a positive development here. ↩︎

Vyperlang team with special thanks to Omniscia team[1]