# Understanding Arweave's endowment: Exploration and Simulation The Arweave network uses a novel form of storage endowment in order to ensure permanence of the information that it stores. In this post we will detail and discuss how the storage endowment works, then study its properties and risk profile using Markov chain simulations of its execution. This post gets deep into the weeds. If you are looks for introductory material, you may want to check out the main [Arweave website](https://arweave.org). Let's dive in! ## Background: What is the endowment? In the Arweave [yellow paper](https://arweave.org/yellow-paper.pdf) draft of 2019, we described Arweave's endowment structure (see section 3.2.2). The central logic of Arweave's endowment goes like this: 1. The cost of storage provision has been declining at a strong, exponential rate since the inception of information encoding. From papyrus, to the Gutenberg press, to magnetic drum memory, floppy disks, and flash drives, the cost of encoding and recalling information has been falling for thousands of years. In the digital era, we call this the `Kryder rate`. 2. While the exact rate of declining costs is variable, the pattern is reliable and has significant room for growth: The theoretical data density limits alone are 10^51 greater than our current achievements. Further, we do not foresee that there will be a slow down in the desire to store data more efficiently, as humans and machines always tend to be more effective if they can access and process more information. 3. Given these factors, we observe that by extrapolating an extremely conservative Kryder rate we are able to price permanent storage at a single fee. We acheive this by charging the user a base fee of 200 years of storage at present costs, then as the cost of storage declines the storage purchasing power of this endowment contribution increases. As long as the Kryder rate remains above 0.5%, the storage purchasing power in the endowment at the end of the year will be greater than that at the start. 4. Once the protocol nears the end of its life, the size and cost of the dataset will drop to an extremely low level. Owing to its small size, we expect that it will be altruistically 'imported' into the next permanent information storage system, continuing the replication of the data. This follows the same pattern that led the Gopher archives to be found on the modern web, etc. You can check out the full details and math that underlies this approach [here](https://arwiki.wiki/#/en/storage-endowment). ### Defining the Kryder+ Rate In practice, the Arweave network utilizes a modification of the raw Kryder rate, which we will refer to as the 'Kryder+' rate in this document. The Kryder+ rate includes not just raw data storage, but also the other factors that are required in order to keep a network like Arweave online: replications, electricity, and operational costs. Each of these, we note, is affected by the same underlying decay in storage costs: - **Replications**: Each new replica of the dataset inherents the same declining storage costs as the first. - **Power Usage**: Changes in data density and reliability (the factors that most prominently effect the Kryder rate) are rarely, if ever, accompanied by increases in power usage. Subsequently, as storage mediums increase in capacity, the relative energy cost of storing a given quantity of data declines, too. - **Operational Expenses**: As with power usage, as the efficiency of individual digital storage mediums increases, the number of devices needed to store a piece of data (and thus, the operational overhead to maintain them) declines. In the present version of the Arweave network (2.5.3), 45 replicas of the dataset are targeted in the Kryder+ rate (defined [here](https://github.com/ArweaveTeam/arweave/blob/fb01d6e5b9107c264113d111b6afa5ae721dcec1/apps/arweave/include/ar_pricing.hrl#L14)), along with a 2x storage overhead for operational and power expenses (see [here](https://github.com/ArweaveTeam/arweave/blob/b6794691daed9d3cf7b064ac287f298c0c521730/apps/arweave/src/ar_pricing.erl#L46)). After the [Arweave 2.6](https://2-6-spec.arweave.dev) upgrade, the network will automatically derive the Kryder+ rate by reacting to the price at which miners are willing to provide storage. The network can orchestrate a trustless oracle for this price because miners are incentivized to minimize it, in competition with one another. Notably absent from Arweave's formulation of the Kryder+ rate are bandwidth costs. Arweave covers this using a separate set of karma-based incentives -- see [here](https://arwiki.wiki/#/en/karma). ## Simulating the endowment Now that we have covered the theoretical background of Arweave's endowment, as well as its practical implementation in the live network, we can consider a simulation of this mechanism to observe likely real-world outcomes. In order to assist in this effort, we utilize a Markov chain-based simulation technique. This model runs many individual iterations of potential futures year-by-year, then collates the results. The code to execute and modify this simulation yourself is linked at the end of this page. ### Simulation factors The Kryder+ rate is a primary factor in any simulation of Arweave's endowment. In this model, we utilize a dataset of hard drives costs over time (found [here](https://arweave.net/wufZ10dlzwfPFTNKr3uRAyeMRfMdkNx1iG9yjolRbv8)) as our base. From this data, we observe an average Kryder rate of ~38%. On top of real world data, we add the ability to create a layer of 'pessimism' about the future vs past advancements, in order to allow us to stress-test how the endowment would operate in less fortunate periods. We describe this 'pessimism' factor as the % of previous storage cost declines that we anticipate will continue into the future. For example, a pessimism rate of 10% implies that we think that the future will only be 10% as effective at lowering the cost of storage than the past was. Another important factor in the simulation of Arweave's endowment is the volatility of its token price. Arweave uses a floating-price token for its endowment for two primary reasons: - Centralized stablecoins are highly likely to collapse or cease operation long before the last block in the Arweave network is mined. Further, a decentralized stablecoin architecture built into the Arweave protocol itself would be susceptible to under-collateralization in the event of extreme market volatility. - Conversely, Arweave's native token has a strong claim to utility and operates independently of any external chains or services. This lack of interdepence helps ensure that the Arweave protocol can continue unaffected by external factors for extremely long periods of time. One of the effects of the floating nature of the token's price, however, is that the 'fiat value' of the endowment is volatile. In order to model this in our simulation, we assume a pessimistic, price neutral volatility in the endowment value. That is, all of the simulated fluctuations to the endowment's value should average to zero in totality, but individually will move the price higher and lower in the meantime. In order to allow each individual simulation to terminate in a reasonable period, execution is halted once 10,000 years have elapsed, or the endowment reaches zero. ### Endowment Lifetimes The simplest way to understand the behavior of the endowment is to look at the average number of years that the endowment survives under various external conditions. ![](https://i.imgur.com/qi2QODX.png) ![](https://imgur.com/j21obF5.png) Above we see a plot of endowment lifetimes with varying levels of yearly maximum token price volatility (horizontal), against changes in the effective Kryder+ rate (vertical, also listed with their 'pessimism' values against real world data). Scenarios in which every operated run (20 per combination) resulted in a lifetime over 10,000 are denoted with a dark green color. The first important cell to note in this rendering is at 0% volatility and 0% pessimism. A rate of 0% for pessimism/Kryder+ implies that we assume that the cost of storage will never decrease again. In this instance, the network should nonetheless hold user data for at least 200 years with functioning economics. This parameter was chosen such that even those that are deeply skeptical of future technological advancements can trust that their data will be economically viable to store for at least ~3 generations before requiring altruistic storage. Another important observation from this rendering comes from the 30% volatility and 2/4% Kryder+ zone. In our simulation, 30% maximum volatility in token price implies an average yearly token price change of 15% -- extremely close to the S&P500's average volatility of 14.4% per year from 1950 to 2015. Assuming this average rate of volatility in the network's token price, we see that a Kryder+ rate of just ~2% will yield an endowment run-length of almost 2,000 years, and a slightly higher rate yields a run-length over 10,000 years. ![](https://i.imgur.com/IFZz5RM.png) Further, if eventual commodity-like average volatilities are assumed (approximately 2-5%, according to [World Bank estimates](https://thedocs.worldbank.org/en/doc/825481461938593619-0050022016/original/CMO2014Julyanalysis.pdf)) we observe that even a Kryder+ rate of less than 0.76% will lead to an endowment runtime of longer than 10,000 years. ### Deflation Probabilities As can be seen above, in a large number of scenarios the endowment still contains tokens to continue incentivizing data storage after the simulation terminates at 10,000 years. If we dig deeper into the execution of each individual run, we see that the vast majority of tokens are taken from the endowment in the early years of storage: ![](https://i.imgur.com/x3HroEC.png) Given this behavior, we can note that when users place tokens into the endowment to back the data they are storing, there is a very high likelihood that some of those tokens will never be released again. ![](https://i.imgur.com/vI0Bp12.png) Above, we see a plot of the likely quantity of tokens that are never released from the endowment at various levels of pessimism about future storage cost declines against the present. ### Run it yourself The simulator used in this post case be found [here](). Please check it out, learn about the model and share your own modified simulations! It can handle running approximately 10,000 full length executions in a couple of minutes on a single thread, so can be used to simulate many different scenarios quickly.