A data-availability layer for Tezos

# A data-availability layer for Tezos The Mumbai protocol, introducing the mechanism of smart rollups, has indeed marked the inception of Layer-2 solutions in the blockchain landscape of Tezos. Layer-2 is fundamentally designed to enhance the scalability of blockchain systems by amplifying transactional throughput, often denoted as TPS or Transactions Per Second. In an earlier announcement, we unveiled the potential to achieve a staggering 1 million TPS via these smart rollups. However, in order to facilitate such elevated throughput, the requirement was to transition the content of operations off-chain. Indeed, if we consider every transaction of the rollup passing through Layer-1 (L1), we could predict the maximum throughput of a rollup to be approximately 5000 TPS[^compute] for the Nairobi protocol, a figure that pales in comparison to the initially stated 1 million TPS. This necessitates moving data off-chain, leading us to confront a critical question of consensus regarding these off-chain data. Once the data is offchain, how can we reach a consensus about the content of those data and whether those data actually exist and deem available? This is the data-availability problem. The data-availability conundrum is a recognized issue within the blockchain sphere, and several solutions have already been proposed to address this complexity. One such methodology proposes the use of a specialized committee that operates independently of the Layer-1 (L1) stake. This concept is known as a Data-Availability Committee, or DAC for short. A DAC solution has been meticulously crafted for Tezos, and is explored in more detail in this article (link to DAC article). The fact that the committee could be anybody (governed by a multisig) makes this solution not decentralized. Even though this can be suitable in practice for some use-cases, for other use-cases it could be interesting to get a fully decentralised solution. This is the data-availability layer: A decentralised solution that aims to solve the data-availability problem. [^compute]: Assuming an L2 transaction is `10` bytes. Roughly, on the L1, we can put at most 512K of transactions in an L1 block. Meaning, we can put at most 51.2K transactions. Since we have a block every 15 seconds, this means we can roughly achieve a throughput of 5K TPS at most. This is far from the 1M of TPS advertised. ## The data-availability layer The Data Availability Layer, or DAL, is a solution designed to achieve consensus on data availability, relying on Layer-1 (L1) stakeholders. This approach introduces an independent peer-to-peer (P2P) network where data can be both submitted and retrieved. Unlike the P2P protocol employed by L1, where each node receives all data, the P2P protocol[^protocol] utilized by the DAL is designed such that nodes only receive data of interest to them. This effectively circumvents the bandwidth limitation inherent to L1, optimizing data transfer and accessibility. [^protocol]: A re-implementation of the [gossipsub algorithm](https://docs.libp2p.io/concepts/pubsub/overview/) that we will detail in a later blog post The high-level workflow for using the DAL is as follows: 1. A user desiring to submit new data submits a commitment to the data (similarly to a *hash*[^hash]) of the data onto the L1 2. Following validation of this hash by L1, the user proceeds to submit the actual data to the DAL P2P network. 3. L1 attesters verify the availability of this data and communicate their findings using their respective attestations. 4. Finally, L1 collates these results to ascertain the availability of the data. It is worth noting that what is transferred to the DAL isn't the raw data in its initial form. Instead, the submitted elements can be broadly characterized as chunks of the original data, accompanied by an erasure code. This arrangement ensures that even if only a portion of these chunks are received by enough attesters on the DAL but not all, it remains feasible to reconstruct the original data. Once L1 has confirmed the availability of this data, it can be utilized by various applications, such as smart rollups. The preliminary requirement for the user to post a hash allows to achieving consensus on the data that is eligible for submission to the DAL. This precautionary step effectively deters an attacker from flooding the DAL with arbitrary data. [^hash]: Technically, this is not really a hash but a [KZG commitment](https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html). ## The smart-rollups integration Smart rollups are designed to be compatible with the DAL in the future, and will be able to leverage it once it is activated. The integration of the DAL with smart-rollups relies on the reveal data channel[^reveal]. This particular channel enables the PVM (Parallel Virtual Machine) to request a hash, or more generally, a unique identifier corresponding to certain data. From the kernel's perspective, submitted data onto the DAL that has been duly recognized as available can be uniquely identified via three numerical values: - The level at which the commitment to the data was submitted - The slot index: This parameter will be elaborated upon in the following section - The page index: This is a technical detail arising from the fact that the WASM PVM can import data of at most 4KiB, while data submitted to the DAL can be larger (such as 1MiB). Therefore, the submitted data can be split into pages of 4KiB each. The page index identifies the specific page of the original data that is to be imported. The kernel is allowed to request any data that has been validated on the DAL, ranging from any point in the past up to the current level. The obligation falls on the rollup node to fetch this data from the DAL P2P network. This will require a DAL node as discussed in the Infrastructure section. More details about this will be thoroughly discussed in an upcoming blog article. ## Slots The L1 maintains control over the maximum quantity of data that can be submitted to the DAL per L1 block level. From L1's perspective, data introduced to the DAL is organized into slots. Each user aiming to submit data to the DAL must specify the slot for which they are posting the data. This allows multiple users at each level to submit data to the DAL simultaneously. Consequently, any smart rollup operator intending to utilize the DAL need not depend on a third party for submiting the data since an operator can do it itself. In instances where two different users aspire to submit data for the same slot index at the same level, only the first operation will prove successful. Given that the order is determined by the baker, and the baker uses fees to sequence operations, it follows that the fee market will effectively determine which operation claims the slot. ## The DAL is made to evolve The implementation of the DAL hinges on several key parameters, such as: - The number of slots - The size of a slot Both of these are established by an economic protocol. These two constants are pivotal as they control the DAL's bandwidth: $$\text{bandwith}=\frac{\text{number of slot} * \text{size of a slot}}{\text{time between blocks}}$$ The DAL's architecture is targeted to accommodate a bandwidth of at least $10$ MiB/s. However, as will be outlined in the DAL's roadmap, we intend to initially launch the DAL with a lower bandwidth, incrementally expanding it over time. The advantage of this approach lies in our ability to first verify the seamless operation of the P2P protocol, and bolster its resilience on the test network as we gradually enhance the bandwidth. Latency is another critical factor to consider. Given that it takes time for attesters to fetch data from the DAL, a certain degree of latency is to be expected between the moment the data's commitment is posted on Layer-1 (L1) and the time when the data is officially marked as available by L1. This latency is governed by a parameter known as 'attestation_lag', which can be fine-tuned in tandem with the 'time_between_block' parameters. The attestation lag regulates the time window (in terms of blocks) that an attester has to retrieve the data. Reducing the time interval between blocks necessitates an increase in the attestation lag. However, as a general rule, the latency should ideally be kept under one minute. ## Infrastructure Primarily, there will be three types of users engaging with the DAL: - Slot producers will submit a commitment of the data to Layer-1 (L1) and then proceed to upload the data to the DAL. For the smart rollup use-case, a rollup operator can be a slot producer. - Attesters will confirm the availability of the data submitted by the slot producers. - Slot consumers will fetch the data uploaded to the DAL, typically data that has been validated as available by L1. For the smart rollup use-case, a slot consumer can be anybody interested to track the activity of a given rollup. Given that the DAL employs a different P2P protocol than the one used by L1, we've decided to implement a separate binary to facilitate connection to the DAL network, namely the DAL node. Anyone wishing to utilize the DAL will need to operate a DAL node (we aim to make the command line interface and the configuration closely resemble those of the octez-node). Initially, this means that both rollup operators and attestors will need to run a DAL node. However, based on feedback from the community, we might consider adjusting this user experience in the future. For instance, we could directly integrate the DAL node with the baker or the rollup node. ## Roadmap for the DAL Our current agenda involves rolling out the DAL on Mondaynet by the end of June. Concurrently, we will be preparing the DAL for production and conducting stress tests on the DAL P2P protocol. As mentioned earlier, for safety reasons, we prefer to initially launch the DAL with a lower bandwidth and then gradually increase it over time. We anticipate a release on Mainnet towards the end of the year, or at the onset of the following year. For those intrigued by the technical aspects of the DAL, you can delve into our design document available [here](https://hackmd.io/@p-cUv0l5RNaDKBCowZ0IzA/HJgFgSzpo/https%3A%2F%2Fhackmd.io%2FUQuA_59QRdOjU47fGM9CsQ). ## Conclusion The DAL represents a ground-breaking solution that offers a decentralized data-availability approach for Tezos. We plan to introduce the DAL on test networks by the end of June, with an ultimate goal of launching on the mainnet by the end of this year or at the start of the next. Additional blog articles will follow, delving deeper into the design of the DAL and explaining the technical compromises we've decided upon. We also intend to describe other data-availability solutions that have been proposed within the blockchain ecosystem and contrast them with ours. [^reveal]: That were already described in the DAC article here TODO.