# Cometh - Swarm Collab ## Objective To have a rollup to use Swarm for Data Availability. ## Context & Motivation [Cometh](https://www.cometh.io/) is a french-origin company that are building a platform for of Web3 games and apps. They [raised 10M USD](https://www.nftgators.com/cometh-secures-10m-round-co-led-by-white-star-capital-ubisoft-and-stake-capital/) with relevant investors such as Ubisoft, and customers like Lacoste. Their technology (codenamed 'Muster') uses Bedrock -official release of the OP Stack- developed and maintained by the Optimism Foundation. In there, the sequencer no longer publishes transactions data onto Ethereum L1 Mainnet but rather sends these data to whitelisted private servers: DACs (Data Availability Committees). ![](https://hackmd.io/_uploads/S1_VARQqn.png) This is a good example to discover how Swarm can provide Data Availability Guarantees in a cost-effective way. ### Cometh DAC requirements This is where they are at the moment. The process is taken from Alembic/Muster [documentation](https://docs.alembic.tech/optimistic-rollup/data-availability-committee-dac), and apparently it follows the [Arbitrum AnyTrust](https://developer.arbitrum.io/inside-anytrust) specs for DACs, in Arbitrum Nova. - DAC has N members, of which we assume at least only one is honest: 1-of-N trust assumption. - Sequencer sends the data + expiry time of the batch to all DAC members simultaneously via RPC. - Each DAC member stores the data using the data's hash value as index, and signs the data with its BLS key over the BLS12-381 curve. It returns the signature, along with a success indicator to the DAS (Data Availability Server) - Once the DAS has gathered a sufficient number of signatures, it aggregates them together to create a valid Data Availability Certificate. - Certificate is posted to the DataAvailabilityInboxAddress (EOA) on Ethereum L1 and used for upcoming chain derivation. If the system fails to collect enough signatures within a few minutes, it will abandon the attempt to use the Committee, and will "fall back to rollup" by posting the full data directly to the L1 chain. # Proposal The following states the main value proposition of Swarm in terms of Data Availability and a high-level sequence of events of the processing within the network. ### Data Availability levels Swarm at it's core functionality provides several assurances for data to be kept secured and available for some time, and we plan to add further guarantees to match and overtake current solutions. #### LEVEL 1 - Native 1. Redunancy: data is stored within neighborhoods, 4 nodes incentivised for retrieval per neighborhood by default. 2. Erasure coding: further assurances that data will not be lost or corrupted. #### LEVEL 2 - Upcoming 3. Multi-year stamps: guarantees a minimum persistance even after drastic price changes in the network capacity. The price oracle itself prevents such volatility. 4. Insurance contract: an onchain contract to provide furhter guarantees to the stored data, allowing quality of service (QoS) performance for that data. ### How does it work Here is a high-level overview of the flow: - The insurance is an on-chain (L1) commitment to storing a particular piece of data identified by some hash that can be checked by L1 EVM. - It can be challenged by an L1 transaction (at cost), to which the insurer has a certain time to respond, also with an L1 transaction. - If there is no response within the allotted time, the insurer’s stake is slashed. - The response actually contains the data that was claimed to be missing. Thus, both the challenger and the insurer pay transaction costs. - Therefore, if the data is actually available in Swarm, there is no reason for the challenger to challenge and, hence, the insurer is incentivized not only to store the data (so that their stake is not slashed upon a challenge), but to keep it available on Swarm (so that there are no challenges). From a Cometh perspective: - The sequencer uploads a batch of transactions to Swarm and buys insurance for them for a given amount of time. - How to query the data? ### Costs #### Options The upload, the insurance and the on-chain commitment have to be cheaper than including all transaction data on the L1 blockchain (as it is the case). - Probably the cheapest is to only include the number of transactions in the batch and the Merkle root of the array of transaction hashes. In this case, the challenge is simply the index of the missing transaction and the response is the transaction together with the Merkle proof of inclusion. - A more costly alternative would be to also include the transaction hashes (possibly truncated to 128 bits) in the on-chain commitment transaction and have this hash array signed by the insurer. Thus, the cost of creating the commitment would be increased by the cost of including this data (16 bytes, i.e. 256 gas per tx), hashing this data (0.5 words, i.e. 3 gas per tx) and a fixed overhead for including and verifying the signature (about 4000 gas). #### Other DA solutions A nice [diagram](https://blog.celestia.org/ethereum-off-chain-data-availability-landscape/) of where DACs stands in terms of cost: ![](https://hackmd.io/_uploads/ryv3N6752.jpg) It is arguable that Swarm provides low cost as DACs but stronger DA guarantees, such as Celestia, positioning us right at their level with lower-cost (is is true? How many gas/byte?) # Pending Questions ### Pending Questions #### Specifically for Cometh DA requirements: The hash of the published data is submitted into L1, in the form of a certificate. They would like these certificates to work as an index. To do so, a server (DAS) will query the data through the certificate. What could be the equivalent in Swarm for this feature? #### Swarm's whitelabel solution for DA (General): - How to verify that data is available? - How to verify that it is erasure coded properly? (that is, that the original data was extended properly) - In Swarm, does it make sense to speak about Trust assumption (1-of-N)?