Instant Transactions and Infinite Scaling (Braindump)

# Instant Transactions and Infinite Scaling (Braindump) Can we have the simplicity of a debit card payment while transacting in Eth? How can we have near instant confirmation times while using a blockchain? How can we have arbitrary throughput, with fixed onchain costs, and trustless withdrawls? By solving this, we could compete with payment networks like Visa for in person transactions, achieve massive global adoption, and potentially make Eth a true money. To get instant transactions, we propose a simple addition to rollups, where a centralised sequencer can give instant guarantees about the rollup's *future* state, while individual users can still trustlessly withdraw their funds after a delay. Imagine Alice is buying an apple from Bob. Suppose, instead of dealing with the blockchain, Alice simply sends her transaction to the sequencer, who sends back a receipt guaranteeing that the transaction will go through soon. Alice gives the receipt to Bob, who checks the amount and the sequencer's signature, and accepts the payment. To get infinite scaling we propose "escape pods." When the sequencer updates the state, they also post the root of all transactions in that update. If the sequencer stops responding the rollup is frozen and the shutdown sequence begins. You have a few weeks to prove you owned coin number `x` at time `t`. If you spent that coin, any subsequent owner will be able to post their proof and supersede your claim. Once the claim period runs out, the money will be returned to the rightful owners on L1. There is no double spending issue because we use an extended UTXO model where every token is numbered, and can only be withdrawn once. First we will sketch the details of instant payment in the traditional rollup setting where all data is posted on L1, making the shutdown process trivial. Then we describe the escape mechanism in detail, giving a fully scalable trustless rollup. ## Instant Payments ### Censorship Resistance Having a centralised sequencer is crucial to allow instant transactions. In order to make guarantees about future payments, the sequencer must be sure that no one else can manipulate the state before they finalise it on chain. However, users need to be able to send transactions permisionlessly to circumvent censorship by the sequencer. Usually, rollups achieve censorship resistance by decentralising the ssequencer, but since we need a centralised sequencer, we instead use an onchain queue. Transactions must be enqueued, say, 100 blocks before they can enter the state. To enforce this, they are stored alongside `block.number + 100`. Only the sequencer can update the state root, and when they include a transaction from the queue in the new state, they can dequeue it on chain. If a transaction is left in the queue too long the sequencer can be slashed for their entire stake. When the sequencer is slashed we disable deposits and L2->L2 payments, freeze the state, and let people withdraw their funds. This achieves trustlessness and self-sovereignty in the sense that we can always force any transaction we like while the sequencer is operating, and when the sequencer fails, we can always return our funds to the (essentially trustless) L1. We use the onchain queue if the sequencer is censoring us, but in the cooperative case we simply send the transaction to the sequencer offchain, who includes it in the next state, skipping the queue. In the basic version, the sequencer posts each transaction as calldata onchain so that everyone can create exit proofs when the state is frozen. This costs linear gas in the number of transactions, and is fixed [below](TODO). ### Offchain Guarantees With only a single designated sequencer, and a delayed onchain queue, the sequencer can be certain about the future state up to 100 blocks in the future. Moreover, by allowing the sequencer to jump the queue, it can instantaneously alter the future state and make *staked guarantees* about that state, without fear that regular users will force any alterations. Suppose Alice is buying an apple from Bob. If she uses the onchain queue Bob must wait, say, 10 blocks worth of confirmation to be sure that the transaction has made it to the queue. Even worse, he must wait a further 100 blocks after the sequencer's ZKP has been verified, to be sure that Alice and the sequencer won't intercede with a transaction that spends the same UTXO by skipping the queue. This is clearly not instant! Instead of using the chain, Alice can send her transaction directly to the sequencer, who can sign it, attesting that the UTXO *will* go to Bob by block X. This is called the receipt. Alice gives the receipt to Bob, who takes it as a strong guarantee of payment, and gives the apple to Alice. Bob holds on to the receipt until block X, and ensures that his UTXO is indeed in the state. If it is, he discards the sequencer's signature. But if it is not, he can take the signature onchain, demonstrate that the sequencer has misbehaved, slash the sequencer's stake and earn a reward. In the simple version with onchain calldata, Bob can easily verify that his transaction was not included and prove that the sequencer misbehaved. In the scalable version, Bob needs [another interactive game](TODO) to find out if his transaction was included and collect evidence of misbehaviour. ### Incentivising the Sequencer The sequencer is given a cut of each offchain payment to incentivise them to confirm offchain transactions. However, if Alice simply signs a TX giving Bob 99% and the sequencer 1%, then the sequencer is instantly certain that they will recieve their cut, and have no incentive to return the signed receipt. Remember, the signed receipt is essential to prove to Bob that the transaction will be complete. Instead, we implement a simple game between Alice and sequencer. Alice sends a TX giving 98% to Bob, and burning 2%. This TX has the special property that it can be converted to a new TX giving 98% to Bob, 1% to the sequencer and 1% back to Alice, as long as the sequencer and Alice mutually agree. So, in order to earn their fee, the sequencer signs the receipt and sends it to Alice. Then Alice signs it to get her cut back, and sends it back to the sequencer. Now Alice has a receipt that she can send to Bob proving that the transaction will go through. [TODO]: <> (diagram of messages between Alice, Bob, and the Sequencer) Note, Alice could spite the sequencer and force them to include the transaction while not being paid their fee, but that would cost her 1% of the original transcation, and it's unlikely that many users would do that. This is more problematic for a rollup with onchain calldata, as the Sequencer pays a non-negligible gas cost to include a transaction, and could be forced to make a loss. This game can be enforced within the rollup's state transition function (i.e., in the circuit for a ZK rollup). We do this by having only 3 valid kinds of transactions: - Any transaction from the queue - Queue skipping transactions that burn 2% - Queue skipping transactions that would have burned 2%, but mutually decided to split the 2% between the sender and sequencer ## Infinite Scaling The conventional method to scale data for rollups is with DA sampling. This introduces a an additional m-of-n security assumption to withdraw funds. This may be the best solution for a Turing complete, VM-based rollup, because you may need the whole state to prove ownership of your funds. However, for the payments rollup described below, a user can simply use their receipts to prove ownership, while only relying on L1. ### UTXO Payment Rollups [The UTXO model](TODO), popularised by Bitcoin, is a model for digital currencies. Valid tokens are called UTXOs (unspent transaction outputs), are they can either be minted (i.e., by mining), or created by spending previous UTXOs. [::]: <> (TODO: show a diagram of the UTXO model) We can succinctly represent the state of a UTXO payment rollup as two Merkle trees. The first tree contains transactions, the second contains spent transactions. To make a payment, you must prove that you have a transaction in the transaction tree that *isn't* in the spent transaction tree, then the two trees are updated accordingly. [::]: <> (TODO: diagram of the two tree model) This is a reasonable model for our scaled rollup. To update the state, the sequencer posts the states of both trees (probably hashed together to save gas), and a ZK proof showing that it applied valid transactions to create some new state. However, these Merkle trees are insufficient as an escape hatch. In order to make a withdrawl, the user would have to prove membership in the transaction tree, and non-membership in the spent transaction tree. The problem is that the roots change as transactions are processed, and earlier Merkle proofs are invalidated. This is a particularly big problem for non-membership in the spent transaction tree. There is a generalisation of Merkle trees called [accumulators](TODO). [Some accumulators](https://eprint.iacr.org/2020/777.pdf) allow us to all update our proofs with the same data, but due to a [well known impossibility result](TODO) the amount of data required for this is linear in the number of changes. To achieve arbitrary scaling with this method we'd have to make that data available, probably through a data availability committee, which defeats the purpose of this construction. ### Escape Pods To add an escape hatch to the two tree model, we use escape pods. Escape pods are simply the Merkle root of all the valid transactions in the current block. When Bob's transaction is included in a block, he asks the server for a proof that his transaction is in the escape pod, we call this a ticket of the escape pod. When the rollup shuts down, he uses his ticket to claim his funds, which are given to him once everyone has the chance to make their claims. If the server doesn't give Bob his ticket, he can go into an onchain queue to request it by using his receipt. If the sequencer doesn't provide a correct ticket in time, they are slashed. Additional complex game theoretic mechanisms could be added where the sequencer earns a fee for each ticket, but the simple queue mechanism is probably sufficient: the Sequencer should simply address the ticket request offchain, to alleviate the threat of having to pay gas costs to dequeue it. ### LoW (Lots of Wei) What if I get an escape pod ticket, then I spend my tokens? This means I'm able to claim spent tokens! What if I spend tokens to an address I own and get tickets for both? Then I'd be able to double-spend by withdrawing both the same token with both tickets. The simple solution, is to number every individual token. Every ticket will specify which particular tokens it claims, and the escape pods will be numbered in time. Tickets for later escape pods will supersede earlier ones, and if I am the latest owner of a token, I know that no one can produce a ticket for a later escape pod than me. This can be thought of as an alternative to the UTXO model. The balances of all users is represented as an ordered list of tokens and the users who own them. When money is deposited in the rollup, new tokens are minted at the end of the list. When tokens are withdrawn from the rollup, they are added to a Merkle tree of unusable tokens. It's important that the unusable tokens are specified onchain during normal withdrawls, as the list of withdrawn tokens is crucial to prevent double spending during shutdown. [::]: <> (diagram showing "user x owns tokens 0-154, user y owns tokens 155-800, ..., tokens 1590-1778 are unspendable, ..., total tokens 2999234882") The LoW model is not as naturally ammenable to efficient data structures like Merkle trees as the UTXO model. But note, we are not particularly compute constrained, as the computation of new updates only needs to be done on a single machine. We do have some limitations because the computation needs to be done as a ZKP, but this is a tractable engineering problem. One basic idea of how to implement LoW is as a [sparse Merkle tree](TODO). Each entry in the tree is `[start_token, end_token, address]`, and if our rollup can handle `2^256` [wei](TODO) in its lifetime, and Ethereum addresses are 160 bits long, the merkle proofs for normal operation are `256 + 256 + 160 = 672` elements long. This is fine since none of these proofs are ever directly onchain, and with [folding](TODO) or [recursion](TODO) and [SNARK friendly hash functions](TODO) this could be quite efficient. The major cost is probably in verifying traditional ECDSA signatures in the SNARK to prove ownership of coins. For this, we can hopefully use [spartan-ecdsa](TODO). ### Shutdown Shutdown occurs when the sequencer fails to perform its duties and is slashed. The shutdown period has two phases: claiming, and resolution. The claiming phase lasts long enough for everyone to reasonably claim their funds. Somewhere between 2 weeks and 2 months seems reasonable. During that time, the only action available on the rollup is to add claims to an onchain queue. These claims include the transaction ID, Merkle proof, and escape pod number. An optimised version would use an accumulator with short constant sized proofs for the membership proofs like [KZG](TODO). TODO: is KZG really a constant sized proof, also, since it needs a trusted setup of fixed size, is it acceptable for potentially arbitrary sized escape pods? During resolution, we determine who owns which tokens and send them to the appropriate people. This can be done in a smart contract, but should probably be done optimistically or in a SNARK. In a SNARK, for example, we would write a circuit that accepts the list of withdrawn funds, the list of escape pods, and the list of claims. The SNARK would output who is owed how much. The contract would verify the SNARK, make sure the inputs to the SNARK match the actual onchain values, then send the funds. The proving should be decentralised. The simplest way to do this is allow anyone to produce a proof, and give them a small reward. This results in a race scenario, which may waste overall effort, but is simple. To make sure the proof is eventually provided, the reward could gradually increase with the blocknumber. ### Security Assumptions Let's return to Alice and Bob. What exactly do they know about their funds during their transaction? #### Finality Alice signs a transaction and sends it to the sequencer. The sequencer returns a receipt signing her transaction. Alice gives the transaction to Bob. *Bob knows that his payment will be included in the next block, assuming the sequencer stays online, and doesn't want to be slashed with the receipt* Bob waits until the next block of L2 transactions. Bob asks the sequencer for a proof that his transaction is included in the most recent escape pod. The sequencer returns the proof. *Bob knows that he can trustlessly withdraw his funds, no matter what the sequencer does.* #### Instantaneity Alice signs a transaction and sends it to the sequencer. The sequencer doesn't reply. Now, to make sure the transaction goes through Alice uses the onchain queue. *By looking at the queue, Bob knows his payment will be in the next block, assuming the sequencer stays online.* So there is no guarantee of instantaneity. Alice expects an instantaneous response because the sequencer will have to pay gas fees to respond to an onchain queue. The fallback is confirmed at the speed of a normal L1 transaction. ### Graceful Shutdown If the rollup is no longer profitable, or there is a far more efficient new design it would be useful to shut the rollup down safely. The exit game puts a lot of responsibilty on users to hold onto their receipts and make the appropriate claims. It's likely that some users will not recieve all the funds they're owed - perhaps they forgot about them, or they were on a long holiday during the claiming period etc. Instead, there should be a function that lets the sequencer return all funds to their original owners on L1. This should not be immediately callable, as it breaks assumptions about instantaneity. Instead, the sequencer should have to declare shutdown well in advance, so that people have time to disable payments in their stores etc and move to the newer system. This is essentially an upgrade mechanism that allows fully static and reliable contracts. ## Further Work ### Soft Censorship The sequencer can't censor entirely transactions in this design, but it can "softly" censor by requiring the user to use the onchain queue. This is not sensible for a rational actor unless the sequencer gains something by censoring a particular user (suppose they're being paid or threatened to censor someone). One possible solution is to send a commitment to a transaction, rather than the transaction itself. The sequencer would sign the commitment, and once the user had the receipt they could reveal the transaction. Further work is needed to make sure that game theory still works, and to see whether the commitment needs to accompanied with a ZKP to prove that it's a commitment to a valid transaction. Moreover, there is a griefing attack where the user gets the sequencer to commit to including an invalid transaction, which means the sequencer will have to pay gas to resolve the transaction without earning a fee. This should probably be solved in the slashing game, i.e., when the user tries to slash the sequencer for not including a transaction, the sequencer has a window to respond that the transaction was invalid. It would be even better to make all transactions entirely private, a la Tornado Cash etc. ### Consolidation #### Swap Transaction Unfortunately, the cost to claim ownership during shutdown increases linearly with the number of UTXOs you want to claim. One solution is to create a new type of transaction where people can trade coins to create contiguous areas. There is a subtlety - we don't want to regularly invalidate our tickets, risking loss of all funds, just to save worst case gas costs. Instead we can make a new transaction type, where the entire pod is built ahead of time, and can only be executed *when the new escape pod tickets are signed* by their owners. This means that the owners definitely have the new escape pod ticket before the pod goes on chain. One could imagine a market based mechanism, where users set their consolidation preferences (i.e., the fee they're willing to pay to consolidate their coins), and the software automatically interacts with collectors (including the prover, probably), who propose consolidation and collect fees. A market is useful here, because if you were only relying on the prover to consolidate coins, then the prover has power over you to determine the fees. #### Dimensionality The version described above essentially imagines all coins on a long line, where tickets specify ranges along that line. Instead, we can use a higher dimensional object like a plane, or a volume, or, in the most extreme cast, a boolean hypercube. For a toy example, suppose we have bitstrings of length 4, representing 2^4 = 16 total coins in the whole system. The two numbers in our "range" actually specify the corners of a hypercube in 4 dimensions. I.e., [0000, 1111] specifies the whole cube, and [0110, 0110] specifies only the coin at 0110. This should give a lot more freedom to construct succinct tickets from diverse collections of coins. In particular, this avoids being blocked by a single immovable coin. More work is needed to characterise the tradeoffs of different dimensionalities, and explore downsides. For example, consider a line where Alice owns coins 100-200, and 300-400, and Bob won't give up 200-300. Alice needs 4 numbers to represent her range. In fact, no one can ever represent the combined claims of 100-200 and 300-400 without getting Bob's coins. However, if we add another dimension we can work around Bob. Let's use a diagram. Say there are 10 coins in total, and they're owned by Alice, Bob, and Charlie. ``` 1 C C B C C 0 A A B A A 0 1 2 3 4 ``` In the above situation, if Bob doesn't cooperate, both Alice and Charlie need 2 tickets to represent their claims. I.e., Charlie has the rectangle [[0,1], [1,1]] and [[3,1], [4,1]]. However, they can cooperate to simplify their tickets. ``` 1 A A B C C 0 A A B C C 0 1 2 3 4 ``` Now Charlie can represent all his tickets with [[3,0], [4,1]]. The open question is, does our freedom improve even more as we add dimensions? Note, we hold the total number of coins fixed, so that we ### Fees How are we determining the fees? In blockchains we normally have markets to determine this, but here we just have 1 prover, who has monopoly power but can be constrained. ### Comparison To Lightning Pro: far lower liveliness requirement. Lightning requires users to be vigilant so that their counterparties don't falsely close the channels. This ties the withdrawl time to vigilance requirements. In our system, by contrast, counterparties can't do a false withdrawl in normal operations, only during shutdown. This ties the vigilance period to the shutdown period, can be on the order of months. Pro: arbitrary payment sizes. In Lightning, you can only own up to the amount held in your state channels. Therefore you can only receive the amount currently owned by your counterparties. In our system you can receive any amount that's locked in the entire system. Pro: Minimal capital requirement. As explained above, Lightning needs large deposits in state channels to process large transactions. This ties up capital. This effect is probably not huge, as there is likely to be large amounts of unused capital on the Lightning network anyway, but it's still an effect! By contrast, our system only requires the central server to stake enough capital that it would be highly painful to lose it. This should be an extremely small fraction of the total funds processed, and indeed doesn't need to scale with the funds processed, only with the wealth of the server, which should be proportional to its income, which is in turn proportional to transaction fees. Pro: Latency is probably slightly better, as every transaction on Lightning requires routing and negotiation between multiple parties. Whereas our system only requires cooperation between the server and user. Moreover our system is highly optimisable in a web2 style. For example, a user can probably finalise a payment just by talking to an edge server (i.e., a server owned by the prover in their geographic region), provded that responsibility for various users is carefully distributed across the cloud. Con: Monopoly fee pricing power. Lightning is decentralised, which means that operators have to compete with one another for fees, which lowers the fees (in principle) to an efficient market rate. The prover in our system is also subject to economic pressure, but only in the form of the escape hatch, and competing payment systems at an abstract level (i.e., the prover may have to undercut Visa and compete with Lightning as part of a wider business strategy). ### Comparison to other low fee systems | | State Channels | DA Rollup | This | | ----------- | ----------- | ----------- | ----------- | | Escape Hatch | ✅ Individual | ❌ m-of-n | ✅ Individual | | Finality | ✅ Instant | ❌ L1 | ✅ Instant | | Capital Lockup | ❌ O(throughput) | ✅ O(1) | ✅ O(1) | | Payment Limit | ❌ Channel Balance | ✅ None | 🦺 ~8 ETH/TX | | Theft Checkup | ❌ ~Daily. | ✅ Never | 🦺 ~Monthly | ### Summary of Guarantees Most transactions are processed offchain: because the prover wants to collect fees and avoid paying for gas Transactions can't be censored: because users can use the onchain queue Transaction promises are trustworthy: because the prover doesn't want to be slashed The escape hatch is safe: because users have proof that they were the most recent owners of their coins