From Offchain to Client-Side: The Scaling Limit of Blockchains

# From Offchain to Client-Side: The Scaling Limit of Blockchains Scalability is an eternal topic for blockchain systems. Across several cycles of market booms and busts, new projects emerge in each bull cycle with higher-performance blockchains as their selling point. Although many do so by sacrificing censorship resistance, single-point-of-failure resistance, and decentralization, it underscores the user demand for better performance. Designing a high-performance, monolithic blockchain with rich state often seems to lead inexorably toward a centralized ordering layer and nodes with extremely high hardware requirements. But if we focus on a blockchain dedicated purely to payments, and without compromising decentralization, what is the maximum possible TPS? Given that stablecoin transfers are still a major chunk of blockchain activity (consider PayFi, for instance), exploring this question remains highly relevant. Moreover, in the era of AI agents, the demand for asset flows between agents may be tens of thousands of times that of humans. The only limiting factors for AI are computing power and network bandwidth, whereas carbon-based humans face various physiological and physical constraints. If we want to meet AI-to-AI requirements for asset flow, current transfer systems must become even faster. The first question to analyze is: for a blockchain system dedicated to payments, where are the bottlenecks? Is it transaction execution, network communication, or exploding state size? For simple transfers, most ordinary machines today can already handle very high TPS. Therefore, I believe the two primary issues are network communication and state size: - **Network communication**: The key is reducing the number of bytes each transaction contributes to DA (Data Availability) synchronization across the entire network. - **State size**: The key is minimizing how much user information occupies on-chain space. Hence, we need stateless blockchains or client-side state. Next, I will explore some existing solutions in the market and share a few ideas for designing a payment-focused blockchain system. ------ ## Hermez 1.0 Hermez 1.0 was a zkRollup system dedicated to payments. Though it has been decommissioned and the Hermez team acquired by Polygon to develop zkEVM, studying Hermez sheds light on the performance limits of zkRollup for payment-specific use cases. ![bytesbreakdown2](https://hackmd.io/_uploads/r1PE_fuUJg.png) A typical ETH transfer requires at least 109 bytes, and transfers of stablecoins require even more. Hermez 1.0, however, manages to reduce that to just 14 bytes per transaction. Because of zk technology, the signature itself is verified within the zk proof without needing DA space, saving 65 bytes in the process. Let’s assume a block size of 1 MB and a 12-second block time. This equates to a throughput of 87,381 bytes per second. With each Hermez transaction being only 14 bytes, the theoretical limit is 6,241 TPS. Of course, we still need to account for zk proof generation time and various delays. According to Hermez’s claims, its testnet achieved around 2,000 TPS in real-world conditions. Due to the elegance of this approach, we can consider that the upper bound for a non-client-side payment-only zkRollup, while preserving a certain degree of decentralization, is on the order of a few thousand to around ten thousand TPS. ------ ## RGB and Taproot Assets Shifting our focus to the BTC ecosystem, RGB stands out as the pioneer of Client-Side Validation (CSV), offering a new perspective: let most of the work be done client-side, thus minimizing on-chain traces. The core idea behind RGB is Client-Side Validation and Single-Use Seals. Essentially, an off-chain state is tied to on-chain UTXOs. The uniqueness of UTXO spending enforces resistance to double-spends. Meanwhile, the client receiving an RGB asset must independently validate the entire historical transaction chain of that asset. Through this, RGB ensures the transaction sequence is double-spend-resistant and valid. Taproot Assets is another CSV asset protocol that leverages Taproot’s Taptree with modifications to better support asset issuance and transfer. However, its fundamental principle is still CSV plus Single-Use Seals. Compared to Inscriptions or Ordinals, RGB has a smaller on-chain footprint and offers better privacy. Other than the parties involved in a transaction, no one can see the link between UTXOs and off-chain states on-chain. But once a transfer occurs, the receiving party can view the transaction history of the entire chain of custody. That said, one limitation of such assets is that each off-chain transaction must map 1:1 to an on-chain transaction, meaning it only adds asset-issuance functionality to Bitcoin without actually providing scalability benefits. The only advantage is that it represents more complex states without occupying too much on-chain space. However, looking ahead, we see that CSV opens the door to a new direction in scalability—a direction we’ll revisit when discussing further solutions. ------ ## Shielded CSV [ShieldedCSV/ShieldedCSV](https://github.com/ShieldedCSV/ShieldedCSV/releases/tag/2024-09-20) Shielded CSV extends the privacy of RGB. Not only can non-parties to a transaction not see the transfer chain on-chain, but even the recipient of a transaction cannot see earlier transactions in that chain. It uses recursive zero-knowledge proofs so that the recipient, by verifying the zk proof, can confirm the validity of the entire transaction history. Like other zk-privacy solutions, Shielded CSV maintains an ever-growing nullifier set; all nodes must listen to the entire network for nullifiers to avoid receiving invalid transactions. Because each node in Shielded CSV needs to monitor the BTC network in real-time, it lacks some practicality. However, a major takeaway here is that by employing recursive zero-knowledge proofs, a CSV protocol can verify the validity of the entire transaction chain using only a single recursive proof rather than the entire chain of transactions. Indeed, RGB is currently exploring this approach ([STARKy RGB: making RGB compatible with zk-STARKs · RGB-WG · Discussion #265](https://github.com/orgs/RGB-WG/discussions/265)). ------ ## Payment Channel Payment channels were arguably the first Layer 2 solution, and the Lightning Network has long been seen as a potential scaling savior for Bitcoin. Yet its adoption has been slow. The mechanism is simple: two parties lock funds into a multi-signature address, and for a period, all transactions happen off-chain between those two parties. Various game-theoretic mechanisms allow any transaction to be settled immediately, enabling near-instant finality. With HTLC and PTLC, channels can be routed. If A-B and B-C each have channels, even if A and C do not, B can still relay transactions between A and C. In some sense, I think payment channels are very well-suited for transactions between AI agents. Their latency is extremely low, and speed depends only on network bandwidth and local processing power. However, current payment channels still face problems, which hamper adoption. Take the Lightning Network, for example. Its core issues include: - **Inbound liquidity**: When two parties open a channel, the liquidity locked in determines each party’s maximum receivable amount. If A and B open a channel funded with 0.01 BTC from A and 0.005 BTC from B, then A can only receive up to 0.005 BTC from B, and vice versa. - **Capital efficiency**: A new node on the Lightning Network may not have any inbound liquidity at all. Providing liquidity on the inbound side requires others to lock their funds in that channel—if there is no incentive, few will do so. - **Rebalancing**: Real-world payment flows are rarely balanced. Some nodes primarily receive, others primarily send, and eventually the network becomes unbalanced. Rebalancing often requires on-chain transactions, each of which may cost hundreds of kilobytes in on-chain data availability, making the user experience and cost far from ideal. To address this, one approach is economic incentives, such as UTXOStack’s ongoing work. By using staking, idle funds can be channeled into the Lightning Network to earn fees. Another approach is combining payment channels with other compression schemes to dramatically reduce the cost of adjusting and rebalancing channels. ------ ## IntMax2 IntMax2 is an intriguing project that combines many of the aforementioned techniques, including client-side state, recursive zero-knowledge proofs, and zkRollup. Users maintain a local balance set and the associated validity proofs. When sending a transaction, they construct the proof of validity using their own balances. Upon receiving, the other party updates their balances with the relevant proofs. ![image](https://hackmd.io/_uploads/Hy05uMOU1l.png) In theory, IntMax2 can achieve high TPS. However, one significant downside is that a single transfer involves too many rounds of interaction—between sender and block builder, and between sender and recipient—which affects its usability in practice. ------ ## Polygon Miden Compared to IntMax2, Polygon Miden is a more comprehensive zkRollup. It integrates client-side proving, a hybrid UTXO and account-based state model, addresses state-explosion issues, and maintains transaction privacy. Polygon Miden features two types of state representation: **Accounts** and **Notes**. An **Account** is similar to an EVM contract account with callable interfaces. A **Note** has a script akin to a UTXO. In each transaction, an Account can consume some Notes and produce new ones. Communication between Accounts can only happen via Notes, which is asynchronous and somewhat resembles the Actor Model. There are three trees in Polygon Miden’s state: 1. **Account Tree**: An SMT to maintain the set of existing accounts. For on-chain accounts, all state data is held in the zkRollup’s state so that the entire network can execute it. Meanwhile, off-chain accounts exist on-chain only as a single account hash, and all details are stored on the client. In other words, no matter how many assets an off-chain account holds, it occupies only one hash on-chain. 2. **Notes Tree**: A Merkle Mountain Range storing all created Notes. 3. **Nullifier Tree**: An SMT storing all consumed Notes. A similar nullifier concept is seen in many UTXO-based privacy blockchains. Through this heterogeneous state model, Polygon Miden allows users who want privacy and low cost to use off-chain accounts exclusively. A user can create a Note to transfer tokens to another account. The receiving account then consumes that Note to add a certain asset balance to itself. During this process, the sender’s account data, the recipient’s account data, and the Note content remain hidden from the zkRollup. The only visible changes are a modified account hash in the Account Tree and a new entry in the Notes Tree. ![note-life-cycle](https://hackmd.io/_uploads/rJbnOz_Iyx.png) Because the entire execution is compressed by recursive zero-knowledge proofs, the on-chain DA is minimized. Hence, Polygon Miden achieves smaller on-chain overhead via recursive zk proofs and less on-chain state via client-side state storage. ------ ## Client-Side zkRollup for Channels From the above analysis, we see that recursive zkRollups can compress the **space dimension** of transactions, spreading out the on-chain DA overhead across many transactions, while payment channels can compress the **time dimension** of transactions, allowing many transactions to share the same on-chain DA overhead over a period. So, if we combine zkRollups and payment channels into a “Client-Side zkRollup for Channel,” might we achieve extreme scalability by compressing both the space and time dimensions? Let’s outline a visionary design for such a blockchain system. It’s a UTXO-model Rollup with the following characteristics: 1. All UTXOs owned by each user are compressed by an SMT. On-chain, there is only a `RootHash` for each user’s UTXO tree. Each user locally keeps their UTXO set. (Using an Account Tree would work similarly, recording all a user’s assets within a tree.) 2. A user can initiate a transaction to create a “floating UTXO” from their UTXO tree. 3. The floating UTXO’s script can have some programmable capabilities. Or, specifically for a payment-channel circuit, two users can use their floating UTXOs to construct a channel funding transaction and carry out Lightning-like payment channel operations. 4. If more funds are needed, a user can create a new floating UTXO and inject liquidity into the channel. Because of recursive proof compression, as long as these two steps happen in the same batch, the intermediate floating UTXO does not occupy on-chain DA space. 5. When withdrawing funds, the multisig-controlled UTXO is split into multiple floating UTXOs, and each user adds the corresponding assets back into their own UTXO tree. 6. If we only care about payments and not maximal privacy, we need not design a Nullifier Tree. Such a design minimizes on-chain DA usage while maximizing TPS. The only occasions that truly consume on-chain DA are channel creation, funding, and exit. Because payment channels facilitate near-instant confirmation, the presence of an extra layer does not affect instant finality. In principle, this can also be built directly as a Layer 1 chain, similar to how Mina operates. In this system, user UTXO sets and transactions are never fully stored on-chain; the chain only sees that certain account hashes changed, or that a floating UTXO was created and consumed. Put differently, the chain sees only the **state diffs**, and within a single block, those state diffs are compressed. By merging the strengths of recursive zkRollups and channels, we could potentially push the limits of blockchain scalability, achieving both minimal on-chain overhead and robust, instant finality for payments.