# Trust and Upgradability *Special thanks to [Cláudio Silva](https://x.com/claudioengdist), [Danilo Tuler](https://x.com/dtuler) and [Milton Jonathan](https://x.com/miltonjonat) for reviewing this article!* *Hello, my name is Guilherme. I've been part of the on-chain team behind [Cartesi](https://cartesi.io/) Rollups since late 2021. Since then, one topic that has always sparked my interest and that of fellow Cartesi contributors is upgradability. What are the trust assumptions at play? How can we minimize them, while also taking into consideration other aspects such as implementation complexity, UX, infrastructure costs, and L1 fees? Let's dig in.* ## Introduction ![](https://hackmd.io/_uploads/rkRj3y-cR.jpg) In the software industry, when a severe security vulnerability is reported and fixed, users are highly encouraged to upgrade to the patched version. On web2, users aren't often given much choice besides agreeing with the new terms and conditions. Users are, therefore,, often coerced to agree with the new terms, despite possible setbacks in privacy and transparency. This is even more severe with vendor lock-in practices, which are so common in this industry. On web3, to employ this policy would be incompatible with the philosophy of zero trust. That is why, once a smart contract is deployed, its code cannot be modified, not even by the deployer. It can, indeed, self-destruct, but the conditions for that to happen are specified by the smart contract itself, much like how a paper contract would state the conditions of its conclusion. Furthermore, any upgrade/migration logic must be described by the smart contract from day 1. You could argue that certain elaborate proxy contract schemes allow authorized parties to switch between implementations with relative ease, but this flexibility comes at the cost of users having to trust the proxy owner. Some users might not mind the backdoor, or even know of it, and some might even deposit real L1 assets, but more seasoned web3 users would avoid interacting with such application, as they have seen enough hacks taking advantage of this design in the past. So, even though Ethereum allows for pseudo-trustless applications, users tend to value more those applications in line with the web3 ethos. Likewise, Cartesi Rollups inherits some of the principles set forth by Ethereum, by establishing the genesis state of the machine as an immutable on-chain value, and by allowing anyone to add inputs, validate/execute outputs, and deploy new applications with whichever code. We have to admit that we're still not there at the golden standard of trustlessness, just like many other scaling solutions, employing so-called ["training wheels"](https://ethereum-magicians.org/t/proposed-milestones-for-rollups-taking-off-training-wheels/11571). However, we're making great strides with [Dave](https://cartesi.io/blog/grokking-dave/), our fraud proof system, which will allow permissionless and trustless arbitration of disputes on-chain, with great properties in favor of honest actors with average hardware and modest staked funds. ## No upgrades ![](https://hackmd.io/_uploads/H1n3nyWqR.jpg) What if our application did not implement any upgrade strategy whatsoever? If our application were validated by Dave, our fraud proof system, then, as long as there is one honest validator, it will stay alive and safe. Alternatively, if our application were validated by an authority, then they would either have to commit to the task of validating it forever, or eventually pull the plug. In the latter case, the authority can give a generous heads-up for users to withdraw their assets before the shutdown, but it would always be possible for some inattentive user to miss the announcement and only realize they've got their assets locked forever after the shutdown. You could remedy this worst case scenario by allowing an authorized party to trigger a system-wide withdrawal request. This would harm trustlessness, but, then, if your application is being validated by an authority, users already need to trust that they will keep the application alive, right? And if you're going to pull the plug either way, you might as well guard users against locking their assets. Having no upgrades in place is definitely the simplest option from the implementation perspective, but it can have some drawbacks. If a user wanted to migrate from one version to the next, then they would have to issue a withdrawal request, wait for the epoch to close, and execute as many vouchers as asset types they'd like to bridge from L2 to L1. Then, they will have to deposit these assets to the newest version of the application. This whole process can be very costly, and, honestly, is also not great from a UX point-of-view. [ERC-4337](https://eips.ethereum.org/EIPS/eip-4337) might improve on this, with bundling. In terms of trust, though, it would be very much in line with the web3 ethos. The application user wouldn't have to trust the code of the next version. But what if they do, and want to migrate? Can we have some improvement over the current model, with similar trust assumptions? ## Individual upgrades ![](https://hackmd.io/_uploads/SJyA21Z5R.jpg) So, imagine that a user wanted to migrate their assets from App v1 to App v2. How could the process be smoother? Well, App v1 could support a special type of withdrawal request that bridges the assets directly to another application. The back-end of App v1 could be totally unaware of the existence of App v2. If the App v1 front-end can be easily updated, it could suggest App v2 as a possible bridging destination. In more technical terms, App v1 would accept a direct input from users requesting migration, that would emit one voucher for each asset type. In the case of ERC-20 tokens, for example, it would first emit an approval voucher to the ERC-20 portal with a high enough allowance, and another to the ERC-20 portal with the user balance in L2. A similar approach would be possible for other token standards. :::spoiler :wrench: **Technical detail: Ether withdrawals** As of Cartesi Rollups SDK v1, it would be a bit convoluted for an application to deposit Ether through the `EtherPortal` contract, because the `withdrawEther` function doesn't take a `payload` parameter, which would be used to encode the Solidity function call. For Cartesi Rollups SDK v2, we are planning to add a `value` field to all vouchers, so Ether withdrawals and payable function calls will be much more straightforward. ::: This approach already saves the user potentially many L1 transactions. Instead of executing a withdraw voucher and depositing an asset manually, the user would execute a voucher that would deposit the asset directly into the new application. On the `execLayerData` field, the application would specify the user address, so that the receiving application would know in which account to deposit the tokens. One problem with this approach is that every user that wants to migrate from v1 to v2 still has to execute as many vouchers as assets they would like to bridge from one application to the other. ## Group upgrades ![](https://hackmd.io/_uploads/SJ_C31-c0.jpg) What if we could transfer the assets of a whole group of users at once? This can reduce the total amount of L1 fees spent on voucher execution during migrations. You could sum the balances of all users that want to migrate to the same application, and specify in the `execLayerData` field how much each user owns from that sum. If the size of `execLayerData` grew with the size of the group (for example, if you encoded it as an array of address-balance tuples), then so would the cost of executing that voucher. Still, you would be saving some gas per group user, since you would be splitting a fixed cost among everyone. This fixed cost includes retrieving the claim from the consensus contract, checking a validity proof, and marking the voucher as executed. For small groups, this approach might be good enough, and you would still be saving some gas per user. For larger groups, however, this solution might be unfeasible. Enter, Merkle trees. ## Mass upgrades ![](https://hackmd.io/_uploads/Syr1T1Zc0.jpg) If an application reaches a sizeable userbase, then per-user and per-group upgrades might not scale very well, due to the high cost of L1 `calldata`. With mass upgrades, however, the size of `execLayerData` would not grow with the size of the group. It would, in fact, have constant size. When the application decides to trigger a mass upgrade, having a considerable amount of compliant users, it would, for each asset type, construct a Merkle tree out of address-balance pairs, and compute its root hash. This would be the `execLayerData` field of the deposit voucher. We could also construct a sparse Merkle tree, in which the leaf index is the user address itself. It would contain $2^{160}$ leaves, but we know an efficient way to compute the root hash of sparse Merkle trees with complexity $O(n)$, where $n$ is the number of non-zero leaves. From the other side, App v2 would receive a deposit coming from App v1 with this Merkle root as `execLayerData`. With only the Merkle commitment, it would, of course, not be able to open it up, and assign balances to the respective accounts. So, users would have to submit either Merkle proofs, or all the address-balance pairs. A nice property of this strategy is that it is App v2 that decides how the Merkle root commitment can be opened up. As we'll later see in section "Impact on L1 fees", there are many different ways this can be done, with advantages and disadvantages each. :::spoiler :wrench: **Technical discussion: EIP-4844** If the application uses L1 `calldata` as the only source of data availability, then you could argue that L1 blockspace is still being disputed when users have to wake up their dormant accounts by providing sizable proofs as inputs. Alternatively, in the future, [EIP-4844](https://eips.ethereum.org/EIPS/eip-4844) could allow for multiple proofs or a large array to be sent in one input, given that each blob has a fixed size of 128 KB. Support for EIP-4844 as data availability for Cartesi Rollups SDK is still experimental, and will only be possible in the upcoming release of SDK v2. You can check out this [repository](https://github.com/guidanoli/cartesi-blobs) for implementation details and resources of [my Experiment Week project](https://www.youtube.com/watch?v=IJcFfhOj8SM&list=PL-srLb8IDxZXCs13yBF1Dnq8rNA2hEXSh&index=3). ::: ## Impact on implementation complexity ![](https://hackmd.io/_uploads/By3gakW9R.jpg) From all the strategies, the easiest one to implement is, by far, having no upgrades at all. Deposits and withdrawals are actions that Cartesi application developers already know how to implement. With individual upgrades, we need the application to emit one voucher to the token contract approving a transfer, and another targeting the appropriate portal for the asset type being bridged. In the `execLayerData` field of the second voucher, the application must specify the address of the user ($20$ bytes). With group upgrades, we need to have some group formation logic. A simple implementation is to add two actions: join group and close group. The first action would create a group headed to an application, if there isn't one already, or join an already existing one. The second action would trigger the migration of the group to the target application. Then, the application would have to sum the balances of all group members, and emit a single voucher to the appropriate portal with the addresses and balances of all members in the `execLayerData` field ($52n$ bytes, for $n \ge 1$). With mass upgrades, you could use the same group formation logic. Once a group is closed, the application would construct a Merkle tree on top of all address-balance pairs, and emit the leaves in a report, and calculate the root hash, which would be passed down to the application via the `execLayerData` field ($32$ bytes). The report would be used for later commitment reveal in App v2, which may take many different forms. Since App v2 could receive multiple Merkle root commitments from App v1, they could be distinguished by a commitment index. App v2 could accept a reveal request, containing the address of the old application and the commitment index. It could accept either all address-balance pairs, from which the Merkle root could be recomputed and checked against the one sent by App v1, marking the commitment as "used". It could also accept a proof of an individual user, and mark that specific user as "woken up". Alternatively, using the mixed method, it could accept a proof of a subtree, and mark that subgroup of users as "woken up". The receiving application would already know the address of accepted "old" versions, and would be able to distinguish between upgrade types simply by the size of the `execLayerData`. | `execLayerData` size | Upgrade type | | :-: | :-: | | $20$ | Individual | | $32$ | Mass | | $52n,\ n \ge 1$ | Group | ## Impact on UX ![](https://hackmd.io/_uploads/SkKbp1b5A.jpg) Let's analyze how the strategies we've enumerated in the previous sections compare in relation to UX. And, by that, we mean, we'd like to minimize user interaction as much as possible, while still giving them control over their assets. All of them (except the "no upgrade" strategy) require at least one input from the user telling the application they agree to migrate. | Strategy | # Transactions | L1 fee | Notes | | :-: | :-: | :-: | :-: | | No upgrades | 2-3 | $O(1)$ |User must sign 1-2 TXs | | Individual upgrades | 1-2 | $O(1)$ | | | Group upgrades | 1-2 per group of $n$ | $O(n)$ | | | Mass upgrades | 1-2 per group of $n$ | (see note) | Requires later commit reveal | Note: Without EIP-4844, mass upgrades cost $O(n)$ calldata. With EIP-4844, they cost $O(1)$ calldata and $O(n)$ blobs. More in the next section. ## Impact on infrastructure cost ![](https://hackmd.io/_uploads/rJPGakZq0.jpg) In a scenario in which Dave is being used, then as long as there is one interested honest party, the application will keep alive and safe. In a scenario in which an application is being validated, instead, by an authority, there would be two possible outcomes: - The authority will keep validating both versions, increasing infrastructure costs. - The authority stops validating the old version, and starts validating the new one. In the latter case, users would need to withdraw their assets before the shutdown. As suggested before, the application may even have a shutdown feature, in which all assets are withdrawn at once. The application could also support an opt-in policy for upgrades. Past some date, those that have opted in would have their assets migrated to the latest version, while the rest would have their assets withdrawn. ## Impact on L1 fees ![](https://hackmd.io/_uploads/HyXXay-9A.jpg) How upgrades are carried out can directly affect the amount of L1 fees that users have to spend. Let's compare the aforementioned upgrade strategies, for bridging one asset type between applications. Without upgrades, the user has to manually execute 1 withdrawal voucher and do 1 deposit. Also, depending on the asset type and allowance, the user may also need to do 1 approval to the portal. This is the default way to bridge assets between applications. With individual upgrades, users would be able to execute a minimum of 1 deposit voucher. Depending on the asset type and allowance, though, they might need to execute 1 extra approval voucher. In practice, however, the application could emit a maxed out approval voucher that could leverage multiple deposit vouchers. Now, with group upgrades, we start to amortize the fixed cost of executing a deposit voucher among the group. So, even though the `execLayerData` field would grow linearly with the group size (52 bytes per group member), the total amount of L1 fees would still be smaller, when compared to individual upgrades. Finally, with mass upgrades, we might get the cheapest L1 fee, because the `execLayerData` field would have a constant, smaller size of 32 bytes, regardless of the group size. If proofs are delivered through L1 `calldata`, then the total L1 cost might be worse than in the group strategy. If they are delivered through EIP-4844 "blobs", on the other hand, this strategy might have an advantage, assuming blobs are cheaper than `calldata`. If the constructed Merkle tree is sparse on the Ethereum address space, the proof would have the following structure, occupying a bit over 5 KB. Each blob could support, therefore, 25 users. ```solidity struct Proof { address user; uint256 balance; bytes32[160] siblings; } ``` Alternatively, the Merkle tree could have address-balance tuples as leaves. So, for a group of $\le 2^H$ users, we could bring down the number of siblings from $160$ to $H$, which would, in turn, allow us to increase the number of proofs per blob. With this, an extra leaf index field is needed in the proof. ```solidity struct Proof { address user; uint256 balance; uint256 index; bytes32[H] siblings; } ``` Finally, there is another option: to send the whole array of address-balance tuples ($20+32=52$ bytes per tuple). For small groups, this is totally affordable with `calldata`, but, for larger groups, EIP-4844 would really shine as a cheaper data availability alternative. ```solidity struct Leaf { address user; uint256 balance; } struct Proof { Leaf[N] leaves; } ``` The following table shows how many EIP-4844 blobs would be needed to send all the proofs or the whole array of address-balance pairs, for different values of $H$. Also note that, with respect to the group size $n$, the proof size is $O(log\ n)$, while the array size is $O(n)$. Therefore, to migrate the whole group, sending individual proofs takes size $O(n\ log\ n)$, while sending the whole array takes $O(n)$. | Group size | $H$ | Proof size | # blobs (proofs) | Array size | # blobs (array) | | :-: | :-: | :-: | :-: | :-: | :-: | | 1 | 0 | 84 | 1 | 52 | 1 | | 2 | 1 | 116 | 1 | 104 | 1 | | 3-4 | 2 | 148 | 1 | 156-208 | 1 | | 5-8 | 3 | 180 | 1 | 260-416 | 1 | | 9-16 | 4 | 212 | 1 | 468-832 | 1 | | 17-32 | 5 | 244 | 1 | 884-1664 | 1 | | 33-64 | 6 | 276 | 1 | 1716-3328 | 1 | | 65-128 | 7 | 308 | 1 | 3380-6656 | 1 | | 129-256 | 8 | 340 | 1 | 6708-13312 | 1 | | 257-512 | 9 | 372 | 1-2 | 13364-26624 | 1 | | 513-1024 | 10 | 404 | 2-4 | 26676-53248 | 1 | | 1025-2048 | 11 | 436 | 4-7 | 53300-106496 | 1 | | 2049-4096 | 12 | 468 | 8-15 | 106548-212992 | 1-2 | | 4097-8192 | 13 | 500 | 16-32 | 213044-425984 | 2-4 | | 8193-16384 | 14 | 532 | 34-67 | 426036-851968 | 4-7 | | 16385-32768 | 15 | 564 | 71-142 | 852020-1703936 | 7-13 | From the table above, we can see that arrays make more efficient use of blobspace, compared to individual proofs (2520 vs. 352 users/blob, respectively). This is easy to see given that proofs also contain a leaf index and a siblings array. Arrays may be more space efficient, but they need to be complete to compute the Merkle root. Meanwhile, proofs allow progressive upgrades, in batches of 352 users. This option would only make sense for more groups larger than 2520 users, in which case 1 blob would not be enough to fit the entire array. Still, in this case, it would take at least 9 blobs to fit all proofs. There is also a mixed approach, in which you provide the leaves of a subtree, and the inclusion of that subtree in the bigger tree. It seems that the maximum subgroup size is $2^{11}=2048$ users/blob. The proof then contains $H-11$ siblings, for $11 \le H \le 160$, which all still fit in a blob. This is a considerable improvement, compared to the $352$ users/blob ratio we had previously. This mixed approach allows for a users/blob ratio comparable to the pure array strategy (2048 vs. 2520 users/blob), while also allowing progressive upgrades, and requiring, in practice, $O(n)$ blobs. ```solidity struct Leaf { address user; uint256 balance; } struct Proof { uint256 index; Leaf[1 << 11] leaves; bytes32[H - 11] siblings; } ``` ## Conclusion ![](https://hackmd.io/_uploads/rJIEp1bcR.jpg) Upgrades are a vital part of traditional software development. We cannot escape from them, even in the web3 era. However, extra caution must be taken when translating traditional web2 procedures to a paradigm of zero trust. The blockchain also imposes several technical limitations to the amount of data that can be transported between applications that use L1 as data availability. These should also be accounted for, as they impact experience and migration costs for users. Furthermore, we also should consider designs that allow for gradual or progressive upgrades, specially with large userbases. We should also look forward to EIP-4844 making large inputs cheaper to send on L1.