# LazyTower LazyTower is a data structure for proof of membership. It is a low-gas alternative to on-chain Merkle trees. The amortized cost for adding an item is O(1). (99424 gas for W=4, including 21000 and the event cost) The cost for ZK proving is O(1). The cost for non-ZK proving is O(log(N)). # The Core Idea Level full => hash (digest) https://www.youtube.com/clip/UgkxqvXNsWp3h21aFWJwCruP0UEJJdxPx3NC Every value in the tower is a Merkle root https://www.youtube.com/clip/Ugkx0kC1Q7JiLB-YJCpSqQkCAbx4WxTDEAhB The core idea in Javascript (bottom-up): ```Javascript var W = 4; // width var levels = []; // levels of the tower function add(item) { _add(0, item); // add to level 0 } function _add(lv, v) { if (lv == levels.length) { // new level levels.push([v]); } else if (levels[lv].length < W) { // not full levels[lv].push(v); } else { // full var d = digest(levels[lv]); levels[lv] = [v]; _add(lv + 1, d); // go up one level } } ``` Unless otherwise specified, we assume W = 4. LazyTower has the following properties: ## Changes seldom propagate to upper levels Most of the time we only change levels[0], the bottom level. levels[1] will only be changed every 4^1=4 add() calls. levels[2] will only be changed every 4^2=16 add() calls. levels[3] will only be changed every 4^3=64 add() calls. levels[4] will only be changed every 4^4=256 add() calls. levels[5] will only be changed every 4^5=1024 add() calls. ... This diagram shows the relationship between add() calls and the number of changed levels: ![](https://hackmd.io/_uploads/B12a-WwH2.png) ## The amortized cost for adding an item is O(1) If changing a level will incur a cost C, then the amortized cost for adding an item will be: C + 1/4 C + 1/16 C + 1/64 C + ... = (1 + 1/3) C for W=4 The result can also be observed from the gray area in the diagram above. The total cost for adding N items will never exceed the loose bound 2N. ## Every value in the tower is a Merkle root. ![](https://hackmd.io/_uploads/HJmyF0YNh.png) Every value at levels[1] is a Merkle root of 4^1=4 leafs. Every value at levels[2] is a Merkle root of 4^2=16 leafs. Every value at levels[3] is a Merkle root of 4^3=64 leafs. Every value at levels[0] is a Merkle root of itself. (4^0 = 1) These trees are all disjoint. ## Every item belongs to a single root in the tower. Roots and ranges https://www.youtube.com/watch?v=jhZVq0vWolQ For example: after adding item 0 ~ item 117 into the tower, item 94 belongs to the root at `levels[2][1]` . ![](https://hackmd.io/_uploads/ByqLxckSh.png) ## Values are append-only at each level. It can be observed from the code and the video above. ## The new value to be added to a level only depends on the very single level below it. For example: after adding item 0 ... item 19 into the tower, the tower looks like ``` levels[1] 0~3 4~7 8~11 12~15 levels[0] 16 17 18 19 ``` Now when we add item 20, the value to be added at levels[2] will only depend on levels[1] (i.e. item 0~15). It has nothing to do with item 16~19 and the newly added item 20. The value to be added at levels[1] will only depend on levels[0] (i.e. item 16~19). It has nothing to do with the newly added item 20. Hence we have the following property: ## The tower can be updated bottom-up or top-down. In the [video](https://www.youtube.com/clip/UgkxqvXNsWp3h21aFWJwCruP0UEJJdxPx3NC) above, we update the tower in the top-down order. In the Javascript above, we update the tower in the bottom-up order. Both orders are totally valid. # Build a Single Digest for Proof-of-Membership Now we have a tower full of Merkle roots. Since every added item belongs to a single root in the tower, we can provide a Merkle proof targeting that root. We now have a gas-efficient replacement of Merkle tree. If we want to prove the membership without revealing the item, then proving toward a single root is not good enough. We should provide a proof targeting all the roots without showing which specific root our item belongs to. However, loading all the roots from storage and feeding them into a circuit are both expensive on-chain operations. Can we build a single digest to represent all the roots in the tower? ## Build a digest for each level first For each level `levels[lv]`, we can build a corresponding `digests[lv]`. If the digest algorithm is collision resistent, then we can safely use that single value `digests[lv]` to represent the `W` values in `levels[lv]`. Why do we build the digests level by level? Because upper levels seldom change, we usually only have to update one or two digests of the bottom levels. ## Use an incremental hash function to build the level digests When values `a, b, c, d` are being sequentially added into the tower, these values will be appended to `levels[0]`. If we use a 4-input Poseidon hash `P4`, then the digest of `levels[0]` will be ``` P4(a, 0, 0, 0) P4(a, b, 0, 0) P4(a, b, c, 0) P4(a, b, c, d) ``` The on-chain cost of Poseidon hash is roughly linear to the number of inputs. The above steps will cost `4*4` units. Since the values are append-only, we can employ an **incremental hash function** to reduce the cost. For example, a **hash chain** composed with 2-input Poseidon hash `P2` can be used to build the level digest: ``` a P2(a, b) P2(P2(a, b), c) P2(P2(P2(a, b), c), d) ``` The above steps will only cost `0 + 2 + 2 + 2` units on hash. To compute the new digest when `d` is being added, we only have to keep the previous digest `P2(P2(a, b), c)`. We don't have to store and load `a, b, c`. Note that now we also have to keep the level length. For example, the two element list `[a, b]` and the single element list `[P2(a, b)]` both have the same digest. If we don't keep track of the length, a malicious prover could claim that `P2(a, b)` has previously been added to the tower. The digest function can be applied to all the levels of the tower. We can create an array of level digests `digests[lv] = digest(levels[lv])`. ## Keep the `digests[]`, drop the `levels[]` Since 1. the single value `digests[lv]` can represent the whole `levels[lv]` 2. the new digest value only depends on the previous digest value (hash chain is incremental) 3. when a level is full, the single value that will be added to the upper level is the level digest , we can safely keep only the `digests[]` and drop the `levels[]` . ![](https://hackmd.io/_uploads/HyC9D_xrn.png =237x360) ## Build a digest of digests Now all the `levels[]` has been digested into `digests[]`. We can futher digest the array into a single value. Let's use hash chain again. We can apply it on the reverse of `digests[]`, i.e. in a top-down order. Assuming that the values in `digests[]` are `a, b, c, d, 0, 0, 0 ...`. The digest of digests would be `P2(P2(P2(d, c), b), a)`. Why don't we apply it in the order of `P2(P2(P2(a, b), c), d)`? It's becuase lower digests `digests[0] digests[1]` change more frequently. If we put them at the head of the hash chain, then we have to recompute the whole chain everytime. If we put them at the tail of the chain, then most of the time we only have to compute one or two `P2`. ## On-chain cost analysis Since we use hash chain to compute both the level digests and digest of digests, the cost of updating a single level is bounded by a constant. The cost of a single `add()` call is propotional to the number of levels being updated. ![](https://hackmd.io/_uploads/B12a-WwH2.png) Since the upper levels seldom change, the amortized cost for adding an item while maintaining a single digest for all the items is still bounded by a constant. ## Circuit complexity If the height and weight of the circuit is H x W, then the circuit complexity will roughly equal to the complexity of H x W Poseidon hashes. Number of non-linear constraints for W = 4: | Height | Capacity | #Constraints | | ------ | ------------------- | ------------ | | 4 | 340 | 3741 | | 8 | 87,380 | 7693 | | 12 | 22,369,620 | 11645 | | 16 | 5,726,623,060 | 15597 | | 20 | 1,466,015,503,700 | 19549 | | 24 | 375,299,968,947,540 | 23501 | ![](https://hackmd.io/_uploads/SkbSDxkPh.png) # Comparing with Merkle Trees ## There is no capacity / cost tradeoff The tower can grow upward dynamically. The only limit is the contract storage. Since the height H does not affect the amortized on-chain cost, we can pick a big H upfront. For example, if we choose H = 24 and W = 4, then the capacity will be 375,299,968,947,540 items. The amortized cost will not be affected. ## Pay as you go Even if we choose H = 24, it doesn't mean that we have to use a fixed circuit with the same height. The client and contract could dynamically select a circuit that can accommodate the current data. For example, we can use a H = 4 circuit while there are less than 340 items being added. We can use a H = 8 circuit while there are less than 87380 items being addes. It will greatly reduce the time needed for generating the proof. # Future works * Find a more efficient incremental hash to further reduce the gas usage * Emit and fetch intermidiate nodes to reduce client side computation * Use private information retrieval to reduce event fetching and computation # Summary In this article, we have illustrated the idea of LazyTower. By localizing the changes, we can build gas-efficient data structures for both ZK and non-ZK proof-of-membership. The code is released under MIT and GPL-v3. https://github.com/privacy-scaling-explorations/zk-kit/tree/dev/packages # Acknoledgements (from @LCamel) I would like to thank @Chance for his time and effort in reviewing and discussing this with me. He makes LazyTower much better. I thank @levs57 for giving critical security advices. I learn how a ZK project works from @cedoor's work. I thank my wife for supporting me. I couldn't have done this without her. This project receives grants from the Ethereum Foundation ESP.