LazyTower

LazyTower is a data structure for proof of membership. It is a low-gas alternative to on-chain Merkle trees.

The amortized cost for adding an item is O(1).
(99424 gas for W=4, including 21000 and the event cost)

The cost for ZK proving is O(1).
The cost for non-ZK proving is O(log(N)).

The Core Idea

Level full => hash (digest)
https://www.youtube.com/clip/UgkxqvXNsWp3h21aFWJwCruP0UEJJdxPx3NC

Every value in the tower is a Merkle root
https://www.youtube.com/clip/Ugkx0kC1Q7JiLB-YJCpSqQkCAbx4WxTDEAhB

The core idea in Javascript (bottom-up):

var W = 4;                               // width
var levels = [];                         // levels of the tower

function add(item) {
    _add(0, item);                       // add to level 0
}
function _add(lv, v) {
    if (lv == levels.length) {           // new level
        levels.push([v]);                
    } else if (levels[lv].length < W) {  // not full
        levels[lv].push(v);
    } else {                             // full
        var d = digest(levels[lv]);
        levels[lv] = [v];
        _add(lv + 1, d);                 // go up one level
    }
}

Unless otherwise specified, we assume W = 4.

LazyTower has the following properties:

Changes seldom propagate to upper levels

Most of the time we only change levels[0], the bottom level.

levels[1] will only be changed every 4^1=4 add() calls.
levels[2] will only be changed every 4^2=16 add() calls.
levels[3] will only be changed every 4^3=64 add() calls.
levels[4] will only be changed every 4^4=256 add() calls.
levels[5] will only be changed every 4^5=1024 add() calls.
โ€ฆ

This diagram shows the relationship between add() calls and the number of changed levels:

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More โ†’

The amortized cost for adding an item is O(1)

If changing a level will incur a cost C, then the amortized cost for adding an item will be:
C + 1/4 C + 1/16 C + 1/64 C + โ€ฆ = (1 + 1/3) C for W=4

The result can also be observed from the gray area in the diagram above. The total cost for adding N items will never exceed the loose bound 2N.

Every value in the tower is a Merkle root.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More โ†’

Every value at levels[1] is a Merkle root of 4^1=4 leafs.
Every value at levels[2] is a Merkle root of 4^2=16 leafs.
Every value at levels[3] is a Merkle root of 4^3=64 leafs.

Every value at levels[0] is a Merkle root of itself. (4^0 = 1)

These trees are all disjoint.

Every item belongs to a single root in the tower.

Roots and ranges
https://www.youtube.com/watch?v=jhZVq0vWolQ

For example: after adding item 0 ~ item 117 into the tower, item 94 belongs to the root at levels[2][1] .

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More โ†’

Values are append-only at each level.

It can be observed from the code and the video above.

The new value to be added to a level only depends on the very single level below it.

For example: after adding item 0 โ€ฆ item 19 into the tower, the tower looks like

levels[1]  0~3  4~7  8~11 12~15
levels[0]  16   17   18   19

Now when we add item 20, the value to be added at levels[2] will only depend on levels[1] (i.e. item 0~15). It has nothing to do with item 16~19 and the newly added item 20.

The value to be added at levels[1] will only depend on levels[0] (i.e. item 16~19). It has nothing to do with the newly added item 20.

Hence we have the following property:

The tower can be updated bottom-up or top-down.

In the video above, we update the tower in the top-down order.

In the Javascript above, we update the tower in the bottom-up order.

Both orders are totally valid.

Build a Single Digest for Proof-of-Membership

Now we have a tower full of Merkle roots.

Since every added item belongs to a single root in the tower, we can provide a Merkle proof targeting that root. We now have a gas-efficient replacement of Merkle tree.

If we want to prove the membership without revealing the item, then proving toward a single root is not good enough. We should provide a proof targeting all the roots without showing which specific root our item belongs to.

However, loading all the roots from storage and feeding them into a circuit are both expensive on-chain operations. Can we build a single digest to represent all the roots in the tower?

Build a digest for each level first

For each level levels[lv], we can build a corresponding digests[lv]. If the digest algorithm is collision resistent, then we can safely use that single value digests[lv] to represent the W values in levels[lv].

Why do we build the digests level by level? Because upper levels seldom change, we usually only have to update one or two digests of the bottom levels.

Use an incremental hash function to build the level digests

When values a, b, c, d are being sequentially added into the tower, these values will be appended to levels[0].

If we use a 4-input Poseidon hash P4, then the digest of levels[0] will be

P4(a, 0, 0, 0)
P4(a, b, 0, 0)
P4(a, b, c, 0)
P4(a, b, c, d)

The on-chain cost of Poseidon hash is roughly linear to the number of inputs. The above steps will cost 4*4 units.

Since the values are append-only, we can employ an incremental hash function to reduce the cost. For example, a hash chain composed with 2-input Poseidon hash P2 can be used to build the level digest:

         a
      P2(a, b)
   P2(P2(a, b), c)
P2(P2(P2(a, b), c), d)

The above steps will only cost 0 + 2 + 2 + 2 units on hash.

To compute the new digest when d is being added, we only have to keep the previous digest P2(P2(a, b), c). We don't have to store and load a, b, c.

Note that now we also have to keep the level length. For example, the two element list [a, b] and the single element list [P2(a, b)] both have the same digest. If we don't keep track of the length, a malicious prover could claim that P2(a, b) has previously been added to the tower.

The digest function can be applied to all the levels of the tower. We can create an array of level digests digests[lv] = digest(levels[lv]).

Keep the digests[], drop the levels[]

Since

  1. the single value digests[lv] can represent the whole levels[lv]
  2. the new digest value only depends on the previous digest value (hash chain is incremental)
  3. when a level is full, the single value that will be added to the upper level is the level digest

, we can safely keep only the digests[] and drop the levels[] .

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More โ†’

Build a digest of digests

Now all the levels[] has been digested into digests[]. We can futher digest the array into a single value.

Let's use hash chain again. We can apply it on the reverse of digests[], i.e. in a top-down order. Assuming that the values in digests[] are a, b, c, d, 0, 0, 0 .... The digest of digests would be P2(P2(P2(d, c), b), a).

Why don't we apply it in the order of P2(P2(P2(a, b), c), d)? It's becuase lower digests digests[0] digests[1] change more frequently. If we put them at the head of the hash chain, then we have to recompute the whole chain everytime. If we put them at the tail of the chain, then most of the time we only have to compute one or two P2.

On-chain cost analysis

Since we use hash chain to compute both the level digests and digest of digests, the cost of updating a single level is bounded by a constant. The cost of a single add() call is propotional to the number of levels being updated.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More โ†’

Since the upper levels seldom change, the amortized cost for adding an item while maintaining a single digest for all the items is still bounded by a constant.

Circuit complexity

If the height and weight of the circuit is H x W, then the circuit complexity will roughly equal to the complexity of H x W Poseidon hashes.

Number of non-linear constraints for W = 4:

Height Capacity #Constraints
4 340 3741
8 87,380 7693
12 22,369,620 11645
16 5,726,623,060 15597
20 1,466,015,503,700 19549
24 375,299,968,947,540 23501

Comparing with Merkle Trees

There is no capacity / cost tradeoff

The tower can grow upward dynamically. The only limit is the contract storage. Since the height H does not affect the amortized on-chain cost, we can pick a big H upfront. For example, if we choose H = 24 and W = 4, then the capacity will be 375,299,968,947,540 items. The amortized cost will not be affected.

Pay as you go

Even if we choose H = 24, it doesn't mean that we have to use a fixed circuit with the same height. The client and contract could dynamically select a circuit that can accommodate the current data. For example, we can use a H = 4 circuit while there are less than 340 items being added. We can use a H = 8 circuit while there are less than 87380 items being addes. It will greatly reduce the time needed for generating the proof.

Future works

  • Find a more efficient incremental hash to further reduce the gas usage
  • Emit and fetch intermidiate nodes to reduce client side computation
  • Use private information retrieval to reduce event fetching and computation

Summary

In this article, we have illustrated the idea of LazyTower. By localizing the changes, we can build gas-efficient data structures for both ZK and non-ZK proof-of-membership.

The code is released under MIT and GPL-v3.
https://github.com/privacy-scaling-explorations/zk-kit/tree/dev/packages

Acknoledgements

(from @LCamel)

I would like to thank @Chance for his time and effort in reviewing and discussing this with me. He makes LazyTower much better.

I thank @levs57 for giving critical security advices.

I learn how a ZK project works from @cedoor's work.

I thank my wife for supporting me. I couldn't have done this without her.

This project receives grants from the Ethereum Foundation ESP.