Try   HackMD

This document proposes a way forward with regards to what should be built to aid in the proof of liabilities effort.

Summation Merkle Tree

Short overview of the evolution of Summation Merkle Trees (SMT)

These types of trees were first proposed by Maxwell in 2013[1].

Basic idea: one takes a regular Merkle tree but each node carries a number as well as a hash. The numbers at the leaf node are set by the tree creator, and each non-leaf node adds together the numbers of it's children to get it's own number.

How this is useful for PoL: each leaf node represents a user (hash of the user ID, for example) and the number attached to these nodes are the users' balances. This way the total liability sum will be at the root node. The exchange constructs the tree and publishes the root node and total liability sum. Users are supposed to check themselves that their balance is correctly represented in the tree by requesting that the exchange give them an inclusion proof (i.e. their path up the tree). The more users that check the more confident everyone is that the tree truthfully shows the exchange's liabilities.

Security flaw: the original Maxwell tree had a security flaw that was explained and fixed by Hu, Zhang, and Guo in 2019[2] to give the Maxwell+ tree.

Privacy concerns: the tree hides neither the user balances, the total liability sum, nor the number of users of the exchange. Even though the full tree lives with the exchange any user who gets an inclusion proof will know 1 leaf node other than their own, thus they learn of another user's balance. They will also know the tree height which means they can put an upper bound on the number of users of the exchange. And of course the total asset sum is visible to everyone because it was made public.

Hiding balances and number of users: the Maxwell++ tree takes each leaf node and splits it some number of times, randomly distributing the balance. Doing this increases the height of the tree, making the upper bound on number of users much higher. Balances of individual users are also now obfuscated from other users who request inclusion proofs. This version of the tree is even made fully public by some exchanges (e.g. BitMEX) as it is considered to not leak any important information.

Hiding liability sum: an SMT with Pedersen commitments was first proposed by Camacho in 2014[3] and perfected by Ji & Chalkias in 2021[4]. This version of the tree hides every balance in a Pedersen commitment. Range proofs have to be given to ensure that the addition of 2 values does not end up being greater than the order of the group, which would allow a "reset to zero", so to speak. The tree is intended to be kept with the exchange.

Summa: a different approach to achieve full privacy is taken by the Summa team[5]. A Maxwell+ tree is used and the exchange provides a snark as inclusion proof when users request, as apposed to giving the Merkle path.

What other exchanges are currently doing

A full list of other exchanges that use PoL protocols can be found on Nic Carter's website[6] but here are some examples:

  • Maxwell++ tree: BitMEX (fully public tree)
  • List of balances: Deribit (fully public, balances are split)
  • Merkle tree: Kraken (with help from audit firm Armanino)
  • Kate commitment with snark: OKX

Exchanges have yet to use the most private of the SMT variants: dapol+ or Summa.

Options and proposal

Although the SMT is not the only option (Kate commitment with snark being another example) it is definitely the most widely used, the simplest to understand, and the best in time/space performance.

There already exist open source libraries for the Maxwell++ tree that are being used in production (e.g. BitMEX's code) so there is not much work to be done here. There is, however, no production-ready open source code for the dapol+ tree.

The code for dapol+ is partly written (by Ji & Chalkias, for benchmarking their paper) and needs some work to get it to a production-ready state. The code is open source and is ready to be forked / added to. Since this is the state-of-the-art in the literature right now it seems that getting this code production ready is the best thing we can do to aid the PoL effort.

The other option would be to write a library for the Kate commitment protocol but this would take longer since it will be from scratch and there is no academic paper with security proofs backing the protocol as with dapol+.

Current state of the dapol+ code

In the paper there are a few proposed accumulators, each of which has different properties. Only the NDM-SMT (non-deterministic SMT) accumulator was implemented in the code, so the other accumulators will need to be written. Writing these accumulators will have to be done from scratch.

For the Pedersen commitments the Ristretto255 group on top of Curve25519 was used (as it is the curve used in the Bulletproofs library). If any other groups/curves are required then these need to be imported (should be fairly easy to do this). Similarly with the hash function only Blake3 is used so if others are required then support will need to be added. And finally if any other range proof besides Bulletproofs are required then this will require extra work too (most likely a simple import of another library).

There are some tests in the code but more will have to be written to get a comprehensive coverage.