POD2 "outside circuit" has three data types: integers, strings, and Merkle roots. In-circuit, any integer, string, or Merkle root is represented by a 4-tuple of elements of a 64-bit field.
A Merkle root represents a Merkle tree. Here we specify the structure of Merkle trees.
A Merkle tree stores an unordered collection of (key, value)
pairs. Keys in a Merkle tree must be distinct: the same key
cannot be entered twice. (In this sense a Merkle tree exposes the same functionality as a Python dictionary.)
Each key
or value
can again be an integer, string, or a Merkle root. In-circuit, of course, the key and value are each a 4-tuple of field elements.
A Merkle tree is implemented as a binary tree.
The insertion path of any key is given by a deterministic rule: given key
and a nonnegative integer depth
, the rule determines that key
belongs to either the left
or right
branch at depth depth
.
The precise rule is an implementation detail; here is one possibility. In-circuit, key
will be represented as a 4-tuple of field elements
Write the field elements in binary (in little-endian order):
At the root, key
goes to the left subtree if b_0 = 0
, otherwise the right subtree. At depth 1, key
goes to the left subtree if b_1 = 0
, otherwise the right subtree, and similarly for higher depth.
A Merkle tree with no entry at all is represented by the hash value
root = hash(0).
(For security, the hash function hash
must output a 4-tuple of field elements.)
A Merkle tree with a single entry (key, value)
is called a "leaf". It is represented by the hash value
root = hash((key, value, 1)).
A Merkle tree tree
with more than one entry is required to have two subtrees, left
and right
. It is then represented by the hash value
root = hash((left_root, right_root, 2)).
(The role of the constants 1 and 2 is to prevent collisions between leaves and non-leaf Merkle roots. If the constants were omitted, a large Merkle tree could be dishonestly interpreted as a leaf, leading to security vulnerabilities.)
Suppose we want to build a Merkle tree with three key-value pairs:
The keys are integers, so the are represented in-circuit by themselves. In little-endian order, the first three bits of the keys are:
The resulting tree looks like:
The intermediate roots are computed as hashes of their subroots: