There is no commentSelect some text and then click Comment, or simply add a comment to this page from below to start a discussion.
Vector Commitment Scheme - High Level
Familiarity with binary merkle trees is assumed.
Commitment Scheme
Commitment schemes in general are at the heart of every scenario where you want to prove something to another person. Lets list two examples from our daily lives.
Lottery
Before you are able to see the winning results of a lottery, you must first commit to your choice of numbers. This commitment will allow you to prove that you did indeed choose these numbers before seeing the results. This commitment is often referred to as a lottery ticket.
We cannot trust people to be honest about their results, or more generously, we cannot trust people to attest to the truth; they could have bad memory.
If you trust everyone to tell the truth or if it is not advantageous for a rational actor to lie, then you might be able to omit the commitment scheme. This is not usually the case, especially in a scenario where it may be impossible to find out the truth.
Sometimes we cannot even assume that actors will behave rationally!
There are certain features that a lottery ticket must have like not being able to edit it after the fact. Many of these features draw a parallel with vector commitment schemes.
Registration and Login
A lot of social applications require you to prove your digital identity to use them. There are two stages;
Registration: This is where you put in your details such as your email address, name, password and phone number. You can think of this as a commitment to a particular identity.
Login: This is where you use the email address and password from registration to prove that you are the same person. Ideally, only you know these login details.
Without the registration phase, you would not be able to later prove your digital identity.
As you can see, commitment schemes are crucial where one needs to prove something after an event has happened. This analogy also carries over to the cryptographic settings we will consider.
Why do we need a commitment scheme?
For the lottery example, one could call it a ticket commitment scheme.
For the registration example, one could call it an identity commitment scheme.
For verkle trees and indeed merkle trees, we need a vector commitment scheme.
Analogously, this means that we need to commit to a vector and later attest to values in that vector.
As a spoiler, with verkle/merkle trees, when one is tasked with proving that a particular value is in the tree, we can reduce this to many instances of proving that particular values are in a vector.
Brief overview of a vector
Think of a vector as a list of items where the length of the vector and the position of each item is also a part of the definition.
Example 1
Here the vectors and are not equal because the first and second items in the vectors are not equal. This may seem obvious but it is not true for mathematical objects such as sets.
Example 2
Here the vectors are also not equal, because their lengths are not equal. Note also that as a set, they would be considered equal.
We will later see that vector commitment schemes, must encode both of these properties (position of each item and length of the vector) when committing to a vector.
First bring your attention to in Figure 1. One can define some function which takes both of these values as inputs and transforms them into a single output value .
Encoding the position
We specify that should not be equal to . This means that the function implicitly encodes the positions of its input values. In this case conveys the fact that is first and is second.
Encoding the length
Another property of is that should not equal , meaning that should also encode the number of inputs, which is conversely the length of the vector. (Even if has a value of )
Elaborating, if there are two items as inputs, one should not get the same answer when there are three items. No matter what the third input is.
Committing to a vector
We now ask the reader to view and as two elements in a vector; ie . The function allows us to commit to such a vector, encoding the length of the vector and the position of each element in the vector. In the above merkle tree, one can repeatedly use until we arrive at the top of the tree. The final output at the top is denoted as the root.
By induction, we can argue that the root is summary of all of the items below it. Whether the summary is succinct, depends on .
Popular choices for include the following hash functions: sha256, blake2s and keccak. But one could just as easily define it to be the concatentation of the input.
Opening a value
Say we are given the root in Figure 1 and we want to show that is indeed a part of the tree that this root represents.
To show that is in the tree with root , we can do it by showing:
is the first element in the vector and applying to this vector yields
Then we can show that is the first element in the vector and applying to the vector yields
Finally, we can show that is the second element in the vector and applying to the vector yields
We now define a new function to show that an element is in a certain position in a vector and that when is applied to said vector, it yields an expected value
takes four arguments:
A commitment to a vector . This is the output of on a vector.
An index,
An element in some vector,
A proof attesting to the fact that is the commitment to , and is the element at index of .
returns true if for some vector :
is the commitment of . i.e.
The i'th element in is indeed . i.e.
Example
Lets use to demonstrate us checking:
is the first element in the vector and applying to this vector yields
(zero indicates the first element)
if returns true, then we can be sure that commits to some vector using and at the first index of that vector, we have the value .
We must trust that was computed correctly, ie it corresponds to the tree in question. This is outside the scope of verkle/merkle trees in general and is usually handled by some higher level protocol.
What is ?
For a binary merkle tree, would be . Now given and , we can apply to check that . This also allows us to check that is the first element in the vector.
Proof cost For Binary Merkle Tree
For a binary merkle tree, our vectors have size and so only has to contain 1 extra element to show . If we had a hexary merkle tree, where our vector had 16 elements, would need to contain 15 elements. Hence the proof grows in proportion to the vector sizes that we are using for merkle trees.
Even more disparaging, is that fact that there is not just one . In our case there is actually 3 to show is in the tree. The overall proof size thus also grows, with the amount of vectors/levels/depths.
In general, we can compute the overall proof size by first defining the number of items in the tree, this is also known as the tree width , we then define the size of our vectors, this is sometimes referred to as the node width : We can compute the proof size with :
Verkle Tree Improvements
The problem with being a hash function like sha256 in the case of a merkle tree is that in order to attest to a single value that was hashed, we need to reveal everything in the hash. The main reason being that these functions by design do not preserve the structure of the input. For example, + != .
Fortunately, we only require a property known as collision resistance and there are many other vector commitment schemes in the literature which are more efficient and do not require all values for the opening. Depending on the one you choose, there are indeed different trade offs to consider.
Some trade offs to consider are:
Proof creation time; How long it takes to make
Proof verification time; How long it takes to verify
Moreover, with some of the schemes in the wider literature, it is possible to aggregate many proofs together so one only needs to verify a single proof . With this in mind, it may be unsurprising that with verkle trees, the node width/vector size has increased substantially, since the proof size in the chosen scheme does not grow linearly with the node width.
Summary
Merkle trees use a vector commitment scheme which is really inefficient.
Verkle trees use a commitment scheme which has better efficiency for proof size and allows one to minimise the proof size using aggregation.
Verkle trees also increase the node width, which decreases the depth of the tree.