https://github.com/crate-crypto/rust-verkle
Stateless client is a key milestone in the Ethereum roadmap. The state of the existing Ethereum execution layer is stored in the MPT structure. Although MPT can also implement Stateless client, the corresponding Proof size is too large and cannot be used in one area. The block interval is broadcast to all nodes:
Verkle trees solve a key problem standing in the way of Ethereum being stateless-client-friendly: witness sizes. A witness accessing an account in today’s hexary Patricia tree is, in the average case, close to 3 kB, and in the worst case it may be three times larger. Assuming a worst case of 6000 accesses per block (15m gas / 2500 gas per access), this corresponds to a witness size of ~18 MB, which is too large to safely broadcast through a p2p network within a 12-second slot. Verkle trees reduce witness sizes to ~200 bytes per account in the average case, allowing stateless client witnesses to be acceptably small.
And Verkle Tree will greatly reduce the Proof size.
Verkle tree ("Vector commitment" and "Merkle Trees") is a data structure based on Trie. The goal is to store a collection of KVs and provide the function of inserting, modifying and getting KVs efficiently, and can provide proof for any certain set of KVs, which proves the V corresponding to K is indeed this value or does not exist.
Here is a simple science for readers who are not familiar with commitments and proofs.
In one scenario, Alice and Bob play a game:
The above is a simplified description of KZG. For details, please refer to https://hackmd.io/DA-UGXFFRzCCznM1JBuZcA
To sum up, Bob does not know the coefficient of in the end, but he believes that Alice knows with a high probability, so which does he believe Alice knows? That is, the one corresponding to the commitment .
Back to Verkle Tree, you don't need to know the state(whole KV pairs), you only need to know the state root in advance, similar to the commitment to all KV, and the relevant KVs' proof, you can verify that the relevant KVs are indeed under the Tree represented by this state root.
figure 1
A typical Verkle tree is shown in the figure above and contains 3 types of nodes. According to the naming in rust-verkle, they are Branch
node, Stem
node, and Leaf
node. Note that although the values in node in the above figure are represented by the same characters, the values in different nodes are not necessarily equal. Among them, represents the commitment of Branch
node, which is a point on the elliptic curve. represents the hash value of the Commitment, which is the element in the scalar field on the elliptic curve. respectively represent the Commitment of Stem
node. represent the actual data to be saved.
We can draw conclusions based on the above figure:
Branch
nodeBranch
node may be Branch
node, which may be Stem
nodeStem
node has multiple child nodes, because if there is only one child node, it will be optimized into stemStem
node is Leaf
node, and the parent node of Leaf
node is Stem
node256
child nodes under a certain Branch
/Stem
node[u8;32]
, the deepest of the entire tree is 32
other information:
Stem
node is 31
bytes. Its Path in the Tree can only obtain the first part to the entire value of the key (depending on its depth), so additional field storage is required.31
bytes of Leaf
node, plus the index of its position under the Stem
node, 32
bytes can be formed, which can be used as the key of Leaf
nodeStem
node and its depth d
, you can get its path: path = key[0..d]
question:
Stem
node? This is an optimization for trie. There is also a similar optimization in MPT: when consecutive nodes in the trie have only one child node, they can be compressed into one node to represent them, which can reduce the depth of the tree.2-ary
to represent 256-ary
, the following figure is a simulation of the actual situation, because all keys are obtained through pedersen hash of address and other information, which can be regarded as random number, so leaf nodes are evenly distributed in each branch starting from the root. Assuming that there is no Stem
node, and the depth is after 3-4, most nodes will appear as shown in the figure above, that is, they will degenerate into a Linked list. Some data show that in actual situations, the average depth of the Verkle tree is 3.8, which is normal, , and the number of Ethereum addresses is 0.268B, the capacity is sufficient.I do have a question about the design of verkle trees, here a binary tree is used to represent a 256-ary tree. My understanding is that since all keys are the results of pedersen_hash and are relatively random, the keys will be evenly distributed in each branch starting from the root. In actual situations, the structure of the tree should be like Figure 1: the first 3-4 layers are almost a balanced tree, and the subsequent nodes will degenerate into linked lists. Then use stem node to represent nodes that degenerate into linked lists for optimization. My question is, 1. Why not put the nonce, balance, and even some storage slots of the same address under the same stem node, because when a transaction occurs, these fields are relatively easy to update at the same time, and this can reduce the number of required to update stem nodes and branch nodes. The actual situation is that they will be randomly distributed in different stem nodes. 2. Since all leaf nodes will be randomly distributed, why is Figure 2 chosen for the final layout, which branches into 256 child nodes after the stem node? I think the actual situation is that under a stem node, in most cases only one child has a value, and the others are Null.
Stem
node is that if Leaf nodes 2 and 4, or 3 and 4 appear at the same time, it will not play a role in reducing the depth, and it is a possible DDoS attack surface. And you don’t need to have the private key of the relevant address to attack. For example, if I directly transfer a small amount of ether as a balance to the accounts represented by Leaf nodes 2 and 3, the Verkle Tree can be deepened. The solution is that the key is not generated directly based on the account address, but is hashed and reflected in EIP 6800.Stem
node designed to be 31 bytes, and Stem
node cannot be root, so at least its first byte can be obtained directly from the Path of Stem
without saving it? My understanding is that this is a difference in implementation. This is how it is implemented in memoryDB: [u8;31]
Verkle Tree
, there is no need to create an additional Storage tree for the contract account under the account. How is this achieved? The potential space of the Verkel tree is very large (), so the storage tree is embedded into the Verkle tree in the following form.Three types of nodes correspond to three tables, leaf_table corresponds to the KV pair of 32 bytes, stem_table corresponds to the key of 31 bytes and the value of . branch_table corresponds to variable-length bytes (Path) as the key. The node under the Path may be Stem
or Branch
.
The above picture is the result of printing Figure 1. You can see 5 leaf nodes, 31 branch nodes, and 4 stem nodes. Note that branch_table contains 31 branch nodes and 4 stem nodes.
In fact, Ethereum's State Trie only needs to implement TrieTrait
, that is, it only needs to provide the above functions to the outside world to complete the task, regardless of whether the underlying layer uses Verkle Tree or SALT. Here we first analyze how it will be implemented if it is a Verkle Tree.
insert
, given a set of KV pairs, save them. In MemoryDB, insert into leaf_tableinsert_single
, call insert directlyget
, given a K, returns V. In MemoryDB, get the KV pair in leaf_tableroot_hash
, the hash of root node commitmentroot_commitment
, root node commitmentcreate_verkle_proof
, given a set of K, returns a VerkleProofIn addition to insert_single
, let’s take a look at the implementation of the remaining 5 interfaces.
First, let’s take a look at the definition of Trie. Trie has two fields, storage and committer. Their types are both generic. There are no requirements for the Storage type. The PolyCommit type also needs to implement the Committer trait. Let’s focus on storage first.
For the Storage
type that implements the ReadWriteHigherDb
trait, because there is a get_leaf interface in the ReadWriteHigherDb
trait, self.storage.get_leaf(key)
is called directly
So how does MemoryDb implement get_leaf
? You can get it directly from the HashMap get of leaf_table in MemoryDb.
I think there is no need to convert types. It can be achieved like this:
the complexity
In the implementation of MemoryDB, on average get is a operation, because the get in HashMap is directly called.
Similarly, for the Trie that implements the ReadWriteHigherDb trait for the S type, because there is a get_branch_meta interface in the ReadWriteHigherDb trait, you can directly call self.storage.get_branch_meta and enter the path of the root node: &[]
to obtain the meta information of the root node. Including commitment and hash.
And how does MemoryDb implement get_branch_meta? Because the HashMap of branch_table is saved, you can get it directly. However, as described before, there are two types of value, one is Branch and the other is Stem. If it is Stem, panic directly because the parameter you gave is Stem. path, and then you want to get Branch here.
the complexity
In the implementation in MemoryDB, on average root_hash and root_commitment are a operation, because the get in HashMap is directly called
Insert will be more complicated, we will analyze it by combining the schematic diagram and code.
From the implementation below, you can see that the implementation involves two processes. Each KV pair to be inserted is converted into a series of small operations consisting of instructions, and then the instructions are applied to the Tree.
However, it should be noted that the processing of KV here is serial. Only one KV can be processed after updating one KV. An optimization idea is to preprocess these KVs in advance, and try to allow KVs with a common path to be processed at the same time. Processing, so that the number of updates to stem nodes and branch nodes on the public path can be reduced.
There are three types of instructions, UpdateLeaf
, ChainInsert
, InternalNodeFallThrough
Updating the leaf node will update all nodes on the path from the root to the leaf node:
Information included:
key
, new_leaf_value
, used for the first updatedepth
, the depth of the leaf nodebranch_id
, the path of the stem node corresponding to the leaf nodebranch_child_index
, the position of the leaf node under the stem node.A scenario of UpdateLeaf is as shown below, assuming that leaf node 6 is added based on the existing tree:
Let’s first look at how the UpdateLeaf
instruction is processed in process_instructions
:
As shown in the figure above, if the leaf node has a value before, update_leaf_table
, update_stem_table
, update_branch_table
are called in sequence. If the return is empty, only update_leaf_table
is needed, which feels a bit strange.
Look at the implementation of these three updates:
update_leaf_table:
The insert_leaf method of storage is called. Then the insert of HashMap is called. If there is a value, return the value; if there is no value, return none.
The summary is update_leaf_table if leaf node
It returns None, and UpdateLeaf only executes update_leaf_table. Otherwise, return the update result: LeafUpdated, and continue to execute update_stem_table, update_branch_table.
update_stem_table
[0, order)
, and you can commit directly. The child node of the stem node is KV, the committed object is V, the value range is [0,2^256)
, and the part beyond the order cannot be committed. Therefore two commitments are required. How to divide it? A natural idea is to put the [0,127] bit of the 256 child nodes V in the first , and the [128,255] bit in . However, if you do this, as long as a certain value in V is updated, it will This results in the need to update , . Therefore, different partitioning methods are used: is used for the commitment of V with index 0-127, is used for the commitment of V with index 128-255. The specific implementation is:There are a total of 256 leaf nodes ~ under a stem node, each , assuming [u8; 32], then:
Vi_l = Fr::from_le_bytes_mod_order(V_i[0..=15]) + 2^128
Vi_h = Fr::from_le_bytes_mod_order(V_i[16..=31])
stem: [u8;31] = Ki[0..=30]
If the node with index i
does not exist, Vi
uses 0 instead.
Assume that the Commitment base is . In fact, all of branch nodes and and of stem nodes use the same set of Commitment base. Then the calculation formula is:
Assume that the Commitment base is . In fact, all of branch nodes and and of stem nodes use the same set of Commitment base .Then the calculation formula is:
The calculation method of is also the source of the green node in the figure below,
The calculation method of is also the source of the green node in the figure below,
from https://blog.ethereum.org/2021/12/02/verkle-tree-structure
I won’t post all the specific codes, there are quite a few, you can go here to view them yourself:
https://github.com/crate-crypto/rust-verkle/blob/e07b17b0538fcd227c221d2388e5468417116f00/verkle-trie/src/trie.rs#L537-L675
It should be noted that the additive homomorphism of points on the elliptic curve is used for optimization. Assume that is updated, then:
Then the following is my step-by-step calculation of and the equality in the updated trie based on the above formula:
https://github.com/flyq/verkle_tree_example/blob/da877d33309ee036661c4408a519b27357dbd335/src/main.rs#L14-L47
Finally, call insert_stem to insert directly on the stem_table of MemoryDb:
update_branch_table
Where represents the commitment of the child node under the branch node, which may be a branch node or a stem node.
The above code implementation also takes advantage of the additive homomorphism of elliptic curve points:
Then the following is my step-by-step calculation of the equality between and trie based on the original formula:
This is used to update C and Hc of a special type of branch node. The child nodes of this type of branch node are also branch nodes, not stem nodes. And after the trie has updated the child nodes of the branch node, the current branch node is updated.
This instruction is used to update a series of branch nodes under a certain path. I won’t post the source code, which is in: https://github.com/crate-crypto/rust-verkle/blob/e07b17b0538fcd227c221d2388e5468417116f00/verkle-trie/src/trie.rs#L362-L485
verkle proof relies on the implementation of IPA
Let’s deduce the create
function in ipa-multipoint/src/ipa.rs
:
parameter:
transcript
, used for Fiat-Shamir Transform, that is, the interactive part in IPA (generating random challenge values) is replaced by random oracle. The transcript plays the role of random oracle, and the verifier can also obtain the corresponding challenge by following the same steps. value, eliminating the need for transcript transmissioncrs
, Common Reference String, used here to save the base corresponding to Pedersen commitments, that is, 256 + 1 linearly independent point Elementa_vec
, vector aa_comm
, the commitment of vector a, equal to b_vec
, vector binput_point
, the value of the polynomial argument, only used for transcriptlogic:
Output:
proof:
verify function
parameter:
transcript
, same as createcrs
, same as createb
, same as b_vec
of createa_comm
, same as createinput_point
, input_pointoutput_point
, the result, is equal to f(input_point)
, also equal to Compared with create, the parameters include less vector a and more output_point.
logic:
The summary of the above is:
process:
Until :
also,
sent to
Verification:
verify_multiexp:
PrecomputedWeights.barycentric_weights
: In the verkle tree scenario, there are 256+256 values, ,PrecomputedWeights.inverted_domain
: In the verkle tree scenario, there are 255+255 values, For the polynomial obtained by Lagrange interpolation by: :
LagrangeBasis.domain
is 256LagrangeBasis.values
is Methods:
divide_by_linear_vanishing
, computes new polynomial evaluate_in_domain
, when , directly take evaluate_lagrange_coefficients
, when is not in domain, return new LagrangeBasis
, ,
After reading the previous dependency data structure, let’s see how open is implemented.
open()
-> MultiPointProof
enter:
crs
, base of Pedersen promiseprecomp
, precompute some transcript
, for Fait-Shamir transformationqueries
, an array of length , representing polynomials logic:
Polynomials opened for the same point:
HashMap<point, Vec<(f_i, r^i)>>
performs aggregation:
Then we have a Vector consisting of
Use to represent
h is equal to how many different points there are, and is also the length of the Vector above.
Stateless 客户端是以太坊路线图里面的关键里程碑,现有的以太坊 execution layer 的状态存储在 MPT 结构里面,虽然 MPT 也能实现 Stateless 客户端,但是对应的 Proof size 太大了,无法在一个区块的时间间隔广播到所有节点:
Verkle trees solve a key problem standing in the way of Ethereum being stateless-client-friendly: witness sizes. A witness accessing an account in today’s hexary Patricia tree is, in the average case, close to 3 kB, and in the worst case it may be three times larger. Assuming a worst case of 6000 accesses per block (15m gas / 2500 gas per access), this corresponds to a witness size of ~18 MB, which is too large to safely broadcast through a p2p network within a 12-second slot. Verkle trees reduce witness sizes to ~200 bytes per account in the average case, allowing stateless client witnesses to be acceptably small.
而 Verkle Tree 将大大减少 Proof size。
Verkle tree("Vector commitment" 和 "Merkle Trees") 是一种基于 Trie 的数据结构,目标是存储 KVs 的集合,并尽可能效率高的提供对 KVs 的增删改查的功能,而且能够提供某组 K 对应的 V 确实是返回的 V 的正确性证明。
这里对不熟悉承诺和证明的读者进行一个简单的科普。
一个这样的场景,Alice 和 Bob 玩一个游戏:
以上是对 KZG 的一个简化描述。具体可以参考 https://hackmd.io/DA-UGXFFRzCCznM1JBuZcA
总结一下,Bob 到最后都不知道 的系数 ,但是他大概率相信 Alice 知道 ,那么他相信 Alice 知道哪个 ?即承诺 对应的那个。
回到 Verkle Tree,你不需要知道具体 KV 的值,只需要提前知道 state root,类似对所有的 KV 的承诺,以及相关 KV proof,你就能验证相关 KV 确实在这个 state root 代表的 Tree 的下面。
图 1
一个典型的 Verkle tree 如上图所示,包含 3 类节点,根据 rust-verkle 里面的命名,为 Branch
node,Stem
node, Leaf
node。注意,上图里面 node 的里面值虽然用相同字符表示,但不同 node 中的值并不一定相等。其中 表示 Branch
node 的 commitment,是一个椭圆曲线上的点, 表示对该 Commitment 的hash 值,是椭圆曲线上的标量域 里面的元素。 分别表示 Stem
node 的 Commitment。 表示要存的实际数据。
我们可以根据上图得出结论:
Branch
nodeBranch
node 的子节点可能是 Branch
node,可能是 Stem
nodeStem
node 的父节点有多个子节点,因为假设只有一个子节点的话,它会被优化进 stemStem
node 的子节点为 Leaf
node,Leaf
node 的父节点为 Stem
nodeBranch
/Stem
node 下最多有 256
子节点[u8;32]
表示,整个 tree 最深为 32
其他信息:
Stem
node 的 key 值为 31
bytes,它在 Tree 里面的 Path 只能获取这个 key 的前面一部分到全部的值(取决于它所处深度),因此需要额外字段存储。Leaf
node 在这 31
bytes 的基础上,再加上它在 Stem
node 下所处位置的 index,就能组成 32
bytes,作为 Leaf
node 的 keyStem
node 的 key,以及它所处的深度 d
,可以得到它的 path:path = key[0..d]
问题:
Stem
node 的作用是什么?这是对 trie 的一种优化,MPT 里面也存在类似的优化:当 Trie 中连续 node 都只有一个子节点时,可以把它们压缩成一个节点来表示,这样可以降低 Tree 的深度。2-ary
来代表 256-ary
时,下图是对实际情况的一种模拟,因为所有的 Key 都是通过 address 和其他信息的 pedersen hash 得来,可以看作是随机数,因此 leaf node 从 root 开始就均匀分布在各个分支,假设没有 Stem
node,深度在 3-4 之后,大部分 node 将呈现如上图所示的情况,即退化成 Linked list。 某个资料显示,在实际情况中 Verkle tree 的平均深度为 3.8,这是正常的,,而以太坊的地址数为 0.268B,容量足够。I do have a question about the design of verkle trees, here a binary tree is used to represent a 256-ary tree. My understanding is that since all keys are the results of pedersen_hash and are relatively random, the keys will be evenly distributed in each branch starting from the root. In actual situations, the structure of the tree should be like Figure 1: the first 3-4 layers are almost a balanced tree, and the subsequent nodes will degenerate into linked lists. Then use stem node to represent nodes that degenerate into linked lists for optimization. My question is, 1. Why not put the nonce, balance, and even some storage slots of the same address under the same stem node, because when a transaction occurs, these fields are relatively easy to update at the same time, and this can reduce the number of required to update stem nodes and branch nodes. The actual situation is that they will be randomly distributed in different stem nodes. 2. Since all leaf nodes will be randomly distributed, why is Figure 2 chosen for the final layout, which branches into 256 child nodes after the stem node? I think the actual situation is that under a stem node, in most cases only one child has a value, and the others are Null.
Stem
node 的缺点是如果同时出现 Leaf node 2 和 4,或者 3 和 4 时,它就没起到降低深度的作用了,是一个可能的 DDoS 攻击面。并且你并不需要拥有相关地址的私钥就能攻击,比如我直接给 Leaf node 2 和 3 所代表的账号转入微量的 ether 作为余额,就可以让 Verkle Tree 深度加深。解决方案是 key 不是根据 account address 直接生成,而是经过一次 hash,在 EIP 6800 里面有体现。Stem
node 的 Key 设计为 31 bytes,Stem
node 不能做 root,那么至少它的第一个 byte 就可以不存,直接从 Stem
的 Path 获取?我的理解这是实现方式的差异,memoryDB 里面就是这么实现的:[u8;31]
Verkle Tree
之后,account 下面不需要给 contract account 额外新建 Storage tree 了,这是怎么实现的?Verkel tree 的潜在空间非常大(),因此按照下面的形式将 storage tree 嵌入到 Verkle tree 里面。三种类型的 node,对应三个 table,leaf_table 对应 32 bytes 的 KV pair,stem_table 对应 31 bytes 的 key 和 的 value。branch_table 对应可变长度的 bytes(Path)作为 key,该 Path 下的 node 可能是 Stem
, 可能是 Branch
。
上图为打印图 1 的结果,可以看到 5 个 leaf node,31 个 branch node,4 个 stem node,注意 branch_table 里面包含 31 个branch node 和 4 个 stem node。
实际上,以太坊的 State Trie 只需要实现 TrieTrait
,即对外只需要提供以上几条功能就能完成任务,而不管底层使用 Verkle Tree 还是 SALT。这里我们先分析如果是 Verkle Tree 将如何实现。
insert
,给定一组 KV pair,将其保存。在 MemoryDB 中,即 insert 到 leaf_tableinsert_single
,直接的调用 insertget
,给定一个 K,返回 V。在 MemoryDB 中,get leaf_table 里面的 KV pairroot_hash
,root 节点 commitment 的 hashroot_commitment
,root 节点 commitmentcreate_verkle_proof
,给定一组 K,返回一个 VerkleProof除了 insert_single
,接下来看一下剩下 5 个接口的实现。
首先看一下 Trie 的定义,Trie 有两个字段,storage 和 committer,他们的类型都是泛型,其中 Storage 类型没做要求, PolyCommit 类型还需要实现 Committer 这个 trait,我们先关注 storage。
对于 Storage
类型实现了 ReadWriteHigherDb
trait 的 Trie,因为 ReadWriteHigherDb
trait 里面有 get_leaf 接口,因此直接调用 self.storage.get_leaf(key)
那么 MemoryDb 是怎么实现 get_leaf
的?直接从 MemoryDb 里面的 leaf_table 这个 HashMap get 就能拿到。
实际上我觉得不必把类型转换来转换去,可以这样实现:
In fact, I think there is no need to convert types. It can be achieved like this:
复杂度
在 MemoryDB 中的实现,平均来说 get 是一个 的操作,因为直接调 HashMap 里面的 get
同样,对于 S 类型实现了 ReadWriteHigherDb trait 的 Trie,因为 ReadWriteHigherDb trait 里面有 get_branch_meta 接口,因此直接调用 self.storage.get_branch_meta,输入 root node 的 path:&[]
,就能获取 root node 的 meta 信息,包括 commitment 和 hash。
而 MemoryDb 是怎么实现 get_branch_meta 的?因为存了 branch_table 这个 HashMap,可以直接拿,但是正如前面描述的, value 有两种类型,一种是 Branch,另一种是 Stem,如果是 Stem 的时候,直接 panic,因为你给的参数是 Stem 的 path,然后你想在这里拿 Branch。
复杂度
在 MemoryDB 中的实现,平均来说 root_hash 和 root_commitment 是一个 的操作,因为直接调 HashMap 里面的 get
insert 会比较复杂,我们结合示意图和代码来分析。
从下面的实现可以看到实现涉及到两个流程,把要 insert 的每个 KV pair 转化成指令组成的一系列小操作 instructions,然后再把 instructions 作用到 Tree 上。
但是需要注意的是,这里对 KV 的处理是串行,更新完一个 KV 后才能进行才一个 KV,这了有一个优化思路是可以提前对这些 KV 进行预处理,尽量让有公共 path 的 KV 同时处理,这样就可以减少公共 path 上的 stem node 和 branch node 的更新次数。
指令有 3 种,UpdateLeaf
,ChainInsert
,InternalNodeFallThrough
更新 leaf node,会把从 root 到该 leaf node 的 path 上所有的 node 都更新一遍:
包含的信息:
key
, new_leaf_value
,用于第一条更新depth
,该 leaf node 所处深度branch_id
,该 leaf node 对应的 stem node 的 pathbranch_child_index
,该 leaf node 在 stem node 下的位置。UpdateLeaf 的一个场景是如下图,假设在已有的 tree 的基础上新增 leaf node 6:
先看 process_instructions
里面是怎么处理 UpdateLeaf
指令的:
如上图所示, 如果该 leaf node 之前有值,就依次调用了 update_leaf_table
,update_stem_table
,update_branch_table
。如果返回是空,就只需要 update_leaf_table
,感觉有点很奇怪。
看这个三个 update 的实现吧:
update_leaf_table:
调用了 storage 的 insert_leaf method。然后调用了 HashMap 的 insert。如果有值就返回值,没值就返回 none。
总结就是 update_leaf_table 如果 leaf node
就返回 None,UpdateLeaf 就只执行 update_leaf_table。否则返回更新结果:LeafUpdated,并继续执行 update_stem_table,update_branch_table。
update_stem_table
[0, order)
,直接承诺就行。而 stem node 的子节点是 KV,被承诺的对象是 V,取值范围 [0,2^256)
,超出 order 的那一部分无法被承诺。因此需要两个 commitment。怎样划分呢?一个自然的想法是把子节点 256 个 V 的 [0,127] bit 放在第一个 ,[128,255] bit 放在 ,但是这样做,只要更新 V 里面的某一个值,就会导致需要更新 , 。因此采用不同的划分办法: 用于 index 为 0-127 的 V 的承诺, 用于 index 为 128-255 的 V 的承诺。具体实现为:一个 stem node 下面有 ~ 共 256 个 leaf node,每一个 ,假设为 [u8; 32],则:
Vi_l = Fr::from_le_bytes_mod_order(V_i[0..=15]) + 2^128
Vi_h = Fr::from_le_bytes_mod_order(V_i[16..=31])
stem: [u8;31] = Ki[0..=30]
如果 index 为 i
的 node 不存在,则 Vi
使用 0 代替。
假设 Commitment base 为 ,实际上所有的 branch node 的 ,stem node 的 , 都是用同一套 Commitment base。那么计算公式为:
Assume that the Commitment base is . In fact, all of branch nodes and and of stem nodes use the same set of Commitment base. Then the calculation formula is:
其中 的计算方式也是下图绿色节点的来源,
The calculation method of is also the source of the green node in the figure below,
from https://blog.ethereum.org/2021/12/02/verkle-tree-structure
具体的代码就不全部贴上来了,还挺多,可以自行去这里查看:
https://github.com/crate-crypto/rust-verkle/blob/e07b17b0538fcd227c221d2388e5468417116f00/verkle-trie/src/trie.rs#L537-L675
需要注意的是,里面利用了椭圆曲线上点的加法同态来优化,假设更新了 ,那么:
然后下面是我根据上面的公式一步步计算 和更新后的 trie 里面的相等:
https://github.com/flyq/verkle_tree_example/blob/da877d33309ee036661c4408a519b27357dbd335/src/main.rs#L14-L47
最后调用 insert_stem 直接在 MemoryDb 的 stem_table 上 insert:
update_branch_table
其中 表示该 branch node 下面的子节点的 commitment,可能是 branch 节点,可能是 stem 节点。
上面的代码实现同样利用了椭圆曲线点的加法同态性:
然后下面是我根据原始的公式一步步计算 和 trie 里面的相等:
这个用于更新一类特殊 branch node 的 C 和 Hc。这类 branch node 的子节点也是 branch node,不是 stem node。并且是在 trie 已经更新好该 branch node 的子节点下,再更新本 branch node。
这个指令用于更新某个路径下一系列的 branch node。源码就不贴了,在 https://github.com/crate-crypto/rust-verkle/blob/e07b17b0538fcd227c221d2388e5468417116f00/verkle-trie/src/trie.rs#L362-L485
verkle proof 依赖 IPA 的实现
推导一下 ipa-multipoint/src/ipa.rs
里面的 create
function:
参数:
transcript
,用于 Fiat-Shamir Transform,即将 IPA 中的交互式部分(产生随机挑战值),使用 random oracle 代替,transcript 扮演 random oracle,并且,验证者按照相同步骤也能获取到对应的挑战值,省去 transcript 的传输crs
,Common Reference String, 这里用于保存 Pedersen commitments 对应的 base,即 256 + 1 个线性无关的点 Elementa_vec
,向量 aa_comm
,向量 a 的承诺,等于 b_vec
,向量 binput_point
,多项式自变量的值,仅仅用于 transcript逻辑:
输出:
proof:
verify function
参数:
transcript
,和 create 一样crs
,和 create 一样b
,和 create 的 b_vec
一样a_comm
, 和 create 一样input_point
, input_pointoutput_point
,结果,等于f(input_point)
,也等于 参数和 create 相比,少了向量 a,多了 output_point
逻辑:
以上总结就是:
过程:
一直到 :
此外,
发送给
验证:
verify_multiexp:
PrecomputedWeights.barycentric_weights
: 在 verkle tree 场景下,有 256+256 个值,,PrecomputedWeights.inverted_domain
: 在 verkle tree 的场景下,有 255+255 个值,对于由: 进行 Lagrange 插值出来的多项式 :
LagrangeBasis.domain
为 256LagrangeBasis.values
为 Methods:
divide_by_linear_vanishing
, 计算新的多项式 evaluate_in_domain
,当 时,直接拿 evaluate_lagrange_coefficients
, 当 不在 domain 时,返回新的 LagrangeBasis
,,
前面的依赖数据结构看完了,看看 open 是怎么实现的
open()
-> MultiPointProof
输入:
crs
,Pedersen 承诺的 baseprecomp
,预计算一些 transcript
, 用于 Fait-Shamir变换queries
,数组,长度为 ,代表 个多项式 逻辑:
对同一点打开的多项式:
HashMap<point, Vec<(f_i, r^i)>>
进行聚合:
然后我们有由 组成的 Vector
用 表示
h 等于有多少个不同的 point,也是上面 Vector 的长度。