# Lib_traced_storage
Most of the code related to these will be found in the `lib_traced_storage` library.
## Hashes
A hash is an abstract type that is under the hood just bytes. It does, however, expose the following functions:
```ocaml=
val empty : t
val to_bytes : t -> bytes
val digest : bytes -> t
val digest_couple : t -> t -> t
val of_bytes_exn : bytes -> t
val of_bytes_opt : bytes -> t option
```
allowing one to go back and forth beween bytes and `t` and to concat two hashes.
## Keys
A key will be a sequence of booleans reresenting a path in a binary tree. A key can be transformed back and forth from bytes and strings. that is
```ocaml=
key_to_string [ true; false; false] = "100"
```
## Common
This is the "common" type in the library. It has signature:
```ocaml=
type t
type nonrec key = Key.t
type nonrec value = bytes
val get_opt : t -> key -> t * value option
val get : t -> key -> t * value
val mem : t -> key -> t * bool
val set : t -> key -> value -> t
```
I will describe the implementation for each of these.
## Tree
This is the commons tree structure. Each of the modules bellow will implement this structure, possibly with some extra decoration.
```ocaml=
and t =
| Node of node_content
| Extender of extender_content
| Leaf of {content : bytes; hash : Hash.t}
| Empty
```
and
```ocaml=
type node_content = {left : t; right : t; hash : Hash.t}
and extender_content = {node : t; hash : Hash.t; key : Key.t}
```
It will also have some functions that allow for hashing.
Note that the leaf contains content as well as a hash. The nodes will only contain a hash that is the hash of the concatenation of the left and right hashes.
An extender will be a compact representation of an `empty` path. More precisely, the tree
```graphviz
strict graph {
node [margin=0 fontcolor=blue fixedsize=false shape=box style=filled]
0 [ label=" emptyemptyeemptya " ]
00 [ label="" shape=circle]
01 [ label=" emptyeemptya "]
010 [ label="" shape=circle]
011 [ label=" emptya " ]
0110 [ label="" shape=circle]
0111 [ label="a" ]
0 -- {01 00}
01 -- {011 010}
011 -- {0110 0111}
}
```
which should be represented by:
```ocaml=
Node {left = Empty;
right = Node
{left = Empty;
right = Node
{left = Empty;
right = Leaf
{content = "a";
hash = "a"};
hash = "emptyab"};
hash = "emptyemptyab"};
hash = "emptyemptyemptyab"}
```
will be represented by
```ocaml=
Extender {node = Leaf a;
key = [true; true; true];
hash = "a111"}
```
There will be three invariants that we need:
1. No `Node` can contain an empty child. That is
```ocaml=
Node {left = Empty; right = node ; hash = "empty"^h}
```
should be replaced by
```ocaml=
Extender {key =[true]; node ; hash = "h1"}
```
2. No extender should contain an extender as a node. More precisely
```ocaml=
Extender {key = key1;
node =
Extender {key = key2; node ; hash = "h"@key2};
hash = "h"^key2^key1}
```
Should be replaced by
```ocaml=
Extender {key = key1 @ key2;
node;
hash = "h"^(key1@key2)}
```
As a consequence of this, `Empty` in the empty tree or at the end of an `Extender`. We also want
3. No extender should en in `Empty`. If it does it should be replaced by `Empty`.
None of the 3 invariants are enforced in code but they cannot apper as consequences of `set`. If, later on we would want to remove values form leaves we will want to take care that these remain true. There are PBT tests checking these. Moreover there is some atavic code dealing with empty extenders just in case.
the Patricia trees are meant to be representations of `(key,value)` pairs. In particular one can get the value corrsponding with a key respectively set the value at a certain key.
## Patricia
### Getting values:
Given a key `k`, trying to `get tree k` will return:
1. `Failure` in the following two cases:
* The path corresponding to `k` continues pass a Leaf.
* The path corrsponding to `k` finishes either on a node or in the `middle` of an extender.
2. `None` if one of the two happen:
* The path corresponding to `k` continues past an empty.
* The path corresponding to `k` diverges from the path corrsponding to the key inside of an extender.
3. `Some a` if the path corrsponding to `k` ends on a Leaf whose content is `a`
### Examples:
Let us consider the following tree:
```graphviz
strict graph {
node [margin=0 fontcolor=blue fixedsize=false shape=box style=filled]
0 [ label="extender\n key = 101"; shape=oval ]
01 [ label="ab"; shape =diamond]
010 [ label="a"]
011 [ label=" b "]
0 -- {01}
01 -- {011 010}
}
```
Then `get tree path` gives:
* Failure if `path = 10, 101,10111, 10100,10100001` etc.
* `None if` `path =10000, 1100,11111` etc.
* `Some a` if `path = 1010` and `Some b` if `path = 1011`
### Setting values:
Given a key `k` and a value `val`, trying to `set tree k val` will return:
1. `Failure` in the following two cases:
* The path corresponding to `k` continues pass a Leaf.
* The path corrsponding to `k` finishes either on a node or in the `middle` of an extender.
2. If the path corrsponding to `k` ends on a Leaf or an empty, then we replace this last node with a leaf whose conent is `val`.
3. If the path corresponding to `k` continues past an empty, then one replaces the empty with an extender whose key is the remainder of `k` and whose node is leaf with content `value` as its node.
4. If the path corresponding to `k` diverges from the path corresponding to the key inside of an extender then we split the extender at that position and introduce a node with a dangling extender.
### Examples
Let us consider the following tree:
```graphviz
strict graph {
node [margin=0 fontcolor=blue fixedsize=false shape=box style=filled]
0 [ label="extender\n key = 101"; shape=oval ]
01 [ label="ab"; shape =diamond]
010 [ label="a"]
011 [ label=" b "]
0 -- {01}
01 -- {011 010}
}
```
then
* `set tree path val` will fail if path = 1, 10, 101, 1010000` etc.
* `set tree path "c"` where `path = 1110` gives:
```graphviz
strict graph {
node [margin=0 fontcolor=blue fixedsize=false shape=box style=filled]
0 [ label="extender\n key = 1"; shape=oval ]
00 [ label="ab"; shape =diamond]
01 [ label="ab"; shape =diamond]
000 [ label="extender\n key = 110"; shape=oval]
0000 [ label="c"]
010 [ label="a"]
011 [ label=" b "]
0 -- { 00}
00 --{000 01}
000 --{0000}
01 -- {011 010}
}
```
* `set tree path "c"` where `path = 1010` gives:
```graphviz
strict graph {
node [margin=0 fontcolor=blue fixedsize=false shape=box style=filled]
0 [ label="extender\n key = 101"; shape=oval ]
01 [ label="ab"; shape =diamond]
010 [ label="c"]
011 [ label=" b "]
0 -- {01}
01 -- {011 010}
}
```
For example consider the following valid tree (I will remove the visted as it is a technicality used in constructuctions. See [this](#The-visited-parameter) for some details):
```ocaml=
Node {left = Node {left = Empty ;
right = Leaf {content = "1"; hash = "h1"};
hash = "emptyh1";
}
right = Node {left = Leaf {content = "2"; hash = "h2"};
right = Leaf {content = "1"; hash = "h1"};
hash = "h2h3";};
hash ="emptyh1h2h3" }
```
## Streams
There are two types of streams, `Stream_produce.t` and `Stream_consume.t`. The types are abstract but, behind the hood they are lists of stream elements, that is elements of type:
```ocaml=
type stream_el =
| Empty
| Leaf of Bytes.t
| Node of Hash.t * Hash.t
| Extender of Hash.t * Key.t
```
They are meant to be a compact representation of traversing a piece of the tree.
`Stream_produce` has access to a function that `produce` of type `:t-> stream_el ->t` , (a function that adds a stream element to the stream) and to a function `to_consume` of type `:t-> Stream_consume.t`. In practice, produce preappends the stream element to the list and `to_consume` reverses the stream. There is also an encoding function.
`Stream_consume` has acces to an encoding function and to a `consume` function of type `t -> stream_el * t`. In practice the latter splits the head form the tail of the list.
## Patricia_Produce
This is the type that produces proofs for gets and sets. Its main type will contain a tree and a stream (the proof):
``` ocaml
t = tree * Stream_produce.t
```
where
```ocaml=
tree = {tree : VTree.tree; visited : bool}
```
and `VTree.tree` is a tree as above.
It also has methods for `get` and `set`. The result will be similar with above but, in addition, the functions will also add to stream. The stream will encode the nodes it has encountered in the path. The visited tag avoids overly eager streaming. The point of this is that a Stream_produce stream will be consumed in Patricia_consume allowing one to recover the important parts of the tree form just the root_hash, hence verifying proofs. The fact that a node has been visited means that it is aleady unfolded in the consumend part so you do not need to write again. Here is a simple example. Suppose $t$ is the following tree. We will color green to mean visited. Initially there are no visited nodes.
```graphviz
strict graph {
node [margin=0 fontcolor=blue fixedsize=false shape=box style=filled]
0 [ label="extender\n key = 101"; shape=oval ]
ab [ label="cdb"; shape =diamond]
cd [ label="cd"]
b [ label=" b "]
0 -- {ab}
ab -- {cd b}
cd -- {c d}
}
```
We now do
```ocaml=
get (t, Stream_produce.empty) 1011 = ((t', stream), Some b)
```
where `t' = {tree, visited=true}` is the following tree:
```graphviz
strict graph {
node [margin=0 fontcolor=blue fixedsize=false shape=box style=filled]
0 [ label="extender\n key = 101"; shape=oval; color=green]
ab [ label="cdb"; shape =diamond; color=green]
cd [ label="cd" ]
b [ label=" b "; color=green]
c []
0 -- {ab}
ab -- {cd b}
cd -- {c d}
}
```
and
```ocaml=
stream = [Leaf "b";
Node ("cd", "b");
Extender ("101", "cdb")]
```
Now if we apply
```ocaml=
get (t', stream) 10101 = ((t'', stream''), Some d)
```
Here t'' is
```graphviz
strict graph {
node [margin=0 fontcolor=blue fixedsize=false shape=box style=filled]
0 [ label="extender\n key = 101"; shape=oval; color=green]
ab [ label="cdb"; shape =diamond; color=green]
cd [ label="cd" ; color=green]
b [ label=" b "color=green]
c []
d [color=green]
0 -- {ab}
ab -- {cd b}
cd -- {c d}
}
```
and
and
```ocaml=
stream = [Leaf "d";
Node ("c", "d");
Leaf "b";
Node ("cd", "b");
Extender ("101", "cdb")]
```
Note that only the last two steps were added to the stream. We will see how this is consumed in teh next section.
## Patricia_consume
This is the main type that is exposed to the protocol. It is the verifier type. As before, its type will be `tree * stream` where stream is a `Stream_consume.t` while tree has type
```ocaml=
type tree = Hash of Hash.t
| Full of tree_view
```
and `tree_view` is essentially a `Tree.t`. The point is that `Hash h` represents the root_hash of a subtree. Apllying the `get` respectively `set` not only finds the value but also verifies the proofs. For example let assume that we are trying the examples from the Patricia_produce. Suppose we have the reverse of the final stream in that example.
```ocaml=
stream = [Extender ("101", "cdb");
Node ("cd", "b");
Leaf "b";
Node ("c", "d");
Leaf "d"]
```
if we apply
```ocaml=
get (Hash "cdb101", stream) 1011 = ((t', stream'), Some b)
```
where t' is the tree (we use circle hor a hash node):
```graphviz
strict graph {
node [margin=0 fontcolor=blue fixedsize=false shape=box style=filled]
0 [ label="extender\n key = 101"; shape=oval]
ab [ label="cdb"; shape =diamond]
cd [ label=" Hash cd " shape=circle]
b [ label="b"]
0 -- {ab}
ab -- {cd b}
}
```
and of course
```ocaml=
stream' = [Node ("c", "d");
Leaf "d"]
```
The point is that `consume_stream stream tree` does not consume the stream in the case when `tree` is of type `Full t` but only in the case of `Hash h`. Therefore applying
```ocaml=
get (t', stream') 10101 = ((t'', []), Some d)
```
where `t''` is almost the initial tree (the `Leaf c` is still a `Hash`):
```graphviz
strict graph {
node [margin=0 fontcolor=blue fixedsize=false shape=box style=filled]
0 [ label="extender\n key = 101"; shape=oval ]
ab [ label="cdb"; shape =diamond]
cd [ label="cd"]
b [ label=" b "]
c [shape=circle label=" Hash c "]
0 -- {ab}
ab -- {cd b}
cd -- {c d}
}
```