# Codec Generator
slide: https://hackmd.io/@wv-WdfVgQBqEZA28x12W8g/rJwZVyHbc
---
# Background
----
## Background: Tezos
- Tezos type $\cong$ Octez type
- self-amendment $\implies$ volatile type definitions
- up-to-date definitions: written in OCaml
----
## Background: Client Libraries
- Locally defined Octez-compatible types
- Must be updated with each new release
- Hard to define without reading OCaml
- Bugs are obscure, vary between libraries
---
# Terminology
----
## Schema
Description of the serial format of a particular type
* Abstract: model of the binary encoding used for a type
* Concrete: value of type `'a Encoding.t`
----
## Codec
A description/definition of:
* A type
* How to encode it
* How to decode it
* Tied to a particular encoding format
* = binary (`data-encoding`)
----
## Target
* A possible output format for *codecs*
* Usually: a programming language
* "target language"
* Also: non-code formats
* 'Prose': human-readable type description
---
# Bigger Picture
```graphviz
digraph {
compound=true
rankdir=UD
graph [ fontname="Source Sans Pro", fontsize=20 ];
node [ fontname="Source Sans Pro", fontsize=16];
edge [ fontname="Source Sans Pro", fontsize=12 ];
subgraph precursor {
c [label="Concept"] [shape=oval]
m [label="Model"]
[shape=box]
c -> m
}
subgraph existing {
d [label="Definition"] [shape=box]
e [label="Schema"] [shape=box]
}
m -> e
m -> d
h [label="Hand-Coded"]
cg [label="codec_generator"]
e -> h
d -> h
cdc [label="Codec"] [shape=diamond]
e -> cg -> cdc
h -> cdc
}
```
---
# Case Study
---
## The Conceptual Type
The full name of a person
e.g.
- <i>Haskell Brooks Curry</i>
- <i>Alonzo Church</i>
---
## The Type Model
Record type with three fields:
- `first`: personal/given name
- `middle`: (optional) middle name
- `last`: family/surname
---
## The Definition
```ocaml=
open Data_encoding
module Person = struct
type t = { first : string
; middle: string option
; last: string }
let encoding : t Encoding.t =
conv
(fun {first;middle;last} -> (first,middle,last))
(fun (first,middle,last) -> {first;middle;last})
(obj3 (req "first" string)
(opt "middle" string)
(req "last" string))
end
```
---
### The Schema
| Field Name | Byte-Length | Type |
|-------------|---------------|---------------------------|
| `L0` | 4 bytes | 30-bit uint |
| first | `L0` | string |
| `o` | 1 byte | tag (`0x00` \| `0xff`) |
| `L1` | `o ? 4 : 0` | 30-bit uint |
| middle | `L1` | string |
| `L2` | 4 bytes | 30-bit uint |
| last | `L2` | string |
---
## Simplification
- Input:
- Schema:
- `'a Encoding.t`
- Output:
- Intermediate representation:
- `Common.Simplified.t`
----
```ocaml
utop# open Common.Simplified;;
utop# of_encoding Person.encoding;;
- : t =
Prod
(Record
{id = None;
fields =
[("first", ... );
("middle", Comp (Opt (...)));
("last", ... )
]
}
)
(* ... = Comp (Dyn (`Uint30, Base (Var VString))) *)
```
---
## Generation
- Input:
- Name to use for codec:
- `string`
- Simplified schema:
- `Common.Simplified.t`
- Output:
- Text content of \[Rust\] codec module
----
### Module Header
```rust=
extern crate rust_runtime;
use rust_runtime::{
Decode,
Dynamic,
Encode,
Estimable,
ParseResult,
Parser,
u30
};
use decode_derive::Decode;
use encode_derive::Encode;
use estimable_derive::Estimable;
```
----
### Type Definition
```rust=14
#[derive(Debug, Decode, Encode, Estimable)]
pub struct Person {
first: Dynamic<u30,String>,
middle: Option<Dynamic<u30,String>>,
last: Dynamic<u30,String>
}
```
----
### Encoding Function
```rust=20
pub fn person_write
(val: &Person, buf: &mut Vec<u8>)
{
Dynamic::<u30,String>
::write(&val.first, buf);
Option::<Dynamic<u30,String>>
::write(&val.middle, buf);
Dynamic::<u30,String>
::write(&val.last, buf);
}
```
----
### Decoding Function
```rust=30
pub fn person_parse<P: Parser>
(p: &mut P) -> ParseResult<Person>
{
Ok(Person {
first: Dynamic::<u30,String>::parse(p)?,
middle: Option::<Dynamic<u30,String>>::parse(p)?,
last: Dynamic::<u30,String>::parse(p)?
})
}
```
----
### Trait Implementations
(alternate strategy to using `#derive[(Encode,Decode)]`)
```rust
impl Encode for Person {
fn write(&self, buf: &mut Vec<u8>) {
person_write(self, buf)
}
}
impl Decode for Person {
fn parse<P: Parser>(p: &mut P) -> Person {
person_parse(p)
}
}
```
---
# Insights
(Observations)
----
### Insights: Input
- Not restricted to Octez schemas
- Can be tested more thoroughly
- Future-comptatible:
- Schemas in future versions
- Can link against `tezos-*` packages...
- ...but only with care
----
### Insights: Simplification
- Independent of supported/selected targets
- Lossy: models structure, not intent
- Sufficient for majority of schemas
- May require fixes in certain cases
----
### Insights: (Rust) Generation
- Generator, AST based on `rust_runtime`:
- Library crate: common utility code
- Included in `codec_generator` directory
---
# Future
----
## Internal Development
- Debugging/Verification:
- Fix known bugs
- More rigorous testing
- Facilitate Adoption:
- Improve documentation
- Optimize for external use
----
## Rust Client Libraries
- Supported Target-Language Libraries
- Review of design by Rust developers
- Adapt codecs, runtime API as needed
- Downstream pressure for active development
----
## Non-Rust Client Libraries
- Unsupported Target-Language Libraries
- Advice regarding new target language
- Existing hand-written codecs as model
- Runtime API closer to client code
---
# Questions
(Intermission)
- https://hackmd.io/@wv-WdfVgQBqEZA28x12W8g/ByBv6EVJq
{"metaMigratedAt":"2023-06-16T20:55:17.807Z","metaMigratedFrom":"YAML","title":"Codec Generator","breaks":true,"slideOptions":"{\"spotlight\":{\"enabled\":true}}","contributors":"[{\"id\":\"c2ff9675-f560-401a-8464-0dbcc75d96f2\",\"add\":16776,\"del\":12836}]"}