# Codec Generator slide: https://hackmd.io/@wv-WdfVgQBqEZA28x12W8g/rJwZVyHbc --- # Background ---- ## Background: Tezos - Tezos type $\cong$ Octez type - self-amendment $\implies$ volatile type definitions - up-to-date definitions: written in OCaml ---- ## Background: Client Libraries - Locally defined Octez-compatible types - Must be updated with each new release - Hard to define without reading OCaml - Bugs are obscure, vary between libraries --- # Terminology ---- ## Schema Description of the serial format of a particular type * Abstract: model of the binary encoding used for a type * Concrete: value of type `'a Encoding.t` ---- ## Codec A description/definition of: * A type * How to encode it * How to decode it * Tied to a particular encoding format * = binary (`data-encoding`) ---- ## Target * A possible output format for *codecs* * Usually: a programming language * "target language" * Also: non-code formats * 'Prose': human-readable type description --- # Bigger Picture ```graphviz digraph { compound=true rankdir=UD graph [ fontname="Source Sans Pro", fontsize=20 ]; node [ fontname="Source Sans Pro", fontsize=16]; edge [ fontname="Source Sans Pro", fontsize=12 ]; subgraph precursor { c [label="Concept"] [shape=oval] m [label="Model"] [shape=box] c -> m } subgraph existing { d [label="Definition"] [shape=box] e [label="Schema"] [shape=box] } m -> e m -> d h [label="Hand-Coded"] cg [label="codec_generator"] e -> h d -> h cdc [label="Codec"] [shape=diamond] e -> cg -> cdc h -> cdc } ``` --- # Case Study --- ## The Conceptual Type The full name of a person e.g. - <i>Haskell Brooks Curry</i> - <i>Alonzo Church</i> --- ## The Type Model Record type with three fields: - `first`: personal/given name - `middle`: (optional) middle name - `last`: family/surname --- ## The Definition ```ocaml= open Data_encoding module Person = struct type t = { first : string ; middle: string option ; last: string } let encoding : t Encoding.t = conv (fun {first;middle;last} -> (first,middle,last)) (fun (first,middle,last) -> {first;middle;last}) (obj3 (req "first" string) (opt "middle" string) (req "last" string)) end ``` --- ### The Schema | Field Name | Byte-Length | Type | |-------------|---------------|---------------------------| | `L0` | 4 bytes | 30-bit uint | | first | `L0` | string | | `o` | 1 byte | tag (`0x00` \| `0xff`) | | `L1` | `o ? 4 : 0` | 30-bit uint | | middle | `L1` | string | | `L2` | 4 bytes | 30-bit uint | | last | `L2` | string | --- ## Simplification - Input: - Schema: - `'a Encoding.t` - Output: - Intermediate representation: - `Common.Simplified.t` ---- ```ocaml utop# open Common.Simplified;; utop# of_encoding Person.encoding;; - : t = Prod (Record {id = None; fields = [("first", ... ); ("middle", Comp (Opt (...))); ("last", ... ) ] } ) (* ... = Comp (Dyn (`Uint30, Base (Var VString))) *) ``` --- ## Generation - Input: - Name to use for codec: - `string` - Simplified schema: - `Common.Simplified.t` - Output: - Text content of \[Rust\] codec module ---- ### Module Header ```rust= extern crate rust_runtime; use rust_runtime::{ Decode, Dynamic, Encode, Estimable, ParseResult, Parser, u30 }; use decode_derive::Decode; use encode_derive::Encode; use estimable_derive::Estimable; ``` ---- ### Type Definition ```rust=14 #[derive(Debug, Decode, Encode, Estimable)] pub struct Person { first: Dynamic<u30,String>, middle: Option<Dynamic<u30,String>>, last: Dynamic<u30,String> } ``` ---- ### Encoding Function ```rust=20 pub fn person_write (val: &Person, buf: &mut Vec<u8>) { Dynamic::<u30,String> ::write(&val.first, buf); Option::<Dynamic<u30,String>> ::write(&val.middle, buf); Dynamic::<u30,String> ::write(&val.last, buf); } ``` ---- ### Decoding Function ```rust=30 pub fn person_parse<P: Parser> (p: &mut P) -> ParseResult<Person> { Ok(Person { first: Dynamic::<u30,String>::parse(p)?, middle: Option::<Dynamic<u30,String>>::parse(p)?, last: Dynamic::<u30,String>::parse(p)? }) } ``` ---- ### Trait Implementations (alternate strategy to using `#derive[(Encode,Decode)]`) ```rust impl Encode for Person { fn write(&self, buf: &mut Vec<u8>) { person_write(self, buf) } } impl Decode for Person { fn parse<P: Parser>(p: &mut P) -> Person { person_parse(p) } } ``` --- # Insights (Observations) ---- ### Insights: Input - Not restricted to Octez schemas - Can be tested more thoroughly - Future-comptatible: - Schemas in future versions - Can link against `tezos-*` packages... - ...but only with care ---- ### Insights: Simplification - Independent of supported/selected targets - Lossy: models structure, not intent - Sufficient for majority of schemas - May require fixes in certain cases ---- ### Insights: (Rust) Generation - Generator, AST based on `rust_runtime`: - Library crate: common utility code - Included in `codec_generator` directory --- # Future ---- ## Internal Development - Debugging/Verification: - Fix known bugs - More rigorous testing - Facilitate Adoption: - Improve documentation - Optimize for external use ---- ## Rust Client Libraries - Supported Target-Language Libraries - Review of design by Rust developers - Adapt codecs, runtime API as needed - Downstream pressure for active development ---- ## Non-Rust Client Libraries - Unsupported Target-Language Libraries - Advice regarding new target language - Existing hand-written codecs as model - Runtime API closer to client code --- # Questions (Intermission) - https://hackmd.io/@wv-WdfVgQBqEZA28x12W8g/ByBv6EVJq
{"metaMigratedAt":"2023-06-16T20:55:17.807Z","metaMigratedFrom":"YAML","title":"Codec Generator","breaks":true,"slideOptions":"{\"spotlight\":{\"enabled\":true}}","contributors":"[{\"id\":\"c2ff9675-f560-401a-8464-0dbcc75d96f2\",\"add\":16776,\"del\":12836}]"}
    294 views