owned this note
owned this note
Published
Linked with GitHub
# CESROX Roadmap
CESR stands for Composable Event Streaming Representation.
CESROX stands for the Rust implementation of the CESR protocol specification. Alternate spellings include CESRox, CESR-ox, or cesrox.
[CESROX Working Group Bi-Weekly Meeting Agenda and Notes](https://hackmd.io/UQaEI0w8Thy_xRF7oYX03Q?edit)
[KERI and ACDC Roadmap](https://hackmd.io/yYpd2uhRTpCadsGVw3Rl-A?view)
[KERI WebOfTrust repo](https://github.com/WebOfTrust/keri)
## Summary
CESROX is the premiere Rust implementation of the [CESR](https://github.com/WebOfTrust/ietf-cesr) protocol specification providing efficient, performant, usable serialization and deserialization to and from qualified cryptographic primitives used in the KERI and ACDC space in a composable way. Extensibility for other domains beyond KERI and ACDC is planned for and expected.
## Naming
### -ox suffix in CESROX
The "-ox" suffix comes from "-ox" sounding like the word oxide from the idea of the chemical rust being an iron oxide.
# Design Requirements
## Binary and Text Composability
- The unique most important feature is binary and text composability of cryptographic primitives and other data structures in both the JSON, CBOR, and MessagePack (MSGPK) formats.
- Binary and Text Composability means that any set of text domain encoded primitives may be converted en-masse to the binary domain and then either further composed with other binary domain single or group encodings and then either decoded or converted back to the text domain en-masse without loss.
- Fully round trippable starting in either binary or text domain.
- Text Domain is a fully qualified self-framing Base64 encoding.
- Binary domain is also self-framing.
- Regarding composability of concretely defined types and code tables, any number of code tables can be used for different applications as long as the encodings satisfy the composability property.
- Therefore the core feature of CESRox is a flexible serialization and deserialization - using [serde](https://serde.rs/) - for composable encodings using specified code tables for concrete types.
- While code tables specify actual encoding formats any encoding is composable with any other encoding providing capability for asynchronous, parallel stream processing.
## Extensibility and Use Case Agnosticism
- Ensure CESROX is use case agnostic. Currently CESR is KERI specific. Therefore when the CESR protocol gets new master code tables, or is extended by another spec like CESR-proofs, CESROX must be open to extension.
- The implementation is opaque to the clients regarding the format input streams are defined in. In other words CESROX encapsulates both the payload type (so whether it is JSON/CBOR/MsgPack/...) as well as whether it is text or binary stream.
- Clients willing to use CESROX are willing to implement mappings to client-specific data models. CESROX is use case agnostic and therefore data model agnostic.
- CESROX provides and exposes all the types that are related to attachments that are defined in master code tables of CESR protocol.
## Code Tables
- To support multiple code tables the parsing of the codes should be table based.
- Example: use the first char (in text domain) or hextet of bits (in binary domain) to select which code table to use and then from that table determine how many more chars or hextexts are needed to parse the full primitive or group code and then given the full code determine how many more characters or (hextets or bytes) for the remainder of the primitive or group.
- There are three basic types of codes in the KERI master code table. Fixed length primitives, variable length primitives, and group codes.
- There is one selector that is still TBD the "_" which is reserved for opcodes to serialize scripted operations on adjacent primitives or groups.
- In addition any number of context specific code tables may be supported.
- KERI has only one and this is the indexed code table which has only fixed length primitives but the code looks like a group code becasue is includes a count.
- Context specific code tables may have the same structure, as the master code table, that is, support for fixed length pimitives, variable length primitives, or group of primitives where the first char (in text domain) or first hextet (in binary) is the selector which can then be looked up to parse the remainder of the code and so forth.
## Comparison to [JSON - RFC8259](https://www.rfc-editor.org/rfc/rfc8259)
- Since JSON is only a string format CESROX provides superior performance in addition to canonical ordering of properties for deterministic, round-trippable serialization and deserialization.
## Comparison to [MessagePack](https://msgpack.org/index.html)
- This selector based self-framing code table approach to parsing is very similar to how CBOR and MsgPack are implemented under the hood except that they are not composable so their codes only have meaning in the binary domain (are not compsable), do not support pipelining (group codes), and do not support stream parsers (i.e. each component of a CBOR or MGPK serialization must be parsed and extracted individually from a stream not en-masse).
## Comparison to [CBOR](https://cbor.io/)
- Unlike msgpack (MGPK), CBOR supports user defined "tags" which provide some flexibilty in parsing user defined primitives. Likewise CESR supports user defined entries in code tables which provide even more expressive parsing. For example a single code entry in a CESR table could support something like CBOR tags in addition to everything else. This means that CESR encoded CBOR is possible and would provide a way to tunnel CBOR in CESR and thereby embue such tunnels with composing, streaming, and pipelining capability in non-KERI applications. This is in addition to the fact that native CBOR is a supported serialization as a KERI message body in a KERI stream (although not as a message attachment).
# Priorities and TODOs
## Priorities
[KERI+ACDC Status Table](https://hackmd.io/yYpd2uhRTpCadsGVw3Rl-A?view)
1. CESROX Requirements (type lists, code tables, serialization and deserialization)
2. CESROX Type and Code Table Implementations
3. CESROX Unit Tests
4. KERIOX Unit Tests
6. KERIPY Unit Tests
7. KERIOX Integration Tests
8. KERIPY Integration Tests
| Acronym | Specification | Technical Design | Python | Rust | Leads | Report/Notes | Next date |
|---|---|---|---|---|---|---|---|
| [CESR](https://github.com/trustoverip/acdc/wiki/composable-event-streaming-representation) | ACCEPTANCE-
## TODOs
1. Extract all the logic related to prefix management from KERIOX into new crate (ie. `BasixPrefix`, `SelfSigningPrefix`, ...). Basically extract `prefix` module.
2. As the above task is done, extract:
- CESR parsing in `event_parsing` module (including the `nom` logic);
- `SignedEventData` from KERIOX into CESROX and make it generic type, so that it becomes `SignedEventData<T, U>`. `SignedEventData<T, U>` is then the primary type that is exposed by CESROX after deserializing a CESR stream. Example below:
```rust
#[derive(Clone, Debug, PartialEq)]
pub struct SignedEventData<T, U> where T: Serialize + Deserialize, U: Attachment {
pub payload: T,
pub attachments: Vec<U>,
}
```
3. Implement in CESROX:
```rust
impl Attachment for KERIAttachment {…}
```
4. From the client perspective, ie. KERIOX, make sure proper generics are selected for given use case:
```rust
use cesrox::KERIAttachment;
// this comes from the client side, so from KERIOX
use keri::event_parsing::EventType;
type KERIBasedSED = SignedEventData<EventType, KERIAttachment>;
impl TryFrom<KERIBasedSED> for Message {…}
```
### Usage from Typescript example
```ts=
import {cesrDeserializer, addCodeTables, RawMsg } from "cesrox";
import * as fs from 'fs';
// Code Tables are provided from various sources:
let codeTable1 = {
"A": {"description": "...", codeLength: ..., totalLength: ...},
"-D": { codeLength: 4, indexLength: 2, totalLength: 4, "protocol": ["E", "F", "E", "A"] }
};
/*`codeTable1` above contains `protocol` that unambgiously
expresses the intent of an attachments that consists
of some other primitives,
ie. in case of `-D` it is `pre+snu+dig+sig`
*/
let codeTable2 = fs.loadFileSync("some_file.json");
await addCodeTables([JSON.stringify(codeTable1), codeTable2]);
//CESR stream got from somewhere else...
let cesrStream: Buffer = Buffer.from("...");
// Deserializes CESR stream into an array of rawMsgs.
let rawMsgs: RawMsg[] = await cesrDeserializer(cesrStream);
// rawMsg body:
// payload: string, (ie. JSON),
// attachments: [
// {"code": "-A": value: [{ "code": "1AAA", "value": "some signature..." },
// { ... }]},
// {"code": "-D", value: [
// [{}, {},{}, {}],
// [{}, {},{}, {}], ...
// ]}
// ]
rawMsgs.map(rawMsg => /* maps rawMsg into client data model */)
```
__Discussion:__
- `addCodeTables(codeTables: string[])` from line `#16` injects code tables into CESROX. These are the source of truth for all the codes received via CESR stream. In case of `tuples/triples/quadruples etc`, `protocol` contains the expected primitives.
- `cesrDeserializer(cesrStream: Buffer)` returns an array of one or more messages along with attachments, extracted from `cesrStream`. On top of `RawMsg[]`, client runs internal processing and executes his business logic, including `RawMsg payload` deserialization, validation, `attachments` organization and so on.
If TCP/UDP is in place and chunks may be received, ie. `cesrStream1` and `cesrStream2` that combined constitutes one event with attachments
# Implementation Notes
## Decisions and Arguments
### Generics vs Traits for data structure serialization
#### Reasoning for generics vs. traits in CESROX
We aim generics rather than traits. CESROX in runtime shall use at most a few different `(T,U)` pairs and in most cases it will be only one pair. In other words the requirement for additional machine code for all known `T`, `U` to support them in runtime is minimal. Therefore it is reasonable to have very limited amount of `T` and `U` rather than additional memory footprint while mantaining trait support.