# Serialization Overview for Ethereum CL Devs From a CL developer perspective, there are three serialization formats for consensus structures (e.g. `BeaconBlock`, `ExecutionPayload`): | Format | Primary Use | Primary Goal | Endianness | Specs |---|---|---|---|---| |SSZ|P2P comms & hashing | Compact on the wire, [bijective](https://ethereum.stackexchange.com/a/82056) | Little-endian | [ethereum/consensus-specs](https://github.com/ethereum/consensus-specs/blob/dev/ssz/simple-serialize.md) |CL JSON| Beacon Node HTTP API | Human Readable | NA | [etherum/beacon-apis](https://github.com/ethereum/beacon-apis/issues) |EL JSON| CL <> EL Comms| Human Readable | Big-endian | [ethereum/execution-apis](https://github.com/ethereum/execution-apis/blob/main/src/engine/paris.md) > Note: RLP has been omitted since it's not so relevant for CL devs. RLP is the EL equivalient of SSZ; a compact encoding designed for the P2P network. ## A Brief Chonology The "EL JSON" format is the oldest of the three, I believe it existed when Ethereum launched in 2015. It's the encoding used in the classic "Ethereum JSON RPC" (e.g. what `geth attach` gives). The EL JSON format is designed to be human readable so users can type stuff into the JSON RPC console and read the answers they get back. Next came SSZ in 2018/2019 during the specification of the Beacon Chain. The goal of SSZ is to be *simple* and *compact*. It should be *simple* so it's easy to grasp conceptually and implement correctly. It should be *compact* because it's used for transmitting blocks/attestations over the P2P network where there's lots of broadcast amplification. SSZ is *not* human readable; it's impractical to type out a SSZ message and read what you get back. It's just raw bytes without field names. (Fun fact: Danny and Vitalik are conversationally fluent in SSZ and use it to converse privately during ACD calls.) After (or around the same time as) SSZ came "CL JSON". It was created as the method of comms for consensus layer APIs (e.g., the Beacon Node HTTP API). It aims to be human readable, so anyone (who reads English) can `curl` things from the BN API and read the response. CL JSON doesn't care much about compactness since it's intended for local comms that happen via a LAN rather than P2P comms across the Internet. ## Why are EL and CL JSON different? It seemed like a good idea at the time 😬. It wasn't clear that we'd end up with the EL+CL ecosystem back then. CL JSON was an attempt to address some oddities in EL JSON and [start fresh with an improved standard](https://xkcd.com/927/). ## Differences There are many different types one can communicate via these three serialization formats: structs, lists, integers, byte-arrays, etc. The EL and CL JSON formats are very close, whilst the SSZ encoding is very different. I don't think we should describe *all* types in this document, rather just focus on what I think are the most confusing: integers and byte-arrays. First, I'll present a table of differences for each type and then I'll show some examples to demonstrate these differences. ### Differences: Integers | Format | Integer Representation | Representation of of `Uint64(1337)` | |---|---|---| |SSZ| **Little-endian** bytes | `[39, 05, 0, 0, 0, 0, 0, 0]` |CL JSON| Decimal string | `"1337"` |EL JSON| Hexidecimal, **big-endian** string with leading-zeros stripped | `"0x0539"` ### Differences: Byte-Arrays | Format | Byte-Array Representation | Representation of of `List([0, 42])` | |---|---|---| |SSZ| Simply an array of bytes | `[0, 42]` |CL JSON| 0x-prefixed hexidecimal string | `"0x002a"` |EL JSON| 0x-prefixed hexidecimal string | `"0x002a"` ## Examples Let's use this imaginary structure to demonstrate : ```python= class SpecialMessage: # A sequential integer used to order messages. message_id: Uint64 # Some special message of 8 bytes. Maybe a UTF-8 string, maybe not. body: List[Uint8, 8] ``` Let's instantiate it: ```python= my_message = SpecialMessage(message_id=1337, body=[0,0,0,0,0,0,0,42]) ``` Now we'll view it in the three different formats: ### `SpecialMessage` as SSZ ```python= ssz_encode(my_message) == [39, 5, 0, 0, 0, 0, 0, 0, 12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 42] ``` For readability I'll take the message and split it across lines with comments: ```python= [ # This is the 64-bit decimal integer 1337 encoded as little endian. 39, 5, 0, 0, 0, 0, 0, 0, # This is an SSZ "offset", saying you can read the value of the list at index 12. 12, 0, 0, 0, # Starting at index 12, this is the value of the `body` field. 0, 0, 0, 0, 0, 0, 0, 42 ] ``` Note that this response is *schemaless*. The SSZ encoding tells you nothing about the structure of the bytes, it's assumed that you know what you're decoding so you should know where each field starts and ends. This sucks for readability, but clearly reduces the size of the message. The SSZ message is 20 bytes in total. ### `SpecialMessage` as CL JSON ```python= cl_json_encode(my_message) == '{"message_id": "1337","body": "0x00000042"}' ``` Once again, let's split the message across lines and comment it (comments are illegal in JSON, but YOLO): ```json { # An integer presented as a string because Javascript doesn't # natively support 64-bit integers. "message_id": "1337", # A byte-array represented as `0x`-prefixed hex. "body": "0x000000000000002a" } ``` The CL JSON message is 49 bytes in total. ### `SpecialMessage` as EL JSON ```python= el_json_encode(my_message) == '{"message_id": "0x0539","body": "0x0000002a"}' ``` The message split across lines for readability (with YOLO comments): ```json= { # The decimal integer 1337 represented as `0x`-prefixed big-endian hex, # with leading-zeros stripped. "message_id": "0x0539", # A byte-array represented as `0x`-prefixed hex. "body": "0x000000000000002a" } ``` The `QUANTITY` and `DATA` encoding formats described in the [Encoding](https://github.com/ethereum/execution-apis/blob/94164851c1630ff0a9c31d8d7d3d4fb886e196c0/src/engine/common.md#encoding) section of the Engine API spec are critical to understanding this format. In our example, the `QUANTITY` encoding is used for `message_id` whilst the `DATA` encoding is used for the `body`. The EL JSON message is 51 bytes in total.