# Recap of Ethereum RLP Serialization **RLP** (Recursive Length Prefix) is the primary encoding method used for serializing and deserializing data in Ethereum. It is widely applied in various aspects of Ethereum, including blocks, transactions, account states, and network protocol messages. RLP is a simple serialization format that is well [documented](https://ethereum.org/en/developers/docs/data-structures-and-encoding/rlp/) and [explained](https://medium.com/coinmonks/data-structure-in-ethereum-episode-1-recursive-length-prefix-rlp-encoding-decoding-d1016832f919). In this article, I will provide a brief summary of RLP and write some tests in TypeScript using [@ethereumjs/rlp](https://www.npmjs.com/package/@ethereumjs/rlp) to illustrate the encoding rules of RLP. ## Data Types RLP processes only two types of data: **String** and **List**. A String is essentially a **Byte Array**. In this article, we use TypeScript String literals to represent the Hex form of RLP Strings, such as `'0x1234ABCD'`. A List can contain Items that are either Strings or other Lists. In other words, Lists can be infinitely nested. We represent RLP Lists using TypeScript Arrays, here are some examples: * `[ ]` - an empty list. * `[ '0x1234', '0x5678' ]` - a list containing two strings. * `[ '0x1234', [ '0x5678', ['0xABCD', '0xCAFE'] ] ]` - a nested list. That's all about RLP data types. Next, I will summarize the encoding rules of RLP. ## Serialization RLP does not require a **Schema**; in other words, the encoded data is self-describing. Given an encoded data, we can determine how to decode it from the first byte. Let's discuss possible scenarios through examples. ### Single Byte If a string consists of only one byte, and this byte is less than `0x80`, then the encoded data is the byte itself. For example, `RLP.encode('0x12') == '0x12'`. ```ts import assert from 'node:assert'; import { RLP } from '@ethereumjs/rlp'; import { bytesToHex } from '@ethereumjs/util'; function testSingleByte() { for (let i = 0x00; i <= 0x7F; i++) { const str = i.toString(16).padStart(2, '0'); assert.equal( bytesToHex(RLP.encode('0x' + str)), '0x' + str, ); } } ``` ### Short String Otherwise, if the length of a string `n` is less than 56, we start with the byte `0x80 + n` and then follow it with the string. For example, `RLP.encode('0x') == '0x80'`, and `RLP.encode('0x123456') == '0x83123456'`. ```ts function testShortString() { for (let i = 0; i <= 55; i++) { let str = 'ab'.repeat(i); assert.equal( bytesToHex(RLP.encode('0x' + str)), '0x' + (0x80 + i).toString(16) + str, ); } } ``` ### Long String Otherwise, if the length of a string `n` can be represented by `m` bytes (`m <= 8`), we start with the byte `0xB7 + m`, followed by `n` encoded in `m` bytes, and then the string itself. Here is the test for `m == 1`: ```ts function testLongString1() { for (let i = 56; i <= 0xFF; i++) { let str = 'ab'.repeat(i); assert.equal( bytesToHex(RLP.encode('0x' + str)), '0xb8' + (i).toString(16) + str, ); } } ``` And here is the test for `m == 2`: ```ts function testLongString2() { for (let i = 0x100; i <= 0x0FFF; i++) { let str = 'ab'.repeat(i); assert.equal( bytesToHex(RLP.encode('0x' + str)), '0xb9' + (i).toString(16).padStart(4, '0') + str, ); } } ``` ### Short List Similar to the encoding of strings, if the length of a list `n` is less than 56, we start with the byte `0xC0 + n` and then follow it with the encoded data of the items. Obviously, `RLP.encode([]) == '0xC0'`. Let's write a test case for 'Single Byte Lists': ```ts function testShortList() { for (let i = 0; i <= 55; i++) { assert.equal( bytesToHex(RLP.encode(Array(i).fill('0x72'))), '0x' + (0xC0 + i).toString(16) + '72'.repeat(i), ); } } ``` ### Long List Otherwise, if the length of the list `n` can be represented using `m` bytes (`m <= 8`), we start with the byte `0xF7 + m`, followed by `n` encoded in `m` bytes, and subsequently the (recursively) encoded data of the list items. Below is the test for the case when `m == 1`: ```ts function testLongList1() { for (let i = 56; i <= 0xFF; i++) { assert.equal( bytesToHex(RLP.encode(Array(i).fill('0x72'))), '0xf8' + (i).toString(16) + '72'.repeat(i), ); } } ``` ## Examples Here are some more [examples](https://ethereum.org/en/developers/docs/data-structures-and-encoding/rlp/#examples): ```ts function testExamples() { assert.equal(bytesToHex(RLP.encode('dog')), '0x83646f67'); assert.equal(bytesToHex(RLP.encode(['cat', 'dog'])), '0xc88363617483646f67'); assert.equal(bytesToHex(RLP.encode(null)), '0x80'); assert.equal(bytesToHex(RLP.encode([])), '0xc0'); assert.equal(bytesToHex(RLP.encode(0)), '0x80'); assert.equal(bytesToHex(RLP.encode('\x00')), '0x00'); assert.equal(bytesToHex(RLP.encode('\x0f')), '0x0f'); assert.equal(bytesToHex(RLP.encode([ [], [[]], [ [], [[]] ] ])), '0xc7c0c1c0c3c0c1c0'); assert.equal(bytesToHex(RLP.encode('Lorem ipsum dolor sit amet, consectetur adipisicing elit')), '0xb8384c6f72656d20697073756d20646f6c6f722073697420616d65742c20636f6e7365637465747572206164697069736963696e6720656c6974'); } ``` ## Summary RLP is relatively simple and easy to understand. In this article, I have summarized the types that RLP can handle, as well as the encoding rules. The complete RLP encoding rules are shown in the figure below: ![eth-rlp](https://hackmd.io/_uploads/HyRgf7NGye.png) Buy me a cup of coffee if this article has been helpful to you: * EVM: `0x8f7BEE940b9F27E8d12F6a4046b9EC57c940c0FA`