Erasure encoding using polynomials

*When Chuck Norris uses erasure code, he doesn't just protect data redundancy; he creates a fortress of redundancy so impenetrable that even a single byte missing feels like an apology to him.* # Erasure code using polynomial interpolation :::info **Definition** An erasure code is a forward error correction (FEC) code under the assumption of bit erasures (rather than bit errors), which transforms a message of `k` symbols into a longer message (code word) with `n` symbols such that the original message can be recovered from a subset of the `n` symbols. ::: We'll now see an example of erasure core using polynomials. :::warning **Warning** This sections lacks a lot of mathematical formalism, and hides a lot of details about real implementation (like the fact that we'll in reality operate on polynomials over finite fields, while this section operates on polynomials over the real numbers). However, this section lets the reader to grasp the idea of erasure code using polynomial interpolation. ::: ## Problem setup Alice has the following 4-bytes message `0x160C0B0D` that she wants to transmit to Bob. To do that, she uses an unreliable network (the internet) with a terrible, very lossy connection. Alice will encode the message into a 6-bytes encoded message, and Bob will be able to reconstruct the initial 4-bytes message if he receives only **ANY** 75% of the encoded 6-bytes message. The word **ANY** is really important here. Bob must be able to reconstruct the initial data, if he receives from example: - Only the first 75% of the encoded message, or - Only the last 75% of the encoded message, or - Any combination, as long as Bob received at least 75% (6 byte) of the encoded message :::info **Note** In this section, we will use a code rate of 3/4 (75%), meaning that the initial message size will increase by 50%. Consequently, only 3/4 (75%) of the coded message will be needed to reconstruct the initial message. ::: ## Coding Let's divide Alice's 4-bytes message into 4 bytes: `[0x16, 0x0C, 0x0B, 0x0D]`. Each byte will represent one polynomial constraint. We have 4 constraints. Let's define a polynomial `P` will the following constraints: - `P(0) = 22` (`0x16`) - `P(1) = 12` (`0x0C`) - `P(2) = 11` (`0x0B`) - `P(3) = 13` (`0x0D`) `x` evaluated values are integers (0, 1, 2, 3). `P(x)` evaluated values are the bytes of the message Alice wants to encode. Using the polynomial Lagrange interpolation, Alice finds the unique polynomial of degree at most `4-1 = 3` which complies with these 4 constraints. She gets: $$ P(x) =-x^3 + \frac{15}{2}x^2 - \frac{33}{2}x + 22 $$ ![image](https://hackmd.io/_uploads/H1aaB8bxA.png) ## Extension Alice adds 2 extension points. She chooses 2 points corresponding to `x=3` and `x=4`. Evaluating `P` at these points gives: - `P(4) = 12` (`0x0C`) - `P(5) = 2` (`0x02`) ![image](https://hackmd.io/_uploads/BJ8NUI-eR.png) ## Communication Instead sending `0x160C0B0D` to Bob, Alice will: - Agrees with Bob she will send him 6 `(x, P(x))` data points, with `P` a polynomial of degree at most `3`. - Sends to Bob: `[(0x00, 0x16), (0x01, 0x0C), (0x02, 0x0B), (0x03, 0x0D), (0x04, 0x0C), (0x05, 0x02)]` corresponding to 6 `(x, P(x))` tuples. - Agree with Bob that the initial data is: `[P(0), P(1), P(2), P(3)]`. Because the connection between Alice and Bob is lossy, Bob receives only 4 tuples out of 6: - `(0x00, 0x16)`, - `(0x02, 0x0B)`, - `(0x04, 0x0C)`, and - `(0x05, 0x02)` ## Reconstruction From received bytes, Bob deduces: - `P(0) = 22` (`0x16`) - `P(2) = 11` (`0x0B`) - `P(4) = 12` (`0x0C`) - `P(5) = 2` (`0x02`) ![image](https://hackmd.io/_uploads/BJu2cLWg0.png) Bob knows the original message is `[P(0), P(1), P(2), P(3)]`. However, `P(1)`and `P(3)`are missing. Bob knows the 4 points belong to the the unique polynomial of degree at most `3`, so he can reconstruct the polynomial using the Lagrange interpolation (just as Alice previously did, but with the `4` points corresponding to the 4-bytes initial message). Bob ends up with this polynomial (which is exactly the same Alice computed previously): $$ P(x) =-x^3 + \frac{15}{2}x^2 - \frac{33}{2}x + 22 $$ ![image](https://hackmd.io/_uploads/By4yn8beR.png) Bobs now evaluates `P` at `x=1` and `x=3`. He finds: - `P(1) = 12` (`0x0C`) - `P(3) = 13` (`0x0D`) ![image](https://hackmd.io/_uploads/SJRN2LZgR.png) Bob is now able to reconstruct the first 4 tuples: `[(0x00, 0x16), (0x01, 0x0C), (0x02, 0x0B), (0x03, 0x0D)]` and the initial message: `0x160C0B0D`. :::info **Note** If Bob receives less than 4 points, then he will be unable to reconstruct the corresponding unique polynomial with degree at most `3`. Actually, he will be able to reconstructe an infinity of polynomials with degree at most `3`, including the original polynomial. However, he won't know exactly which one is the good one. ::: :::info **Note** The advantage of using such an erasure code technique may not be obvious at the first sight: - Without the erasure code, Alice sends 4 bytes of data to Bob. - With the erasure code, Alice sends 6 bytes of data (+ `x` coordinates) to Bob. **Erasure code consumes more data.** However, the important point here is: - Without erasure code, Bob must receive **exactly** 4 bytes out of 4 (100%) sent by Alice. - With erasure code, Bob may receive only **any** 4 bytes out of 6 (75%) sent by Alice. ::: ## Some erasure code real life usage ### QR codes This QR codes embeds the offchainlabs.com link. ![image](https://hackmd.io/_uploads/HyqTZPZeR.png) If we deteriorate some parts of this QR code, it is still possible to retrieve the (erasure coded) information embedded inside it. ![image](https://hackmd.io/_uploads/ryI4QPWlR.png) ### Compact disc A scratches deteriorated compact disc (up to a certain extent...) may continue to be readable, because data stored into is erasure coded. ### Ethereum - beacon chain blobs peer DAS This precise case will be studied in details.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.