Error-Correcting Codes(ECC)

--- tags: System Software --- # Error-Correcting Codes(ECC) 這篇文章起於 [成大的系統軟體系列課程的一次作業](https://hackmd.io/@sysprog/2020-quiz4)，因為不知不覺內容就變得相當豐富了，因此就作為一個主題獨立出來。目前的內容中，我們僅會先探討比較容易理解的 Hamming codes，並主要針對 Reed–Solomon codes 及其在 linux 中的應用作探討。如果未來有機會碰觸到相關的議題，也許就可以更新進來XD 在閱讀這篇文章前，小小提醒一下: 這篇文章偏向記錄自己所學習的內容，對於不了解的部分，我會試著先留空，避免造成錯誤的概念 (如果有機會弄懂也希望可以把它補齊><)。不過難保會無意間對某些內容穿鑿附會了，因此有疑惑的地方務必回到原文! 如果有錯誤也請務必不吝指正！ ## 甚麼是 ECC? Error-Correcting Codes(ECC) 被應用在通訊等資料傳輸的過程中，補正因為資料損壞、雜訊等原因導致的錯誤問題。 ## Single–Error Correcting(SEC) Codes > 此節內容摘錄並重新整理自 [錯誤更正碼簡介](https://w3.math.sinica.edu.tw/math_media/d184/18404.pdf) 其中的一類被稱 Single–Error Correcting(SEC) Codes，顧名思義，這個方法可以修正編碼中的單位元的錯誤，其中一個經典的例子為 [(7, 4) hamming code](https://en.wikipedia.org/wiki/Hamming(7,4))，意指資料以 7 個位元編碼的資料中，有 4 位是真實的資料，其做法大致為: 對於一個資料 `d1 d2 d3 d4`，加上額外的 parity `p1 p2 p3`，此 parity 定義為使下圖中的三個大圓中的 1 數量為偶數，則在僅有至多一碼錯誤的前提下: * 僅有其中一個大圓內 1 的個數不符合，這表示是 `p1` or `p2` or `p3` 產生錯誤，因此反轉 `p1` or `p2` or `p3` 即可 * 其中兩個大圓內的 1 的個數不符合，表示是 `d1` or `d2` or `d3` 產生錯誤，因此反轉 `d1` or `d2` or `d3` * 如果是 `d4` 產生錯誤，三個大圓內的 1 個數皆不符合，反轉 `d4` ![](https://i.imgur.com/3IHl0OH.png) ### Hamming codes 的數學結構以數學的角度來探討，為甚麼 (7, 4) hamming code 可以修正一個 bit 的錯誤呢? 首先假設 d1,d2,d3,d4 對應到 x4,x3,x2,x1，而 p3,p2,p1 對應到 x5,x6,x7，則表示其需滿足(注意這裡的加法指的是 2 進位 single bit 的加法): $$ \left\{ \begin{array}{c} x1 + x2 + x3 + x5 = 0 \\ x1 + x2 + x4 + x6 = 0 \\ x1 + x3 + x4 + x7 = 0 \\ \end{array} \right. $$ 把這個線性方程式寫成矩陣的話，那就是 $$ \begin{pmatrix} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1\\ \end{pmatrix} $$ 令 $C$ 為所有編碼資料的集合即是 H 的 null space，也就是 $\forall x \in C$　$Hx^T = 0$，又 rank($H$) = 3，根據 [Rank–nullity theorem](https://en.wikipedia.org/wiki/Rank%E2%80%93nullity_theorem) 則 rank($C$) = 4 ，表示對於上述式子的解，事實上只要給定 x1~x7 中任意 4 個，其他 3 個就會因應產生，又這任意 4 個可以為任意的 0 或 1，因此共有 $2^4$ = 16 個可能，對應可以編碼的 4 個位元。接著，我們可以從 Hamming distance 和 Hamming weight 來思考。 * Hamming distance: 指兩個 bitstring 0 和 1 相異的個數，例如 x = 0100, y = 1100，則 Hamming distance $d_H(x,y)$ = 1 * Hamming weight: 一個 bitsting 中 1 的數量，例如 x = 0100，其 Hamming weight $w_{H}(x)$ = 1 * 我們可以從這兩個定義推廣，發現到 x 和 y 的 Hamming distance 事實上就是 x + y 的 Hamming weight 則我們可以定義一段 codeword 的 minimum distance 為此 codeword 集合中任兩個編碼的最小 Hamming distance，也就是 $$ d_{min}{C} = \min \left\{ d_{H}(x,y) \mid x,y \in C, x \neq y \right\} $$ 而 minimum weight，則定義為 $$ w_{min}{C} = \min \left\{ w_{H}(x)\mid x \in C \right\} $$ 由於 null space $C$ 是同時也是一個 [linear subspace](https://en.wikipedia.org/wiki/Linear_subspace)，這表示集合中的任兩個 x、y 相加仍會存在集合 C 中，又前面提到 x、y 相加的 Hamming weight 會等同 x、y 的 Hamming distance，因此可以歸納出 $d_{min}{C} = w_{min}{C}$。那麼回過頭看，$C$ 的 minimum weight 會是多少呢?，我們知道 $C$ 是 $H$ 的 nullspace，由此下去推理的話: * 1 是不可能的，因為 H 的任一個 column 中都不存在 [0 0 0] * 2 是不可能的，因為 H 的任兩個 column 各異，任取兩個相加後不會得到 [0 0 0] * 如果取第 1、2、7 個 column 相加，可以得到 [0 0 0]，因此 $C$ 的 minimum weight / minumum distance 為 3 由於對於 d 個 bits 的錯誤，其 minumum distance 必須至少要是 2d + 1，因此我們知道這個演算法僅能修正 1 bit 的錯誤(詳情請參考[錯誤更正碼簡介](https://w3.math.sinica.edu.tw/math_media/d184/18404.pdf)第 7 頁) ### Hamming codes 的矩陣運算 Hamming codes 的編碼與解碼操作可以被轉換成矩陣的運算，首先要先定義編碼的生成矩陣 $G$ 和檢驗的矩陣 $H$ $$ \begin{split} (d4, \space d3, \space d2, \space d1) \\ G = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 \\ 1 & 1 & 0 & 1 \\ 1 & 0 & 1 & 1 \\ \end{pmatrix}\end{split} $$ 可以看到前四列形成一個單位矩陣，後三列則分別是 p1、p2、p3 所在的大圓中，根據包含的 $d_{n}$ 為何，其對應的 bit 就設為 1 形成的。而 $H$ 就是我們前面的線性方程式所形成的矩陣： $$ H = \begin{pmatrix} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1\\ \end{pmatrix} $$ 則對於一段資料，舉例來說 p = $[1 \space 1 \space 0\space 1]^T$，其編碼的計算就是。再次注意到，這裡的矩陣運算是對單 bit 的加乘法: $$ x = Gp = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 \\ 1 & 1 & 0 & 1 \\ 1 & 0 & 1 & 1 \\ \end{pmatrix}\ \begin{pmatrix} 1 \\ 1 \\ 0 \\ 1 \end{pmatrix}\ = \begin{pmatrix} 1 \\ 1 \\ 0 \\ 1 \\ 0 \\ 1 \\ 0\end{pmatrix}\ $$ 假設經過傳輸後得到的向量為 r，如果在傳輸過程中沒有錯誤的 bit 產生，則會得到 0 向量： $$ z = Hr = \begin{pmatrix} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1\\ \end{pmatrix}\begin{pmatrix} 1 \\ 1 \\ 0 \\ 1 \\ 0 \\ 1 \\ 0\end{pmatrix}\ = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix}\ $$ 如果 r 產生 1 bit 的錯誤，舉例來說，第 6 個 bit 有錯，則輸入第 6 個 bit 有錯的 codeword 之後，得到的結果為: $$ z = Hr = \begin{pmatrix} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1\\ \end{pmatrix}\begin{pmatrix} 1 \\ 1 \\ 0 \\ 1 \\ 0 \\ 0 \\ 0\end{pmatrix}\ = \begin{pmatrix} 0 \\ 1 \\ 0 \end{pmatrix}\ $$ 因為我們知道對第 6 個 bit 為 1 的向量運算會得到: $$ z = Hr = \begin{pmatrix} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1\\ \end{pmatrix}\begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 1 \\ 0\end{pmatrix}\ = \begin{pmatrix} 0 \\ 1 \\ 0 \end{pmatrix}\ $$ 則表示輸入的 codeword 第 6 個 bit 出錯，因此需要對其取補數調整，就可以得到原始的 codeword。(如果我們把 parity bit 擺在 2^n 的位置的話，事實上這個矩陣運算可以得到更美妙(?)的結果，詳情請參考[(7, 4) hamming code](https://en.wikipedia.org/wiki/Hamming(7,4))) ## Reed–Solomon error correction > 此節內容引用譯自 [Reed-Solomon error-correcting code decoder](https://www.nayuki.io/page/reed-solomon-error-correcting-code-decoder#preliminaries) Reed–Solomon error correction 也是 ECC 的一種，透過在原始的資料中加上額外的資訊，則在定義內的錯誤數量以下，在資料傳輸後產生的錯仍可以被修正，這個 ECC 算法被廣泛使用於諸多領域。 ### 先備知識 1. Reed-solomon codes 的操作是對於在 [finite field](https://en.wikipedia.org/wiki/Finite_field) 集合中的 symbol。對於使用者定義的 field $F$ (以電腦領域來說，通常是 $GF(2^8)$)，存在 primitive element / generator $\alpha$，這表示 field 中的所有數值可以被表示成非零且不重複的 {$\alpha^0、\alpha^1、...\alpha^{\lvert F\rvert -2}$} :::info 對於 primitive element 的敘述可以參考 [Primitive element (finite field)](https://en.wikipedia.org/wiki/Primitive_element_(finite_field)) 和 [Primitive root modulo n](https://en.wikipedia.org/wiki/Primitive_root_modulo_n)。 ::: 2. 定義資料的長度為 k，k 需滿足 $1\leq k < \lvert F\rvert$，意及每個要編碼的資料都是來自 field $F$ 的 k 值數量的 block 3. 定義 m 為延伸的糾正碼長度，m 為滿足 $1 \leq m < \lvert F\rvert - k$ 的整數，這可以糾正上至 $\lfloor m/2 \rfloor$ 的錯誤數量 4. 因此，最終我們會定義 $n = k + m$ 為最後的 block size \ codeword 長度，n 需要滿足 $2 \leq n < \lvert F\rvert$，這表示如果我們想要糾正大量的錯誤，則需要愈大的 field $F$ ### Systematic encoder 1. Reed-Solomon error-correcting codes 具有不同的形式與處理方式，這裡所說明的是比較普遍的 [BCH code](https://en.wikipedia.org/wiki/BCH_code)。這意味著原始資料會視為是一個多項式的係數，而接續的糾正碼也會接續著成為新的多項式項。 2. 根據 $\alpha$ 和 m 定義一個生成多項式 $$ g(x) = \prod_{i=0}^{m-1} (x-\alpha^i) $$ 則此多項式為 m degree，且最大冪次項 $x^m$ 的係數為 1 3. 假設原始的資料是由 k 個值組成的 ($M_0$, $M_1$, ... ,$M_k-1$)，$M_i$ 是 field F 中的一個數值，則將以這 k 個值形成的多項式定義為 $$ M(x) = \sum\limits_{i= 0}^{k-1}{M_ix^i} $$ 4. 則 Reed-solomon codeword 的編碼被定義為 $$ s(x) = M(x)x^m - [(M(x)x^m) \space mod \space g(x)] $$ 這裡我們可以看到， $M(x)x^m$ 會形成 $x^m$, $x^{m+1}$ 到 $x^{m+k-1}$，總共 k 項，而這 k 項的係數其實也是資料的內容。而因為 $[(M(x)x^m) \space mod \space g(x)]$ 會是 degree $m - 1$ 更低，因此不會有和 $M(x)x^m$ 的計算互動，因此整個 s(x) 的 degree 會為 m + k - 1 也就是 n - 1 4. 因此，最後的編碼即為 $$ s(x) = \sum\limits_{i= 0}^{n-1}{s_ix^i} $$ 多項式的係數，也就是 $(s_0,s_1,...,s_{n-1})$ ### Peterson-Gorenstein-Zierler decoder #### Calculating syndromes 1. 假收我們收到的編碼為 $(r_0,r_1,...r_{n-1})$，這此編碼形成的多項式為： $$ r(x) = \sum\limits_{i= 0}^{n-1}{r_ix^i} $$ 2. 對於接收者而言，我們並不知道 $s_0, s_1, ... , s_{n-1}$ 是多少，但我們可以定義誤差 $e_0 = r_0 -s_0, e1 = r_1 - s1,...,e_{n-1} = r_{n-1}-s_{n-1}$，則誤差多項式為 $$ e(x) = r(x) - s(x ) = \sum\limits_{i= 0}^{n-1}{e_ix^i} $$ 3. 定義 m 個 syndrome value $S_i$ 中，$0 \leq i < m$ ，則 $$ S_i = r(\alpha^i) = s(\alpha^i) + e(\alpha^i) = 0 + e(\alpha^i) = e(\alpha^i) $$ * 這是因為 s(x) 可以被 $g(x)=(x-\alpha^0)(x-\alpha^1)...(x-\alpha^{m-1})$ 整除(參考 Systematic encoder 的第四點)，代表 syndrome 的值只會根據 error 而改變 * 所以，如果所有的 syndrome value 皆為 0 的話，這表示得到的編碼是正確的，不需要做任何的修正。 4. 如果把所有的 syndrome 方程式都列出來，那就是 $$ \left\{ \begin{array}{r} S_0 =& e_0\alpha^{0 \times 0} + e_1\alpha^{1 \times 0} + ... + e_{n-1}\alpha^{(n-1) \times 0}\\ S_1 =& e_0\alpha^{0 \times 1} + e_1\alpha^{1 \times 1} + ... + e_{n-1}\alpha^{(n-1) \times 1}\\ ... \\ S_{m-1} =& e_0\alpha^{0 \times (m-1)} + e_1\alpha^{1 \times (m-1)} + ... + e_{n-1}\alpha^{(n-1) \times (m-1)} \end{array} \right. $$ 5. 並重寫成矩陣形式： $$ \begin{bmatrix} e_0\alpha^{0 \times 0} + e_1\alpha^{1 \times 0} + ... + e_{n-1}\alpha^{(n-1) \times 0} \\ e_0\alpha^{0 \times 1} + e_1\alpha^{1 \times 1} + ... + e_{n-1}\alpha^{(n-1) \times 1} \\ ...\\ e_0\alpha^{0 \times (m-1)} + e_1\alpha^{1 \times (m-1)} + ... + e_{n-1}\alpha^{(n-1) \times (m-1)} \end{bmatrix} = \begin{bmatrix} S_0 \\ S_1 \\ ... \\ S_{m-1} \end{bmatrix} $$ 再把誤差項單獨取出 $$ \begin{bmatrix} \alpha^{0 \times 0} & \alpha^{1 \times 0} & ... & \alpha^{(n-1) \times 0} \\ \alpha^{0 \times 1} & \alpha^{1 \times 1} & ... & \alpha^{(n-1) \times 1} \\ ... \\ \alpha^{0 \times (m-1)} & \alpha^{1 \times (m-1)} & ... & \alpha^{(n-1) \times (m-1)} \end{bmatrix} \begin{bmatrix} e_0 \\ e_1 \\ ... \\ e_{n-1} \end{bmatrix} = \begin{bmatrix} S_0 \\ S_1 \\ ... \\ S_{m-1} \end{bmatrix} $$ 由於 $\alpha$ 是已知的，且 $S_i$ 也為已知，理論上我們可以對 $e_i$ 求解，不過可以看到對於 n - 1 個未知數，僅有 m - 1 個方程式，顯然該如何得到誤差項(或者比較準確的說，預測)就需要一些技巧了 #### Finding error locations 1. 我們需要先找到錯誤的位置。假設訊息中有 $\nu$ 個錯誤，如我們在前提中所說，$\nu$ 需滿足 $1 \leq \nu \leq \lfloor\frac{m}{2}\rfloor$。所以說，除非有時間或空間的限制，我們會期望 $\nu$ 可以大一些，可以最大化能糾正的錯誤 2. 假設這 $\nu$ 錯誤的位置在 $I_0,I_1,...I_{\nu-1}$，$I_i$ 之間是沒有順序的且滿足 $0 \leq I_i < n$，這表示在這位置上的誤差 $e_{I_0},e_{I_1},...,e_{I_{\nu-1}}$ 會是任意值(可能是 0)，而其他的 $e_i$ 則必須等於 0 3. 接著，再根據錯誤的地方 $I_i$ 定義，對於 $0 \leq i < \nu$ $$ X_i = \alpha^{I_i} \\ Y_i = e_{I_i} $$ 4. 因為我們知道其他的 $e_i$ 都為 0，因此可以重新定義 syndrome，把前面的矩陣方程式改寫成 $$ \begin{bmatrix} X^{0}_0 & X^{0}_1 & ... & X^{0}_{\nu-1} \\ X^{1}_0 & X^{1}_1 & ... & X^{1}_{\nu-1}\\ ... \\ X^{m-1}_0 & X^{m-1}_1 & ... & X^{m-1}_{\nu-1} \end{bmatrix} \begin{bmatrix} Y_0 \\ Y_1 \\ ... \\ Y_{\nu-1}\end{bmatrix} = \begin{bmatrix} S_0 \\ S_1 \\ ... \\ S_{m-1}\end{bmatrix} $$ 5. 接著，定義一個 error locator polynomial $$ \Lambda(x) = \prod_{i=0}^{\nu-1} (1-X_ix) \\ = 1+\Lambda_{1} x+\Lambda_{2} x^2+...\Lambda_{\nu} x^\nu $$ 6. 因為 $(1-X_iX_i^{-1}) = 0$，因此我們知道 $$ 0 = \Lambda(X_{i}^{-1}) = 1+\Lambda_{1} X_i^{-1}+\Lambda_{2} X_i^{-2}+...\Lambda_{\nu} X_i^{-\nu} $$ 7. 現在，對於 $0 \leq i < \nu$ 且 j $\in \mathbb{Z}$，將兩側同乘上 $Y_iX_i^{j+\nu}$ $$ 0 = Y_iX_i^{j+\nu}\Lambda(X_{i}^{-1}) = Y_iX_i^{j+\nu} + \Lambda_{1}Y_iX_i^{j+\nu} + ... \Lambda_{\nu}Y_iX_i^{j+\nu} $$ 8. 然後把範圍中的所有 i 得到的結果相加 $$ \begin{aligned} 0 &= \sum\limits_{i= 0}^{\nu-1}{Y_iX_i^{j+\nu}\Lambda(X_{i}^{-1})} \\ &= \sum\limits_{i= 0}^{\nu-1}{(Y_iX_i^{j+\nu} + \Lambda_1 Y_iX_i^{j+\nu - 1} + \Lambda_2 Y_iX_i^{j+\nu - 2} + ... +\Lambda_\nu Y_iX_i^{j})}\\ &= \sum\limits_{i= 0}^{\nu-1}{Y_iX_i^{j+\nu}} + \Lambda_1 \sum\limits_{i= 0}^{\nu-1}{Y_iX_i^{j+\nu - 1}} + ... +\Lambda_\nu \sum\limits_{i= 0}^{\nu-1}{Y_iX_i^{j}} \\ &= S_{j+\nu} + \Lambda_1 S_{j+\nu - 1} + ... + \Lambda_\nu S_{j} \end{aligned} $$ 然後再將式子調整成 $$ -S_{j+\nu} = \Lambda_\nu S_{j} +\Lambda_{\nu-1} S_{j+1}+ ... +\Lambda_1 S_{j+\nu - 1} $$ 其中 $0 \leq j < \nu$ 9. 則如果將每個 j 都列出等式： $$ \left\{ \begin{array}{r} \Lambda_\nu S_0 + \Lambda_{\nu-1} S_1 + ... + \Lambda_{1} S_{\nu-1} = -S_\nu \\ \Lambda_\nu S_1 + \Lambda_{\nu-1} S_2 + ... + \Lambda_{1} S_{\nu} = -S_{\nu+1}\\ ...\\ \Lambda_\nu S_{\nu-1} + \Lambda_{\nu} S_{\nu} + ... + \Lambda_{1} S_{2\nu-2} = -S_{2\nu-1} \end{array} \right. $$ 一樣可以將此轉換成矩陣運算的形式： $$ \begin{bmatrix} S_0 & S_1 & ... & S_{\nu-1} \\ S_1 & S_2 & ... & S_{\nu} \\ ...\\ S_{\nu-1} & S_\nu & ... & S_{2\nu-2} \\ \end{bmatrix} \begin{bmatrix} \Lambda_\nu \\ \Lambda_{\nu-1} \\ ... \\ \Lambda_{1}\end{bmatrix} = \begin{bmatrix} -S_\nu \\ -S_{\nu+1} \\ ... \\ -S_{2\nu-1}\end{bmatrix} $$ 10. 現在，我們終於得到了可以求解的方程式了，我們可以將其寫成擴增矩陣： $$ \begin{bmatrix} S_0 & S_1 & ... & S_{\nu-1} & -S_\nu\\ S_1 & S_2 & ... & S_{\nu} & -S_{\nu+1}\\ ...\\ S_{\nu-1} & S_\nu & ... & S_{2\nu-2} & -S_{2\nu-1}\\ \end{bmatrix} $$ 並且透過 [Gauss-Jordan elimination](https://online.stat.psu.edu/statprogram/reviews/matrix-algebra/gauss-jordan-elimination) 求解。如果對此線性矩陣的解為 [inconsistent](https://en.wikipedia.org/wiki/Consistent_and_inconsistent_equations)(無解)，那表示 codeword 中有超過 $\nu$ 數量的錯誤，因此無法得到解。如果我們的 $\nu$ 可以更大，則應該將其調大並且重新解碼，否則，就是無法完成 error correction。如果是 consistent 但是 under-determined(無限多解)，這可能發生於錯誤少於 $\nu$ 個錯誤時，因為任何一個 $e_i = 0$ 的位置都可以被視為是一個 virtual error，此時就必須把 $\nu$ 降低並重新解碼。 11. 一旦我們得到係數 $\Lambda_1$、$\Lambda_2$、...$\Lambda_\nu$，則此時可以把 x = $\alpha^{-i}$($0 \leq i < n$) 代入 $\Lambda(x)$ 中進行運算看看 $\Lambda(\alpha^{-i})$ 是否為 0，表示此位置有誤差，即 $e^i$ 不為 0，把這個 i 加入 $I_i$ 如果找到不為 0 的 $e^i$ 多餘 $\nu$ 個，表示傳輸後的資料錯誤太多，無法修正。如果少於 $\nu$ 個，表示誤差比預計中的少，我們只需將 $\nu$ 重新定義為較低的數字，並刪除編號較高的 $I_i$ 變量即可。對於大於等於 n 的 i 毋需被代入，這是因為錯誤不能出現在長度為 n 的資料以外。 #### Calculating error values 1. 現在，我們已經得知了錯誤的位置 $I_0,I_1,...I_{\nu-1}$，因此我們就知道 $X_i$ = $\alpha^{I_i}$, $0 \leq i < \nu$ 2. 由於 $Y_i = e_{I_i}$ ，且我們已知 $S_i$，所以 $$ \begin{bmatrix} X^{0}_0 & X^{0}_1 & ... & X^{0}_{\nu-1} \\ X^{1}_0 & X^{1}_1 & ... & X^{1}_{\nu-1}\\ ... \\ X^{m-1}_0 & X^{m-1}_1 & ... & X^{m-1}_{\nu-1} \end{bmatrix} \begin{bmatrix} Y_0 \\ Y_1 \\ ... \\ Y_{\nu-1}\end{bmatrix} = \begin{bmatrix} S_0 \\ S_1 \\ ... \\ S_{m-1}\end{bmatrix} $$ 現在就可以求解了。同樣可以透過 [Gauss-Jordan elimination](https://online.stat.psu.edu/statprogram/reviews/matrix-algebra/gauss-jordan-elimination) 來求解。如果 linear system 是 consistent 且 independent 的，則存在唯一解，可以成功修正錯誤。如果是 inconsistent，則表示錯誤無法被修正，或者結果是 consistent 但 dependent，那表示沒有唯一的解（即使唯一解是必須的)。 3. 則對於修正後的訊息 $r'(x) = r'_0x^0+r'_1x^1+...r'_{n-1}x^{n-1}$，對於在錯誤位置 $I_i$ $0 \leq i < \nu$ 的部份，根據 $r'_{I_i} = r_{I_i} \!-\! Y_i = r_{I_i} \!-\! e_{I_i}$ 去修正，其他則維持 $r_i' = r_i$ 即可。 ### 註記 decode 的部份，Peterson-Gorenstein-Zierler decoder 是$Θ(m^3+mk)$ 的解，但在速度的考量上，則應該透過 [Berlekamp-Massey algorithm](https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction#Berlekamp%E2%80%93Massey%20decoder) 求出位置，加上 [Forney algorithm](https://en.wikipedia.org/wiki/Forney_algorithm) 還原錯誤，其中兩者的時間複雜度皆為 $Θ(m^2)$ ## 如何實現 Reed–Solomon codes? > 此節內容譯自: [Reed–Solomon codes for coders](https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders) 在前面的章節中，我們從數學的角度探討了 Reed–Solomon codes 可行的原因，然而如果你跟我一樣數學都很差勁的話(萬分的抱歉...)，可能對於如何撰寫程式的層面上還是有點模糊。這裡我們引用另一篇文章，來理解相關的操作可以如何被實現。 ### 基本想法想像一下，資料在傳輸的過程中有資料遺失了，最終我們收到的資料叫作 `a p p * e`。不過我們可以猜到遺失的字母是甚麼，因為當我們把字典翻開，我們找到字典中最接近的字就是 `apple`。那麼如果我們收到的資料是 `b * d` 呢? 這時候它應該要是 `bad` 還是 `bed`? 為了解決這個問題，我們可以在原始資料後面加上一下額外的字母然後傳輸，例如： ``` bad -> badxyz bed -> bedqwe ``` 這時候，假如我們收到的資料有遺失，變成 `b * d x * z` ，那麼對照字典，最接近的字應該是 `badxyz`，所以我們就有足夠的證據猜測原始的資料應該是 `bad` 了。 ### BCH code [BCH code](https://en.wikipedia.org/wiki/BCH_code) 是 Reed–Solomon codes 經常使用的編碼形式，在前面的章節也有提到，可以簡單理解為是透過多項式的係數來表達資料。 #### BCH error detection BCH 的錯誤檢查類似整數長除法，不過減法的運算會改用 XOR。舉例來說，對於 000111101011001 這個 formating infomation 以及 10100110111 這個 generator，如果可以整除(餘數為 0)，那就是表示沒有錯誤發生，此時得到的商就是原始的資料，否則的話就表示有錯誤需要被修正。 :::info 因為原文中是以 QR 為例來說明的，關於 [formating infomation](https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders#Formatting_information) 請參考原文，還蠻有趣的XD，有空我就把完整內容整理上來。我們可以先簡單理解成 000111101011001 的前 5 個 bit 是真正的資料，後面的 10 個 bit 則是用來修正錯誤的編碼。 ::: ``` 00011 ---------------- 10100110111 ) 000111101011001 ^ 10100110111 -------------- 010100110111 ^ 10100110111 ------------- 00000000000 ``` 對應的 python code 實作如下 ```=python def qr_check_format(fmt): g = 0x537 # = 0b10100110111 in python 2.6+ for i in range(4,-1,-1): if fmt & (1 << (i+10)): fmt ^= g << i return fmt ``` 所以我們也可以反過來用此來編碼 5 bit 的 formating infomation ```=python format = 0x3 encoded_format = (format<<10) ^ qr_check_format(format<<10) print(hex(encoded_format)) # 反過來可以得到 formating infomation 000111101011001 ``` #### BCH error correction 如果得到的餘數不為 0，表示資料有誤，那麼我們就要 "猜出" 原本的資料為何。對於 BCH 的解碼有許多演算法，不過這裡我們可以用一個簡單的方式來理解：因為我們的原始資料只有 32 種可能，因此我們可以一一比較，並以 hamming distance 最小的那個作為猜出的結果 ```=python def hamming_weight(x): weight = 0 while x > 0: weight += x & 1 x >>= 1 return weight def qr_decode_format(fmt): best_fmt = -1 best_dist = 15 for test_fmt in range(0,32): test_code = (test_fmt<<10) ^ qr_check_format(test_fmt<<10) test_dist = hamming_weight(fmt ^ test_code) if test_dist < best_dist: best_dist = test_dist best_fmt = test_fmt elif test_dist == best_dist: best_fmt = -1 return best_fmt ``` 事實上 Reed–Solomon 就是依據類似的概念來解碼的，不過顯然資料很大的時候，我們就不能採取這種搜尋方法了，一些比較複雜的作法，如 Berlekamp-Massey，就必須被使用。 ### Finite field arithmetic 要了解 Reed–Solomon，就不得不對 [finite field](https://en.wikipedia.org/wiki/Finite_field) 有一定的了解。我們需要定義對於一個 8 bit 數字加法、減法、乘法、除法，且經過這些運算後仍為 8 bit，一個比較 naive 的作法是，我們可以定義這些運算和自然數的作法相同，並 mod 256 以避免 overflow，對於這些範圍的數字，我們就將其叫作 Galois Field 2^8，這個作法事實上是可以成立的(你可能會質疑在除法的可行性？下面會解釋之)。要簡單的說明甚麼是 Galois Fields 的話: 一個 finite field 就是在有限集合中的一群數字，在這個 field 裡，我們需要定義加法、減法、乘法、除法滿足: [封閉性](https://en.wikipedia.org/wiki/Closure_(mathematics))、[結合律](https://en.wikipedia.org/wiki/Associative_property)、[交換律](https://en.wikipedia.org/wiki/Commutative_property)、[分配律](https://en.wikipedia.org/wiki/Distributive_property)、[Identity](https://simple.wikipedia.org/wiki/Identity_Property) 和 inverse 六大 property。例如，實數域 ℝ 是一個 field，但整數域 ℤ 就不是，因為不滿足乘法的 inverse property(找不到整數域中的 x 可以滿足 x * 4 = 5)。為了處理此問題，一個可行的作法是用質數來當作 modulo，例如 257，或者是任何質數的正整數數次方。也就是說，整數域 ℤ modulo 任何的質數都可以被稱為 Galois Field。而 modulo 2 則是有趣的一類，例如這裡的例子中，因為 256 = $2^8$，這是質數 2 的 8 次方，這使得 GF(2^8) 的計算定義有一些特性。 :::warning 細節可以參考[原文](https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders#Introduction_to_mathematical_fields)以及 [Non-prime fields](https://en.wikipedia.org/wiki/Finite_field#Non-prime%20fields)，因我暫且還沒理解的很好，避免誤導就暫時不整理進來。 ::: #### 加減法對於 GF(2^n) ，加法和減法可以同時被 XOR 運算所替代，這很符合邏輯，因為相加 modulo 2 就像 XOR，而相減 modulo 2 和相加 modulo 2 並沒有不同。 ``` 0101 + 0110 = 0101 - 0110 = 0101 XOR 0110 = 0011 ``` 或者從多項式的角度去思考 $(x^2 + 1) + (x^2 + x) = 2 x^2 + x + 1 = 0 x^2 + x + 1 = x + 1$ 注意到這是因為我們是在 GF(2^n) 中，才可以用 XOR 取代加減法，在其他的 GF 中則不一定如此。 ```=python= def gf_add(x, y): return x ^ y def gf_sub(x, y): return x ^ y ``` #### 乘法乘法的運算是基於多項式的乘法，例如 10001001 乘以 00101010： $$ (x^7 + x^3 + 1) * (x^5 + x^3 + x^1)\\ = x^7 * (x^5 + x^3 + x^) + x^3 * (x^5 + x^3 + x^1) + 1 * (x^5 + x^3 + x^1)\\ = x^{12} + x^{10} + x^6 + x^5 + x^4 + x^3 + x^1 $$ 這等同於 ``` 10001001 * 00101010 ---------- 10001001 ^ 10001001 ^ 10001001 ------------- 1010001111010 ``` 注意要把加法替換成 XOR。顯然，此時得到的 bit 已經超過原本的 8 bits，因此我們需要 modulo 一個神祕的數字 100011101，同等的操作也可以寫成： ``` 1010001111010 ^ 100011101 ---------------- 0010110101010 ^ 100011101 ---------------- 00111011110 ^ 100011101 --------------- 011000011 ``` 對應的程式碼為： ```python def gf_mult_noLUT(x, y, prim=0): '''Multiplication in Galois Fields without using a precomputed look-up table (and thus it's slower) by using the standard carry-less multiplication + modular reduction using an irreducible prime polynomial''' ### Define bitwise carry-less operations as inner functions ### def cl_mult(x,y): '''Bitwise carry-less multiplication on integers''' z = 0 i = 0 while (y>>i) > 0: if y & (1<<i): z ^= x<<i i += 1 return z def bit_length(n): '''Compute the position of the most significant bit (1) of an integer. Equivalent to int.bit_length()''' bits = 0 while n >> bits: bits += 1 return bits def cl_div(dividend, divisor=None): '''Bitwise carry-less long division on integers and returns the remainder''' # Compute the position of the most significant bit for each integers dl1 = bit_length(dividend) dl2 = bit_length(divisor) # If the dividend is smaller than the divisor, just exit if dl1 < dl2: return dividend # Else, align the most significant 1 of the divisor to the most significant 1 of the dividend (by shifting the divisor) for i in range(dl1-dl2,-1,-1): # Check that the dividend is divisible (useless for the first iteration but important for the next ones) if dividend & (1 << i+dl2-1): # If divisible, then shift the divisor to align the most significant bits and XOR (carry-less subtraction) dividend ^= divisor << i return dividend ### Main GF multiplication routine ### # Multiply the gf numbers result = cl_mult(x,y) # Then do a modular reduction (ie, remainder from the division) with an irreducible primitive polynomial so that it stays inside GF bounds if prim > 0: result = cl_div(result, prim) return result ``` 為甚麼需要 mod 100011101 呢，其數學證明可能會有些複雜，不過簡而言之，這是為了確保 GF(2^8) 多項式是 "irreducible"(可以想成是一個質數的多項式，所以不能被任兩個 degree 小的多項式相乘而得到)，這個數字被稱做 primitive polynomial \ irreducible polynomial \ prime polynomial， 100011101 (0x11d) 是其中常用的一種。至於該如何生成這個數字呢，請參考 [Universal Reed-Solomon Codec](https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders/Additional_information#Universal_Reed-Solomon_Codec)。 #### Multiplication with logarithms 雖然前面的作法可以達成 GF 中數字的乘法，然而並不是一個高效率的作法，一個更好的作法是透過查表，不過8 bit 數字會有 256 種結果，我們也沒必要把 256 x 256 的乘法結果完全的紀錄起來。下面介紹一種更精簡的作法: 首先，定義 $\alpha$ 為一個 generator number = 2，則對於 $\alpha^i$ 的計算就只是對 $\alpha$ 做位元的位移，然後和 100011101 做 XOR (如果有 overflow) 運算而已。我們可以得到如下： $$ α^0 = 00000001 \space α^4 = 00010000 \space α^8 = 00011101\space α^{12} = 11001101\\ α^1 = 00000010 \space α^5 = 00100000 \space α^9 = 00111010 \space α^{13} = 10000111\\ α^2 = 00000100 \space α^6 = 01000000 \space α^{10} = 01110100 \space α^{14} = 00010011\\ α^3 = 00001000 \space α^7 = 10000000 \space α^{11} = 11101000 \space α^{15} = 00100110\\ $$ 如果依此類推建立表格，從 $\alpha^0$ 直到 $\alpha^{255}$ 前的數值都不會有重複(這是 GF 的特性)，也就是說，除了 0 之外的數都會是某個 $\alpha^i$。那麼，我們知道 10001001 乘以 00101010 就會是某兩個 $\alpha$ 相乘 $$ 10001001 * 00101010 = \alpha^{74} \times \alpha^{142} = \alpha^{216} = 11000011 $$ 因此我們只要建立起對於 $\alpha^i$ 查到的值，以及反過來對於一個值對應的 $\alpha^i$ 就可以有效的進行乘法運算了！ ```python gf_exp = [0] * 512 # Create list of 512 elements. In Python 2.6+, consider using bytearray gf_log = [0] * 256 def init_tables(prim=0x11d): '''Precompute the logarithm and anti-log tables for faster computation later, using the provided primitive polynomial.''' # prim is the primitive (binary) polynomial. Since it's a polynomial in the binary sense, # it's only in fact a single galois field value between 0 and 255, and not a list of gf values. global gf_exp, gf_log gf_exp = [0] * 512 # anti-log (exponential) table gf_log = [0] * 256 # log table # For each possible value in the galois field 2^8, we will pre-compute the logarithm and anti-logarithm (exponential) of this value x = 1 for i in range(0, 255): gf_exp[i] = x # compute anti-log for this value and store it in a table gf_log[x] = i # compute log at the same time x = gf_mult_noLUT(x, 2, prim) # If you use only generator==2 or a power of 2, you can use the following which is faster than gf_mult_noLUT(): #x <<= 1 # multiply by 2 (change 1 by another number y to multiply by a power of 2^y) #if x & 0x100: # similar to x >= 256, but a lot faster (because 0x100 == 256) #x ^= prim # substract the primary polynomial to the current value (instead of 255, so that we get a unique set made of coprime numbers), this is the core of the tables generation # Optimization: double the size of the anti-log table so that we don't need to mod 255 to # stay inside the bounds (because we will mainly use this table for the multiplication of two GF numbers, no more). for i in range(255, 512): gf_exp[i] = gf_exp[i - 255] return [gf_log, gf_exp] ``` 稍微瀏覽一下程式碼，我們可以看出其中的 `gf_exp` 是用來計算從 $\alpha^1$ 到 $\alpha^{256}$ 的數值的，而 `gf_log` 則是反過來的 `index`。也可以看到程式碼中紀錄的事實上是 $\alpha^1$ 到 $\alpha^{512}$，這是使我們在查表時 gf_log[x] + gf_log[y] 超過範圍也可以找到對應的 i。 ```python def gf_mul(x,y): if x==0 or y==0: return 0 return gf_exp[gf_log[x] + gf_log[y]] ``` #### 除法透過對 $\alpha$ 建立的表格，則除法的運算也就可以一併加速了 ```python def gf_div(x,y): if y==0: raise ZeroDivisionError() if x==0: return 0 return gf_exp[(gf_log[x] + 255 - gf_log[y]) % 255] ``` 注意這裡 `+ 255` 可以保證 index > 0 #### Power and Inverse 同樣的，冪次和 inverse 也可以一併加速 ```python def gf_pow(x, power): return gf_exp[(gf_log[x] * power) % 255] def gf_inverse(x): return gf_exp[255 - gf_log[x]] # gf_inverse(x) == gf_div(1, x) ``` #### Polynomials 接下來，我們還需要再定義一些對於多項式係數的操作。這裡可能會有點混亂(至少我是真的很混亂啦XD)，因為GF 的元素本身，做為多項式的係數也是個多項式，所以勢必不要搞混了。以下面的多項式為例： $$ 00000001 x^4 + 00001111 x^3 + 00110110 x^2 + 01111000 x^1 + 01000000\\= 01 x^4 + 0f x^3 + 36 x^2 + 78 x^1 + 40 $$ 在程式中，我們可以把這個係數在矩陣中寫成 [0x01, 0x0f, 0x36, 0x78, 0x40]，根據實作的不同考量，也可以把順序反過來。則第一個 function 用來與一個 scalar 相乘 ```python def gf_poly_scale(p,x): r = [0] * len(p) for i in range(0, len(p)): r[i] = gf_mul(p[i], x) return r ``` 將兩個多項式相加(記得用 XOR) ```python def gf_poly_add(p,q): r = [0] * max(len(p),len(q)) for i in range(0,len(p)): r[i+len(r)-len(p)] = p[i] for i in range(0,len(q)): r[i+len(r)-len(q)] ^= q[i] return r ``` 將兩個多項式相乘 ```python= def gf_poly_mul(p,q): '''Multiply two polynomials, inside Galois Field''' # Pre-allocate the result array r = [0] * (len(p)+len(q)-2) # Compute the polynomial multiplication (just like the outer product of two vectors, # we multiply each coefficients of p with all coefficients of q) for j in range(0, len(q)): for i in range(0, len(p)): r[i+j] ^= gf_mul(p[i], q[j]) # equivalent to: r[i + j] = gf_add(r[i+j], gf_mul(p[i], q[j])) # -- you can see it's your usual polynomial multiplication return r ``` 最後，我們需要一個可以對特定的 x 求出多項式值的 function。這裡使用了 [Horner's method](https://en.wikipedia.org/wiki/Horner%27s_method) 使得 $x^i$ 的計算可以迭代累積，不需要每次都重複計算 ```python def gf_poly_eval(poly, x): '''Evaluates a polynomial in GF(2^p) given the value for x. This is based on Horner's scheme for maximum efficiency.''' y = poly[0] for i in range(1, len(poly)): y = gf_mul(y, x) ^ poly[i] return y ``` 最後，我們還需要一個較為複雜的多項式除法，這個留待下一節來講解。 ## linux kernel 中的 Reed–Solomon codes 可喜可賀，我們終於來到 Reed–Solomon codes 的關鍵實作了。在前一節的文章中，使用了 python 實作了 Reed–Solomon codes，作為認識該如何實作 RS 是一個很好的出發點，十分建議先瀏覽一遍，這裡暫時不對該部分多做介紹了。這裡我們從另一個應用來嘗試了解 Reed-solomon codes 的實作，那就是 linux 中的 [linux/lib/reed_solomon/](https://github.com/torvalds/linux/tree/master/lib/reed_solomon)。沒錯! linux 中是有相關的 API 可以使用的。下面，我們就以 [Reed-Solomon Library Programming Interface](https://www.kernel.org/doc/html/latest/core-api/librs.html#reed-solomon-library-programming-interface) 為出發點，嘗試從 API 的使用深入到程式內部吧! ### `rs_control` 我們可以在 [linux/include/linux/rslib.h](https://github.com/torvalds/linux/blob/0fe5f9ca223573167c4c4156903d751d2c8e160e/include/linux/rslib.h) 找到相關的資料結構: ```cpp struct rs_control { struct rs_codec *codec; uint16_t buffers[]; }; ``` * `codec`: 整個 Reed-Solomon 的編碼相關參數 * `buffer`: 在 `decode_rs()` 內部會使用的 buffer ```cpp struct rs_codec { int mm; int nn; uint16_t *alpha_to; uint16_t *index_of; uint16_t *genpoly; int nroots; int fcr; int prim; int iprim; int gfpoly; int (*gffunc)(int); int users; struct list_head list; }; ``` rs_codec 則是整個 Reed-Solomon 中最關鍵的資料結構： * `mm`: 每個 symbol 總共的 bit 數量 * `nn`: 每個 block 中共包含幾個 symbol (就是 `(1<<mm)-1`) * `alpha_to`: 為了增加在 [Galois field](https://en.wikipedia.org/wiki/Finite_field) 中乘法運算的效率，會建立每個 $\alpha^i$ 的表 * `index_of`: 同樣考量效率，對於每個值，也要可以反回去找到對應的 $alpha^i$ * `genpoly`: 生成式的係數矩陣 * `nroots`: 生成多項式的 degree，也就是額外的 parity 數量 * `fcr`: first consecutive root(index form)。這可以使計算出生成多項式時的 consecutive root 改變，通常都設為 0 * 舉例來說，對於 degree 3 的生成多項式，如果指定 fcr = 3，計算出的生成多項式就是 $g(x) = \prod_{i=3}^{5} (x-\alpha^i)$ * 也因此有範圍限制( 舉例來說 255 for GF(2^8))) * `prim`: 表示用 Galois field 中的 primitive element / generator $\alpha^{prim}$ 來當 generator，參照 [Primitive element (finite field)](https://en.wikipedia.org/wiki/Primitive_element_(finite_field)) :::info 在 linux 的實作中，如果不考慮 `gffunc` 的使用，則 α 固定 = 2。輸入的 `prim` 是對此的 index，因此實際的 genearator $\beta$ = $\alpha^{prim}$，於是 $\beta^2 = \alpha^{2 * prim}$... 以此類推。為了方便解說，後面我們也會把這裡設定的 primitive element 稱為 $\beta$ ::: * `iprim`: ?? * `gfpoly`: [Primitive polynomial](https://en.wikipedia.org/wiki/Primitive_polynomial_(field_theory)) 的多項式，這是為了把多項式制約在 Galois field 中。以 `0x409 = 0b0100 0000 1001` 為例，表示 primitive polynomial 為 $x^{10}+x^3+1$ ### `init_rs` 想要使用 Reed–Solomon codes，首先我們需要先初始化相關的資料結構。 ```cpp static struct rs_control *rs_decoder; rs_decoder = init_rs (10, 0x409, 0, 1, 6); ``` 我們可以在 [linux/include/linux/rslib.h](https://github.com/torvalds/linux/blob/0fe5f9ca223573167c4c4156903d751d2c8e160e/include/linux/rslib.h) 找到其定義，首先先來看相關的參數設定: ```cpp static inline struct rs_control *init_rs(int symsize, int gfpoly, int fcr, int prim, int nroots) { return init_rs_gfp(symsize, gfpoly, fcr, prim, nroots, GFP_KERNEL); } struct rs_control *init_rs_gfp(int symsize, int gfpoly, int fcr, int prim, int nroots, gfp_t gfp) { return init_rs_internal(symsize, gfpoly, NULL, fcr, prim, nroots, gfp); } ``` `init_rs(int symsize, int gfpoly, int fcr, int prim, int nroots)` * `symsize`: 每個 symbol 的 bit 數量 :::warning 根據 [Reed-Solomon Library Programming Interface](https://www.kernel.org/doc/html/latest/core-api/librs.html) > The databytes are expanded to the given symbol size on the fly. There is no support for decoding continuous bitstreams with a symbolsize != 8 at the moment. If it is necessary it should be not a big deal to implement such functionality. ::: * 其他的參數基本上就對應資料結構 `rs_codec` 中的同名稱 `init_rs` 會呼叫 `init_rs_gfp`，`init_rs_gfp` 再呼叫 `init_rs_internal`，針對額外的參數說明： ```cpp init_rs_internal(int symsize, int gfpoly, int (*gffunc)(int), int fcr, int prim, int nroots, gfp_t gfp) ``` * `gffunc`: 在 [non-canonical](https://en.wikipedia.org/wiki/Canonical_form) 模式下生成不同的 $\alpha$，不過在 `init_rs` 下會被設為 NULL(對此的使用還沒概念，暫時先忽略qq) * `gfp`: 使用 [kmalloc](https://www.kernel.org/doc/htmldocs/kernel-api/API-kmalloc.html) 時的 flag，會影響記憶體配置的區域和行為不同 ### `init_rs_internal` 初始 `rs_control` 結構。 ```cpp if (symsize < 1) return NULL; if (fcr < 0 || fcr >= (1<<symsize)) return NULL; if (prim <= 0 || prim >= (1<<symsize)) return NULL; if (nroots < 0 || nroots >= (1<<symsize)) return NULL; ``` 幾個基本的檢查被進行： * `symsize` 是每個 symbol 的 bit 數量，因此不能小於 1 * `fcr` 的範圍，參見前面的說明 * `prim` 的範圍，因為 primitive element 不能是 $\alpha^0 = 1$ 也不能是 GF 外的 $\alpha^i$ * 每個 block 只會有 `(1<<symsize)-1` 個 symbol，`nroots` 大於這個範圍是不合理的，小於 0 顯然也是 ```cpp /* * The decoder needs buffers in each control struct instance to * avoid variable size or large fixed size allocations on * stack. Size the buffers to arrays of [nroots + 1]. */ bsize = sizeof(uint16_t) * RS_DECODE_NUM_BUFFERS * (nroots + 1); rs = kzalloc(sizeof(*rs) + bsize, gfp); if (!rs) return NULL; ``` * decode 的時候，會需要一個固定的大小的 heap buffer 空間，可以看到 rs_control 中使用了 [Arrays of Length Zero](https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html) 技巧，因此可以在執行時期才決定分配的空間。`RS_DECODE_NUM_BUFFERS` = 8。 ```cpp mutex_lock(&rslistlock); /* Walk through the list and look for a matching entry */ list_for_each(tmp, &codec_list) { struct rs_codec *cd = list_entry(tmp, struct rs_codec, list); if (symsize != cd->mm) continue; if (gfpoly != cd->gfpoly) continue; if (gffunc != cd->gffunc) continue; if (fcr != cd->fcr) continue; if (prim != cd->prim) continue; if (nroots != cd->nroots) continue; /* We have a matching one already */ cd->users++; rs->codec = cd; goto out; } ``` * [linux/include/linux/list.h](https://github.com/torvalds/linux/blob/master/include/linux/list.h) 是 linux kernel 中強大的 doubly linked list 結構，它對於 linked list 的操作進行了許多封裝，因此程式開發者可以很容易的建立自定義的 struct linked list > 延伸參考: [Linux鏈結串列struct list_head 研究](https://myao0730.blogspot.com/2016/12/linux.html) * 因為 list 結構是 non thread-safe，所以可以看到首先要透過 `mutex_lock` 保護 critical section * `list_for_each(tmp, &codec_list)` 走過整個 linked list，檢查是否已經存在相同的編碼結構，如果有，則直接 reference 一個相同的即可，不需要另外 init 一份 ```cpp /* Create a new one */ rs->codec = codec_init(symsize, gfpoly, gffunc, fcr, prim, nroots, gfp); if (!rs->codec) { kfree(rs); rs = NULL; } out: mutex_unlock(&rslistlock); return rs; ``` * 如果此結構不存在，則呼叫 `codec_init` 去建立一個新的結構 ### `codec_init` ```cpp int i, j, sr, root, iprim; struct rs_codec *rs; rs = kzalloc(sizeof(*rs), gfp); if (!rs) return NULL; INIT_LIST_HEAD(&rs->list); rs->mm = symsize; rs->nn = (1 << symsize) - 1; rs->fcr = fcr; rs->prim = prim; rs->nroots = nroots; rs->gfpoly = gfpoly; rs->gffunc = gffunc; /* Allocate the arrays */ rs->alpha_to = kmalloc_array(rs->nn + 1, sizeof(uint16_t), gfp); if (rs->alpha_to == NULL) goto err; rs->index_of = kmalloc_array(rs->nn + 1, sizeof(uint16_t), gfp); if (rs->index_of == NULL) goto err; rs->genpoly = kmalloc_array(rs->nroots + 1, sizeof(uint16_t), gfp); if(rs->genpoly == NULL) goto err; ``` * 初始化一個 `rs_codec` 所需的空間，並透過 `INIT_LIST_HEAD` 初始化這個 linked list 節點 * 初始化相關成員、矩陣，關於這些成員的作用請參考前面 ```cpp /* Generate Galois field lookup tables */ rs->index_of[0] = rs->nn; /* log(zero) = -inf */ rs->alpha_to[rs->nn] = 0; /* alpha**-inf = 0 */ if (gfpoly) { sr = 1; for (i = 0; i < rs->nn; i++) { rs->index_of[sr] = i; rs->alpha_to[i] = sr; sr <<= 1; if (sr & (1 << symsize)) sr ^= gfpoly; sr &= rs->nn; } } else { /* else case 在 init_rs_non_canonical 才會用到， * 尚未研究，請先讓我省略這部份 qq */ ``` * 定義 `rs->alpha_to[rs->nn] = 0` * 接著，我們需要初始化 $\alpha^0$、$\alpha^1$、...、$\alpha^{2^{symsize}-1}$ 對應的數值，以及反過來的查找，這個加速我們之後編碼與解碼的運算 * 因為 $\alpha$ = 2，因此每次對 $\alpha^i$ 的運算就是把 $\alpha^{i-1}$ 左移 1 bit * `if (sr & (1 << symsize))` 判斷是否有 overflow，如果有，則需要額外對 `gfpoly`，也就是 primitive polynomial 做 XOR 運算，最後則 `sr &= rs->nn` 以確保得到的數字仍在 GF 中 ```cpp /* If it's not primitive, exit */ if(sr != rs->alpha_to[0]) goto err; /* Find prim-th root of 1, used in decoding */ for(iprim = 1; (iprim % prim) != 0; iprim += rs->nn); /* prim-th root of 1, index form */ rs->iprim = iprim / prim; ``` * 如果 sr 不等於 $\alpha^0$，表示 $\alpha^i$ 建立起的集合並非 GF，有錯誤 * 這裡要計算出 decode 用到的 `iprim` (？？) ```cpp /* Form RS code generator polynomial from its roots */ rs->genpoly[0] = 1; for (i = 0, root = fcr * prim; i < nroots; i++, root += prim) { rs->genpoly[i + 1] = 1; /* Multiply rs->genpoly[] by @**(root + x) */ for (j = i; j > 0; j--) { if (rs->genpoly[j] != 0) { rs->genpoly[j] = rs->genpoly[j -1] ^ rs->alpha_to[rs_modnn(rs, rs->index_of[rs->genpoly[j]] + root)]; } else rs->genpoly[j] = rs->genpoly[j - 1]; } /* rs->genpoly[0] can never be zero */ rs->genpoly[0] = rs->alpha_to[rs_modnn(rs, rs->index_of[rs->genpoly[0]] + root)]; } ``` * genpoly 的 index 從 0 到 rs->nroots，就對應多項式從 1 到 $x^{nroots-1}$ * 建立生成多項式，我們知道生成多項式長成 $g(x) = \prod_{i=fcr}^{fcr+nroots} (x-\beta^i)$，這裡顯然是在對其進行計算 * 對於 `for (j = i; j > 0; j--)` 迴圈中的運算，可以理解為例如當下的 genpoly 對應的多項式可能是 $a_j+ a_{j-1}x^1...+ a_2x^{j-2} + a_1x^{j-1}$，然後要乘上下一個 $(x-\beta^{root})$，因此新的多項式係數，假設是 $x$ 項好了，就會變成 $a_{j-1} \times \beta^{root} + a_j \times 1$ * 呈上，$a_{j-1}$ 其實就是某一個 $\alpha^i$，而 $\beta^{root}$ 也是某個 $\alpha^j$，則因為 $\alpha^{(i+j)}$ 可能會超出能 index 到的範圍，因此就要去使用 `rs_modnn(rs,rs->index_of[rs->genpoly[j]] + root)` 去調整 ```cpp for (i = 0; i <= nroots; i++) rs->genpoly[i] = rs->index_of[rs->genpoly[i]]; rs->users = 1; list_add(&rs->list, &codec_list); return rs; err: kfree(rs->genpoly); kfree(rs->index_of); kfree(rs->alpha_to); kfree(rs); return NULL; ``` * 前面計算出來的 `genpoly` 是係數的真實值，為了方便計算，透過 `index_of` 改成用 index 來代表 * `rs->users = 1` 計算在 reference 這個結構的數量 ### `rs_modnn` ```cpp static inline int rs_modnn(struct rs_codec *rs, int x) { while (x >= rs->nn) { x -= rs->nn; x = (x >> rs->mm) + (x & rs->nn); } return x; } ``` * 如果 x 大於 `rs->nn = (1<<mm)-1`，表示 overflow 所以不存在定義的 GF 中，此時需調整成 `x % rs->nn` * 這裡對取餘運算進行了優化，可以參考[Mersenne Numbers: Mod 3, Mod 7, Mod ($2^i$ – 1)](http://homepage.divms.uiowa.edu/~jones/bcd/mod.shtml#exmod3) (沒有找到比較明確的數學證明，待補) ### `encode_rs8` 初始化 RS 相關的資料結構後，接著我們就可以來進行編碼了，下面以 `encode_rs8` 為範例 ```cpp /* Parity buffer. Size = number of roots */ uint16_t par[6]; /* Initialize the parity buffer */ memset(par, 0, sizeof(par)); /* Encode 512 byte in data8. Store parity in buffer par */ encode_rs8 (rs_decoder, data8, 512, par, 0); ``` ```cpp int encode_rs8(struct rs_control *rsc, uint8_t *data, int len, uint16_t *par, uint16_t invmsk) ``` * `rsc` 就是我們初始化過的結構 * `data` 是要被 encode 的資料，其長度為 `len` * `par` 則是呈裝 parity data 的地方，注意需要由使用者自行初始化 * `invmsk` 可以讓我們在編碼時加上一個 mask(關於為甚麼需要 mask，可以稍微參考 [Reed–Solomon codes for coders: Masking](https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders#Masking)) ```cpp pad = nn - nroots - len; if (pad < 0 || pad >= nn) return -ERANGE; ``` * 首先檢查 `nn - nroots - len`，也就是檢查資料數量加上 parity 數量是否超過總 symbol 數量，這是因為編碼數量必須小於 GF 的元素總量 $\lvert F\rvert$ ```c= for (i = 0; i < len; i++) { fb = index_of[((((uint16_t) data[i])^invmsk) & msk) ^ par[0]]; /* feedback term is non-zero */ if (fb != nn) { for (j = 1; j < nroots; j++) { par[j] ^= alpha_to[rs_modnn(rs, fb + genpoly[nroots - j])]; } } /* Shift */ memmove(&par[0], &par[1], sizeof(uint16_t) * (nroots - 1)); if (fb != nn) { par[nroots - 1] = alpha_to[rs_modnn(rs, fb + genpoly[0])]; } else { par[nroots - 1] = 0; } } ``` 以編碼而言，事實上就是把資料對生成多項式做長除法，並把餘數紀錄到 `par` 中，為了儘量說清楚，這裡我借用 [Polynomial_division](https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders#Polynomial_division) 為例來解說，以這個長除法為例，假設我們的資料是 `12 34 56`，生成多項式為 `01 0f 36 78 40`，然後其 `par` 的大小為 4 個 uint8: ``` 12 da df --------------------- 01 0f 36 78 40 ) 12 34 56 00 00 00 00 ^ 12 ee 2b 23 f4 ---------------- da 7d 23 f4 00 ^ da a2 85 79 84 ----------------- df a6 8d 84 00 ^ df 91 6b fc d9 ---------------- 37 e6 78 d9 ``` * 先假設沒有設 invmsk * 那麼最外層迴圈的第 1 層，i = 0，一開始的 `par` 成員就是 `00 00 00 00` ，`fb` 先取出 `12` * 在第 5 行的 for 迴圈要計算出後面的 `ee 2b 23`，這會是透過 `par[j] ^= alpha_to[rs_modnn(rs, fb + genpoly[nroots - j])]` 來得到 * 我們要計算 `12` * `0f 36 78` 為何。在這裡 `12` 就是 $\alpha^{fb}$，而 generator 的 `0f 36 78` 會對應某個 $\alpha^{genpoly[nroots - j]}$ * 因此我們要計算的 `12` * `0f 36 78` 對應每個 $\alpha^{fb+{genpoly[nroots - j]}}$ ，也就是程式中的 `alpha_to[rs_modnn(rs, fb + genpoly[nroots - j])];` * 至此的 par 是 `12 ee 2b 23`，接著我們把它向左位移 (也就是這裡的 `memmove`)，變成 `ee 2b 23 23` (雖然這裡我說位移，不過最後一個 byte 沒有變成 0 ，這是因為事實上是使用 memmove，不過稍候它就會被覆蓋掉) * 第 13 行的概念如前，因此 par 變成 `ee 2b 23 f4` * 來到最外層迴圈的第二層，i = 1，fb 會是 `34 ^ ee = da`，然後就依尋前面的步驟，一直做到處理完整個 data * 最後得到的餘數會在 parity 中，那就是編碼的 parity :::warning 我知道可能會有點混亂...建議先把多項式長除法的思路理清。 ::: ### `decode_rs8` 最複雜的部份應該就是這裡了。對於 RS 的 decode，可以有許多不同的演算法被應用。不過我們可以大致總結成 5 點： 1. 計算出 syndromes polynomial，透過 Berlekamp-Massey 等演算法檢查資料是否有錯誤產生 2. 根據 syndromes 計算出 erasure/error locator polynomial，用 Berlekamp-Massey 找到確切是哪些位置產生了錯誤 3. 根據前兩者算出 erasure/error evaluator polynomial 有算出有多少 symbol 被竄改 4. 由前3個多項式計算 erasure/error magnitude polynomial， 5. 你可想像成這個 polynomial 就是揪出來的 noise，則我們把 noise 從接收到的資料後去除後，就可得到原始的資料 :::danger decode 的部份複雜到招架不住了 qq 這裡我會盡量把我可以理解的相關說明整理進來，不過更多的是各種我殘破的數學無法理解的演算法，只能說少時不努力，老大徒傷悲啊... ::: ```cpp int decode_rs8(struct rs_control *rsc, uint8_t *data, uint16_t *par, int len, uint16_t *s, int no_eras, int *eras_pos, uint16_t invmsk, uint16_t *corr) ``` * `rsc` 就是我們初始化好的 `rs_control` 結構 * 輸入長度為 `len` 的 `data` 跟 `par` * `s`: 解碼的過程中會需要 syndromes polynomial，找出發生錯誤的位置，因為 syndromes 的計算可能可以用硬體來計算，因此 API 提供此參數可以直接依據這個 syndromes 來修正錯誤 * 下面的範例先假設這個欄位為 NULL，也就是需要軟體負責計算 syndromes * `no_eras`、`eras_pos`: 有些 channel 中，錯誤會發生的情形是可以預知的，這種錯誤被稱做 erasure。要修正一個 erasure，只要添加一個 symbol 就可以做 [erasure correction](https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders#Erasure_correction) 了 * 因此，我們可以告知 decoder 相關的訊息來做 erasure 修正 * 如果不知道錯誤的位置，那就稱為 error，這種情況下，就需要添加 2 個 symbol 才可以做修正 * `invmsk` 就等於我們在 encode 時所使用的 * `corr`：根據 `eras_pos` 可以要做 correction 相關的 bitmsk * 所以這個 API 可以不輸入 data 和 parity，僅輸入 `s`、`eras_pos` 等得到 `corr` 的 mask，再透過自定義的程式去處理問題 ```cpp enum { RS_DECODE_LAMBDA, RS_DECODE_SYN, RS_DECODE_B, RS_DECODE_T, RS_DECODE_OMEGA, RS_DECODE_ROOT, RS_DECODE_REG, RS_DECODE_LOC, RS_DECODE_NUM_BUFFERS }; ... /* * The decoder buffers are in the rs control struct. They are * arrays sized [nroots + 1] */ uint16_t *lambda = rsc->buffers + RS_DECODE_LAMBDA * (nroots + 1); uint16_t *syn = rsc->buffers + RS_DECODE_SYN * (nroots + 1); uint16_t *b = rsc->buffers + RS_DECODE_B * (nroots + 1); uint16_t *t = rsc->buffers + RS_DECODE_T * (nroots + 1); uint16_t *omega = rsc->buffers + RS_DECODE_OMEGA * (nroots + 1); uint16_t *root = rsc->buffers + RS_DECODE_ROOT * (nroots + 1); uint16_t *reg = rsc->buffers + RS_DECODE_REG * (nroots + 1); uint16_t *loc = rsc->buffers + RS_DECODE_LOC * (nroots + 1); /* Check length parameter for validity */ pad = nn - nroots - len; BUG_ON(pad < 0 || pad >= nn - nroots); ``` * 首先，還記得我們預先拿到的 `rs->buffers` 嗎？當時我們 allocate 了 `8 * (nroots + 1)` uint16_t 大小的空間，現在我們把它切在 8 等份給不同的需求使用 * encode 時相同的理由，做對應的檢查 ```cpp /* Does the caller provide the syndrome ? */ if (s != NULL) { for (i = 0; i < nroots; i++) { /* The syndrome is in index form, * so nn represents zero */ if (s[i] != nn) goto decode; } /* syndrome is zero, no errors to correct */ return 0; } ``` * 如果 caller 有輸入 syndrome 參數的話，檢查 syndrome 是否為 0 * 因為 syndrome 是 index 的，而我們在初始化時也定義了 `rs->alpha_to[rs->nn] = 0`，所以這邊只要檢查 syndrome 是否等於 `nn` 即可 ```cpp /* form the syndromes; i.e., evaluate data(x) at roots of * g(x) */ for (i = 0; i < nroots; i++) syn[i] = (((uint16_t) data[0]) ^ invmsk) & msk; for (j = 1; j < len; j++) { for (i = 0; i < nroots; i++) { if (syn[i] == 0) { syn[i] = (((uint16_t) data[j]) ^ invmsk) & msk; } else { syn[i] = ((((uint16_t) data[j]) ^ invmsk) & msk) ^ alpha_to[rs_modnn(rs, index_of[syn[i]] + (fcr + i) * prim)]; } } } for (j = 0; j < nroots; j++) { for (i = 0; i < nroots; i++) { if (syn[i] == 0) { syn[i] = ((uint16_t) par[j]) & msk; } else { syn[i] = (((uint16_t) par[j]) & msk) ^ alpha_to[rs_modnn(rs, index_of[syn[i]] + (fcr+i)*prim)]; } } } s = syn; /* Convert syndromes to index form, checking for nonzero condition */ syn_error = 0; for (i = 0; i < nroots; i++) { syn_error |= s[i]; s[i] = index_of[s[i]]; } if (!syn_error) { /* if syndrome is zero, data[] is a codeword and there are no * errors to correct. So return data[] unmodified */ return 0; } ``` * 如果沒有提供 `s` 參數，我們就要自行計算 syndromes polynomial * syndromes polynomial 的計算就是把 data(包含資料部份和 parity) 當成多項式，然後代入 $\beta^{fcr}$ 到 $\beta^{fcr+nroots}$ 看看是否為 0 * 這邊有使用 [Horner's method](https://en.wikipedia.org/wiki/Horner%27s_method) 避免直接計算 $x^n$ 做優化 * 把計算完的轉換為 index 形式方便後續的處理 * `syn_error` 計算是不是所有 syndrome 都是 0，如果不是就要去修正 ```cpp decode: memset(&lambda[1], 0, nroots * sizeof(lambda[0])); lambda[0] = 1; if (no_eras > 0) { /* Init lambda to be the erasure locator polynomial */ lambda[1] = alpha_to[rs_modnn(rs, prim * (nn - 1 - (eras_pos[0] + pad)))]; for (i = 1; i < no_eras; i++) { u = rs_modnn(rs, prim * (nn - 1 - (eras_pos[i] + pad))); for (j = i + 1; j > 0; j--) { tmp = index_of[lambda[j - 1]]; if (tmp != nn) { lambda[j] ^= alpha_to[rs_modnn(rs, u + tmp)]; } } } } for (i = 0; i < nroots + 1; i++) b[i] = index_of[lambda[i]]; ``` * 如果已知哪些位置出錯，如前所述要做 erasure correction * 首先初始 $\Lambda(x)$ = 1 * 如果有 erasure，則要計算 erasure locator polynomial $\Lambda(x) = \prod (1-\beta^ix)$ * 其中 i 的值是根據錯誤位置計算 * 要注意到例如我們的資料是 `01 02 03 04`，對應的多項式是 $01x^3 + 02x^2 + 03x + 04$，index 0 的地方是多項式的最高次方，但是如果錯誤是發生在 `03`(index = 2) 的地方，要代入的是 $\beta^1$ ，`nn - 1 - (eras_pos[0] + pad)` 就是在做此計算 * 則迴圈的計算邏輯就和前面的 g(x) 計算大致相同 * `b` 對應 Berlekamp-Massey algorithm 的多項式 B(x)，用 lamda 的 $\alpha$ index 初始化 ```cpp /* * Begin Berlekamp-Massey algorithm to determine error+erasure * locator polynomial */ r = no_eras; el = no_eras; while (++r <= nroots) { /* r is the step number */ /* Compute discrepancy at the r-th step in poly-form */ discr_r = 0; for (i = 0; i < r; i++) { if ((lambda[i] != 0) && (s[r - i - 1] != nn)) { discr_r ^= alpha_to[rs_modnn(rs, index_of[lambda[i]] + s[r - i - 1])]; } } discr_r = index_of[discr_r]; /* Index form */ ... ``` * 我們需要透過 [Berlekamp-Massey algorithm](https://en.wikipedia.org/wiki/Berlekamp%E2%80%93Massey_algorithm) 來找出未知位置的錯誤(error locator polynomial) * 迴圈要進行 `nroots` 減去已知的錯誤次(換句話說，如果沒有 erasure 的話就是進行 nroots 次) * 首先要計算每一步的 discrepancy，這個數字等於 $\sum_{i=0}^{r}\Lambda_iS_{r-i-1}$ ```cpp ... if (discr_r == nn) { /* 2 lines below: B(x) <-- x*B(x) */ memmove (&b[1], b, nroots * sizeof (b[0])); b[0] = nn; } else { /* 7 lines below: T(x) <-- lambda(x)-discr_r*x*b(x) */ t[0] = lambda[0]; for (i = 0; i < nroots; i++) { if (b[i] != nn) { t[i + 1] = lambda[i + 1] ^ alpha_to[rs_modnn(rs, discr_r + b[i])]; } else t[i + 1] = lambda[i + 1]; } if (2 * el <= r + no_eras - 1) { el = r + no_eras - el; /* * 2 lines below: B(x) <-- inv(discr_r) * * lambda(x) */ for (i = 0; i <= nroots; i++) { b[i] = (lambda[i] == 0) ? nn : rs_modnn(rs, index_of[lambda[i]] - discr_r + nn); } } else { /* 2 lines below: B(x) <-- x*B(x) */ memmove(&b[1], b, nroots * sizeof(b[0])); b[0] = nn; } memcpy(lambda, t, (nroots + 1) * sizeof(t[0])); } } ``` * `discr_r == nn` 表示原求的 discrepancy `d` 為 0 * 此時 $B(x) \Longleftarrow x*B(x)$ * 然後前進到下個 iteration * 如果 discrepancy 不為 0: * $\Lambda'(x) \Longleftarrow \Lambda(x) - d \times x \times B(x)$ * 可以看到程式中的 `t` 就是 $\Lambda(x) - d \times x \times B(x)$ 的暫存空間 $\Lambda'(x)$ * 每個 iteration 中的 `r + no_eras` 表示目前是第幾個 iteration (從 1 開始數) * `el` 的用途是用來判斷某些 iteration 中需要額外的調整 `B(x)`，在條件滿足的 `el` 下： * $el \Longleftarrow$ 現在的 iteration 數量 $- \space el$ * 此時 $B(x) \Longleftarrow d^{-1}\Lambda(x)$ * 除此之外 $B(x) \Longleftarrow x*B(x)$ * 最後，$\Lambda(x) \Longleftarrow \Lambda'(x)$，然後進入下一個 iteration ```cpp /* Convert lambda to index form and compute deg(lambda(x)) */ deg_lambda = 0; for (i = 0; i < nroots + 1; i++) { lambda[i] = index_of[lambda[i]]; if (lambda[i] != nn) deg_lambda = i; } if (deg_lambda == 0) { /* * deg(lambda) is zero even though the syndrome is non-zero * => uncorrectable error detected */ return -EBADMSG; } ``` * 最終，我們會得到一個 erasure /error locator polynomial $\Lambda(x)$ * 把 lambda 的值轉換成 index 形式，並且計算是幾階的多項式 `deg_lambda` * 如果結果為 0 階但是 syndrome 卻是非零值，那表示此錯誤不能被修正 ```cpp /* Find roots of error+erasure locator polynomial by Chien search */ memcpy(&reg[1], &lambda[1], nroots * sizeof(reg[0])); count = 0; /* Number of roots of lambda(x) */ for (i = 1, k = iprim - 1; i <= nn; i++, k = rs_modnn(rs, k + iprim)) { q = 1; /* lambda[0] is always 0 */ for (j = deg_lambda; j > 0; j--) { if (reg[j] != nn) { reg[j] = rs_modnn(rs, reg[j] + j); q ^= alpha_to[reg[j]]; } } if (q != 0) continue; /* Not a root */ if (k < pad) { /* Impossible error location. Uncorrectable error. */ return -EBADMSG; } /* store root (index-form) and error location number */ root[count] = i; loc[count] = k; /* If we've already found max possible roots, * abort the search to save time */ if (++count == deg_lambda) break; } ``` * 則對於 erasure /error locator polynomial $\Lambda(x)$，其根($\beta^i$ 使得 $\Lambda(x)$ = 1) 就表示錯誤發生的地方是 len(msg) - 1 - i * 這裡使用了 [Chien search](https://en.wikipedia.org/wiki/Chien_search) 加速求根 * ??? ```cpp if (deg_lambda != count) { /* * deg(lambda) unequal to number of roots => uncorrectable * error detected */ return -EBADMSG; } /* * Compute err+eras evaluator poly omega(x) = s(x)*lambda(x) (modulo * x**nroots). in index form. Also find deg(omega). */ deg_omega = deg_lambda - 1; for (i = 0; i <= deg_omega; i++) { tmp = 0; for (j = i; j >= 0; j--) { if ((s[i - j] != nn) && (lambda[j] != nn)) tmp ^= alpha_to[rs_modnn(rs, s[i - j] + lambda[j])]; } omega[i] = index_of[tmp]; } ``` * 如果找到的根數量和 $\Lambda$ 的階數不一致，表示錯誤無法被修正 * 接下來要求 error / evaluator polynomial $\Omega(x) = S(x) * \Lambda(x) \space mod \space x^{nroots}$，且存成 index form ```cpp /* * Compute error values in poly-form. num1 = omega(inv(X(l))), num2 = * inv(X(l))**(fcr-1) and den = lambda_pr(inv(X(l))) all in poly-form * Note: we reuse the buffer for b to store the correction pattern */ num_corrected = 0; for (j = count - 1; j >= 0; j--) { num1 = 0; for (i = deg_omega; i >= 0; i--) { if (omega[i] != nn) num1 ^= alpha_to[rs_modnn(rs, omega[i] + i * root[j])]; } if (num1 == 0) { /* Nothing to correct at this position */ b[j] = 0; continue; } num2 = alpha_to[rs_modnn(rs, root[j] * (fcr - 1) + nn)]; den = 0; /* lambda[i+1] for i even is the formal derivative * lambda_pr of lambda[i] */ for (i = min(deg_lambda, nroots - 1) & ~1; i >= 0; i -= 2) { if (lambda[i + 1] != nn) { den ^= alpha_to[rs_modnn(rs, lambda[i + 1] + i * root[j])]; } } b[j] = alpha_to[rs_modnn(rs, index_of[num1] + index_of[num2] + nn - index_of[den])]; num_corrected++; } ``` * [Forney algorithm](https://en.wikipedia.org/wiki/Forney_algorithm)?? (不確定有沒有關聯) * ?? ```cpp /* * We compute the syndrome of the 'error' and check that it matches * the syndrome of the received word */ for (i = 0; i < nroots; i++) { tmp = 0; for (j = 0; j < count; j++) { if (b[j] == 0) continue; k = (fcr + i) * prim * (nn-loc[j]-1); tmp ^= alpha_to[rs_modnn(rs, index_of[b[j]] + k)]; } if (tmp != alpha_to[s[i]]) return -EBADMSG; } ``` * 經過前面的步驟後，這裡我們會得到誤差多項式 e(x)(重用 `b` 來保存係數) $$ e(x) = e_0 + e_1x + ... + e_{n-1}x^{(n-1)} $$ * 理論上，我們前面得到的 syndromes $S_i$ 應該要等於 $e(\beta^i)$，這裡再做額外的檢查以確保正確性 ```cpp /* * Store the error correction pattern, if a * correction buffer is available */ if (corr && eras_pos) { j = 0; for (i = 0; i < count; i++) { if (b[i]) { corr[j] = b[i]; eras_pos[j++] = loc[i] - pad; } } } else if (data && par) { /* Apply error to data and parity */ for (i = 0; i < count; i++) { if (loc[i] < (nn - nroots)) data[loc[i] - pad] ^= b[i]; else par[loc[i] - pad - len] ^= b[i]; } } return num_corrected; ``` * 如果 input 是 `corr` 和 `eras_pos` 的話，就把該誤差多項式填入 `corr` 並把有錯的位置填入 `eras_pos`，當成回傳 * 如果是 input 是 `data` 和 `par` 的話，那就是看誤差項的位置在 `data` 還是 `par`，並且將結果修正回去 ## 後記在看完 Reed–Solomon codes 的實作之後，就發現自己的數學太不扎實了，導致許多地方，特別是 decode 的部分，因為涉及許多優化的數學演算法，都沒辦法很精準地解釋。希望這些部分有機會也可以慢慢的補起來，然後再回頭更精準的理解程式的行為和背後的目的。 ## Reference * [錯誤更正碼簡介](https://w3.math.sinica.edu.tw/math_media/d184/18404.pdf) * [Hamming(7,4)](https://en.wikipedia.org/wiki/Hamming(7,4)) * [Reed-Solomon error-correcting code decoder](https://www.nayuki.io/page/reed-solomon-error-correcting-code-decoder#preliminaries) * [Reed–Solomon error correction(wikipedia)](https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction) * [Reed–Solomon codes for coders](https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders)