Algebraic coding theory

# Algebraic coding theory lecture： 0813119 王斾頤 --- ### Agenda * 16.5 Elements of Coding Theory * 16.6 The Hamming Metric * 16.7 The Parity-Check and Generator Matrices --- ## 16.5 Elements of Coding Theory ### History ![](https://i.imgur.com/2HDUTRY.png =200x) :::warning **起源：** 一般來說，我們可以把通訊過程以下面這個模型 (model)來表示： ![](https://i.imgur.com/ObuH73t.png =400x) 資料 (data) 會經由通道 (channel)傳輸。而通道中存在著許多干擾或雜訊, 經過通道之後, 資料就會變成有雜訊的資料(noisy data)。 ::: * 1948年，Shannon發表了《通信的數學理論》(A mathematical theory of communications)，啟發了代數編碼理論。 * 主要會探討在有雜訊(noise)的情況下0、1字串的傳輸，偵測錯誤及修改錯誤等等。 ![](https://i.imgur.com/dB8aLPB.png =500x) --- ### The Binary Symmetric Channel * 此處的signal以二元(0、1)標示，故稱為binary。 * 若傳輸正確，則傳 0 得 0，傳 1 得 1，傳輸錯誤反之。 * 設 $p$ 代表傳輸錯誤的機率，而 1$-p$ 為傳輸正確的機率。 * 當0、1雙方傳輸錯誤的機率 $p$ 是相同的，稱為symmetric(對稱)。 ![](https://i.imgur.com/ojTpdaT.png =300x) --- ### EXAMPLE 16.19 * Consider the string $c =$ 10110. * We regard $c$ as an element of the group $Z^{5}_{2}$, formed from the direct product of five copies of ($Z_{2}$, $+$). * When sending each bit of $c$ through the binary symmetric channel,we assume that **the probability of incorrect transmission is $p =$ 0.05** * the transmission of each signal does not depend in any way on the transmissions of prior signals ,so that **the probability of transmitting $c$ with no errors is (0.95)$^5 \doteq$ 0.77.** ::: warning #### $Z^{5}_{2}$ $\rightarrow$ 2 : 由0, 1兩種元素組成; 5 : String 的 length ::: --- ### EXAMPLE 16.19 > What is the probability that the party receiving the five-bit message receives the string $r=$ 00110 ($c$ with an error in the first position)? > * **(0.05)(0.95)$^4 \doteq$ 0.041 is the probability of sending $c =$ 10110 and receiving $r =$ 00110.** * With $e =$ 10000, we can write $c + e = r$ and interpret $r$ as the result of the sum of the original message $c$ and the particular error pattern $e =$ 10000. We have $c + r = e$ and $r + e = c$. :::warning $c =$ 10110, $r =$ 00110, $e =$ 10000 $\rightarrow c + r =$ 10110 $+$ 00110 $= e =$ 10000 $\rightarrow r + e =$ 00110 $+$ 10000 $= c =$ 10110 ::: > Finally if we transmit $c =$ 10110, what is the probability that $r$ differs from $c$ **in exactly 2 places**? * There are $\dbinom{5}{2}$ such patterns,so the probability of 2 errors in transmission is given by $\dbinom{5}{2} (0.05)^2(0.95)^3 \doteq$ 0.021. --- ### THEOREM 16.10 Let $c \in Z^{n}_{2}$. For the transmission of $c$ through a binary symmetric channel with probability $p$ of incorrect transmission, 1. the probability of receiving $r = c + e$, where $e$ is a particular error pattern **consisting of $k$ 1’s and ($n - k$) 0’s**, is $p^k (1 − p)^{n−k}$. 2. the probability that (exactly) $k$ errors are made in the transmission is $\dbinom{n}{k} p^k (1 − p)^{n−k}$ --- ### Improve the accuracy > To improve the accuracy of transmission in a binary symmetric channel, certain types of coding schemes can be used where **extra bits are provided**. * For $m$,$n \in Z^+$, let $n > m$. Consider $\phi \ne W \subseteq Z^{m}_{2}$ * The set $W$ consists of the messages to be transmitted. To each $w \in W$ are **appended $n – m$ extra bits to form the code word $c$**, where $c \in Z^{n}_{2}$. :::warning This process is called **encoding** and is represented by function $E: W \rightarrow Z^{n}_{2}$. Then $E(w) = c$ and $E (W)= C \subseteq Z^{n}_{2}$ ::: Upon transmission, $c$ is received as $T(c)$ , where $T(c) \in Z^{n}_{2}$. Unfortunately, $T$ is not a function because $T(c)$ may be different at different transmission times (for the noise in the channel changes with time). :::warning Upon receiving $T(c)$ , we want to apply a decoding function $D: Z^{n}_{2} \rightarrow Z^{m}_{2}$ to remove the extra bits and, we hope, obtain the original message $w$. ::: Since this cannot be expected, we seek functions $E$ and $D$ such that there is a high probability of correctly decoding the received word $T(c)$ and recapturing the original message $w$. ![](https://i.imgur.com/5e7yupb.png) --- **We want the ratio $\frac{m}{n}$ to be as large as possible** so that an excessive number of bits are not appended to $w$ in getting the code word $c = E(w)$. This ratio $\frac{m}{n}$ **measures the efficiency of our scheme** and is called **the rate of the code**. :::warning $\cfrac{m}{n}$ = $\cfrac{w 的length}{c 的length}$ $\longrightarrow$ Extra bits加越少，efficiency越好 ::: --- ### EXAMPLE 16.20 > Consider the $(m + 1, m)$ block code for $m =$ 8. Let $W = Z^{8}_{2}$. > For each $w = w_1 w_2 \cdots w_8 \in W$, define $E: Z^{8}_{2} \rightarrow Z^{9}_{2}$ by $E (w) = w_1 w_2 \cdots w_8 w_9$, where $w_9 = \sum^{8 }_{i=1} w_i$. For example, $E (11001101)=$ 110011011, and $E (00110011) =$ 001100110. For all $w \in Z^{8}_{2}$, **$E (w)$ contains an even number of 1’s**. So for $w$ = ++11010110++ and $E (w) =$ ++11010110++1, if we receive $T (c) = T(E(w))$ as 100101101, from the odd number of 1’s in $T(c)$ we know that a mistake has occurred. Hence we are able to detect ==single error== in transmission. But we seem to have no way to correct such errors. The probability of sending the code word ++11010110++1 and making **at most one error 最多只有 1bit 錯誤**(全部沒錯~1個錯) in transmission is $(1 − p) ^9 + \dbinom{9}{1} p (1 − p)^8$. For $p =$ 0.001 this gives $(0. 999)^9 + \dbinom{9}{1}(0.001)(0.999)^8 \doteq 0.99996417$. --- If we detect an error and we are able to relay a signal back to the transmitter to repeat the transmission of the code word, and **continue this process until the received word has an even number of 1’s**, then the probability of sending the code word ++11010110++1 and receiving the correct transmission is approximately 0.99996393. :::warning **不斷重複發送直到通過檢測（偶數個1）:** $p_{odd} = \dbinom{9}{1}(0.999)^8(0.001) + \dbinom{9}{3}(0.999)^6(0.001)^3 + \cdots + \dbinom{9}{9}(0.001)^9 \doteq 0.008928251$ $\rightarrow q + q \times p_{odd} + q \times (p_{odd})^2 + \cdots = q / (1-p_{odd}) \doteq 0.99996393$ ::: --- Should an even positive number of errors occur in transmission, $T(c)$ is unfortunately accepted as the correct code word and we interpret its first eight components as the original message. This scheme is called the **$(m + 1, m)$ parity-check code** and is appropriate ==only when multiple errors are not likely to occur==. --- If we send the message ++11010110++ through the channel, we have probability (0.999)$^8=$ 0.99202794 of correct transmission. By using this parity-check code, we increase our chances of getting the correct message to 0.99996393. However, an extra signal is sent and **the rate of the code has decreased from 1 to 8/9**. --- ### EXAMPLE 16.21 The $(3m, m)$ **tripple repetition code** is one where we can ==both detect and correct single errors== in transmission. With $m =$ 8 and $W = Z^8_2$, we define $E: Z^8_2 \rightarrow Z^{24}_2$ by $E(w_1w_2 \cdots w_7w_8) = w_1w_2 \cdots w_8w_1w_2 \cdots w_8w_1w_2 \cdots w_8$. if $w =$ 10110111, then $c =$++==1==0110111++ ++==1==0110111++ ++==1==0110111++ :::warning 電腦以位元為單位。針對一個特定位元，只要所有對應的位元（e.g. 第1、9、17位），正確位元過半，就能正確更正。**雖然the rate of the code下降（$\cfrac{8}{9} \rightarrow \cfrac{8}{24}$）增加了傳輸的時間，卻讓重複重送的機會下降！** ::: --- ## 16.6 The Hamming Metric * In this section we develop the general principles for discussing the error-detecting and error-correcting capabilities of a coding scheme. * These ideas were developed by Richard Wesley Hamming (1915–1998). ### A Scenario We start by considering a code $C \subseteq Z^{4}_{2}$, where $c_1 =$ 0111, $c_2 =$ 1111 $\in C$. Now both the transmitter and the receiver know the elements of $C$. ![](https://i.imgur.com/MvIyN8J.png ) :::danger :question: **Question** 兩個code words非常相近(1 bit之差): :point_right: 被傳錯並被誤認的可能性是否隨之上升？ ::: --- ### Definition 16.9 * For each element $x = x_1x_2 \cdots x_n \in Z^{n}_{2}$, where $n \in Z^+$, **the weight of $x$, denoted $wt (x)$, is the number of components $x_i$ of $x$**, for $1 $n =$ 5, let $x =$ 01001 and $y =$ 11101 > $wt (x)$ = 2, $wt (y) =$ 4, and $d (x, y) =$ 2 > $x + y =$ 10100, so $wt (x + y) =$ 2 For each $1 < i <5$, $x_i + y_i$ contributes a count of 1 to $wt (x + y) \Leftrightarrow x_i \ne y_i \Leftrightarrow x_i$, $y_i$ contribute a count of 1 to $d (x, y)$ for all $x$, $y \in Z^{n}_{2}$ ![](https://i.imgur.com/635TWGT.png =200x) When $x$, $y \in Z^{n}_{2}$, we write $d(x, y) =\sum^{n}_{i=1} d(x_i, y_i)$ where, for each $1 For all $x$, $y \in Z^{n}_{2}$, $wt(x + y) \le wt(x) + wt(y)$ ==可視為一個三角不等式== :::info Proof: We prove this lemma by **examining**, for each $1 < i < n$, the components $x_i$, $y_i$, $x_i + y_i$, of $x$, $y$, $x + y$, respectively. Only one situation would cause this inequality to be false: if $x_i + y_i=$ 1 while $x_i =$ 0 and $y_i =$ 0, for some $1 < i < n$. 但 $x_i + y_i =$ 1 代表 $x_i$, $y_i$ 兩者必為相異0, 1組合，故前句陳述不成立For all $x$, $y$ $\in$ $Z^{n}_{2}$, $wt (x + y)$ ≤ $wt (x)$ + $wt (y)$ ，得證 ::: 以先前 **EXAMPLE 16.22** 為例($x =$ 01001, $y =$ 11101)： $wt (x + y) = wt (10100) =$ 2 $<$ 2 $+$ 4 $= wt (01001) + wt (11101) = wt (x) + wt (y)$ --- ### THEOREM 16.11 The distance function $d$ defined on $Z^{n}_{2} \times Z^{n}_{2}$, satisfies the following for all $x, y, z \in Z^{n}_{2}$. 1. $d(x, y) \ge$ 0 2. $d(x, y) =$ 0 $\Leftrightarrow x = y$ 3. $d(x, y) = d(y, x)$ 4. $d(x, z) \le d(x, y) + d(y, z)$ **Proof of 4 :** In $Z^{n}_{2}$, $y + y =$ 0, so $d(x, z) = wt(x + z)= wt(x + (y + y) + z) = wt((x + y) +(y + z)) \le wt(x + y) + wt(y + z)$ :::warning When a function satisfies the four properties mentioned above, it is called a **distance function** or **metric**, and we called $(Z^n_2, d)$ a **metric space**. $d$ is often referred to as **Hamming metric**. :point_right: 我們可以定義一個 minimum distance, 就是任何在這個碼中的兩個不同的 codewords 的 Hamming distance, 它們之中最小的值就是這個碼的 minimum distance. ::: --- ### Defintion 16.10 For $n$, $k \in Z^{+}$ and $x \in Z^{n}_{2}$, the sphere of radius $k$ centered at $x$ is defined as $S(x, k) = \{y \in Z^{n}_{2}| d(x, y) \le k\}$ ### EXAMPLE 16.23 For $n =$ 3 and $x =$ 110 $\in Z^{3}_{2}$, $S(x,1) = \{ 110, 010, 100, 111 \}$(與 $x$ 的 $d=$ 0, 1) and $S(x,2) = \{ 110, 010, 100, 111, 000, 101, 011 \}$(與 $x$ 的 $d=$ 0, 1, 2) --- ### THEOREM 16.12 Let $E:W \rightarrow C$ be an encoding function with the set of messages $W \in Z^{m}_{2}$ and the set of code words $E(W) = C \subseteq Z^{n}_{2}$, where $m<n$. If our objective is **error detection**, then for $k \in Z^{+}$,we can detect all transmission errors of weight $\le k$ if and only if the minimum distance between code words is at least $k +$ 1. :::warning ![](https://i.imgur.com/kAZKErZ.png) A code scheme has a Hamming distance $d_{min} =$ 4. This code guarantees the detection of up to three errors ($d = s + 1$ or $s = 3$). ::: :::info Proof: The set $C$ is known to both the transmitter and the receiver, so if $w \in W$ is the message and $c = E(w)$ is transmitted, let $c \ne T(c) = r$. If the minimum distance between code words is at least $k +$ 1, then the transmission of $c$ can result in as many as $k$ errors and $r$ will not be listed in $C$. Hence we can detect all errors $e$ where $wt(e) \le k$. Conversely, let $c_{1}$, $c_{2}$ be code words with $d(c_{1}, c_{2}) < k +$ 1. Then $c_{2} = c_{1} + e$ where $wt(e) \le k$. (容錯值$\le k$) If we send $c_{1}$ and $T(c_{1}) = c_{2}$, then we would feel that $c_{2}$ had been sent, thus failing to detect an error of weight $< k$. ::: --- ### THEOREM 16.13 Let $E$, $W$, and $C$ be as in **Theorem 16.12**. If our objective is error correction, then for $k \in Z^{+}$, we can construct a decoding function $D:Z^{n}_{2} \rightarrow W$ that corrects all transmission errors of weight $\le k$ if and only if the minimum distance between code words is at least 2$k+$ 1. :::warning ![](https://i.imgur.com/25eKhfa.png) For error correction, we definitely need more distance. It can be shown that to correct $t$ errors, we need to have $d_{min} =$ 2 $t +$ 1. In other words, if we want to correct 3 bits in a packet, we need to make the minimum hamming distance 7 bits. :point_right: 假如兩顆球碰在一起, 那會有一點與 $x$ 的距離$\le t$, 距離 $y$ 也 $\le t$, 這時候我們無法分辨要將錯誤改成$x$還是$y$ ::: :::info Proof: For $c \in C$, consider $S(c, k) = \{ x \in Z^{n}_{2}| d(c, x) \le k \}$. Define $D: Z^{n}_{2} \rightarrow W$ as follows. If $r \in Z^{n}_{2}$ and $r \in S(c, k)$ for some code word $c$, then $D(r) = w$ where $E(W) = c$. **[Here $c$ is the (unique) code word nearest to $r$.]** If $r \notin S(c, k)$ for any $c \in C$, then we define $D(r) = w_{0}$, where $w_{0}$ is some arbitrary message that remains fixed once it is chosen. The only problem we could face here is that $D$ might not be a function. This will happen if there is an element $r$ in $Z^{n}_{2}$ with $r$ in both $S(c_{1}, k)$ and $S(c_{2}, k)$ for distinct code words $c_{1}$, $c_{2}$. But $r \in S(c_{1}, k) \Rightarrow d(c_{1}, k) \le k$, and $r \in S(c_{2}, k) \Rightarrow d(c_{2}, r) \le k$, so $d(c_{1}, c_{2}) \le d(c_{1}, r) + d(r, c_{2}) \le k + k <$ 2$k+$ 1. ::: --- Consequently, if the minimum distance between code words is at least $2k+$ 1, then $D$ is a function, and it will decode all possible received words, correcting any transmission error of weight $\le k$. Conversely, if $c_{1}$, $c_{2} \in C$ and $d (c_{1}, c_{2}) \le$ 2$k$, then $c_{2}$ can be obtained from $c_{1}$ by making at most 2$k$ changes. Starting at code word $c_{1}$ we make approximately half of these changes. This brings us to $r = c_{1} + e_{1}$ with $wt(e_{1}) \le k$. Continuing from $r$, we make the remaining changes to get to $c_{2}$ and find $r + e_{2} = c_{2}$ with $wt(e_{2}) \le k$. But then $r = c_{2} + e_{2}$. Now with $c_{1} + e_{1} = r = c_{2} + e_{2}$ and $wt(e_{1})$, $wt(e_{2}) \le k$, how can one decide on the code word from which $r$ arises? This ambiguity results in a possible error of weight $\le k$ that cannot becorrected. --- ### EXAMPLE 16.24 With $W = Z^{2}_{2}$ let $E: W \rightarrow Z^{6}_{2}$ be given by: \begin{matrix} E(00)=000000 & E(10)= 101010 & E(01)= 010101 & E(11)= 111111 \end{matrix} The $d_{min}$ between code words is **3**, so we can correct all single errors(3 $\ge$ 2 $\times$ 1 $+$ 1) $S(000000, 1) = \{ x \in Z^{6}_{2} | d( 000000, x ) \le 1 \}$ =$\{ 000000, 100000, 010000, 001000, 000100, 000010, 000001 \}$. The decoding function $D: Z^{6}_{2} \rightarrow W$ gives $D(x)=$ 00 for all $x \in S(000000, 1)$ At this point our definition of $D$ accounts for 7 of the elements in $Z^{6}_{2}$. Continuing to define $D$ for the 7 $\times$ 3 elements in $S(010101, 1)$, $S(101010, 1)$ and $S(111111, 1)$. :::danger :question: 如果code words之間的最小距離為 $2k+$ 1, 是否可以偵測出所有weight $\le 2k$ 的error並更正weight $\le k$ 的error? ::: --- > Error detection and error correction need not take place at the same time and at the maximum levels.Let's reconcider the (6, 2)-triple repetition code of Example16.24: Since the minimum distance between any 2 elements of $Z^{2}_{2}$ is 1, it follows that the minimum distance between code words is 3. Suppose that our major objective is ==error correction== and that $r =$ 100000 $[ \notin E(W) ]$ is received. We see that $d(000000, r) =$ 1, $d(101010, r) =$ 2, $d(010101, r) =$ 4, and $d(111111, r) =$ 5. Consequently, we should choose to decode $r$ as 000000, the unique code word nearest to $r$. Unfortunately, suppose that the actual message were 10, but we received $r =$ 100000. Upon correcting $r$ as 000000, we should then decode 000000 to get the incorrect message 00. And, in so doing, we have failed to detect an error of weight 2. --- ## 16.7 The Parity-Check and Generator Matrices ### EXAMPLE 16.25 Let $G$ be a 3 $\times$ 6 matrix over $Z_{2}$: ![](https://i.imgur.com/Lu3tl41.png =300x) Letting $A$ denote the matrix formed from the last 3 columns of $G$, we write $G = [I_{3}| A]$ to denote its structure. The matrix $G$ is called a generator matrix. We use $G$ to define an encoding function $E: Z^{3}_{2} \rightarrow Z^{6}_{2}$ as follows. For $w \in Z^{3}_{2}$, $E (w) = wG$ is the element in $Z^{6}_{2}$, obtained by multiplying $w$. For example: ![](https://i.imgur.com/fcfCgAs.png =500x)and ![](https://i.imgur.com/tSvUpZp.png =500x) Note that $E(110)$ can be obtained by adding the first 2 rows(110) of $G$, whereas $E(010)$ is simply the second row(010) of $G$. --- ### EXAMPLE 16.25 The set of code words obtained by this method is $C = \{000000, 100110, 010011, 001101, 110101, 101011, 011110, 111000 \} \subseteq Z^{6}_{2}$ and one can recapture the corresponding message by simply dropping the last 3 components of the code word(前3 $\times$ 3是單位矩陣). In addition, the minimum distance between code words is 3, so we can detect errors of weight $\le$ 2 or correct single errors. For all $w = w_{1}w_{2}w_{3} \in Z^{6}_{2}$, $E (w) = w_{1}w_{2}w_{3}w_{4}w_{5}w_{6} \in Z^{6}_{2}$. Since ![](https://i.imgur.com/MMsusLa.png =500x) we have \begin{matrix} w_{4} = w_{1} + w_{3} & w_{5} = w_{1} + w_{2} & w_{6} = w_{2} + w_{3} \\ \end{matrix} and these equations are called the **parity-check equations**. --- Since $w_{i} \in Z_{2}$ for each $1 < i < 6$, it follows that $w_{i} = -w_{i}$; \begin{aligned} &w_{1} \qquad +w_{3}+w_{4} &= 0 \\ &w_{1}+w_{2} \qquad\qquad+w_{5}&=0 \\ &\qquad \ w_{2}+w_{3} \qquad\qquad +w_{6} &= 0 \end{aligned} Thus we find that: ![](https://i.imgur.com/mwqoIMI.png =500x) where $(E(w))^{tr}$ denotes the transpose of $E(w)$. Consequently, if $r = r_{1}r_{2} \cdots r_{6} \in Z^{6}_{2}$, we can identify $r$ as a code word if and only if $H \cdot r^{tr} = \begin{bmatrix} 0\\0\\0 \end{bmatrix}$ Writing $H = [B | I_{3}]$, $B = A^{tr}$, is called **parity-check matrix**. --- Suppose we receive $r =$ 110110. We want to find the code word $c$ that is the nearest neighbor of $r$. We would be better off to first examine $H \cdot r^{tr}$, which is called the **syndrome(校驗子)** of $r$. Here ![](https://i.imgur.com/rsSvAIh.png =500x) so $r$ is not a code word. Hence we at least detect an error. Looking back at the list of code words, we see that $d(100110, r) =$ 1. For all other $c \in C$,$d(r, c) \ge$ 2. Writing $r = c + e =$ 100110 $+$ 010000, we find that the transmission error occurs in the ==**second component**== of $r$. Let $r = c + e$, where $c$ is a code word and $e$ is an error pattern of weight 1. Suppose that 1 is in the $i_{th}$ component of $e$, where $1 < i < 6$. Then $H \cdot r^{tr} = H \cdot (c + e)^{tr}= H \cdot (c^{tr} + e^{tr}) = H \cdot c^{tr} + H \cdot e^{tr}$ With $c$ a code word, it follows that $H \cdot c^{tr} = 0$, so $H \cdot r^{tr} = H \cdot e^{tr} =$ ==$i_{th}$ column of matrix $H$==. Thus $c$ and $r$ differ only in the $i_{th}$ component, and we can determine $c$ by simply changing the $i_{th}$ component of $r$. --- > Suppose that we receive $r$ = 000111. Computing the syndrome: ![](https://i.imgur.com/k1Rcd1W.png =500x) We obtain a result that is not one of the columns of $H$. If $H \cdot r^{tr}$ came from the sum of the first and sixth columns of $H$, correcting these components in $r$ results in the code word 100110. If we sum the third and fifth columns of $H$ to get this syndrome, upon changing the third and fifth components of $r$ we get a second code word, 001101. So we cannot expect $H$ to correct multiple errors. This is no surprise since the minimum distance between code words is 3. :::warning Summarize the results of **Example 16.25** for the general situation: For $m,n \in Z^{+}$ with $m < n$, the encoding function $E: Z^{m}_{2} \rightarrow Z^{n}_{2}$, is given by an $m \times n$ matrix $G$ over. Generator matrix $G$ has the form $[I_{m}|A]$, where $A$ is an $m \times (n − m)$ matrix. Here $E(w) = wG$ for each message $w \in Z^{m}_{2}$, and the code $C = E(Z^{m}_{2}) \subseteq Z^{n}_{2}$ :::