---
# System prepended metadata

title: JPEG Decoder (3/3)
tags: [DataCompression]

---

---
###### tags: `DataCompression`
---
# JPEG Decoder (3/3)

本篇介紹
1. Quantization table
2. Huffman table
3. Encoded data

在JPEG檔案內排列的順序，範例參考[Understanding and Decoding a JPEG Image using Python](https://yasoob.me/posts/understanding-and-writing-jpeg-decoder-in-python/#huffman-encoding)，參考規格書補漏，並實作JPEG decoder。

JPEG有4種壓縮格式
- Baseline
- Extended Sequential
- Progressive
- Loseless

每種壓縮格式的佈局都不一，我們預設圖片為**Baseline** JPEG。

![](https://i.imgur.com/DwAuA7b.png)

上方為[Ange Albertini](https://yasoob.me/posts/understanding-and-writing-jpeg-decoder-in-python/#huffman-encoding)的圖，這顯示出JPEG內重要的區段(segment)，大部分的標記(marker)後面接著不含標記的區段長度，較為重要的標記有:

Hex|Marker|Marker Name|Description
-|-|-|-
0xFFD8|SOI|Start of Image|
0xFFE0|APP0|Application Segment 0|JFIF - JFIF JPEG image / AVI1 - Motion JPEG (MJPG)
0xFFDB|DQT|Define Quantization Table |
0xFFC0|SOF|Start of Frame 0|Baseline DCT 
0xFFC4|DHT|Define Huffman Table|
0xFFDA|SOS|Start of Scan|
0xFFD9|EOI|End of Image|

# 1. File Start & File End
- `SOI`: 圖檔起始
- `EOI`: 圖檔結束
 
`SOS`並沒有提及壓縮資料的長度，所以解碼器會一直讀取直到`EOI`。

嘗試寫個簡單的decoder，除了讀到`SOI`、`EOI`要進行特別處理，其餘只要遵循讀完2 bytes標記，再讀2 bytes的區段長度的規則：
```python=
from struct import unpack


marker_mapping = {
    0xffd8: "Start of Image",
    0xffe0: "Application Default Header",
    0xffdb: "Quantization Table",
    0xffc0: "Start of Frame",
    0xffc4: "Define Huffman Table",
    0xffda: "Start of Scan",
    0xffd9: "End of Image"
}


class JPEG:
    def __init__(self, image_file):
        with open(image_file, 'rb') as f:
            self.img_data = f.read()
    
    def decode(self):
        data = self.img_data
        while(True):
            marker, = unpack(">H", data[0:2])
            print(marker_mapping.get(marker))
            if marker == 0xffd8:  # soi
                data = data[2:]
            elif marker == 0xffd9:  # eoi
                return
            elif marker == 0xffda:  # sos
                self.decodeSOS(data[2:-2])
                data = data[-2:]
            else:
                lenchunk, = unpack(">H", data[2:4])
                chunk = data[4:2+lenchunk]
                data = data[2+lenchunk:]

                if marker == 0xffc4:
                    self.decodeHuffmanTable(chunk)
                elif marker == 0xffdb:
                    self.DefineQuantizationTables(chunk)
                elif marker == 0xffc0:
                    self.decodeFrameHeader(chunk)

            if len(data) == 0:
                break  

if __name__ == "__main__":
    img = JPEG('profile.jpg')
    img.decode()    

# OUTPUT:
# Start of Image
# Application Default Header
# Quantization Table
# Quantization Table
# Start of Frame
# Huffman Table
# Huffman Table
# Huffman Table
# Huffman Table
# Start of Scan
# End of Image
```

# 2. Application Default Header (`0xFFE0`)
略過，不影響解碼

# 3. Define Quantization Table, DQT(`0xFFDB`)
[ISO/IEC 10918-1:199](https://www.w3.org/Graphics/JPEG/itu-t81.pdf), 39-40

![](https://i.imgur.com/5XMc5gU.png)
![](https://i.imgur.com/n83GffK.png)

Baseline:
Parameter | Value
-|-
Lq|2+65n
Pq|0 (8-bit $Q_k$)
Tq|0~3,(0: Luminance, 1:Chrominance)

實際佈局
```
Lq   | LqPq | Qk...
0x43  0x01   0x01 0x02 0x03
```

### 3.1 Decode DQT
```python3=
from struct import *

def GetArray(type,l, length):
    s = ""
    for i in range(length):
        s =s+type
    return list(unpack(s,l[:length]))
  
class JPEG:
    # ------

    def DefineQuantizationTables(self, data):
        offset = 0
        while offset < len(data):
            id, = unpack("B", data[offset: offset+1])
            self.quant[id] = GetArray("B", data[offset + 1:offset + 65], 64)
            print("#{} Table".format(id))
            print("Elements: ", self.quant[id])
            offset += 65


    def decode(self):

        self.DefineQuantizationTables(chunk)
```

### 3.2 Encode DQT
使用全是1的quantization table進行測試
```python=
def encodeQuantizationTables(self, hdr):
    # DQT, Lq, (Pq|Tq), Qk
    return pack('>HHB'+"B"*64, 0xffdb, 2+1+64, hdr, *self.quant[hdr])

jpeg.quant[0] = np.ones(64, dtype='uint8').tobytes()
print('DQT: ', jpeg.encodeQuantizationTables(0))
```
```
# b'\xff\xdb\x00C\x00\x01\x01\x01\x01.......\x01'
```
Quantization table可使用預設的表，加上quality factor
```python=
qf = 90
factor = 5000/qf if (qf < 50) else 200-2*qf
q = load_quantization_table('lum')
q = np.clip(q*factor/100,1,255)
```
使用方式如下：
```
(img // _jpeg_quantiz_matrix)
```
# 4. Start of Frame, SOF(`0xFFC0`)
[ISO/IEC 10918-1:199](https://www.w3.org/Graphics/JPEG/itu-t81.pdf), 35-36
![](https://i.imgur.com/n6ujDPm.png)
![](https://i.imgur.com/8nNSJt8.png)

---
![](https://i.imgur.com/uZEn8WC.png)
與上面的圖交戶對照

Parameter|Description|Value (For Baseline DCT)
-|-|-
SOFn|Marker|0xffc0
Lf|Frame header length|Lf+P+Y+X+Nf=8 bytes, Ci+Hi+Vi+Tqi=3 bytes, 8+3*3(Lf)=17
P|Sample precision|8
Y|Number of lines (Height)|2
X|Number of sample per line (Width)|6
Nf|Number of image components in frame|Y, Cb, Cr => Nf=3
Ci|Component identifier|1~3 (1=Y, 2=Cb, 3=Cr, 4=I, 5=Q)
Hi|Horizontal sampling factor|略
Vi|Vertical sampling factor|略
Tqi|Quantization table destination selector, component所對應的量化表|Y對應到0，CbCr共用一張表對應到1

關於`Y`, `X`, `Hi`, `Vi`的意義，還需要參照[ISO/IEC 10918-1:199](https://www.w3.org/Graphics/JPEG/itu-t81.pdf) A.1.1
![](https://i.imgur.com/ruMaanx.png)

假設圖片512x512且有3個components (`Y`, `Cb`, `Cr`)，要求出每個component是由多少個`xi*yi`的sample組成(sample就是我們常在說構成圖片的"點"，在Baseline中1 sample=8-bit)：
```
Component 0   H0=4  V0=1
COmponent 1   H1=2  V1=2
Component 2   H2=1  V2=1
```
`X=512`, `Y=512`, `Hmax=4`, `Vmax=2`, 那每個component的`xi`, `yi`會由decoder計算，分別為：
```
Component 0   x0=512  y0=256
Component 1   x1=256  y1=512
Component 2   x2=128  y2=256
```

所以在上方表格中，對應的`xi`, `yi`:
```
Y=2 X=6 Hmax=2 Vmax=2
Component 1   H1=2  V1=2     x1=6 y1=2
COmponent 2   H2=1  V2=1  => x2=3 y2=1
Component 3   H3=1  V3=1     x3=3 y2=1
```
會發現chroma儲存的資訊水平、垂直上都減半，也就是[YUV 4:2:0](https://www.impulseadventure.com/photo/jpeg-decoder.html)

```
MCU1                  | MCU2
Y00 Y10 Y01 Y11 Cb Cr | Y...
```

### 4.1 Decode Frame Header 
```python=
subsample_mapping = {
    "11": "4:4:4",
    "21": "4:2:2",
    "41": "4:1:1",
    "22": "4:2:0"
}

class JPEG:
    
    def decodeFrameHeader(self, data):
        # BaselineDCT
        precison, self.height, self.width, self.components = unpack(
            ">BHHB", data[0:6])
        for i in range(self.components):
            id, factor, QtbId = unpack("BBB", data[6+i*3:9+i*3])
            h, v = (factor >> 4, factor & 0x0F)
            self.horizontalFactor.append(h)
            self.verticalFactor.append(v)
            self.quantMapping.append(QtbId)

        print("size {}x{}".format(self.width, self.height))
        print("subsampling {}".format(subsample_mapping.get("{}{}".format(
            int(max(self.horizontalFactor)/self.horizontalFactor[0]),
            int(max(self.verticalFactor)/self.verticalFactor[0])
        ))))
```

### 4.2 Encode Frame Header
```python=
    def encodeFrameHeader(self):
        # BaselineDCT, 4:4:4, 3 components
        C = []
        for i in range(3):
            C.append(i+1)
            C.append(1 << 4 | 1)
            C.append(self.quantMapping[i])
        return pack('>HHBHHB'+'BBB'*3, 0xffc0, 8+3*3, 8, self.height, self.width, 3, *C)
```
輸出一小段測試
```python=
img.quantMapping=[0,1,1]
img.height=2
img.width=6
print(img.encodeFrameHeader()) # b'\xff\xc0\x00\x11\x08\x00\x02\x00\x06\x03\x01\x11\x00\x02\x11\x01\x03\x11\x01'
```

# 5. Define Huffman Table, DHT(`0xFFC4`)
[ISO/IEC 10918-1:199](https://www.w3.org/Graphics/JPEG/itu-t81.pdf), 40-41
![](https://i.imgur.com/GyvQgYt.png)
![](https://i.imgur.com/KIiJTmd.png)

Parameter|Description|Value (For Baseline DCT)
-|-|-
Lh|Huffman table definition length|2+17+mt
Tc|Table class|0=DC table, 1=AC table
Th|Huffman table destination identifier|0-1 (0=Luminance, 1=Chroma)。其實沒有規範，只要與Quantization table和Scan header的destination對應起來就好。
Li|Number of Huffman codes of length i|0-255
Vij|Value associated with each Huffman code |0-255

先前提到JPEG將Huffman table拆成codeword length、decoded value兩部份存，合計16+n bytes，就是在講Li、Vij。
```
01 05 01 01 01 01 01 01 00 00 00 00 00 00 00 | 01 02 03 04 05 06 07 08 09 0A 0B
             codeword length                 |            decoded value
```

回顧範例，試著寫encoder/decoder。
![](https://i.imgur.com/4blWV32.png)

:::warning
Standard允許一個marker後方接多組Huffman table
:::

### 5.1 Decode Huffman Table
Huffman decoder的部份覺得[micro-jpeg-visualizer](https://github.com/aguaviva/micro-jpeg-visualizer)寫的不錯，索幸拿來用，整份程式只有250行不依賴任何套件。

```python=
class HuffmanTable:
	def __init__(self):
		self.root=[]
		self.elements = []
	
	def BitsFromLengths(self, root, element, pos):
		if isinstance(root,list):
			if pos==0:
				if len(root)<2:
					root.append(element)
					return True				
				return False
			for i in [0,1]:
				if len(root) == i:
					root.append([])
				if self.BitsFromLengths(root[i], element, pos-1) == True:
					return True
		return False
	
	def GetHuffmanBits(self,  lengths, elements):
		self.elements = elements
		ii = 0
		for i in range(len(lengths)):
			for j in range(lengths[i]):
				self.BitsFromLengths(self.root, elements[ii], i)
				ii+=1

	def Find(self,st):
		r = self.root
		while isinstance(r, list):
			r=r[st.GetBit()]
		return  r 

	def GetCode(self, st):
		while(True):
			res = self.Find(st)
			if res == 0:
				return 0
			elif ( res != -1):
				return res
```

`GetCode` 需要逐位元讀取data，所以另外建立`Stream`以便循序讀取資料上的每一位元
```python=
class Stream:
    def __init__(self, data):
        self.data= data
        self.pos = 0

    def GetBit(self):
        b = self.data[self.pos >> 3]
        s = 7-(self.pos & 0x7)
        self.pos+=1
        return (b >> s) & 1

    def GetBitN(self, l):
        val = 0
        for i in range(l):
            val = val*2 + self.GetBit()
        return val
```

假設在Huffman table中`0b000`對應0x04：
```python
    st = Stream([0x00]) #0b000000011
    print('0b000: ', hf.GetCode(st)) # 4
    print('0b000: ', hf.GetCode(st)) # 4
```

將`length`, `elements`解出來丟進`HuffmanTable()`建表得到Huffman tree - `root`，便能用`GetCode`將codeword解碼。
```python=
    def decodeHuffmanTable(self, data):
        offset = 0
        tbcount = 0
        while offset < len(data):
            tcth, = unpack("B", data[offset:offset+1])
            tc, th = (tcth >> 4, tcth & 0x0F)
            offset += 1

            # Extract the 16 bytes containing length data
            lengths = unpack("BBBBBBBBBBBBBBBB", data[offset:offset+16])
            offset += 16

            # Extract the elements after the initial 16 bytes
            elements = []
            for i in lengths:
                elements += (unpack("B"*i, data[offset:offset+i]))
                offset += i

            print("#{}: {} {} Table".format(tbcount, "Luma" if th ==
                  0 else "Chroma", "DC" if tc == 0 else "AC"))
            print("lengths: ", lengths)
            print("Elements: ", elements)

            # tc = 0(DC), 1(AC)
            # th = 0(Luminance), 1(Chroma)
            hf = HuffmanTable()
            hf.GetHuffmanBits(lengths, elements)
            self.huffman_tables[tc << 1 | th] = hf
            tbcount += 1
```
```
Define Huffman Table
Luma AC Table
lengths:  (0, 1, 4, 1, 3, 4, 2, 3, 0, 1, 5, 1, 0, 0, 0, 0)
Elements:  [3, 1, 2, 4, 5, 6, 7, 17, 18, 0, 8, 19, 33, 20, 34, 21, 35, 49, 50, 9, 22, 36, 65, 81, 23]
Define Huffman Table
Chroma DC Table
lengths:  (0, 2, 3, 1, 0, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0)
Elements:  [6, 7, 4, 5, 8, 3, 0, 1, 2, 9]
Define Huffman Table
Chroma AC Table
lengths:  (0, 2, 1, 3, 3, 3, 2, 5, 3, 4, 1, 4, 3, 0, 0, 0)
Elements:  [1, 2, 3, 4, 5, 17, 0, 18, 33, 6, 49, 65, 19, 34, 7, 50, 81, 97, 113, 20, 35, 129, 21, 66, 145, 177, 161, 22, 37, 82, 209, 130, 193, 241]
```

### 5.2 Encode Huffman Table
```python=
    def encodeHuffmanTable(self, matrix, tc, th):
        L = []
        V = []
        Lh = 5 + 16 + len(V)
        tcth = tc << 4 | th
        return pack('>HHB'+'B'*16+'B'*len(V), 0xffc4, Lh, tcth, *L, *V)
```

# 6. Start of Scan, SOS (`0xFFDA`)
[ISO/IEC 10918-1:199](https://www.w3.org/Graphics/JPEG/itu-t81.pdf), 37-38

![](https://i.imgur.com/Ts5Y0ht.png)
![](https://i.imgur.com/aznVGVf.png)

### 6.1 Decode SOS
在進行解碼前，先對一些函數進行修改。

JPEG在色域轉換後會減去128才進行DCT，將`[0, 255]`對應到`[-128, 127]`，因此解碼時需先補上128。
```python=
# Level-shifting
for i in range(3):
    self.img[i] += 128
```

#### 6.1.1 `Image`
儲存圖像資料的物件`Image`只有在進行色域轉換時才以3維陣列處理，平時以1維陣列`img`儲存Y、Cb、Cr的2維資料。`DrawMatrix`將已解碼完成的部份資料渲染到整塊畫布。
```python3=
class Image:
    def __init__(self, height, width):
        self.width = width
        self.height = height
        # [0] = Y, [1] = Cb, [2]=Cr
        self.img = []
        for i in range(3):
            self.img.append(np.zeros((self.height, self.width)))

    def ycbcr2rgb(self):
        # Level-shifting
        for i in range(3):
            self.img[i] += 128

        # Merge multiple 2D arrays into a 3D array
        self.img = np.dstack(tuple([i for i in self.img]))

        # Convert image from YCbCr to RGB
        self.img[:, :, 1:] -= 128
        m = np.array([
            [1.000,  1.000, 1.000],
            [0.000, -0.344136, 1.772],
            [1.402, -0.714136, 0.000],
        ])
        rgb = np.dot(self.img, m)
        self.img = np.clip(rgb, 0, 255)

        # Convert a 3D array to multiple 2D arrays
        self.img = [self.img[:, :, i] for i in range(3)]

    def DrawMatrix(self, y, x, L, Cb, Cr):
        _x = x*8
        _y = y*8
        self.img[0][_y:_y+8, _x:_x+8] = L.reshape((8, 8))
        self.img[1][_y:_y+8, _x:_x+8] = Cb.reshape((8, 8))
        self.img[2][_y:_y+8, _x:_x+8] = Cr.reshape((8, 8))
```

#### 6.1.2 `decodeSOS`
> To help resiliency in the case of data corruption, the JPEG standard allows JPEG markers to appear in the huffman-coded scan data segment. Therefore, a JPEG decoder must watch out for any marker (as indicated by the `0xFF` byte, followed by a non-zero byte). If the huffman coding scheme needed to write a `0xFF` byte, then it writes a `0xFF` followed by a `0x00` -- a process known as adding a stuff byte.

JPEG允許scan data出現marker(`0xFFXX`)來增加容錯性，為了和marker進行區別，寫入`0xFF`的資料會寫入成`0xFF00`，這是在進行Huffman decoding前要進行留意到的細節。

解碼流程：
1. 讀取各component的Huffman table id、quantization table id
2. 取代`0xFF00`為`0xFF`
3. 依照MCU內的component順序呼叫`BuildMatrix`查表解碼，在YCbCr 4:4:4下就是Y, Cb, Cr, Y, Cb, Cr...
4. 解碼後的component交給`DrawMatrix`渲染在畫布`img`上

```python=
    def decodeSOS(self, data):
        # BaselineDCT
        ls, ns = unpack(">HB", data[0:3])
        csj = unpack("BB"*ns, data[3:3+2*ns])
        dcTableMapping = []
        acTableMapping = []
        for i in range(3):
            dcTableMapping.append(csj[i*2+1] >> 4)
            acTableMapping.append(csj[i*2+1] & 0x0F)
        data = data[6+2*ns:]

        # Replace 0xFF00 with 0xFF
        i = 0
        while i < len(data) - 1:
            m, = unpack(">H", data[i:i+2])
            if m == 0xff00:
                data = data[:i+1]+data[i+2:]
            i = i + 1

        img = Image(self.height, self.width)
        st = Stream(data)
        oldlumdccoeff, oldCbdccoeff, oldCrdccoeff = 0, 0, 0
        for y in range(self.height//8):
            for x in range(self.width//8):
                matL, oldlumdccoeff = self.BuildMatrix(
                    st, dcTableMapping[0], acTableMapping[0], self.quant[self.quantMapping[0]], oldlumdccoeff)
                matCb, oldCrdccoeff = self.BuildMatrix(
                    st, dcTableMapping[1], acTableMapping[1], self.quant[self.quantMapping[1]], oldCrdccoeff)
                matCr, oldCbdccoeff = self.BuildMatrix(
                    st, dcTableMapping[2], acTableMapping[2], self.quant[self.quantMapping[2]], oldCbdccoeff)
                img.DrawMatrix(y, x, matL.base, matCb.base, matCr.base)

        img.ycbcr2rgb()
        self.decoded_data = img.img
```

#### 6.1.3 `BuildMatrix`
`BuildMatrix`的功能是重排DC/AC coeficient的順序，並進行inverse DCT。

[micro-jpeg-visualizer](https://github.com/aguaviva/micro-jpeg-visualizer)的實作有兩處值得檢視：
1. 假如DC coefficient解碼後的係數是`0`(`EOB`)，那就沒有必要讀addition bits的必要。在`GetBitN`的實作中，傳入`0`不會從`Stream`讀取任何位元，而是返回`0`；`DecodeNumber(0, 0)`也是直接返回`0`，這樣的實作方式省去判斷`code`是否為0。
    ```python3
    bits = st.GetBitN(code)
    dccoeff = DecodeNumber(code, bits) + olddccoeff
    i.base[0] = dccoeff
    ```
2. 在AC coefficient中，讀到ZRL(`0xF0`)要將offset(`l`)加上16並讀取下一組，但實作上是分成 +15、+1兩次進行，也省下判斷式。
    ```python3=
    if code > 15:
        l += code >> 4 # +15
        code = code & 0x0F

    bits = st.GetBitN(code)

    if l < 64:
        coeff = DecodeNumber(code, bits)
        # print(coeff)

        i.base[l] = coeff * quant[l]
        l += 1 # +1
    ```
    為了比較好釐清JPEG的實作，這邊刻意改寫成：
    ```python3=
    if code == 0xF0:
        # ZRL
        l += 16 # +16
        continue
    elif code > 15:
        l += code >> 4
        code = code & 0x0F

    bits = st.GetBitN(code)

    if l < 64:
        coeff = DecodeNumber(code, bits)
        i.base[l] = coeff
        l += 1
    ```
自己沒有注意到兩處小技巧在改寫時忽略掉幾點，解碼時在幾個MCU後解碼錯誤，因為是Bit-stream非常難除錯，最後還是靠比對原始程式碼發現這點。整體程式碼如下：
```python=
class JPEG:
        
    def BuildMatrix(self, st, dcTableId, acTableId, quant, olddccoeff):
        i = DCT()

        code = self.huffman_tables[0b00 | dcTableId].GetCode(st)
        bits = st.GetBitN(code)
        dccoeff = DecodeNumber(code, bits) + olddccoeff
        i.base[0] = dccoeff

        l = 1
        while l < 64:
            code = self.huffman_tables[0b10 | acTableId].GetCode(st)
            if code == 0:
                # EOB
                break

            if code == 0xF0:
                # ZRL
                l += 16
                continue
            elif code > 15:
                l += code >> 4
                code = code & 0x0F

            bits = st.GetBitN(code)

            if l < 64:
                coeff = DecodeNumber(code, bits)
                i.base[l] = coeff
                l += 1

        i.base = np.multiply(i.base, quant)
        i.rearrange_using_zigzag()
        i.perform_IDCT()

        return i, dccoeff
```

#### 6.1.4 `DCT`
實作前一章的內容。
```python=
class DCT():
    def __init__(self):
        self.base = np.zeros(64)
        self.zigzag = np.array([
            [0, 1, 5, 6, 14, 15, 27, 28],
            [2, 4, 7, 13, 16, 26, 29, 42],
            [3, 8, 12, 17, 25, 30, 41, 43],
            [9, 11, 18, 24, 31, 40, 44, 53],
            [10, 19, 23, 32, 39, 45, 52, 54],
            [20, 22, 33, 38, 46, 51, 55, 60],
            [21, 34, 37, 47, 50, 56, 59, 61],
            [35, 36, 48, 49, 57, 58, 62, 63],
        ]).flatten()
        # Generate 2D-DCT matrix
        L = 8
        C = np.zeros((L, L))
        for k in range(L):
            for n in range(L):
                C[k, n] = np.sqrt(1/L)*np.cos(np.pi*k*(1/2+n)/L)
                if k != 0:
                    C[k, n] *= np.sqrt(2)
        self.dct = C

    def perform_DCT(self):
        self.base = np.kron(self.dct, self.dct) @ self.base

    def perform_IDCT(self):
        self.base = np.kron(self.dct.transpose(),
                            self.dct.transpose()) @ self.base

    def rearrange_using_zigzag(self):
        newidx = np.ones(64).astype('int8')
        for i in range(64):
            newidx[list(self.zigzag).index(i)] = i
        self.base = self.base[newidx]
```

#### 6.1.5 Reconstruct image
我們需要一張8倍數長寬的圖片作為解碼器測試用
1. 從twitter取得[400x400的Fubuki頭貼](https://twitter.com/shirakamifubuki)
    ![](https://pbs.twimg.com/profile_images/1482580721466486784/q36x-Uux_400x400.jpg)
2. 以[GIMP](https://www.gimp.org/)開啟後另存一張Optimized JPG
    ![](https://i.imgur.com/CHSDxt3.png)
3. 解碼。`decode`後產生的是RGB不同分量的二維矩陣的一維陣列，因此還需要用`dstack`疊成一個三維陣列
    ```python=
    jpeg = JPEG('sample2.jpg')
    jpeg.decode()
    img = np.dstack(tuple([i for i in jpeg.decoded_data])).astype(np.uint8)
    ```
4. 用`pyplot`展示結果
    ```python=
    import matplotlib.pyplot as plt
    img = plt.imshow(img, interpolation='nearest')
    plt.axis('off')
    plt.savefig("output.png", bbox_inches='tight')
    ```
    ```
    real    0m5.922s
    user    0m6.220s
    sys     0m1.460s
    ```
    ![](https://imgur.com/rmoDoOn.jpg)
    
#### 6.1.6 Image resolution isn't mod8.
JPEG對於非8倍數寬度圖像的做法，就是將其擴展至8倍數，至於如何擴展由實作決定，有以鏡像方式填充(padding)，為何是以鏡像方式？以留黑邊影像為例，黑邊在進行轉換到頻域時會產生分佈極廣的係數（回想step function經過fourier transform的結果），這會產生大量不同的位元需要編碼，因此以鏡像方式填補。

以8x2的黑白影像分別以重複(edge)、鏡像(reflection)、對稱(symmetric)進行填補，使用[numpy.pad](https://numpy.org/doc/stable/reference/generated/numpy.pad.html)實作。


```python=
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import entropy

class DCT():
    def __init__(self):
        self.base = np.zeros(64)
        self.zigzag = np.array([
            [0, 1, 5, 6, 14, 15, 27, 28],
            [2, 4, 7, 13, 16, 26, 29, 42],
            [3, 8, 12, 17, 25, 30, 41, 43],
            [9, 11, 18, 24, 31, 40, 44, 53],
            [10, 19, 23, 32, 39, 45, 52, 54],
            [20, 22, 33, 38, 46, 51, 55, 60],
            [21, 34, 37, 47, 50, 56, 59, 61],
            [35, 36, 48, 49, 57, 58, 62, 63],
        ]).flatten()
        # Generate 2D-DCT matrix
        L = 8
        C = np.zeros((L, L))
        for k in range(L):
            for n in range(L):
                C[k, n] = np.sqrt(1/L)*np.cos(np.pi*k*(1/2+n)/L)
                if k != 0:
                    C[k, n] *= np.sqrt(2)
        self.dct = C

    def perform_DCT(self):
        self.base = np.kron(self.dct, self.dct) @ self.base


quant = np.array([
    [16, 11, 10, 16, 24, 40, 51, 61],
    [12, 12, 14, 19, 26, 58, 60, 55],
    [14, 13, 16, 24, 40, 57, 69, 56],
    [14, 17, 22, 29, 51, 87, 80, 62],
    [18, 22, 37, 56, 68, 109, 103, 77],
    [24, 35, 55, 64, 81, 104, 113, 92],
    [49, 64, 78, 87, 103, 121, 120, 101],
    [72, 92, 95, 98, 112, 100, 103, 99],
])

image = np.array([
    [0, 255],
    [0, 255],
    [0, 255],
    [0, 255],
    [0, 255],
    [0, 255],
    [0, 255],
    [0, 255]
])
seq = ((0, 8-image.shape[0]), (0, 8-image.shape[1]))
bins = np.linspace(-128, 127, 256)
for mode in ('edge', 'reflect', 'symmetric'):
    m = DCT()
    if mode == 'reflect' or mode == 'symmetric':
        m.base = np.pad(image, seq, mode)
    else:
        m.base = np.pad(image, seq, mode)
    m.base = m.base.flatten()
    m.base -= 128  # level-shifting
    m.perform_DCT()
    m.base = m.base // quant.flatten()
    plt.hist(m.base, bins, alpha=0.5, label=mode)
    
    unique, counts = np.unique(m.base,return_counts=True)
    freq = counts/np.sum(counts)
    print(mode, entropy(freq, base=2))
plt.legend(loc='upper right')
plt.show()
```


![](https://imgur.com/lbovgCx.jpg)
結果顯示reflect和symmetric所產生的係數更為集中，如果我們計算一下三者的entropy，會發現reflect和symmetric能夠帶給我們更好的壓縮比。
```
edge 1.7733292042478084
reflect 1.3967822215997983
symmetric 1.0960127271685762
```

# 7. Conclusion
本篇介紹資料在JPEG中的佈局並實作[decoder]()，未來有空的話將實作encoder，需要處理SOI、DQT、SOF、DHT、SOS、EOI幾個重要的marker，因為注重在quality factor和圖片差異的關係而非壓縮率，因此Luma、Chroma採用的Huffman table採用[ISO/IEC 10918-1:199](https://www.w3.org/Graphics/JPEG/itu-t81.pdf)的Table K.3~K.6，quantization table則使用Table K.1~K.2，實作參考[ghallak](https://github.com/ghallak/jpeg-python/tree/2fe1bd2244c3090543695b106866dfa0a3b48f6c)。

解碼過程中也發現，縱使處理MCU不用耗費太多時間，JPEG decoder的速度仍會受限於Huffman decoder的限制，因為必須解碼當前的符號才知道下一個符號的起始點。[t0rakka](https://t0rakka.silvrback.com/jpeg-decoding-benchmark)加入RST marker區隔MCU，達到多執行緒decode，效率驚人，可惜並非所有圖片都有加入RST，使用上較為受限。

# Reference
- [Designing a JPEG Decoder & Source Code](https://www.impulseadventure.com/photo/jpeg-decoder.html)
- [JPEG Huffman Coding Tutorial](https://www.impulseadventure.com/photo/jpeg-huffman-coding.html)
- [從計算機編碼的角度看Entropy](https://hackmd.io/@sXG2cRDpRbONCsrtz8jfqg/ry-0k0PwH)
- [DLCV Lab-影像壓縮](http://ip.csie.ncu.edu.tw/course/IP/IP1607cp.pdf)