JPEG Decoder (2/3)

--- ###### tags: `DataCompression` --- # JPEG Decoder (2/3) 在上一部份處理完MCU的DCT及量化，這篇要介紹如何進行對係數進行編碼。一個8x8 MCU經過量化後可分為 - 1個在左上角雙向上頻率為0的**DC coefficient** - 63個**AC coefficient** 從2-D DCT可發現DC coefficient代表區塊內像素平均值，由於鄰近區塊的DC coefficient有相似性，以DPCM方式編碼使值縮小，可以在之後的壓縮取得不錯成效。 - $DiffDC(i) = DC(i) - DC(i-1), i=1,2,...$ - $DC(0)=0$ 例如 - DC(1)=-512 - DC(2)=1024 代表 - MCU 1: DC Coef=-512 - MCU 2: DC Coef=512。 AC coefficient藉由zig-zag順序來將低頻率的係數放在前面，有助於entropy coding。又因區塊小，跨區塊變化也小，高頻部份趨近於0，適合以RLE方式編碼。接者編碼AC/DC coefficient成Bit-stream。 # 1. Huffman Coding DC/AC coefficient在檔案中的存放方式 ``` | MCU 1 | | Y | Cb | Cr | | DC | AC | DC | AC | DC | AC | ... ``` DC/AC coefficient需經過Huffman encoding才寫入。下表範例是經過Huffman encoding的資料集以及其codeword Symbol|a|b|c|d|e -|-|-|-|-|- weights|0.10|0.15|0.30|0.16|0.29 codeword|0b010|0b011|0b11|0b00|0b10 ### 1.1 Huffman Table JPEG分兩部份存Huffman table，第一個部份存codeword length，第二個存decoded value，[實作](https://www.impulseadventure.com/photo/jpeg-decoder.html)上可使用兩個迴圈建出Huffman tree。[Expansion of DHT Table into binary bit strings](https://www.impulseadventure.com/photo/jpeg-huffman-coding.html)敘述如何利用codeword length及decoded value重建code word： ``` Class = 1 (AC Table) Destination ID = 0 Codes of length 01 bits (000 total): Codes of length 02 bits (002 total): 01 02 Codes of length 03 bits (001 total): 03 Codes of length 04 bits (003 total): 11 04 00 Codes of length 05 bits (003 total): 05 21 12 ... ``` ![](https://www.impulseadventure.com/photo/images/huffman/huff_tree1.gif) 步驟如下： 1. 0 bit (root)沒有value，生成左右node 2. 1 bit 沒有value，同樣生成左右node 3. 2 bit 有 0x01, 0x02，從左至右填上分支出來的位置，成為leaf node。剩下的node繼續分支。 4. 3 bit 有 0x03，填上剛生成的node，以此類推。了解上述流程才能理解Huffman table在JPEG內所佔空間為何是`16+n` bytes > In JPEG all Huffman code tables are defined in the image header. Each table requires 16 + n bytes, where n is the number of codewords in the table. **The first 16 bytes list the number of codewords of each length from 1 to 16 bits (codewords longer than 16 bits are forbidden).** The remaining n bytes list the decoded output values of the n codewords in ascending codeword order (n < 256). 以下方HEX Code為例，嘗試解出Huffman table： ``` 01 05 01 01 01 01 01 01 00 00 00 00 00 00 00 | 01 02 03 04 05 06 07 08 09 0A 0B codeword length | decoded value ``` Codeword length (bits)|1|2|3|4|5|6|7|8|9|...|16 -|-|-|-|-|-|-|-|-|-|-|- Decoded value| - | 0x00 | 0x01, 0x02, 0x03, 0x04, 0x05 | 0x06 | 0x07 | 0x08 | 0x09 | 0x0A | 0x0B | - | - Symbol|Codeword|Bit-length -|-|- 0x00|00|2 0x01|010|3 0x02|011|3 0x03|100|3 0x04|101|3 0x05|110|3 0x06|1110|4 0x07|11110|5 0x08|111110|6 0x09|1111110|7 0x0A|11111110|8 0x0B|111111110|9 :::info 當採用Optimized Luminance/Chrominance的DC/AC coefficient分別個一張表，共4張表 ::: ### 1.2 Optimized JPEG 一般來說，JPEG不會對各種係數各建一張表，除非儲存時選擇`Optimized`。 Optimized JPEG指的是對Y(Luminance), CbCr(Chrominance)不同DC/AC係數各算一張Huffman table，雖然在檔案中需額外定義，但整體空間省下[2.4~6.8%](https://www.impulseadventure.com/photo/optimized-jpeg.html)。現代高階相機具有輸出Optimized JPEG的能力，但考慮到耗時與節省的空間不成比例，多採用[JPEG ITU-T81D standard, 149-157](https://www.w3.org/Graphics/JPEG/itu-t81.pdf)提供的Huffman table。計算Optimized JPEG的Huffman table方式不只一種，最簡單的方式是採用先前提到的作法：將係數化成`Size`和`Additional bits`，再對`Size`進行編碼。也有的軟體有自家處理的方法，例如Photoshop CS2限制DC chrominance table不得有`0xA-0xB`、`EOB`為在DC luminance table內為5位元長度的codeword。 ### 1.3 Standard Huffman Tables 由於JPEG對於Huffman table的限制，DC table的`Size`為`0x0-0xB`，也就是最多12個code word；AC table的`Run`與`Size`分別為`0x0-0xF`、`0x1-0xA`，再加上`EOB (0x00)`與`ZRL (0xF0)`共162個code word。Standard huffman tables必須涵蓋所有code word。使用方式如下，拿Table K3的category對應到Luminance的DC係數的Size。例如Size = 3, 對應的code word就是`0b100`。 ![](https://i.imgur.com/E3pgEON.png) 但即使存儲影像時選擇`Standard`也未必採用Standard huffman table，例如Photoshop CS2輸出的檔案，其Huffman table會依據quaility而定。 # 2. DC coefficient Encoding DC coeffiecient並非直接經過Huffman encoding直接寫入檔案，先前提過由於相鄰的區塊DC coefficient有所關係，使用DPCM可降低entropy。 ![](https://i.imgur.com/WIxYdbV.png) 但經過DPCM轉換有著Difference過大的問題，想像一下原始圖片數值介於`[0,255]`，假設經過DCT放大8倍，Difference同樣擴大8倍，這會造成code table非常巨大。因此進一步採用floating-point representation。 ### 2.1 Floating-point representation **Decode** Huffman table解碼出的值，對應到下方表格的`Size`、。`Size=5`代表要讀取5 bits的`Additional bits`，再對應到實際的值。在給定`Size`下，每個`Additional bits`出現的機率相近，使得entropy coding效率高。 DC Value (Size) | Additional bits | Value (DC Coef Difference) -|-|- 0x05|0b00101|-26 - `Size = 0` 具有特殊意義，代表該MCU結束(End of block) ![](https://i.imgur.com/5uo3nnx.png) ```python3= # https://github.com/yasoob/Baseline-JPEG-Decoder/blob/master/decoder.py def DecodeNumber(code, bits): l = 2 ** (code - 1) if bits >= l: return bits else: return bits - (2 * l - 1) DecodeNumber(0x05,0b00101) # -26 DecodeNumber(0x04,0b0101) # -10 ``` **Encode** ```python3= import math def EncodeNumber(value): code = int(math.log(abs(value))/math.log(2)) + 1 if value > 0: return code, value else: return code, value+2**code-1 EncodeNumber(15) # 0x4, 0b1111 EncodeNumber(-26) # 0x5, 0b00101 ``` **Example** 以一小段hex作為[範例](https://www.impulseadventure.com/photo/jpeg-huffman-coding.html)，MCU 1, 2的DC Coef分別是 -512、+508 ![](https://i.imgur.com/HFcc6NH.png) --- # 3. AC coefficient Encoding Run-length encoding(RLE)具有以下特性： - 容易實作 - 無損壓縮 - 對資料內有連續出現的數值壓縮率佳影像經過DCT及Quantization，數值集中在左上角，AC coefficient以zig-zag順序排成一維陣列，進行 (RLE)可以得到不錯的效果。 ### 3.1 Zig-zag sequence ![](https://i.imgur.com/Ag1pmP1.png) 觀察zig-zag matrix - 每條斜線上的點座標x+y均同，左上到右下x+y是遞增的 - 在同一條斜線上，x+y是奇數方向為右上至左下，反之亦然我們可以拿[zigzag matrix](https://rosettacode.org/wiki/Zig-zag_matrix#Python)生成函式的一部份，窮舉所有座標對應的值，依照這些值進行遞增排序即為zig-zag order。預先生成8x8 zig-zag matrix： ```python= zigzag = [ [0, 1, 5, 6, 14, 15, 27, 28], [2, 4, 7, 13, 16, 26, 29, 42], [3, 8, 12, 17, 25, 30, 41, 43], [9, 11, 18, 24, 31, 40, 44, 53], [10, 19, 23, 32, 39, 45, 52, 54], [20, 22, 33, 38, 46, 51, 55, 60], [21, 34, 37, 47, 50, 56, 59, 61], [35, 36, 48, 49, 57, 58, 62, 63], ] ``` ### 3.2 Run-length encoding [JPEG Entropy Coding](http://www.mee.tcd.ie/~sigmedia/pmwiki/uploads/Teaching.4S1b/handout6_4s1.pdf): > The codeword represents the **run-length of zeros before a non-zero coef and the Size of that coef.** This is then followed by the Additional Bits which define the coef amplitude and sign precisely. Size and Additional Bits are defined just as for DC coefs. 編碼方式： ``` (Run, Size) codeword + coefficient additional bits ``` - Run: run-length of zeros - Size & Additional bits: 剛才提到的Floating-point representation 步驟如下： 1. 拿 `(Run, Size)` 組合而成的位元組進行Huffman encoding得到codeword，例如 `Run=0x0A, Size=0x03, (Run, Size)=0xA3` 2. 後面放上`Additional bit` 之所以將`Run`和`Size`放在一起進行編碼，是因為`Run`和`Size`關聯性高（較小的係數通常前面0多，較大的係數前面0較少）。 :::info 有些特殊的組合： - `EOB=(0,0)` - `ZRL=(15,0)` > The procedure “Append EHUFSI(X’F0’) bits of EHUFCO(X’F0’)” codes a run of 16 zero coefficients (ZRL code of Figure F.1). The procedure “Code EHUFSI(0) bits of EHUFCO(0)” codes the end-of-block (EOB code). If the last coefficient (K = 63) is not zero, the EOB code is bypassed. - [ISO/IEC 10918-1:19, 90-91](https://www.w3.org/Graphics/JPEG/itu-t81.pdf) `EOB`代表後續無係數，`ZRL`用來與其他`(Run, Size)`使用表示更長的run，例如： ``` (ZRL) (4,3) 010 ``` 等同 ``` |----- x16-----| |- 4 -| AC 0 0 0 .... 0 0 0 0 0 0 0 -5 ``` ::: 如果不想對`(Run, Size)`進行Huffman encoding，可以使用上方的Table K.3。 **Example** ![](https://i.imgur.com/u3HYSac.png) # 4. Conclusion ![](https://i.imgur.com/M8PEMAq.png) 這篇弄清了係數如何進行編碼AC/DC coeffieient，下篇介紹如何將這些東西塞進檔案。此外，我們並未進行down-sampling，但後續仍需處理這部份的JPEG header。 # 5. Reference - :star: [Understanding and Decoding a JPEG Image using Python ](https://yasoob.me/posts/understanding-and-writing-jpeg-decoder-in-python/) - [JPEG Huffman Coding Tutorial](https://www.impulseadventure.com/photo/jpeg-huffman-coding.html)

Read more

2022q1 Homework3 (quiz3)

2021q1 Homework4 (quiz4)

JPEG Decoder (3/3)

JPEG Decoder (1/3)