Try   HackMD

JPEG Decoder (3/3)

本篇介紹

  1. Quantization table
  2. Huffman table
  3. Encoded data

在JPEG檔案內排列的順序,範例參考Understanding and Decoding a JPEG Image using Python,參考規格書補漏,並實作JPEG decoder。

JPEG有4種壓縮格式

  • Baseline
  • Extended Sequential
  • Progressive
  • Loseless

每種壓縮格式的佈局都不一,我們預設圖片為Baseline JPEG。

上方為Ange Albertini的圖,這顯示出JPEG內重要的區段(segment),大部分的標記(marker)後面接著不含標記的區段長度,較為重要的標記有:

Hex Marker Marker Name Description
0xFFD8 SOI Start of Image
0xFFE0 APP0 Application Segment 0 JFIF - JFIF JPEG image / AVI1 - Motion JPEG (MJPG)
0xFFDB DQT Define Quantization Table
0xFFC0 SOF Start of Frame 0 Baseline DCT
0xFFC4 DHT Define Huffman Table
0xFFDA SOS Start of Scan
0xFFD9 EOI End of Image

1. File Start & File End

  • SOI: 圖檔起始
  • EOI: 圖檔結束

SOS並沒有提及壓縮資料的長度,所以解碼器會一直讀取直到EOI

嘗試寫個簡單的decoder,除了讀到SOIEOI要進行特別處理,其餘只要遵循讀完2 bytes標記,再讀2 bytes的區段長度的規則:

from struct import unpack marker_mapping = { 0xffd8: "Start of Image", 0xffe0: "Application Default Header", 0xffdb: "Quantization Table", 0xffc0: "Start of Frame", 0xffc4: "Define Huffman Table", 0xffda: "Start of Scan", 0xffd9: "End of Image" } class JPEG: def __init__(self, image_file): with open(image_file, 'rb') as f: self.img_data = f.read() def decode(self): data = self.img_data while(True): marker, = unpack(">H", data[0:2]) print(marker_mapping.get(marker)) if marker == 0xffd8: # soi data = data[2:] elif marker == 0xffd9: # eoi return elif marker == 0xffda: # sos self.decodeSOS(data[2:-2]) data = data[-2:] else: lenchunk, = unpack(">H", data[2:4]) chunk = data[4:2+lenchunk] data = data[2+lenchunk:] if marker == 0xffc4: self.decodeHuffmanTable(chunk) elif marker == 0xffdb: self.DefineQuantizationTables(chunk) elif marker == 0xffc0: self.decodeFrameHeader(chunk) if len(data) == 0: break if __name__ == "__main__": img = JPEG('profile.jpg') img.decode() # OUTPUT: # Start of Image # Application Default Header # Quantization Table # Quantization Table # Start of Frame # Huffman Table # Huffman Table # Huffman Table # Huffman Table # Start of Scan # End of Image

2. Application Default Header (0xFFE0)

略過,不影響解碼

3. Define Quantization Table, DQT(0xFFDB)

ISO/IEC 10918-1:199, 39-40


Baseline:

Parameter Value
Lq 2+65n
Pq 0 (8-bit
Qk
)
Tq 0~3,(0: Luminance, 1:Chrominance)

實際佈局

Lq   | LqPq | Qk...
0x43  0x01   0x01 0x02 0x03

3.1 Decode DQT

from struct import * def GetArray(type,l, length): s = "" for i in range(length): s =s+type return list(unpack(s,l[:length])) class JPEG: # ------ def DefineQuantizationTables(self, data): offset = 0 while offset < len(data): id, = unpack("B", data[offset: offset+1]) self.quant[id] = GetArray("B", data[offset + 1:offset + 65], 64) print("#{} Table".format(id)) print("Elements: ", self.quant[id]) offset += 65 def decode(self): self.DefineQuantizationTables(chunk)

3.2 Encode DQT

使用全是1的quantization table進行測試

def encodeQuantizationTables(self, hdr): # DQT, Lq, (Pq|Tq), Qk return pack('>HHB'+"B"*64, 0xffdb, 2+1+64, hdr, *self.quant[hdr]) jpeg.quant[0] = np.ones(64, dtype='uint8').tobytes() print('DQT: ', jpeg.encodeQuantizationTables(0))
# b'\xff\xdb\x00C\x00\x01\x01\x01\x01.......\x01'

Quantization table可使用預設的表,加上quality factor

qf = 90 factor = 5000/qf if (qf < 50) else 200-2*qf q = load_quantization_table('lum') q = np.clip(q*factor/100,1,255)

使用方式如下:

(img // _jpeg_quantiz_matrix)

4. Start of Frame, SOF(0xFFC0)

ISO/IEC 10918-1:199, 35-36



與上面的圖交戶對照

Parameter Description Value (For Baseline DCT)
SOFn Marker 0xffc0
Lf Frame header length Lf+P+Y+X+Nf=8 bytes, Ci+Hi+Vi+Tqi=3 bytes, 8+3*3(Lf)=17
P Sample precision 8
Y Number of lines (Height) 2
X Number of sample per line (Width) 6
Nf Number of image components in frame Y, Cb, Cr => Nf=3
Ci Component identifier 1~3 (1=Y, 2=Cb, 3=Cr, 4=I, 5=Q)
Hi Horizontal sampling factor
Vi Vertical sampling factor
Tqi Quantization table destination selector, component所對應的量化表 Y對應到0,CbCr共用一張表對應到1

關於Y, X, Hi, Vi的意義,還需要參照ISO/IEC 10918-1:199 A.1.1

假設圖片512x512且有3個components (Y, Cb, Cr),要求出每個component是由多少個xi*yi的sample組成(sample就是我們常在說構成圖片的"點",在Baseline中1 sample=8-bit):

Component 0   H0=4  V0=1
COmponent 1   H1=2  V1=2
Component 2   H2=1  V2=1

X=512, Y=512, Hmax=4, Vmax=2, 那每個component的xi, yi會由decoder計算,分別為:

Component 0   x0=512  y0=256
Component 1   x1=256  y1=512
Component 2   x2=128  y2=256

所以在上方表格中,對應的xi, yi:

Y=2 X=6 Hmax=2 Vmax=2
Component 1   H1=2  V1=2     x1=6 y1=2
COmponent 2   H2=1  V2=1  => x2=3 y2=1
Component 3   H3=1  V3=1     x3=3 y2=1

會發現chroma儲存的資訊水平、垂直上都減半,也就是YUV 4:2:0

MCU1                  | MCU2
Y00 Y10 Y01 Y11 Cb Cr | Y...

4.1 Decode Frame Header

subsample_mapping = { "11": "4:4:4", "21": "4:2:2", "41": "4:1:1", "22": "4:2:0" } class JPEG: def decodeFrameHeader(self, data): # BaselineDCT precison, self.height, self.width, self.components = unpack( ">BHHB", data[0:6]) for i in range(self.components): id, factor, QtbId = unpack("BBB", data[6+i*3:9+i*3]) h, v = (factor >> 4, factor & 0x0F) self.horizontalFactor.append(h) self.verticalFactor.append(v) self.quantMapping.append(QtbId) print("size {}x{}".format(self.width, self.height)) print("subsampling {}".format(subsample_mapping.get("{}{}".format( int(max(self.horizontalFactor)/self.horizontalFactor[0]), int(max(self.verticalFactor)/self.verticalFactor[0]) ))))

4.2 Encode Frame Header

def encodeFrameHeader(self): # BaselineDCT, 4:4:4, 3 components C = [] for i in range(3): C.append(i+1) C.append(1 << 4 | 1) C.append(self.quantMapping[i]) return pack('>HHBHHB'+'BBB'*3, 0xffc0, 8+3*3, 8, self.height, self.width, 3, *C)

輸出一小段測試

img.quantMapping=[0,1,1] img.height=2 img.width=6 print(img.encodeFrameHeader()) # b'\xff\xc0\x00\x11\x08\x00\x02\x00\x06\x03\x01\x11\x00\x02\x11\x01\x03\x11\x01'

5. Define Huffman Table, DHT(0xFFC4)

ISO/IEC 10918-1:199, 40-41

Parameter Description Value (For Baseline DCT)
Lh Huffman table definition length 2+17+mt
Tc Table class 0=DC table, 1=AC table
Th Huffman table destination identifier 0-1 (0=Luminance, 1=Chroma)。其實沒有規範,只要與Quantization table和Scan header的destination對應起來就好。
Li Number of Huffman codes of length i 0-255
Vij Value associated with each Huffman code 0-255

先前提到JPEG將Huffman table拆成codeword length、decoded value兩部份存,合計16+n bytes,就是在講Li、Vij。

01 05 01 01 01 01 01 01 00 00 00 00 00 00 00 | 01 02 03 04 05 06 07 08 09 0A 0B
             codeword length                 |            decoded value

回顧範例,試著寫encoder/decoder。

Standard允許一個marker後方接多組Huffman table

5.1 Decode Huffman Table

Huffman decoder的部份覺得micro-jpeg-visualizer寫的不錯,索幸拿來用,整份程式只有250行不依賴任何套件。

class HuffmanTable: def __init__(self): self.root=[] self.elements = [] def BitsFromLengths(self, root, element, pos): if isinstance(root,list): if pos==0: if len(root)<2: root.append(element) return True return False for i in [0,1]: if len(root) == i: root.append([]) if self.BitsFromLengths(root[i], element, pos-1) == True: return True return False def GetHuffmanBits(self, lengths, elements): self.elements = elements ii = 0 for i in range(len(lengths)): for j in range(lengths[i]): self.BitsFromLengths(self.root, elements[ii], i) ii+=1 def Find(self,st): r = self.root while isinstance(r, list): r=r[st.GetBit()] return r def GetCode(self, st): while(True): res = self.Find(st) if res == 0: return 0 elif ( res != -1): return res

GetCode 需要逐位元讀取data,所以另外建立Stream以便循序讀取資料上的每一位元

class Stream: def __init__(self, data): self.data= data self.pos = 0 def GetBit(self): b = self.data[self.pos >> 3] s = 7-(self.pos & 0x7) self.pos+=1 return (b >> s) & 1 def GetBitN(self, l): val = 0 for i in range(l): val = val*2 + self.GetBit() return val

假設在Huffman table中0b000對應0x04:

    st = Stream([0x00]) #0b000000011
    print('0b000: ', hf.GetCode(st)) # 4
    print('0b000: ', hf.GetCode(st)) # 4

length, elements解出來丟進HuffmanTable()建表得到Huffman tree - root,便能用GetCode將codeword解碼。

def decodeHuffmanTable(self, data): offset = 0 tbcount = 0 while offset < len(data): tcth, = unpack("B", data[offset:offset+1]) tc, th = (tcth >> 4, tcth & 0x0F) offset += 1 # Extract the 16 bytes containing length data lengths = unpack("BBBBBBBBBBBBBBBB", data[offset:offset+16]) offset += 16 # Extract the elements after the initial 16 bytes elements = [] for i in lengths: elements += (unpack("B"*i, data[offset:offset+i])) offset += i print("#{}: {} {} Table".format(tbcount, "Luma" if th == 0 else "Chroma", "DC" if tc == 0 else "AC")) print("lengths: ", lengths) print("Elements: ", elements) # tc = 0(DC), 1(AC) # th = 0(Luminance), 1(Chroma) hf = HuffmanTable() hf.GetHuffmanBits(lengths, elements) self.huffman_tables[tc << 1 | th] = hf tbcount += 1
Define Huffman Table
Luma AC Table
lengths:  (0, 1, 4, 1, 3, 4, 2, 3, 0, 1, 5, 1, 0, 0, 0, 0)
Elements:  [3, 1, 2, 4, 5, 6, 7, 17, 18, 0, 8, 19, 33, 20, 34, 21, 35, 49, 50, 9, 22, 36, 65, 81, 23]
Define Huffman Table
Chroma DC Table
lengths:  (0, 2, 3, 1, 0, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0)
Elements:  [6, 7, 4, 5, 8, 3, 0, 1, 2, 9]
Define Huffman Table
Chroma AC Table
lengths:  (0, 2, 1, 3, 3, 3, 2, 5, 3, 4, 1, 4, 3, 0, 0, 0)
Elements:  [1, 2, 3, 4, 5, 17, 0, 18, 33, 6, 49, 65, 19, 34, 7, 50, 81, 97, 113, 20, 35, 129, 21, 66, 145, 177, 161, 22, 37, 82, 209, 130, 193, 241]

5.2 Encode Huffman Table

def encodeHuffmanTable(self, matrix, tc, th): L = [] V = [] Lh = 5 + 16 + len(V) tcth = tc << 4 | th return pack('>HHB'+'B'*16+'B'*len(V), 0xffc4, Lh, tcth, *L, *V)

6. Start of Scan, SOS (0xFFDA)

ISO/IEC 10918-1:199, 37-38


6.1 Decode SOS

在進行解碼前,先對一些函數進行修改。

JPEG在色域轉換後會減去128才進行DCT,將[0, 255]對應到[-128, 127],因此解碼時需先補上128。

# Level-shifting for i in range(3): self.img[i] += 128

6.1.1 Image

儲存圖像資料的物件Image只有在進行色域轉換時才以3維陣列處理,平時以1維陣列img儲存Y、Cb、Cr的2維資料。DrawMatrix將已解碼完成的部份資料渲染到整塊畫布。

class Image: def __init__(self, height, width): self.width = width self.height = height # [0] = Y, [1] = Cb, [2]=Cr self.img = [] for i in range(3): self.img.append(np.zeros((self.height, self.width))) def ycbcr2rgb(self): # Level-shifting for i in range(3): self.img[i] += 128 # Merge multiple 2D arrays into a 3D array self.img = np.dstack(tuple([i for i in self.img])) # Convert image from YCbCr to RGB self.img[:, :, 1:] -= 128 m = np.array([ [1.000, 1.000, 1.000], [0.000, -0.344136, 1.772], [1.402, -0.714136, 0.000], ]) rgb = np.dot(self.img, m) self.img = np.clip(rgb, 0, 255) # Convert a 3D array to multiple 2D arrays self.img = [self.img[:, :, i] for i in range(3)] def DrawMatrix(self, y, x, L, Cb, Cr): _x = x*8 _y = y*8 self.img[0][_y:_y+8, _x:_x+8] = L.reshape((8, 8)) self.img[1][_y:_y+8, _x:_x+8] = Cb.reshape((8, 8)) self.img[2][_y:_y+8, _x:_x+8] = Cr.reshape((8, 8))

6.1.2 decodeSOS

To help resiliency in the case of data corruption, the JPEG standard allows JPEG markers to appear in the huffman-coded scan data segment. Therefore, a JPEG decoder must watch out for any marker (as indicated by the 0xFF byte, followed by a non-zero byte). If the huffman coding scheme needed to write a 0xFF byte, then it writes a 0xFF followed by a 0x00 a process known as adding a stuff byte.

JPEG允許scan data出現marker(0xFFXX)來增加容錯性,為了和marker進行區別,寫入0xFF的資料會寫入成0xFF00,這是在進行Huffman decoding前要進行留意到的細節。

解碼流程:

  1. 讀取各component的Huffman table id、quantization table id
  2. 取代0xFF000xFF
  3. 依照MCU內的component順序呼叫BuildMatrix查表解碼,在YCbCr 4:4:4下就是Y, Cb, Cr, Y, Cb, Cr
  4. 解碼後的component交給DrawMatrix渲染在畫布img
def decodeSOS(self, data): # BaselineDCT ls, ns = unpack(">HB", data[0:3]) csj = unpack("BB"*ns, data[3:3+2*ns]) dcTableMapping = [] acTableMapping = [] for i in range(3): dcTableMapping.append(csj[i*2+1] >> 4) acTableMapping.append(csj[i*2+1] & 0x0F) data = data[6+2*ns:] # Replace 0xFF00 with 0xFF i = 0 while i < len(data) - 1: m, = unpack(">H", data[i:i+2]) if m == 0xff00: data = data[:i+1]+data[i+2:] i = i + 1 img = Image(self.height, self.width) st = Stream(data) oldlumdccoeff, oldCbdccoeff, oldCrdccoeff = 0, 0, 0 for y in range(self.height//8): for x in range(self.width//8): matL, oldlumdccoeff = self.BuildMatrix( st, dcTableMapping[0], acTableMapping[0], self.quant[self.quantMapping[0]], oldlumdccoeff) matCb, oldCrdccoeff = self.BuildMatrix( st, dcTableMapping[1], acTableMapping[1], self.quant[self.quantMapping[1]], oldCrdccoeff) matCr, oldCbdccoeff = self.BuildMatrix( st, dcTableMapping[2], acTableMapping[2], self.quant[self.quantMapping[2]], oldCbdccoeff) img.DrawMatrix(y, x, matL.base, matCb.base, matCr.base) img.ycbcr2rgb() self.decoded_data = img.img

6.1.3 BuildMatrix

BuildMatrix的功能是重排DC/AC coeficient的順序,並進行inverse DCT。

micro-jpeg-visualizer的實作有兩處值得檢視:

  1. 假如DC coefficient解碼後的係數是0(EOB),那就沒有必要讀addition bits的必要。在GetBitN的實作中,傳入0不會從Stream讀取任何位元,而是返回0DecodeNumber(0, 0)也是直接返回0,這樣的實作方式省去判斷code是否為0。
    ​​​​bits = st.GetBitN(code)
    ​​​​dccoeff = DecodeNumber(code, bits) + olddccoeff
    ​​​​i.base[0] = dccoeff
    
  2. 在AC coefficient中,讀到ZRL(0xF0)要將offset(l)加上16並讀取下一組,但實作上是分成 +15、+1兩次進行,也省下判斷式。
    ​​​​if code > 15: ​​​​ l += code >> 4 # +15 ​​​​ code = code & 0x0F ​​​​bits = st.GetBitN(code) ​​​​if l < 64: ​​​​ coeff = DecodeNumber(code, bits) ​​​​ # print(coeff) ​​​​ i.base[l] = coeff * quant[l] ​​​​ l += 1 # +1
    為了比較好釐清JPEG的實作,這邊刻意改寫成:
    ​​​​if code == 0xF0: ​​​​ # ZRL ​​​​ l += 16 # +16 ​​​​ continue ​​​​elif code > 15: ​​​​ l += code >> 4 ​​​​ code = code & 0x0F ​​​​bits = st.GetBitN(code) ​​​​if l < 64: ​​​​ coeff = DecodeNumber(code, bits) ​​​​ i.base[l] = coeff ​​​​ l += 1

自己沒有注意到兩處小技巧在改寫時忽略掉幾點,解碼時在幾個MCU後解碼錯誤,因為是Bit-stream非常難除錯,最後還是靠比對原始程式碼發現這點。整體程式碼如下:

class JPEG: def BuildMatrix(self, st, dcTableId, acTableId, quant, olddccoeff): i = DCT() code = self.huffman_tables[0b00 | dcTableId].GetCode(st) bits = st.GetBitN(code) dccoeff = DecodeNumber(code, bits) + olddccoeff i.base[0] = dccoeff l = 1 while l < 64: code = self.huffman_tables[0b10 | acTableId].GetCode(st) if code == 0: # EOB break if code == 0xF0: # ZRL l += 16 continue elif code > 15: l += code >> 4 code = code & 0x0F bits = st.GetBitN(code) if l < 64: coeff = DecodeNumber(code, bits) i.base[l] = coeff l += 1 i.base = np.multiply(i.base, quant) i.rearrange_using_zigzag() i.perform_IDCT() return i, dccoeff

6.1.4 DCT

實作前一章的內容。

class DCT(): def __init__(self): self.base = np.zeros(64) self.zigzag = np.array([ [0, 1, 5, 6, 14, 15, 27, 28], [2, 4, 7, 13, 16, 26, 29, 42], [3, 8, 12, 17, 25, 30, 41, 43], [9, 11, 18, 24, 31, 40, 44, 53], [10, 19, 23, 32, 39, 45, 52, 54], [20, 22, 33, 38, 46, 51, 55, 60], [21, 34, 37, 47, 50, 56, 59, 61], [35, 36, 48, 49, 57, 58, 62, 63], ]).flatten() # Generate 2D-DCT matrix L = 8 C = np.zeros((L, L)) for k in range(L): for n in range(L): C[k, n] = np.sqrt(1/L)*np.cos(np.pi*k*(1/2+n)/L) if k != 0: C[k, n] *= np.sqrt(2) self.dct = C def perform_DCT(self): self.base = np.kron(self.dct, self.dct) @ self.base def perform_IDCT(self): self.base = np.kron(self.dct.transpose(), self.dct.transpose()) @ self.base def rearrange_using_zigzag(self): newidx = np.ones(64).astype('int8') for i in range(64): newidx[list(self.zigzag).index(i)] = i self.base = self.base[newidx]

6.1.5 Reconstruct image

我們需要一張8倍數長寬的圖片作為解碼器測試用

  1. 從twitter取得400x400的Fubuki頭貼
  2. GIMP開啟後另存一張Optimized JPG
  3. 解碼。decode後產生的是RGB不同分量的二維矩陣的一維陣列,因此還需要用dstack疊成一個三維陣列
    ​​​​jpeg = JPEG('sample2.jpg') ​​​​jpeg.decode() ​​​​img = np.dstack(tuple([i for i in jpeg.decoded_data])).astype(np.uint8)
  4. pyplot展示結果
    ​​​​import matplotlib.pyplot as plt ​​​​img = plt.imshow(img, interpolation='nearest') ​​​​plt.axis('off') ​​​​plt.savefig("output.png", bbox_inches='tight')
    ​​​​real    0m5.922s
    ​​​​user    0m6.220s
    ​​​​sys     0m1.460s
    

6.1.6 Image resolution isn't mod8.

JPEG對於非8倍數寬度圖像的做法,就是將其擴展至8倍數,至於如何擴展由實作決定,有以鏡像方式填充(padding),為何是以鏡像方式?以留黑邊影像為例,黑邊在進行轉換到頻域時會產生分佈極廣的係數(回想step function經過fourier transform的結果),這會產生大量不同的位元需要編碼,因此以鏡像方式填補。

以8x2的黑白影像分別以重複(edge)、鏡像(reflection)、對稱(symmetric)進行填補,使用numpy.pad實作。

import numpy as np import matplotlib.pyplot as plt from scipy.stats import entropy class DCT(): def __init__(self): self.base = np.zeros(64) self.zigzag = np.array([ [0, 1, 5, 6, 14, 15, 27, 28], [2, 4, 7, 13, 16, 26, 29, 42], [3, 8, 12, 17, 25, 30, 41, 43], [9, 11, 18, 24, 31, 40, 44, 53], [10, 19, 23, 32, 39, 45, 52, 54], [20, 22, 33, 38, 46, 51, 55, 60], [21, 34, 37, 47, 50, 56, 59, 61], [35, 36, 48, 49, 57, 58, 62, 63], ]).flatten() # Generate 2D-DCT matrix L = 8 C = np.zeros((L, L)) for k in range(L): for n in range(L): C[k, n] = np.sqrt(1/L)*np.cos(np.pi*k*(1/2+n)/L) if k != 0: C[k, n] *= np.sqrt(2) self.dct = C def perform_DCT(self): self.base = np.kron(self.dct, self.dct) @ self.base quant = np.array([ [16, 11, 10, 16, 24, 40, 51, 61], [12, 12, 14, 19, 26, 58, 60, 55], [14, 13, 16, 24, 40, 57, 69, 56], [14, 17, 22, 29, 51, 87, 80, 62], [18, 22, 37, 56, 68, 109, 103, 77], [24, 35, 55, 64, 81, 104, 113, 92], [49, 64, 78, 87, 103, 121, 120, 101], [72, 92, 95, 98, 112, 100, 103, 99], ]) image = np.array([ [0, 255], [0, 255], [0, 255], [0, 255], [0, 255], [0, 255], [0, 255], [0, 255] ]) seq = ((0, 8-image.shape[0]), (0, 8-image.shape[1])) bins = np.linspace(-128, 127, 256) for mode in ('edge', 'reflect', 'symmetric'): m = DCT() if mode == 'reflect' or mode == 'symmetric': m.base = np.pad(image, seq, mode) else: m.base = np.pad(image, seq, mode) m.base = m.base.flatten() m.base -= 128 # level-shifting m.perform_DCT() m.base = m.base // quant.flatten() plt.hist(m.base, bins, alpha=0.5, label=mode) unique, counts = np.unique(m.base,return_counts=True) freq = counts/np.sum(counts) print(mode, entropy(freq, base=2)) plt.legend(loc='upper right') plt.show()


結果顯示reflect和symmetric所產生的係數更為集中,如果我們計算一下三者的entropy,會發現reflect和symmetric能夠帶給我們更好的壓縮比。

edge 1.7733292042478084
reflect 1.3967822215997983
symmetric 1.0960127271685762

7. Conclusion

本篇介紹資料在JPEG中的佈局並實作decoder,未來有空的話將實作encoder,需要處理SOI、DQT、SOF、DHT、SOS、EOI幾個重要的marker,因為注重在quality factor和圖片差異的關係而非壓縮率,因此Luma、Chroma採用的Huffman table採用ISO/IEC 10918-1:199的Table K.3~K.6,quantization table則使用Table K.1~K.2,實作參考ghallak

解碼過程中也發現,縱使處理MCU不用耗費太多時間,JPEG decoder的速度仍會受限於Huffman decoder的限制,因為必須解碼當前的符號才知道下一個符號的起始點。t0rakka加入RST marker區隔MCU,達到多執行緒decode,效率驚人,可惜並非所有圖片都有加入RST,使用上較為受限。

Reference