Faster LIME - HackMD

# Faster LIME 本文先簡述[LIME(low-light image enhancement)](https://arxiv.org/abs/1605.05034)演算法內容，並提供程式碼連結，之後進入***主旨***：修改方程(1)來得到可簡易實作的改版。因這篇在arxiv上的論文並沒有提供演算法細節，僅簡略帶過內容。而github上有實作LIME，因此本文提供了修改後的numpy與torch版本。 ## 一、LIME概略 LIME(Low-light image enhancement)在2016提出，基於Retinex模型，改進對光照的估計來獲得物體反射(物體本身的顏色)，主要流程如下 - 光照的初始估計 $\hat{T}$ - 精煉光照並gamma correction $T\leftarrow f(\hat{T})^\gamma$，其中f為函數 - 獲得反射$R=L/(T+\epsilon)$，其中$L,\, R,\,T$分別為輸入影像、反射與光照。在Retinex模型「精煉」基本都會與模糊化有關，LIME改進對光照的精煉，其優點是 1. 色彩失真較小 2. 不太需要微調參數。設定$\alpha=2,\,\gamma=0.5$並迭代3~5次左右，就能在大多圖片有不錯效果。 3. 雜訊和光暈不會像MSR(multi-scale retinex)明顯其效果讓一些low-light enhancement的神經網路模型也會拿來與LIME比較。而迭代法讓計算速度成為主要缺點。 1. 初始光照 $\hat{T}$ 估計取RGB的最大值： $$ \hat{T}(x,y) = \max_{c\in\{r,\ g,\ b\}} L^c(x,y) $$ 2. 透過下列方程精煉初始值 $\hat{T}$ 來得到光照 $T$。原文認為好的解需要達到「保留結構特徵」並「平滑細節紋路」，可以理解為保留物體邊緣、抹平物體表面上的紋路。 $$ \tag{1} \arg\min_T \left\|{\hat{T}-T}\right\|_{\text{F}}^2 + \alpha \left\|\textbf{W}\circ\nabla T\right\|_1 $$ 因為包含$L_1\text{-norm}$，直接使用梯度下降的效果不佳，從下方連結可知是透過增加額外參數，將方程轉換成多個子問題來輪流更新。 3. 咖瑪校正(gamma correction)光照並求得反射：$R = L / (T^\gamma+\epsilon)$ ### 程式碼 python上的實作可參考 1. [aeinrw/LIME: Implementation of the paper, "LIME: Low-Light Image Enhancement via Illumination Map Estimation", which is for my graduation thesis.](https://github.com/aeinrw/LIME) 2. [numpy版本 johnny95731/Highlight-Removal](https://github.com/johnny95731/Highlight-Removal/blob/main/new/python/lime.py) 3. [torch版本 johnny95731/Highlight-Removal](https://github.com/johnny95731/Highlight-Removal/blob/main/new/python/lime_torch.py) 後兩個來自個人github，自連結1(原版)程式碼做修改： - rfft2替代fft，在二維計算且實數傅立葉轉換後僅需一半計算量 - 原本用矩陣計算微分與頻率域拉普拉斯算子，改為直接在2D計算 - in-place operation和memory preallocation，減少記憶體操作 numpy版在同樣迭代次數下，比原版時間減少一半。光照也模糊化得比較快，只需要比較少的迭代。而torch版在cpu上比numpy版快了將近10倍。[配備](###配備) 關於模糊化的部分，是在將subproblem_t的$\mathcal{F}\{\textbf{D}^\text{T}\textbf{D}\}$（$\mathcal{F}$為傅立葉轉換）更換為頻率域的拉普拉斯算子後，產生了明顯的改變。最後的結果與光照如下。影像來自ExDark資料集 Bicycle/2015_00003.png <figure> <img src="https://hackmd.io/_uploads/Hkc3OGM6Zg.png" alt="LIME結果"> <figcaption style="text-align: center;">LIME結果。aeinrw迭代5次。johnny95731迭代3次</figcaption> </figure> <figure> <img src="https://hackmd.io/_uploads/H1mTSZza-g.png" alt="LIME結果(局部)"> <figcaption style="text-align: center;">LIME結果(局部)</figcaption> </figure> <figure> <img src="https://hackmd.io/_uploads/HkXarbza-l.png" alt="光照估計"> <figcaption style="text-align: center;">光照估計</figcaption> </figure> ## 二、修改方程提出幾項改版後，在下面展示各版本的結果。 ### 1. Screened Poisson Equation 將原始公式 (1)中$L_1\text{-norm}$項替換為$L_2\text{-norm}$，並將權重$\textbf{W}$定為常數1。則公式(1)變為下式 $$ \tag{3} \begin{split} &\arg\min_T \left\|{\hat{T}-T}\right\|_{\text{F}}^2 + \alpha \left\|\nabla T\right\|_\text{F}^2 \\ = &\arg\min_T \iint \left|{\hat{T}(x,y)-T(x,y)}\right|^2 + \sum_{d\in\{x,y\}} \left|{D_{d}T(x,y)}\right|^2 \,\text{d}x\text{d}y \end{split} $$ (3)式對應的PDE為下列方程式，即screened Poisson equation的變形 $$ \text{T}-\alpha\Delta\text{T}=\hat{\text{T}} \ \ \text{or}\ \ (1-\alpha\Delta)\text{T} = \hat{\text{T}} $$ 經過傅立葉轉換$\mathcal{F}$之後，很簡單地就能求出光照$T=\mathcal{F}^{-1}\left\{\mathcal{F}\{\hat{T}\} / \mathcal{F}\{1-\alpha\Delta\}\right\}$。關於拉普拉斯算子，見[補充—拉普拉斯算子與傅立葉轉換](###拉普拉斯算子與傅立葉轉換)。 ### 2. 等價於Butterworth 透過傅立葉轉換[公式表](https://en.wikipedia.org/wiki/Fourier_transform#Tables_of_important_Fourier_transforms)可以得到 $$ \frac{1}{\mathcal{F}\{(1-\alpha\Delta)\}(u,v)}= \frac{1}{1 + 4\alpha\pi^2(u^2+v^2)} $$ 這等價於1/2階、cutoff frequency=$1/(4\alpha\pi^2)$的Butterworth lowpass filter。因此方程(3)的解實際上是對$\hat{\text{T}}$做Butterworth低通濾波。因此可以直接使用Butterworth濾波器並調整order與cuttoff frequency。 ### 3. 彌補L1與L2差異梯度由L1-norm變成L2-norm，數值變成平方增長，因此最小化能量時，對邊緣會有較強的抑制=平滑效果，這與原文認為的保留邊緣不同。如下所示，對T的梯度加上平滑化，以此來減少argmin對邊緣的懲罰。 $$ \tag{4} \arg\min_T \left\|{\hat{T}-T}\right\|_{\text{F}}^2 + \alpha \left\|G\ast\nabla T\right\|_\text{F}^2 $$ 如果G是Hermitian function $G(-x)=G^*(x)$，例如高斯函數，則上面對應的Euler-Lagrange equation為([參考](https://math.stackexchange.com/questions/1066495/minimization-of-variational-total-variation-tv-deblurring)) $$ T-\alpha (G*G^*)\ast\Delta T = \hat{T} $$ ### 4. 替換拉普拉斯算子 $-\mathcal{F}\{\Delta\}$是一種高通濾波器，因此可以嘗試用別的高通濾波器代替。 ### 結果 LIME iter=3: LIME演算法，迭代3次、alpha=2 screened: 使用方程(3)，alpha=2 Butterworth: 對初始光照使用巴特沃斯低通濾波，alpha=2、oreder=7 screened+blur: 使用方程(4)，alpha=2、sigma=2pi sharpening: 使用高斯高通濾波代替拉普拉斯算子，alpha=2、sigma=10 gamma皆為0.5 下方表格的時間為15次的平均值，在cpu上執行改良後的程式碼可以到[gist](https://gist.github.com/johnny95731/2faf668ed022c4e72ae5dbd3d75f78f7 )，以及額外需要的[頻率域濾波器](https://github.com/johnny95731/torch-imagetools/blob/master/src/imgtools/filters/rfft.py)。 <figure> <img src="https://hackmd.io/_uploads/SkoewowTZg.jpg" alt="結果1"> <figcaption style="text-align: center;">結果1。寬高=(350, 262)</figcaption> </figure> <figure> <img src="https://hackmd.io/_uploads/HJTJm2DaWg.jpg" alt="結果1的光照"> <figcaption style="text-align: center;">結果1的光照。</figcaption> </figure> 方法 | LIME iter=3 | screened | Butterworth | screened+blur | sharpening ----|-------------|----------|-------------|---------------|------------ 時間 | 0.00508 | 0.00132 | **0.00129** | 0.00176 | 0.00135 <figure> <img src="https://hackmd.io/_uploads/HkjgPiP6We.jpg" alt="結果2"> <figcaption style="text-align: center;">結果2。寬高=(2800, 1248)</figcaption> </figure> 方法 | LIME iter=3 | screened | Butterworth | screened+blur | sharpening ----|-------------|----------|-------------|---------------|------------ 時間 | 0.10508 | 0.01364 | **0.01332** | 0.02286 | 0.01532 <figure> <img src="https://hackmd.io/_uploads/Hy2xPjw6-e.jpg" alt="結果3"> <figcaption style="text-align: center;">結果3。寬高=(5600, 4200)</figcaption> </figure> 方法 | LIME iter=3 | screened | Butterworth | screened+blur | sharpening ----|-------------|----------|-------------|---------------|------------ 時間 | 0.93937 | 0.13060 | 0.13167 | 0.18815 | **0.13005** #### 配備 CPU: AMD Ryzen9 7945HX Memory: 16GB*2 ## 補充 ### 拉普拉斯算子與傅立葉轉換先說結果，下面提到的(cl)與(dl5)對LIME結果影響不大，至少90%的像素差異<1(影像在[0, 255]範圍內)；而(dl9)擴散較快，但陰影區域的中央可能會出現高強度雜訊。文章內皆為使用cl。 #### 座標關係下方提到的頻率$(u,v)\in[-1,1]^2$，程式上可用`fft.fftfreq(n, d=1.0)`或`torch.fft.fftfreq(n, d=1.0)`產生，其中n相當於原始圖片的長或寬。詳細參考下方 ```python img = cv2.imread(path) # (H, W, C) img_f = np.fft.rfft2(img) u_coor = np.fft.rfftfreq(img.shape[1]) v_coor = np.fft.fftfreq(img.shape[0]) ``` #### 不同公式頻率域的拉普拉斯算子為 $$ \tag{cl} \mathcal{F}\{\Delta\}(u,v)=-4\pi^2(u^2+v^2) $$ 而影像處理常用的5-point stencil拉普拉斯捲積核為 $$ \begin{bmatrix} 0 &1 &0 \\\ 1 &-4 &1\\\ 0 &1 &0 \end{bmatrix} $$ 其與影像L捲積後的結果可表示為 $$ L_{\text{laplace}}(x,y)=L(x+1,y)+L(x-1,y)+L(x,y+1)+L(x,y-1)-4L(x,y) $$ 傅立葉轉換後得到 $$ \tag{dl5} H_{\text{dl5}}(u,v) = 2\cos{(2\pi u)} + 2\cos{(2\pi v)}-4 $$ 另外，9-point stencil拉普拉斯卷積核如下，這是影像處理常用、旋轉對稱的版本。有興趣的話，在[9-point stencil](https://en.wikipedia.org/wiki/Nine-point_stencil)的維基可以找到數值微分的版本。 $$ \tag{dl9} \begin{bmatrix} 1 &1 &1 \\\ 1 &-8 &1\\\ 1 &1 &1 \end{bmatrix} $$ (cl)和(dl5)與9-point stencil (dl9)在頻率域處理的結果如下。 <figure> <img src="https://hackmd.io/_uploads/BkxOT1Sa-x.png" alt="Laplacian 1"> <figcaption style="text-align: center;">頻率域拉普拉斯1</figcaption> </figure> <figure> <img src="https://hackmd.io/_uploads/ryZOp1Bpbl.png" alt="Laplacian 2"> <figcaption style="text-align: center;">頻率域拉普拉斯2</figcaption> </figure> 在Gonzalez與Wood的數位影像處理中也有提到，頻率域使用拉普拉斯做sharpening比空間域銳利，但在LIME上(cl)與(dl5)似乎沒什麼影響，而(dl9)容易產生明顯雜訊。 ## 後記數學的部分只列出能量泛函與對應的PDE，不過變分法是看線上課程自學的，不算很有自信沒問題。不過寫成頻率域濾波器之後還蠻直觀的，並且程式上執行是效果不錯。一開始只想了替換成L2-norm的形式(方程(3))，不過內容偏少，就嘗試加點東西，刪除掉一些實質上等價的形式，或是效果不佳的內容，最後就是這些。加上捲積後的方程(4)，直覺上可以運用在去雜訊，Perona-Malik equation與shock filter也是要一直迭代，改良版計算量增加又實作複雜。未來考慮提一些Python加速，以及我以PyTorch實作的影像處理[repo](https://github.com/johnny95731/torch-imagetools)。

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.