# 23: Super-resolution
或Deep Learning Foundations to Stable Diffusion
主要討論:深度學習模型在圖像處理(應用/改進)
## FID
首先他說了 notebook 23章 ~~ [~~(我找了很久)~~](https://github.com/fastai/course22p2)~~ FID (Fréchet Inception Distance) 的bug
> Fréchet Inception Distance 是什麼? 用來評估模型產生圖像品質和多樣性的重要指標/生成圖像與真實圖像在特徵空間中的相似度。
> 特徵提取>計算>Fréchet 距離計算
> value 越低越好, if ==0 即生成的和真實的圖像分布完全相同~ 完美~
```
BUG:
cos model: -0.5 to +0.5
sampling x2
** dependent variable is just "noise"
```
## U-Nets for super-resolution
然後就到Tiny Imagenet
```
fashion MNIST:最大的1training size 只有28x28
Tiny Imagenet: 64x64 [壞處: 很難找,只有standford 找到]
```
```
SHUTIL <-解壓用~
```
```
STEP1:
: create data set
class TinyDS:
def __init__(self, path):
self.path = Path(path)
self.files = glob(str(path/'**/*.JPEG'), recursive=True) <-要小心recursive
def __len__(self): return len(self.files)
def __getitem__(self, i): return self.files[i],Path(self.files[i]).parent.parent.name
tds = TinyDS(path/'train') <-
```

Label 是分開放在val
如上
> shift 提示大法~
### data augmentation
random resize crop
pick one area inside and zoom into it
可是tiny imagenet 太小/細了,太poor
=> blur

貓貓是很可愛啦,有貓就讚,但是我以為自己沒有戴眼鏡呢
所以padding around them
然後隨機從中找64x64
用random array
V
用transform batch passing in those transforms
(nn dot sequential: called one time in a row)
~~(我也覺得沒什麼神奇特別啦...)~~
get_dropmodel <=就用之前那個啦
> 在batch level 下,不是必然做augmentation
> 又用時間,又會濛

用AdamW mixed precision來train
攪leraning rate finder
trained it for 25epochs

59.3%
## 怎樣可以做得更好呢?
看論文,如何從60%>70%

用real ResNet ~~([我看了這條影片](https://youtu.be/o_3mboe1jYI?si=Oq3WPSWIOQRcHv7a))~~


```
1. no extra parameter
2. 1x1 convolution (x2)
3. /2 size (=>32x32)but stride of 2
```


5 down sample!
3+2+2+1+1=9 res blocks
前
後
多過一倍

61.8%了
## More augmentation: trivial augment
[論文](https://arxiv.org/pdf/2103.10158)

在這個dataset 表現沒有太好
所以逐件逐件來 (/張?!/項?!)

(要做augmentation, 之後轉去tensor,之後normalize )
conv -> normalization->activation(optional)

64.9%

## 25 Super Resolution not Classification
independent variable will be scaled down to 32x32 pixel
dependent variable: original image
do random crop within padded image and random flips: need exactly the same cropping and flipping (both independent and dependent)
難版super resolution 可以把deleted 的pixels 換成更好的pixels
重點: dataset 要求簡單點,不用load labels
train 和validation 是沒分別~
TfmDS only appkued to the independent var
64x64->32x32 (2x2pixels by 2)->64
如果質數好32>64
更甚->128 etc.
## Denoising autoencoder (Review: ch8)


[Unet](https://arxiv.org/abs/1505.04597)!
2015開發的, 起初用作醫療影像

32x32 pixels in the lowest resolution
(-2魔法)
-> down sampling
^ up sampling
Question: Moudle list?! does not do anything?!
## Unet

8.6% 損失又真的是少了很多
intput: 
output: 
肉眼沒分別 ~~我眼睛業障~~

這個看得出來! 但這是t (target )
## Perceptual loss
拯救眼睛計劃~
傳統: pixel loss: 對比像數:模糊不清 (比較安全)
Perceptual loss: 印象/概念/大概的草稿
要求: classifier model

~~我眼鏡是有病吧?~~
scalling factor: 0.1x



trading
fast ai favourite trick:
gradually unfreezing pre trained networks
use actual weight for training and sampling
turn off require grad
trick: use random weight, set everything to unfrozen. do 20 epochs
==============================
然而,我還是沒成功~

不等了~