# BDSE33 - 深度學習
# 講義與程式範例下載
* [講義](https://www.dropbox.com/scl/fi/ogxq336e9qmqazndl7tlq/BDSE_v2024.02.pdf?rlkey=p6g1kqut1lbbzytrkzhxf7q6f&dl=1)
* [資料集和程式](https://www.dropbox.com/scl/fi/d13don6ueknj220thx9g8/BDSE_v2024.02_data_and_codes.zip?rlkey=p1fvloqlfe1anknqikirrzkaj&dl=1)
---
## 20240219 Review
* VSCode Setup: linter, formatter
* Loss: MSE
* Mean Square Error:
* 給定第i個樣本 $(x^{(i)}, y^{(i)})$, 模型對樣本$x^{(i)}$的預測: $\hat{y}^{(i)}$, 可定義出第i個樣本的誤差$l^{(i)} := (y^{(i)} - \hat{y}^{(i)})^2$
* 考慮了所有樣本的誤差, 最終定義誤差 $L = \frac{1}{N}\sum_{i=1}^N l^{(i)} := \frac{1}{N} \sum_{i=1}^N (y^{(i)} - \hat{y}^{(i)})^2$
* Optimizer: GD
* 何謂Gradient (梯度)? 白話:方向:所站之處, 最陡峭的上升方向. 大小: 所站之處的坡度.
* 行進公式: $w:= w+\Delta_w$. 以梯度下降法而言, 其$\Delta_w := -\eta \frac{\partial L}{\partial w}$
* Jargon:
* Gradient (梯度): "w的梯度"的意思就是 $\frac{\partial L}{\partial w}$
* learning rate (學習率): 一個優化器超參數. 太大或小可能都不太好. 需要條參或仔細排程.
* Batch size (BS, 批次大小): 優化器更新模型權重w一次時, 需要估計權重w的梯度.
* 一般梯度計算方式: $\frac{1}{N} \sum_{i=1}^N \frac{\partial (y^{(i)} - \hat{y}^{(i)})^2}{\partial w}$.
* "梯度的近似"計算方式:$\frac{1}{N} \sum_{i=1}^{BS} \frac{\partial (y^{(i)} - \hat{y}^{(i)})^2}{\partial w}$. (先把樣本索引打亂, 然後抽BS個樣本來估算梯度的近似)
* 隨機梯度下降法 (SGD): 用比較隨機的估計方式, 抽BS個樣本, 用來估計梯度. 梯度的計算沒那麼準確, 但是不用拿整體資料集去計算, 比較可行 (否則容易OOM)
* Iteration/step: 我模型昨天訓練了100個iteration/step -> 表示梯度下降了100次.
* Epoch: 我模型昨天訓練了100個epoch. -> 表示訓練過程, 模型將整體資料看過了100次.
* Metrics:
* Precision
* 搜尋引擎 -> 預測是1的最好就是1 -> $\frac{TP}{TP+FP}$ -> 寧可漏抓, 不可錯抓 .
* Recall
* 銀行詐欺 -> 真的是1的最好我都有預測是1 -> $\frac{TP}{TP+FN}$ -> 寧可錯抓, 不可漏抓.
## 20240224 Review
* Cross Entropy
* 第i個樣本的樣本誤差: $l^{(i)}=-(y^{(i)}\log\phi + (1-y^{(i))}\log(1-\phi))$
* 如何得到? 追求最大化每個樣本是白努力分布(非$0$即$1$的分布. 假設$y$是$1$的機率是$\phi$, $y$是$0$的機率就是$1-\phi$)的可能
* Backpropagation
* 使用連鎖率將誤差往回倒傳, 以取得所有權重的梯度
* 過程:
* Forward+Backward
* Given:
* model
* input shape=`[BS, num_features]`
* output shape=`[BS, num_classes]` or `[BS, num_regression_target]`
* `loss_func`: 可能是MSE, 或CE
* `optimizer`: 可能是SGD之類
* x (Shape=`[BS, num_features]`)
* y_true (`Shape=[BS,]`)
* 正向傳遞後(Forward), 於末端層各神經元取得誤差(梯度). 接著, 將誤差往回傳遞(Backward)至每一層的每個神經元. 接著, 每個神經元就可以告知其前面與之相連的權重, 應該更新多少幅度.
```python
y_pred = model(x) # Forward
loss = loss_func(y_true, y_pred) # Get Loss
loss.backward() # Loss Backward, 往回傳遞誤差使得所有權重得到梯度
optimizer.step() # 套用諸如梯度下降法之類的優化器
```
## 20240225 To Do
- [x] ReLU
- [x] CNN Intro.
- [x] CNN Building blocks Construction: ResNet, DenseNet
- [x] Adaptive learning optimizers, BatchNorm
## 20240225 課堂補充
* Geforce顯卡FP32算力(TFLOPs): https://en.wikipedia.org/wiki/GeForce_40_series
* I/O
* Conv2D的輸入: `[BS, C, H, W]`
* Linear的輸入: `[BS, num_features]`
* LSTM的輸入: `[BS, num_time_steps, num_features]`
* Softmax回歸器分10個類別`[BS, 28, 28]` -> flatten -> `[BS, 784]` -> Linear -> `[BS, 10]`
* 類別不要太大太多東西, 不好維護 (https://refactoring.guru/smells/large-class)
## 20240224 課堂補充
* 檢查環境是否有NVIDIA GPU或Mac M1/M2/M3 GPU
```python=
import torch
if __name__ == "__main__":
print(torch.cuda.is_available())
print(torch.backends.mps.is_available())
```
* upgrade package
```bash
pip3 install pandas --upgrade
```
* 導數
Graph:
`x -> f(x) -> out`
當敲擊`out.backward()`會發生什麼?
會將算圖中, `out`之前的可帶梯度張量(通常是模型權重), 將out對該張量求導數.
在上圖中, 會將 out對x求導數, 也就是計算 x的梯度: $\frac{\partial out}{\partial x}$ (若`x`在這裡的意思是權重, $\frac{\partial out}{\partial x}$的意涵為權重`x`的梯度)
* 多變數導數
$\frac{\partial [(x_1 + 2 * x_2)^2]}{\partial x_1}= 2(x_1 + 2 * x_2)\frac{\partial 2(x_1 + 2 * x_2)}{\partial x_1}=2(x_1 + 2 * x_2)$
$\frac{\partial [(x_1 + 2 * x_2)^2]}{\partial x_2}= \frac{\partial 2(x_1 + 2 * x_2)}{\partial x_2}=4(x_1 + 2 x_2)$
* 懲罰項(L1與L2)
* L1 (Lasso): $\tilde{L} = L + \lambda * \sum_i|w_i|$
* L2 (Ridge): $\tilde{L} = L + \lambda * \sum_i (w_i)^2$
* 若懲罰項強度太強 ($L \ll \lambda \sum_i |w_i|$), 主要目標變成微擾, 最終, 目標變成近似於: 嘗試調整權重以最小化懲罰項Loss
* 也就是找到一組 ${w_1, w_2, ..., w_N}$使得$\lambda * \sum_i|w_i|$最小
* 這個情況就是 $w_1=w_2=...=w_N=0$
## 20240220 課堂補充
#### GPU Cores
* FP16 (half precision, 半精度)
* FP32 `Core` (single presicion,單精度) (單精度)(AI training)
* FP64 `Core` (double precision, 雙精度) (scientific computing)
* int8 `Core` (AI inference, edge computing)
* Tensor `Core` (FP16+FP32 Mixed): 做FMA (一次性的乘加運算)(AI training)
## 20240219 課堂補充
* PyTorch Docker images:
* NGC: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch
* DockerHub: https://hub.docker.com/r/pytorch/pytorch/tags
---
# Python Coding Style
* Linter: `pylance`
* Formatter: `black`, `isort`
* 設置Black (format on save)
![未命名.png](https://hackmd.io/_uploads/S17227Q7a.png)
* 設置isort (sort on save)
將下述設定加到`settings.json`:
```json
"editor.codeActionsOnSave": {
"source.organizeImports": true
}
```
![未命名2.png](https://hackmd.io/_uploads/rkOLAmX7p.png)
* 加強自己程式碼的易讀性
* 善用`docstring`, `type hint`
* follow, e.g., [Google's Python Coding Style Guide](https://google.github.io/styleguide/pyguide.html)
* vscode常用套件?
* `even better toml`, `reload`, `remote ssh`, `Docker`, ...
* jupyter notebook:
* 搜尋notebook format on save, 把它打勾
* type hint:
* 搜尋type hint, 開啟python analysis -> inlay hint (將function return types那一行打勾)
---
# GPU環境設置
* CPU vs. GPU
* CPU: 時脈高, 核心數少
* GPU: 時脈低, 核心數多
* NVIDIA GPU生態系
![截圖 2023-11-04 上午8.40.26.png](https://hackmd.io/_uploads/rkRZmM77a.png)
* 可能的環境設置方式
* 使用Docker
1. Linux → NVIDIA GPU Driver → Docker → 啟動[TensorFlow](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow), [PyTorch](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)等開發框架的 Docker image
2. Windows → NVIDIA GPU Driver → [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install) → Docker → 啟動[TensorFlow](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow), [PyTorch](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)等開發框架的 Docker image
* 直接於系統內安裝
* Windows → NVIDIA Driver (NVIDIA顯卡驅動) → CUDA (NVIDIA顯卡軟體開發框架) → cuDNN (用於NVIDIA顯卡運算的神經網路函式庫) → TensorFlow。
## GPU相關套件 - 於 Windows 內直接安裝
安裝順序: NVIDIA Driver (NVIDIA顯卡驅動) → CUDA (NVIDIA顯卡軟體開發框架) → cuDNN (用於NVIDIA顯卡運算的神經網路函式庫) → TensorFlow。
0. 安裝[NVIDIA Driver](https://www.nvidia.com/Download/index.aspx)
(此步驟可略過,因為教室的機器應該已經安裝好NVIDIA驅動程式)
1. 啟動 `NVSMI` (NVIDIA System Management Interface)
開啟終端機 (如GIT BASH或Windows命令提示字元), 輸入`nvidia-smi`後, 按Enter, 即可顯示出GPU使用率等資訊。此步驟若能成功執行,也代表驅動程式應已正常安裝。
2. 安裝[CUDA v11.6.1](https://developer.nvidia.com/cuda-toolkit)。[[下載連結]](https://developer.download.nvidia.com/compute/cuda/11.6.1/local_installers/cuda_11.6.1_511.65_windows.exe)
3. 安裝[cuDNN v 8.4.0 (需與CUDA v11.6.1相容)](https://developer.nvidia.com/cudnn)。[[下載連結]](https://www.dropbox.com/s/vg4yp1cf5hvf2p3/cudnn-windows-x86_64-8.4.0.27_cuda11.6-archive.zip?dl=1)
將解壓縮後的cuDNN資料夾內的檔案, 逐一複製到```C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6```。
4. 執行以下指令,確定NVIDIA CUDA Compiler (NVCC)有被安裝:
```nvcc -V```
如果有正常執行,表示CUDA應該可以使用。
5. 安裝TensorFlow等套件:
1. 執行`pip install -r requirements.txt`
```
matplotlib
seaborn
numpy>=1.23.4, <2.0.0
pandas>=1.5.3, <2.0.0
scipy>=1.9.1, <2.0.0
tensorflow<2.11
scikit-image>=0.20.0, <0.21.0
opencv-python>=4.6.0, <5.0.0
loguru>=0.6.0, <0.7.0
more-itertools>=8.14.0, <9.0.0
joblib>=1.2.0, <2.0.0
```
4. 檢查TensorFlow是否可把張量丟到GPU
```python
import numpy as np
import tensorflow as tf
from loguru import logger
if __name__ == "__main__":
logger.info(tf.__version__)
t = np.random.normal(0, 1, (3, 3)).astype(np.float32)
# 以32 (bits), 也就是 32/8=4 (bytes) 來儲存一個浮點數。
# 上述建立了一個3x3亂數矩陣 每個矩陣元素皆以4 bytes來儲存。
logger.info(t)
logger.info(t.shape)
logger.info(t.dtype)
t = tf.constant(t)
# 改以TensorFlow來儲存先前NumPy定義出來的張量 (順利的話會存在GPU上面的記憶體)
logger.info(t.device) # 顯示張量存放位置
```
如有出現`GPU:0`字眼,表示GPU已可正常使用。
6. 修復TensorFlow Bug
TensorFlow `v2.10.1`在Windows系統上使用卷基層做模型推論的時候會遇到錯誤 (與`zlibwapi.dll`相關)。如果你有遇到這個問題,可將: `C:\Program Files\NVIDIA Corporation\Nsight Systems 2021.5.2\host-windows-x64\zlib.dll` 更名,並且複製到 `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\zlibwapi.dll` 。
複製完後,簡單將一個隨機張量丟進一個卷積層,即可確認問題是否解決。
```python=
import numpy as np
from keras.layers import Conv2D
from keras.models import Sequential
if __name__ == '__main__':
randData = np.random.normal(0, 1, (10, 5, 5, 3)) # normal分佈的亂數資料當input, 10個3D樣本
model = Sequential()
model.add(
Conv2D(
filters=96,
kernel_size=(3, 3),
strides=(1, 1),
padding="valid",
input_shape=(5, 5, 3),
)
)
print(model.predict(randData).shape) # 看輸出資料的形狀
```
# 參考連結
* 寫程式確保每個區塊運作正確 - 寫test: https://docs.pytest.org/en/8.0.x/
* Design Pattern Guru - 壞味道: https://refactoring.guru/refactoring/smells
* Design Pattern, Python basics: https://www.youtube.com/@ArjanCodes
* Find state-of-art model implementations (if papers are with codes :arrow_right: :100:): https://paperswithcode.com
* Understand CNN basics (:100: you must read this, understand every lines of words): https://cs231n.github.io
* karparthy: https://www.youtube.com/results?search_query=karparthy
* Learn Deep Learning basics (Wow! Learning by doing! :100:): https://zh.d2l.ai
* Machine Learning basic theories (:100: for those who wants to understand every general bits of maths in Machine Learning): https://cs229.stanford.edu/notes2022fall/main_notes.pdf
* Understand Backpropagation: https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf
* Python Machine Learning (Sebastian Raschka)
* Keras book: https://www.manning.com/books/deep-learning-with-python
---
wengchihung@gmail.com