BDSE32 - 深度學習

# BDSE33 - 深度學習 # 講義與程式範例下載 * [講義](https://www.dropbox.com/scl/fi/ogxq336e9qmqazndl7tlq/BDSE_v2024.02.pdf?rlkey=p6g1kqut1lbbzytrkzhxf7q6f&dl=1) * [資料集和程式](https://www.dropbox.com/scl/fi/d13don6ueknj220thx9g8/BDSE_v2024.02_data_and_codes.zip?rlkey=p1fvloqlfe1anknqikirrzkaj&dl=1) --- ## 20240219 Review * VSCode Setup: linter, formatter * Loss: MSE * Mean Square Error: * 給定第i個樣本 $(x^{(i)}, y^{(i)})$, 模型對樣本$x^{(i)}$的預測: $\hat{y}^{(i)}$, 可定義出第i個樣本的誤差$l^{(i)} := (y^{(i)} - \hat{y}^{(i)})^2$ * 考慮了所有樣本的誤差, 最終定義誤差 $L = \frac{1}{N}\sum_{i=1}^N l^{(i)} := \frac{1}{N} \sum_{i=1}^N (y^{(i)} - \hat{y}^{(i)})^2$ * Optimizer: GD * 何謂Gradient (梯度)? 白話:方向:所站之處, 最陡峭的上升方向. 大小: 所站之處的坡度. * 行進公式: $w:= w+\Delta_w$. 以梯度下降法而言, 其$\Delta_w := -\eta \frac{\partial L}{\partial w}$ * Jargon: * Gradient (梯度): "w的梯度"的意思就是 $\frac{\partial L}{\partial w}$ * learning rate (學習率): 一個優化器超參數. 太大或小可能都不太好. 需要條參或仔細排程. * Batch size (BS, 批次大小): 優化器更新模型權重w一次時, 需要估計權重w的梯度. * 一般梯度計算方式: $\frac{1}{N} \sum_{i=1}^N \frac{\partial (y^{(i)} - \hat{y}^{(i)})^2}{\partial w}$. * "梯度的近似"計算方式:$\frac{1}{N} \sum_{i=1}^{BS} \frac{\partial (y^{(i)} - \hat{y}^{(i)})^2}{\partial w}$. (先把樣本索引打亂, 然後抽BS個樣本來估算梯度的近似) * 隨機梯度下降法 (SGD): 用比較隨機的估計方式, 抽BS個樣本, 用來估計梯度. 梯度的計算沒那麼準確, 但是不用拿整體資料集去計算, 比較可行 (否則容易OOM) * Iteration/step: 我模型昨天訓練了100個iteration/step -> 表示梯度下降了100次. * Epoch: 我模型昨天訓練了100個epoch. -> 表示訓練過程, 模型將整體資料看過了100次. * Metrics: * Precision * 搜尋引擎 -> 預測是1的最好就是1 -> $\frac{TP}{TP+FP}$ -> 寧可漏抓, 不可錯抓 . * Recall * 銀行詐欺 -> 真的是1的最好我都有預測是1 -> $\frac{TP}{TP+FN}$ -> 寧可錯抓, 不可漏抓. ## 20240224 Review * Cross Entropy * 第i個樣本的樣本誤差: $l^{(i)}=-(y^{(i)}\log\phi + (1-y^{(i))}\log(1-\phi))$ * 如何得到? 追求最大化每個樣本是白努力分布(非$0$即$1$的分布. 假設$y$是$1$的機率是$\phi$, $y$是$0$的機率就是$1-\phi$)的可能 * Backpropagation * 使用連鎖率將誤差往回倒傳, 以取得所有權重的梯度 * 過程: * Forward+Backward * Given: * model * input shape=`[BS, num_features]` * output shape=`[BS, num_classes]` or `[BS, num_regression_target]` * `loss_func`: 可能是MSE, 或CE * `optimizer`: 可能是SGD之類 * x (Shape=`[BS, num_features]`) * y_true (`Shape=[BS,]`) * 正向傳遞後(Forward), 於末端層各神經元取得誤差(梯度). 接著, 將誤差往回傳遞(Backward)至每一層的每個神經元. 接著, 每個神經元就可以告知其前面與之相連的權重, 應該更新多少幅度. ```python y_pred = model(x) # Forward loss = loss_func(y_true, y_pred) # Get Loss loss.backward() # Loss Backward, 往回傳遞誤差使得所有權重得到梯度 optimizer.step() # 套用諸如梯度下降法之類的優化器 ``` ## 20240225 To Do - [x] ReLU - [x] CNN Intro. - [x] CNN Building blocks Construction: ResNet, DenseNet - [x] Adaptive learning optimizers, BatchNorm ## 20240225 課堂補充 * Geforce顯卡FP32算力(TFLOPs): https://en.wikipedia.org/wiki/GeForce_40_series * I/O * Conv2D的輸入: `[BS, C, H, W]` * Linear的輸入: `[BS, num_features]` * LSTM的輸入: `[BS, num_time_steps, num_features]` * Softmax回歸器分10個類別`[BS, 28, 28]` -> flatten -> `[BS, 784]` -> Linear -> `[BS, 10]` * 類別不要太大太多東西, 不好維護 (https://refactoring.guru/smells/large-class) ## 20240224 課堂補充 * 檢查環境是否有NVIDIA GPU或Mac M1/M2/M3 GPU ```python= import torch if __name__ == "__main__": print(torch.cuda.is_available()) print(torch.backends.mps.is_available()) ``` * upgrade package ```bash pip3 install pandas --upgrade ``` * 導數 Graph: `x -> f(x) -> out` 當敲擊`out.backward()`會發生什麼? 會將算圖中, `out`之前的可帶梯度張量(通常是模型權重), 將out對該張量求導數. 在上圖中, 會將 out對x求導數, 也就是計算 x的梯度: $\frac{\partial out}{\partial x}$ (若`x`在這裡的意思是權重, $\frac{\partial out}{\partial x}$的意涵為權重`x`的梯度) * 多變數導數 $\frac{\partial [(x_1 + 2 * x_2)^2]}{\partial x_1}= 2(x_1 + 2 * x_2)\frac{\partial 2(x_1 + 2 * x_2)}{\partial x_1}=2(x_1 + 2 * x_2)$ $\frac{\partial [(x_1 + 2 * x_2)^2]}{\partial x_2}= \frac{\partial 2(x_1 + 2 * x_2)}{\partial x_2}=4(x_1 + 2 x_2)$ * 懲罰項(L1與L2) * L1 (Lasso): $\tilde{L} = L + \lambda * \sum_i|w_i|$ * L2 (Ridge): $\tilde{L} = L + \lambda * \sum_i (w_i)^2$ * 若懲罰項強度太強 ($L \ll \lambda \sum_i |w_i|$), 主要目標變成微擾, 最終, 目標變成近似於: 嘗試調整權重以最小化懲罰項Loss * 也就是找到一組 ${w_1, w_2, ..., w_N}$使得$\lambda * \sum_i|w_i|$最小 * 這個情況就是 $w_1=w_2=...=w_N=0$ ## 20240220 課堂補充 #### GPU Cores * FP16 (half precision, 半精度) * FP32 `Core` (single presicion,單精度) (單精度)(AI training) * FP64 `Core` (double precision, 雙精度) (scientific computing) * int8 `Core` (AI inference, edge computing) * Tensor `Core` (FP16+FP32 Mixed): 做FMA (一次性的乘加運算)(AI training) ## 20240219 課堂補充 * PyTorch Docker images: * NGC: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch * DockerHub: https://hub.docker.com/r/pytorch/pytorch/tags --- # Python Coding Style * Linter: `pylance` * Formatter: `black`, `isort` * 設置Black (format on save) ![未命名.png](https://hackmd.io/_uploads/S17227Q7a.png) * 設置isort (sort on save) 將下述設定加到`settings.json`: ```json "editor.codeActionsOnSave": { "source.organizeImports": true } ``` ![未命名2.png](https://hackmd.io/_uploads/rkOLAmX7p.png) * 加強自己程式碼的易讀性 * 善用`docstring`, `type hint` * follow, e.g., [Google's Python Coding Style Guide](https://google.github.io/styleguide/pyguide.html) * vscode常用套件? * `even better toml`, `reload`, `remote ssh`, `Docker`, ... * jupyter notebook: * 搜尋notebook format on save, 把它打勾 * type hint: * 搜尋type hint, 開啟python analysis -> inlay hint (將function return types那一行打勾) --- # GPU環境設置 * CPU vs. GPU * CPU: 時脈高, 核心數少 * GPU: 時脈低, 核心數多 * NVIDIA GPU生態系 ![截圖 2023-11-04 上午8.40.26.png](https://hackmd.io/_uploads/rkRZmM77a.png) * 可能的環境設置方式 * 使用Docker 1. Linux → NVIDIA GPU Driver → Docker → 啟動[TensorFlow](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow), [PyTorch](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)等開發框架的 Docker image 2. Windows → NVIDIA GPU Driver → [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install) → Docker → 啟動[TensorFlow](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow), [PyTorch](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)等開發框架的 Docker image * 直接於系統內安裝 * Windows → NVIDIA Driver (NVIDIA顯卡驅動) → CUDA (NVIDIA顯卡軟體開發框架) → cuDNN (用於NVIDIA顯卡運算的神經網路函式庫) → TensorFlow。 ## GPU相關套件 - 於 Windows 內直接安裝安裝順序: NVIDIA Driver (NVIDIA顯卡驅動) → CUDA (NVIDIA顯卡軟體開發框架) → cuDNN (用於NVIDIA顯卡運算的神經網路函式庫) → TensorFlow。 0. 安裝[NVIDIA Driver](https://www.nvidia.com/Download/index.aspx) (此步驟可略過，因為教室的機器應該已經安裝好NVIDIA驅動程式) 1. 啟動 `NVSMI` (NVIDIA System Management Interface) 開啟終端機 (如GIT BASH或Windows命令提示字元), 輸入`nvidia-smi`後, 按Enter, 即可顯示出GPU使用率等資訊。此步驟若能成功執行，也代表驅動程式應已正常安裝。 2. 安裝[CUDA v11.6.1](https://developer.nvidia.com/cuda-toolkit)。[[下載連結]](https://developer.download.nvidia.com/compute/cuda/11.6.1/local_installers/cuda_11.6.1_511.65_windows.exe) 3. 安裝[cuDNN v 8.4.0 (需與CUDA v11.6.1相容)](https://developer.nvidia.com/cudnn)。[[下載連結]](https://www.dropbox.com/s/vg4yp1cf5hvf2p3/cudnn-windows-x86_64-8.4.0.27_cuda11.6-archive.zip?dl=1) 將解壓縮後的cuDNN資料夾內的檔案, 逐一複製到```C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6```。 4. 執行以下指令，確定NVIDIA CUDA Compiler (NVCC)有被安裝: ```nvcc -V``` 如果有正常執行，表示CUDA應該可以使用。 5. 安裝TensorFlow等套件: 1. 執行`pip install -r requirements.txt` ``` matplotlib seaborn numpy>=1.23.4, <2.0.0 pandas>=1.5.3, <2.0.0 scipy>=1.9.1, <2.0.0 tensorflow<2.11 scikit-image>=0.20.0, <0.21.0 opencv-python>=4.6.0, <5.0.0 loguru>=0.6.0, <0.7.0 more-itertools>=8.14.0, <9.0.0 joblib>=1.2.0, <2.0.0 ``` 4. 檢查TensorFlow是否可把張量丟到GPU ```python import numpy as np import tensorflow as tf from loguru import logger if __name__ == "__main__": logger.info(tf.__version__) t = np.random.normal(0, 1, (3, 3)).astype(np.float32) # 以32 (bits), 也就是 32/8=4 (bytes) 來儲存一個浮點數。 # 上述建立了一個3x3亂數矩陣每個矩陣元素皆以4 bytes來儲存。 logger.info(t) logger.info(t.shape) logger.info(t.dtype) t = tf.constant(t) # 改以TensorFlow來儲存先前NumPy定義出來的張量 (順利的話會存在GPU上面的記憶體) logger.info(t.device) # 顯示張量存放位置 ``` 如有出現`GPU:0`字眼，表示GPU已可正常使用。 6. 修復TensorFlow Bug TensorFlow `v2.10.1`在Windows系統上使用卷基層做模型推論的時候會遇到錯誤 (與`zlibwapi.dll`相關)。如果你有遇到這個問題，可將: `C:\Program Files\NVIDIA Corporation\Nsight Systems 2021.5.2\host-windows-x64\zlib.dll` 更名，並且複製到 `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\zlibwapi.dll` 。複製完後，簡單將一個隨機張量丟進一個卷積層，即可確認問題是否解決。 ```python= import numpy as np from keras.layers import Conv2D from keras.models import Sequential if __name__ == '__main__': randData = np.random.normal(0, 1, (10, 5, 5, 3)) # normal分佈的亂數資料當input, 10個3D樣本 model = Sequential() model.add( Conv2D( filters=96, kernel_size=(3, 3), strides=(1, 1), padding="valid", input_shape=(5, 5, 3), ) ) print(model.predict(randData).shape) # 看輸出資料的形狀 ``` # 參考連結 * 寫程式確保每個區塊運作正確 - 寫test: https://docs.pytest.org/en/8.0.x/ * Design Pattern Guru - 壞味道: https://refactoring.guru/refactoring/smells * Design Pattern, Python basics: https://www.youtube.com/@ArjanCodes * Find state-of-art model implementations (if papers are with codes :arrow_right: :100:): https://paperswithcode.com * Understand CNN basics (:100: you must read this, understand every lines of words): https://cs231n.github.io * karparthy: https://www.youtube.com/results?search_query=karparthy * Learn Deep Learning basics (Wow! Learning by doing! :100:): https://zh.d2l.ai * Machine Learning basic theories (:100: for those who wants to understand every general bits of maths in Machine Learning): https://cs229.stanford.edu/notes2022fall/main_notes.pdf * Understand Backpropagation: https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf * Python Machine Learning (Sebastian Raschka) * Keras book: https://www.manning.com/books/deep-learning-with-python --- wengchihung@gmail.com

Read more

BDSE32 - 深度學習

ISpan - AIML

SOLID

Long Method