1. HUNG-YI LEE 2022 ML - NN introduction

tags: `Machine Learning`

李宏毅2022 ML 上課筆記
課程說明2022
課程說明2021
課程說明2019
課程說明所有年份
(Youtube)Machine Learning (2021) Mandarin Version https://youtube.com/playlist?list=PLJV_el3uVTsMhtt7_Y6sgTHGHp1Vb2P2J

ML學習攻略

【機器學習2021】機器學習任務攻略
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

機器學習-Training，就已知資料

1. function with unknown

名詞定義
- model:帶有未知參數的function
- y = f (資料)
轉化為

y = b + w * x1
- x1:feature，為已知資料
- y:model
- w跟b為unknown parameter
  - 用加的b: bias
  - 與feature相乘的w: weight

機器學習-(就unseen during training資料)

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

此模型都用前一天預測，不太準

linear model

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

可能產生的問題：model bias

Model Bias:有時linear model無法模擬真實狀況(區間太小，找到的都是loser，最強的不再池子裡，也就是大海撈針、但針不在海裡)，需更flexible的方法，於是有了其他的方式(參考更多features or DL)。
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

Piecewise linear curves：
* 有多個轉折點的線段，可用來逼近任何continuous的function(連續的曲線)

constant + sum of a set of

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

可用各種不同的形狀的function，例如sigmoid function模擬/逼近hard sigmoid(藍色curve)，因為hard sigmoid較難寫出來
藉由調整sigmoid function中的：
* w(改變坡度/斜率)
* b(左右移動)
* c(高度調整)
來製作不同的sigmoid function，疊起來可逼近各種不同的piecewise linear curves function，piecewise linear function又可用來逼近各種不同的continuous function(連續的曲線)

新的model:more features

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

線性代數表示

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

r通過sigmold
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- sigmoid function不限定三個，可以自己決定；並不是input feature是三個就只能用三個sigmoid function

–-result(新的第一步function)

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

激勵函數Activation function

ReLU(rectified linear unit):兩個reLU疊起來可變成hard sigmoid
Sigmoid:S型態的Function,用在資料不規則（ex.非線性）
Tanh: 類似sigmoid但沒那麼平滑(最後變水平），且在-1~1之間

2. define Loss from training data

Loss也是一個function，其輸入為b與w，即parameter
Loss代表how good a set of values is.

L(b,w)

Define Loss from Training Data

Loss function可選用：MAE/MSE/Cross-entropy

新的loss function

θ：所有未知parameter
與先前的loss function定義沒有不同：先給定一組θ，帶入一組feature，算出估計值y與label y head的誤差：e，把所有的誤差加總可得到loss

f(θ)

3. Optimization

測試各種w,b組合，找到w與b，讓Loss值最小

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

本課堂使用Gradient Descent

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

利用算微分得到斜率，朝低方向調整w
步伐大小(自己設定):learning rate 為超參數之一(hyperparameters)

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

可能遇到的opt. issue

然而有可能因gradient descent不給力，找不到global minima
可能只找到local minima!(大海撈針，針在海裡但撈不起來)

Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

新的optimization

仍是gradient descent

Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- gradient:對Loss function做微分，斜率的概念
測試各種θ組合，找到θ*，讓Loss值最小
- 算gradient:做到不想做 or g值為0無法更新即可
- geadient descent主要是用在參數很多的時候，少的時候直接爆開是有可能的；此範例因為參數多，集成theta只能做gradient descent

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

資料分組來算Loss：l1,l2,l3,…
Q:為何要分多個batch?：下周解釋
每更新一次參數：一次update
所有batch都看過一遍：一個epoch

一個epoch的update數取決於你設定的batch大小!

其他變型：修改model

多層layer：反覆做多次activation function，但注意每次的參數都不同
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

總結：

深度學習 = 多個hidden layer神經網路
- Image Not Showing Possible Reasons
  - The image file may be corrupted
  - The server hosting the image is unavailable
  - The image path is incorrect
  - The image format is not supported
  Learn More →
越疊越深…
- Image Not Showing Possible Reasons
  - The image file may be corrupted
  - The server hosting the image is unavailable
  - The image path is incorrect
  - The image format is not supported
  Learn More →
- 為何deap network而不是fat network?

overfitting

可自己survey Regularization
Early stopping 與dropout之後教
但是model也不能太大的限制，否則會回到model bias
- bias-complexity trade-off

cross validation

HW1:Regression

使用pyTorch實作
別人的筆記

pyTorch nn教學
PyTorch Dataset Normalization - Torchvision.Transforms.Normalize()教學
Feature Selection in regression model
reLU /sigmoid差別
- ReLU vs. Sigmoid Function in Deep Neural Networks
Batch Normalization
xgBoost?
- 【lightgbm/xgboost/nn代码整理四】pytorch做二分类，多分类以及回归任务
手刻L1/L2 Regularization
- https://androidkt.com/how-to-add-l1-l2-regularization-in-pytorch-loss-function/
Adam相關
- 演算法
  - 論文
- weight decay
  - Weight Decay and Its Peculiar Effects
  - Deep learning basics — weight decay
    - 論文

1. HUNG-YI LEE 2022 ML - NN introduction

tags: Machine Learning

目錄

ML學習攻略

1. function with unknown

linear model

可能產生的問題：model bias

新的model:more features

激勵函數Activation function

2. define Loss from training data

新的loss function

3. Optimization

可能遇到的opt. issue

新的optimization

其他變型：修改model

總結：

overfitting

cross validation

HW1:Regression

Read more

linux 20.04 LTS + xRDP遠端連線環境建置(Win10)

6. HUNG-YI LEE 2022 ML - Generation

機器學習 HW2: Regression

tensorflow 2.10.0 從原始碼編譯

tags: `Machine Learning`