1. HUNG-YI LEE 2022 ML - NN introduction
李宏毅2022 ML 上課筆記
課程說明2022
課程說明2021
課程說明2019
課程說明所有年份
(Youtube)Machine Learning (2021) Mandarin Version https://youtube.com/playlist?list=PLJV_el3uVTsMhtt7_Y6sgTHGHp1Vb2P2J
目錄
ML學習攻略
- 【機器學習2021】機器學習任務攻略
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
機器學習-Training,就已知資料
1. function with unknown
-
名詞定義
- model:帶有未知參數的function
- y = f (資料)
-
轉化為
y = b + w * x1
- x1:feature,為已知資料
- y:model
- w跟b為unknown parameter
- 用加的b: bias
- 與feature相乘的w: weight
機器學習-(就unseen during training資料)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
此模型都用前一天預測,不太準linear model
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
可能產生的問題:model bias
- Model Bias:有時linear model無法模擬真實狀況(區間太小,找到的都是loser,最強的不再池子裡,也就是大海撈針、但針不在海裡),需更flexible的方法,於是有了其他的方式(參考更多features or DL)。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Piecewise linear curves:
* 有多個轉折點的線段,可用來逼近任何continuous的function(連續的曲線)
constant + sum of a set of
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
可用各種不同的形狀的function,例如sigmoid function模擬/逼近hard sigmoid(藍色curve),因為hard sigmoid較難寫出來
藉由調整sigmoid function中的:
* w(改變坡度/斜率)
* b(左右移動)
* c(高度調整)
來製作不同的sigmoid function,疊起來可逼近各種不同的piecewise linear curves function,piecewise linear function又可用來逼近各種不同的continuous function(連續的曲線)
新的model:more features
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
線性代數表示
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- r通過sigmold
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- sigmoid function不限定三個,可以自己決定;並不是input feature是三個就只能用三個sigmoid function
–-result(新的第一步function)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
激勵函數Activation function
- ReLU(rectified linear unit):兩個reLU疊起來可變成hard sigmoid
- Sigmoid:S型態的Function,用在資料不規則(ex.非線性)
- Tanh: 類似sigmoid但沒那麼平滑(最後變水平),且在-1~1之間
2. define Loss from training data
Loss也是一個function,其輸入為b與w,即parameter
Loss代表how good a set of values is.
L(b,w)
Define Loss from Training Data
- Loss function可選用:MAE/MSE/Cross-entropy
新的loss function
θ:所有未知parameter
與先前的loss function定義沒有不同:先給定一組θ,帶入一組feature,算出估計值y與label y head的誤差:e,把所有的誤差加總可得到loss
f(θ)
3. Optimization
測試各種w,b組合,找到w與b,讓Loss值最小
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
本課堂使用Gradient Descent
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
利用算微分得到斜率,朝低方向調整w
步伐大小(自己設定):learning rate 為超參數之一(hyperparameters)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
可能遇到的opt. issue
然而有可能因gradient descent不給力,找不到global minima
可能只找到local minima!(大海撈針,針在海裡但撈不起來)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
新的optimization
仍是gradient descent
-
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- gradient:對Loss function做微分,斜率的概念
-
測試各種θ組合,找到θ*,讓Loss值最小
- 算gradient:做到不想做 or g值為0無法更新即可
- geadient descent主要是用在參數很多的時候,少的時候直接爆開是有可能的;此範例因為參數多,集成theta只能做gradient descent
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- 資料分組來算Loss:l1,l2,l3,…
- Q:為何要分多個batch?:下周解釋
每更新一次參數:一次update
所有batch都看過一遍:一個epoch
一個epoch的update數取決於你設定的batch大小!
其他變型:修改model
- 多層layer:反覆做多次activation function,但注意每次的參數都不同
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
總結:
overfitting


- 可自己survey Regularization
- Early stopping 與dropout之後教
- 但是model也不能太大的限制,否則會回到model bias
- bias-complexity trade-off

cross validation

HW1:Regression
使用pyTorch實作
別人的筆記