Try   HackMD

1. HUNG-YI LEE 2022 ML - NN introduction

tags: Machine Learning

李宏毅2022 ML 上課筆記
課程說明2022
課程說明2021
課程說明2019
課程說明所有年份
(Youtube)Machine Learning (2021) Mandarin Version https://youtube.com/playlist?list=PLJV_el3uVTsMhtt7_Y6sgTHGHp1Vb2P2J

目錄

ML學習攻略


Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →


機器學習-Training,就已知資料

1. function with unknown

  • 名詞定義

    • model:帶有未知參數的function
    • y = f (資料)
  • 轉化為

    y = b + w * x1

    • x1:feature,為已知資料
    • y:model
    • w跟b為unknown parameter
      • 用加的b: bias
      • 與feature相乘的w: weight

機器學習-(就unseen during training資料)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

此模型都用前一天預測,不太準

linear model

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

可能產生的問題:model bias

  • Model Bias:有時linear model無法模擬真實狀況(區間太小,找到的都是loser,最強的不再池子裡,也就是大海撈針、但針不在海裡),需更flexible的方法,於是有了其他的方式(參考更多features or DL)。
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

Piecewise linear curves:
* 有多個轉折點的線段,可用來逼近任何continuous的function(連續的曲線)

constant + sum of a set of

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

可用各種不同的形狀的function,例如sigmoid function模擬/逼近hard sigmoid(藍色curve),因為hard sigmoid較難寫出來
藉由調整sigmoid function中的:
* w(改變坡度/斜率)
* b(左右移動)
* c(高度調整)
來製作不同的sigmoid function,疊起來可逼近各種不同的piecewise linear curves function,piecewise linear function又可用來逼近各種不同的continuous function(連續的曲線)

新的model:more features

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →


Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →


線性代數表示

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • r通過sigmold
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
    • sigmoid function不限定三個,可以自己決定;並不是input feature是三個就只能用三個sigmoid function

-result(新的第一步function)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

激勵函數Activation function

  • ReLU(rectified linear unit):兩個reLU疊起來可變成hard sigmoid
  • Sigmoid:S型態的Function,用在資料不規則(ex.非線性)
  • Tanh: 類似sigmoid但沒那麼平滑(最後變水平),且在-1~1之間

2. define Loss from training data

Loss也是一個function,其輸入為b與w,即parameter
Loss代表how good a set of values is.

L(b,w)

Define Loss from Training Data

  • Loss function可選用:MAE/MSE/Cross-entropy

新的loss function

θ:所有未知parameter
與先前的loss function定義沒有不同:先給定一組θ,帶入一組feature,算出估計值y與label y head的誤差:e,把所有的誤差加總可得到loss

f(θ)

3. Optimization

測試各種w,b組合,找到w與b,讓Loss值最小

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

本課堂使用Gradient Descent

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

利用算微分得到斜率,朝低方向調整w
步伐大小(自己設定):learning rate 為超參數之一(hyperparameters)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

可能遇到的opt. issue

然而有可能因gradient descent不給力,找不到global minima
可能只找到local minima!(大海撈針,針在海裡但撈不起來)

  • Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
  • Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

新的optimization

仍是gradient descent

  • Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

    • gradient:對Loss function做微分,斜率的概念
  • 測試各種θ組合,找到θ*,讓Loss值最小

    • 算gradient:做到不想做 or g值為0無法更新即可
    • geadient descent主要是用在參數很多的時候,少的時候直接爆開是有可能的;此範例因為參數多,集成theta只能做gradient descent

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • 資料分組來算Loss:l1,l2,l3,
  • Q:為何要分多個batch?:下周解釋
    每更新一次參數:一次update
    所有batch都看過一遍:一個epoch

    一個epoch的update數取決於你設定的batch大小!

其他變型:修改model

  • 多層layer:反覆做多次activation function,但注意每次的參數都不同
  • Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

總結:

  • 深度學習 = 多個hidden layer神經網路

    • Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
  • 越疊越深

    • Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
    • 為何deap network而不是fat network?

overfitting


  • 可自己survey Regularization
  • Early stopping 與dropout之後教
  • 但是model也不能太大的限制,否則會回到model bias
    • bias-complexity trade-off

cross validation


HW1:Regression

使用pyTorch實作
別人的筆記