# ML(Hung-yi Lee)_Lecture01. Machine Learning Intorduction
###### tags: `Machine Learning`
## What is ML?
1. Looking for function
2. 分類
* Regression : The function outputs a scalar. (Scalar = 數值)
* Classification : Given options (classes), the function outputs the correct one.
* Structured Learning(Generation) : Create sth with something(image,documen)
## How Machine find function??
**1. Write function with unknown parameters**
(Guess the formula looks like -> the func called Model)
**2. Define Loss from training data**
(Loss is a func of parametors -> how good a set of value is.)
* How to caculate Loss : Training Data, put the data you already known in the func and see if it's match the Label (Label : 真實數值)
* After caculated Loss(L),you can know the unknown parametors is good or bad(L越大代表預設參數越差,越小則越好)
Loss(L) 的計算方法有兩種 :
1. e(估測質和實際值差距) = $|y - \hat{y}|$ => Mean Absolute Error(MAE / 平均絕對值誤差 / L1誤差)
* 計算簡易
2. e = $(y -\hat{y})^2$ => Meam Square Error(MSE / 均方誤差 / L2誤差)
* 有比MAE更好的**魯棒性**(系統在擾動或不確定的情況下仍能保持它們的特徵行為)
*If y and $\hat{y}$ are both probability distributions $\Longrightarrow$ Cross-entropy
#### Error Surface
* 透過不同參數計算出來得到Loss後畫出的等高線圖

**3. Optimization(最佳化問題)**

#### Gradient Descent(找尋最佳化方法) => 會有找不到Local Minima的問題
Local Minima : 只要左右兩邊高於那個位置即可
Global Minima : 真正Loss最小的地方(整個Error Surface的最低點)
* 透過微分計算斜率來找尋最佳的未知參數
1. if 斜率 > 0 => Decrease unknown parametors
2. if 斜率 < 0 => Increase unknown parametors
* $\eta$ = Learning Rate(可自行設定的學習參數 = Hyperparametors)
## Linear Model
$y = b + w * x$ (b = bias / w = weight / x = feature)
## Overfitting
* 愈複雜的model在Training Data可以帶好結果,但在Testing Data上不一定有好結果
* 做Regularzation無須考慮bias,因為考慮進去只影響lineear的高低,與平滑程度無關
* Overfitting = error是來自Variance影響比較大
* Underfitting = error是來自Bias影響比較大
#### 如何知道你的error來自於Bias太大還是Variance太大
1. Bias大 => If your model cannot even fit the training examples
2. Variance大 (Overfitting) => If you can fit the training data, but large error on testing data

* **How to solve it??**
* For bias, redesign your model:
1. Add more features as input
2. A more complex model
* For Variance :
1. Add more data(Very effective, but not always practical)
2. Regularization