# Machine Learning
###### tags: `DLRM`
## 主旨

## Supervised Learning
Supervised:"right answers" given
1. Regression:Predict continuous valued output
1. Classification:Discrete valued output (0 or 1)
### Linear Regression




#### feature scaling and mean normalization
We can speed up gradient descent by having each of our input values in roughly the same range.
=> xi=(xi−μi)/si
(Where μi is the average of all the values for feature (i) and si is the range of values (max - min), or si is the standard deviation.)
#### Gradient Descent




Declare convergence if J(θ) decreases by less than 10^-3 in one iteration
##### learning rate

If α is too small: slow convergence.
If α is too large:J(θ) may not decrease on
every iteration;may not converge.
#### Normal equation

##### X and Y matrix


##### 
1. Redundant features (linearly dependent).
2. Too many features (e.g.m<=n).
#### Gradient Descent v.s Normal equation

### Classification

### Logistic regression



#### boundary


#### cost function

#### gradient descent

#### Advance optimization

#### One-vs-all

### underfitting and overfitting



#### regularization
##### linear regression


##### Normal equation


##### logistic regression


## Neuron Network










### Backpropagation algorithm
Forward propagation:

Backpropagation:


注意:

### Gradient checking

### Random initialization

### Training a neural network


## Unsupervised Learning


