Week 6- Neural Networks
===
###### [Source](https://www.youtube.com/watch?v=jjV0GnfWDZg&t=2s)
{%hackmd theme-dark %}
Basics
---
Neural Networks is used when we can't draw a linear line to separate two classes.

Some problems like XOR are unable to be separated using LinReg problems. It requires a more complex, polynomial problems.
A deep neural network is a neural network with more than 1 layer (excluding the input/output layers).

Each layer consists of nodes/neurons, with each node being a weighted linear combination of its inputs.
Neuron
---
Each neuron's output is a linear combination of all its inputs. This linear combination is put through an activation function, g, to form the actual output (y hat) of the node.

Perceptron: Forward Propagation
---

We usually adds bias to the model (that shifts the model)


An example of an activation function is the sigmoid function, where
$g(z) = \sigma(z) = \frac{1}{1+e^{-z}}$
### Common Activation Functions

Different activation functions and losses are used for different problems/objectives.
- Regression
- Linear (-∞, +∞) or ReLu (0, +∞)
- Min Squared Error Loss
- Binary classification
- Sigmoid (0,1): confidence/probability
- Binary cross entropy loss
- Categorical
- Softmax (0,1) for each class, total sum = 1
- Cross entropy loss
- Multi label categorical
- Softmax (0,1) for each class, total sum = 1
- Binary cross entropy (for every class)
Training a model in neural network means finding the optimal weights.
### Importance of Activation Functions
The main purpose of activation functions is to introduce non-linearities into the network, so that the model is able to learn more complex shapes.