Week 6- Neural Networks === ###### [Source](https://www.youtube.com/watch?v=jjV0GnfWDZg&t=2s) {%hackmd theme-dark %} Basics --- Neural Networks is used when we can't draw a linear line to separate two classes. ![](https://i.imgur.com/zrrPb88.png) Some problems like XOR are unable to be separated using LinReg problems. It requires a more complex, polynomial problems. A deep neural network is a neural network with more than 1 layer (excluding the input/output layers). ![](https://i.imgur.com/wljozfm.png) Each layer consists of nodes/neurons, with each node being a weighted linear combination of its inputs. Neuron --- Each neuron's output is a linear combination of all its inputs. This linear combination is put through an activation function, g, to form the actual output (y hat) of the node. ![](https://i.imgur.com/wETq0S1.png) Perceptron: Forward Propagation --- ![](https://i.imgur.com/eGVXBcJ.png) We usually adds bias to the model (that shifts the model) ![](https://i.imgur.com/PSIIkCL.png) ![](https://i.imgur.com/lwMF5EV.png) An example of an activation function is the sigmoid function, where $g(z) = \sigma(z) = \frac{1}{1+e^{-z}}$ ### Common Activation Functions ![](https://i.imgur.com/RggMckj.png) Different activation functions and losses are used for different problems/objectives. - Regression - Linear (-∞, +∞) or ReLu (0, +∞) - Min Squared Error Loss - Binary classification - Sigmoid (0,1): confidence/probability - Binary cross entropy loss - Categorical - Softmax (0,1) for each class, total sum = 1 - Cross entropy loss - Multi label categorical - Softmax (0,1) for each class, total sum = 1 - Binary cross entropy (for every class) Training a model in neural network means finding the optimal weights. ### Importance of Activation Functions The main purpose of activation functions is to introduce non-linearities into the network, so that the model is able to learn more complex shapes.