---
slideOptions:
spotlight:
enabled: true
---
---
# Introduction to Deep Learning (DL)
by Sumit Sah
---
## Overview
- Example Applications
- Basic Motivation to Neural Networks
- What is DL?
- Why do you need DL?
- Some Applications
---
## Motivation through examples
- **Problem:** Find if the incoming email is SPAM or NOT A SPAM
- Feed in the email to a computer, and it should tell you if it is a spam or not
- What characterizes a spam/non-spam emails?
- Appearance of specific words
- Have a bag/dictionary of words + count occurance of each one of them
- By the way, throw away words like "if", "the", "of", "or" "is" etc.
- These are called features + collectively called feature vectors
- These bunch of numbers are called a feature vector
- Typically, the size will be $1000$ as there are $1000$ words in the bag of words or dictionary
---
## How do we learn?
- Human uses experience (obeservations) to learn
- Machines should use data to learn
- Data consists of lot of emails and somebody should label them!
- Data: $\{\textbf{X}_i, y_i\}, i=1,2,\ldots, n$; $\textbf{x}_i$ is the feature vector for the $i$-th training samples and $y_i \in \{-1,+1\}$ is its label
- Assume $\textbf{x}_i$ has two components and plot 
---
## Classification (Linear versus non-linear)
- Problem: Find if the incoming email is SPAM or NOT A SPAM
- Find a line that separates the two set of points

- In the real world, we are not so lucky!
---
# Linear/non-linear Classifier: How do we learn?
- Learn a line or a plane that separates the two data points using training
- How do we know if the line/plane is good?
- Loss functions: indicator (if correct $0$, else $1$), squared error, cross-entropy loss and many more
- Learn with respect to the loss
- $\min_{LINE} \frac{1}{n} \sum_{i=1}^n \underbrace{\mathbf{1}\{LINE(\textbf{x}_i) \neq y_i\}}_{\text{error}}$ OR $\min_{f} \frac{1}{n} \sum_{i=1}^n \underbrace{\mathbf{1}\{f(\textbf{x}_i) \neq y_i\}}_{\text{error}}$
- Need to know how to solve (there are inbuilt functions for the same)
