# Types of Machine learning * ![image](https://hackmd.io/_uploads/BkrOamnWC.png) * ![image](https://hackmd.io/_uploads/Sy-RPQ3Z0.png) * ![image](https://hackmd.io/_uploads/SJi6YQ3-R.png) * ![image](https://hackmd.io/_uploads/HJ2U0Q3ZC.png) ## Supervised (input -> output) ### Regression (output is continuous) * ==Linear Regression== * Support Vector Machine (Support Vector Regression) * K Nearest Neighbours * Decision Tree * Random Forest ### Classification (output is discrete) * ==Logistic Regression== * ==Naïve Bayes== * Support Vector Machine (Support Vector Classification) * K Nearest Neighbours * Decision Tree * Random Forest ## Unsupervised (learns patterns and structures from unlabeled data) ### Clustering * K-Means * Identify groups of data points that are similar to each other within the same cluster while being different from data points in other clusters. ### Dimension reduction * Principal Component Analysis (PCA) * Find the best way to tell a story using just a few important pictures, instead of showing every single detail and tons of words. ## Reinforcement (teaching a robot/dog to play a game: it learns by trying different things, getting rewards when it does well, and figuring out how to do better next time.) ## [Watch this video to review Supervised and Unsupervised models.](https://youtu.be/yN7ypxC7838?si=bdQkkpQH0B8I0AhF) # Steps of Machine learning * Data Collection * Data Preparation (preprocessing) * Handling missing values and data formats * ![image](https://hackmd.io/_uploads/S10VzkaWA.png) * Country name: Germany, GERMANY, germany, Deutschland, DE, De, de * True/false: TRUE, true, True, FALSE, False, false, 1, 0 * Feature selection * Dimensionality reduction * Normalization * Choice of Model * Training of Model * Remember to split the data into the training set and testing set * Evaluation of Model * Parameter Tuning and Optimization * Predictions and Deployment # Common Libraries * [pandas](https://pandas.pydata.org/) * Read/write csv, excel, json, sql, html, etc * [NumPy](https://numpy.org/) * Scientific computing * [Matplotlib](https://matplotlib.org/) * Plotting library * [scikit-learn](https://scikit-learn.org/stable/) * Machine learning library which offers many supervised and unsupervised algorithms. * [Interactive cheet-sheet](https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html) # Ways of importing libraries ```python= # 1. from sklearn.linear_model import LinearRegression model = LinearRegression() # 2. from sklearn import linear_model model = linear_model.LinearRegression() # 3. import sklearn model = sklearn.linear_model.LinearRegression() ``` ```python= # 1. from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split, cross_val_score from sklearn.svm import SVC # 2. from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split from sklearn.model_selection import cross_val_score from sklearn.svm import SVC # 3. import sklearn dataset = sklearn.datasets.load_breast_cancer() train_test_split = sklearn.model_selection.train_test_split() cross_val_score = sklearn.model_selection.cross_val_score() SVC = sklearn.svm.SVC() ```