###### tags: `Linear Algebra` `LA01` # L05 Machine Learning and Knn --- ## From Last week - Norm and Distance of vectors - Triangle Inequality - Intro to nearest neighbor problem --- ## This week - Formally Introduction of Maching Learning - Implemenet Knn Algorithm in fully details - Evaluate Model performance --- ### 1. Machine Learning ![](https://drive.google.com/uc?export=view&id=1YfgiPkBkFb02ZdrR5S7lHn0T2nakwZsX) ---- ### 1.1 Definition of Machine Learning from Wiki > Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. -- Wikipedia --- ### 2. Our 1st Supervised Learning Algorithm, Knn ![](https://drive.google.com/uc?export=view&id=13ohyXSV_ci0me-aJloueODWWC3eX-nTO) ---- ### 2.1 A more formal introduction to Knn: > k-NN, an object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors. If k = 1, then the object is simply assigned to the class of that single nearest neighbor. -- Wikipedia --- ### 3. How Knn work? We already know most of it last week! - Point-wise distance calculation in Python (Numpy) - From the distance array pick out k smallest values with index We will have a full detail of implemenation this week. ---- ### 3.1 A more formal introduction to Knn: * Choose a set of data points as trainning set (reference set) * For every new data point (those to be classified), calculte nearest k neighborhood. * let them vote for classfication for each new data point ---- ### 3.2 Example as our iris data set: * we have 150 points in total, we randomly choose 100 data points out of it as our reference set; * The rest 50 data points as testing set, i.e. we pretend that we do not know their classification and run algorithm on them. * After get the results, we can compare it with the actrual classification we hold on hand. We need something to evaluate the performance of our algorithm --- ### 4. performance evaluation There are many metrics in the field of Data Science and Statictics that evaluate performance of predictive model, which will be introduced in other module. In the current module, we focus on the most streightforward metric, Accuracy. $$Accuracy = \frac{Number\ of\ correctedly\ predicted}{Total\ number\ of\ prediction}$$ --- ### Thank you! :snake: Python Time!
{"metaMigratedAt":"2023-06-17T04:42:11.313Z","metaMigratedFrom":"YAML","title":"LA01_L05","breaks":true,"description":"Machine Learning and Knn.","slideOptions":"{\"theme\":\"moon\"}","contributors":"[{\"id\":\"d8479402-2b3f-4751-92f6-b67f55f4b94f\",\"add\":2834,\"del\":221}]"}
    183 views