# CS410 Homework 4: Model Selection **Due Date: 10/9/2024 at 12pm** **Need help?** Remember to check out [Edstem](https://edstem.org/us/courses/61309) and our website for TA assistance. ### Python Notebooks A **Python notebook** is an application that allows users to combine text, code, and visualizations, much like a traditional scientific lab notebook. This assignment is written in a Python notebook. In the file, we describe the tasks, and ask you to insert the code and run it to generate the requisite visualizations. Your handin should include the Python notebook file--modified--as well as your README, as always. ### Download Please click [here](https://classroom.github.com/a/BadpyE8v) to download the assignment. ### Handin Your handin should contain: - all modified files, including comments describing the logic of your algorithmic modifications, and your tests - a README, containing a brief overview of your implementation, and the outcomes of all tests ### Gradescope Submit your assignment via Gradescope. To submit through GitHub, follow these commands: 1. `git add -A` 2. `git commit -m "commit message"` 3. `git push` Now, you are ready to upload your repo to Gradescope. *Tip*: If you are having difficulties submitting through GitHub, you may submit by zipping up your hw folder. ### Rubric | Component | Points | Notes | |-------------------|----|--------------------------------| | Figure comparing performance on training and test set | 15 | Points awarded for producing a figure that matches the specifications and shows the tradeoff between performance on the training set and test set. | | Cross Validation | 20 | Points awarded for correct implementation of cross validation. | | Grid Search | 25 | Points awarded for correct implementation of grid search, which, for any number of parameters, finds the best parameter setting.| | README Questions| 30 | Points awarded for responses to questions in your notebook file. | | README | 10 | Points awarded for an easy to read and well-organized README file. | :::spoiler README Questions In the notebook, there are a few conceptual questions, repeated here for your convenience. Make sure you answer these in your README. 1. Explain why we might need a metric other than accuracy to measure the performance of a classifier? Provide an example of a problem or dataset where neither provides a useful assessment. (**Hint:** Consider medical tests for rare diseases.) 2. Describe the parameters you use (i.e., their names and what they do), and report the best parameter settings and score found using your grid search. By how much does grid search improve your model's performance? ::: :::success Congrats on submitting your homework; Steve is proud of you!! ![image](https://hackmd.io/_uploads/H1gKUsJ20.png) ![image](https://hackmd.io/_uploads/S1OQ2aCwA.png) :::