# CS410 Homework 5: Model Selection
**Due Date: Tuesday, 3/4/2025 at 11:59pm**
**Need help?** Remember to check out [Edstem](https://edstem.org/us/courses/74300) and our website for TA assistance.
### Python Notebooks
A **Python notebook** is an application that allows users to combine text, code, and visualizations, much like a traditional scientific lab notebook.
This assignment is written in a Python notebook. In the file, we describe the tasks, and ask you to insert the code and run it to generate the requisite visualizations.
Your handin should include the Python notebook file--modified--as well as your README, as always.
---
### Download ###
Please click [here](https://classroom.github.com/a/tB2_kzAT) to download the assignment.
---
### Instructions for Downloading and Setting Up [Jupyter](https://docs.jupyter.org/en/latest/install/notebook-classic.html) Notebook
To complete this assignment, you’ll need to have **Jupyter Notebook** installed on your computer. Follow the steps below based on your operating system (macOS or Windows).
#### Option 1: Jupyter via VSCode ####
1. In VSCode, activate your CS410 environment
```bash
source cs410_env/bin/activate
```
2. Install Jupyter Notebook by typing:
```bash
pip install notebook ipykernel
```
3. Install the Python and Jupyter Extension:
- Open VS Code.
- Go to the Extensions view (Ctrl+Shift+X / Cmd+Shift+X).
- Search for "Jupyter" and install the extension by Microsoft. Make sure "Python" is also installed
4. Open the folder/repo containing the stencil code. Be sure to select the environment as the Python Interpreter and select the kernel.
**Option 2: Launch Jupyter Notebook Browser:**
1. Open the **Terminal** (macOS) app or **Command Prompt** (Windows) and activate your environment (same command as above)
3. install Jupyter Notebook by typing:
```bash
pip install notebook
```
2. Launch Jupyter Notebook. Your browser will open with the Jupyter Notebook interface:
```bash
jupyter notebook
```
3. Once Jupyter Notebook launches, navigate to the folder where you cloned the GitHub repository.Click the file to open and start modifying the code.
---
### Handin
Your handin should contain:
- all modified files, including comments describing the logic of your algorithmic modifications
- a README, containing a brief overview of your implementation, and the outcomes of all tests
### Gradescope
Submit your assignment via Gradescope.
To submit through GitHub, follow these commands:
1. `git add -A`
2. `git commit -m "commit message"`
3. `git push`
Now, you are ready to upload your repo to Gradescope.
*Tip*: If you are having difficulties submitting through GitHub, you may submit by zipping up your hw folder.
### Rubric
| Component | Points | Notes |
|-------------------|----|--------------------------------|
| Figure comparing performance on training and test set | 20 | Points awarded for producing a figure that matches the specifications and shows the tradeoff between performance on the training set and test set. |
| Cross Validation | 25 | Points awarded for correct implementation of cross validation. |
| Grid Search | 10 | Points awarded for correct usage of SKLearn's grid-search function.|
| README Questions| 35 | Points awarded for responses to questions in your notebook file. |
| README | 10 | Points awarded for an easy to read and well-organized README file. |
:::spoiler README Questions
In the notebook, there are a few conceptual questions, repeated here for your convenience. Make sure you answer these in your README.
1. Explain why we might need a metric other than accuracy to measure the performance of a classifier? Provide an example of a problem or dataset where neither provides a useful assessment. (**Hint:** Consider medical tests for rare diseases.)
2. Please describe your results from running k-fold cross validation on the training set. Include details and explanations for trends observed in the graphs plotted.
3. Describe the parameters you use (i.e., their names and what they do), and report the best parameter settings and score found using your grid search. By how much does grid search improve your model's performance?
4. Report the best parameters found by your grid search. How much better does your model perform after grid search?
:::
:::success
Congrats on submitting your homework; We are proud of you!!

:::