---
tags: hw4, programming
---
# HW4 Programming: CNNs
:::info
Assignment due **Friday, March 6, 2026** on Gradescope
:::
## Assignment Overview
In this assignment, you will be building a **Multi-Layer Perceptron** and **Convolutional Neural Network** with pooling layers using the CIFAR-10 dataset to learn to distinguish cats and dogs (among other things). *Please read this handout in its entirety before beginning the assignment.*
## Bruno's Deep Dilemma

Stranded in a world of bubbling oil and golden-brown breading, Bruno has tunneled into the heart of the Great Deep Fryer. In this sizzling underworld, he has stumbled upon two greasy, golden species: the **Crispy Crumb Felines** (*Felis Tempura*) and the **Battered Barking Canines** (*Canis Panko*).
The cats want to help Bruno find the cooling rack and safety, but the dogs are eager to dip his escape pod into a vat of spicy remoulade for extra crunch. Unfortunately, to Bruno's grease-fogged eyes, these golden-fried creatures look identical through the steam.
**The Mission**
Train a model to distinguish cats from dogs with at least 70% accuracy before the timer dings. You will build three components:
1. A Convolutional Neural Network (CNN): This mimics how eyes detect patterns like the curve of a tail or the point of an ear through the thick haze of frying oil.
2. A Multi-Layer Perceptron (MLP): A baseline classifier to see if raw pixel intensity alone can spot the difference between a hushpuppy and a kitten.
3. A Custom Convolution Function: You'll write your own "filtering" logic to understand exactly how kernels extract features from a greasy input image.
---
### Stencil
Please click the following banner for the Github Classroom [link](https://classroom.github.com/a/4_9ACShD) to get the stencil code. Reference this [guide](https://hackmd.io/gGOpcqoeTx-BOvLXQWRgQg) for more information about GitHub and GitHub Classroom.
{%preview https://classroom.github.com/a/4_9ACShD %}
The stencil should have the following structure:
```
code/
├── assignment.py
├── preprocess.py
├── local_test.py
├── visualize.py
├── models/
│ ├── __init__.py
│ ├── base_model.py
│ ├── cnn.py
│ └── mlp.py
└── convolution/
├── __init__.py
└── manual_convolution.py
```
:::danger
**Do not change the stencil except where specified**. While you are welcome to write your own helper functions, changing the stencil's method signatures or removing pre-defined functions could result in incompatibility with the autograder and result in a low grade.
:::
---
### Environment
You will need to use the virtual environment that you made in Homework 1. You can activate the environment by using the command `conda activate csci1470`. If you have any issues running the stencil code, be sure that your conda environment contains at least the following packages:
- `python==3.11`
- `numpy`
- `torch`
- `tqdm`
- `pytest`
On Windows conda prompt or Mac terminal, you can check to see if a package is installed with:
```bash
conda list -n csci1470 <package_name>
```
On Unix systems to check to see if a package is installed you can use:
```bash
conda list -n csci1470 | grep <package_name>
```
:::danger
Be sure to read this handout in its **entirety before** moving onto implementing **any** part of the assignment!
:::
## Assignment Overview
Your task is a multi-class classification problem on the CIFAR-10 dataset which you can read about [here](https://www.cs.toronto.edu/~kriz/cifar.html). While the CIFAR-10 dataset has 10 possible classes (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck), you will build a CNN to take in an image and correctly predict a subset of these classes. **You'll be graded on your model's predictions for the cat-dog subset.**
This assignment uses **PyTorch**. All models inherit from `torch.nn.Module` and use a `forward(inputs)` method for the forward pass.
:::info
**Task 0.1 Download the data:** The first step to this project will be getting the data you are working with. As mentioned, you will be working with the CIFAR-10 dataset which you obtain through either of the following methods:
1. In your repository, you will find a `download.sh` file. This script automatically downloads the dataset from the course server and unzips the data into a `data/` directory inside `code/`. You can run the command as:
```bash=
./download.sh
# If you are running into permission errors, run the following:
chmod +x ./download.sh
./download.sh
```
2. Click [here](https://cs.brown.edu/courses/csci1470/hw_data/hw2.zip) to download the data. When you unzip you'll find 2 files, `data/train` and `data/test`. Place the `data/` folder inside the `code/` directory.
After this step, you should have one `data/` directory that contains your `train` and `test` files that you will preprocess.
:::
:::danger
**You should not submit the data to the autograder**. We keep a copy of the data on the autograder, so you don't need to upload it.
To ensure you do not accidentally include the `data/` directory inside your git commit, ensure your `.gitignore` includes `data` to exclude the folder from your commits.
:::
The assignment has two parts:
1. **Model:** Build the MLP and CNN models. Our stencil provides model classes with several methods and hyperparameters you need to use for your network.
2. **Convolution Function:** Fill out a `ManualConv2d` class that performs the convolution operator.
:::info
You should include a brief README with your model's accuracy and any known bugs!
:::
If completed correctly, the model should train and test within 15 minutes on a department machine. It takes about 5 minutes on our TAs' laptops. While you will mainly be using PyTorch functions, the second part of the assignment requires you to write your own convolution function, which can be very computationally expensive. To counter this, we only require that you print the accuracy across the test set using your manual convolution. __You should train your model using the PyTorch built-ins__. On a department machine, training should take about 3 minutes and testing using your own convolution should take about 2 minutes.
# Roadmap
Below is a brief outline of some things you should do. We expect you to fill in some of the missing gaps (review lecture slides to understand the pipeline).
## Step 1. Preprocessing Data
:::danger
**⚠️WARNING⚠️:** __Please do not shuffle the data here__. You'll shuffle the data before training and testing. You should maintain the order of examples as they are loaded in or you will fail Autograder test 1.4.
:::
:::info
__Task 1.1 [preprocess.get_data pt. 1]:__ Start filling in the get_data function in `preprocess.py`.
* We have provided you with a function `unpickle(file)` in the `preprocess.py` file stencil, which unpickles an object and returns a dictionary. Do not edit it. We have also already extracted the inputs and labels from the dictionary in `get_data` so you have no need to deal with the pickled file or the dictionary.
- You will want to limit the inputs and labels returned by `get_data` to those specified by the `classes` parameter. You will be expected to train and test for the **cat (label index 3), dog (5)** subset of the test set, so it's a good default to have in mind. For every image and its corresponding label, if the label is not in `classes`, remove both from your arrays. You might find [`numpy.nonzero`](https://numpy.org/doc/stable/reference/generated/numpy.nonzero.html) useful for finding only the indices of matching labels.
:::
:::info
__Task 1.2 [preprocess.get_data pt. 2]:__ Continue filling in the get_data function in `preprocess.py`.
- At this point, your inputs are still 2-dimensional (shape: `(num_examples, 3072)`). CIFAR-10 images are stored flat in row-major order: the first 1024 values are the red channel, the next 1024 green, and the final 1024 blue. **PyTorch uses NCHW format** (num_examples, channels, height, width), so you'll need to reshape your data to `(num_examples, 3, 32, 32)`. You can use `numpy.reshape` and `numpy.transpose` to get there.
- Normalize the pixel values so they range from 0 to 1 by dividing by 255.
:::
:::info
__Task 1.3 [preprocess.get_data pt. 3]:__ Finish the get_data function in `preprocess.py`.
- Re-number the labels so that the smallest class index becomes 0, the next becomes 1, and so on. For example, if `classes=[3, 5]` (cat and dog), relabel cats as 0 and dogs as 1. You might find [`numpy.where`](https://numpy.org/doc/stable/reference/generated/numpy.where.html) or indexing tricks useful here.
- **PyTorch's `CrossEntropyLoss` expects integer class indices, not one-hot vectors**. Hence, you should return your labels as a 1D `torch.int64` tensor of shape `(num_examples,)`.
- Return your inputs as a `torch.float32` tensor of shape `(num_examples, 3, 32, 32)`.
:::
:::danger
**⚠️WARNING⚠️:** In `assignment.py`, we give you `AUTOGRADER_TRAIN_FILE` and `AUTOGRADER_TEST_FILE` variables which are the file paths that must be used with the autograder. You might need separate local filepaths when running on your machine. When you submit to Gradescope, you **MUST** call `get_data` using the autograder filepaths we have provided.
:::
:::success
**Note:** If you download the dataset from online, the training data is actually divided into batches. We have done the job of repickling all of the batches into one single train file for your ease.
:::
:::success
**Note:** You're going to be calling `get_data` on both the training and testing data files in `assignment.py`. The testing and training data files to be read in are in the following format:
- `train`: A pickled object of `50,000` train images and labels. This includes images and labels of all 10 classes. After unpickling the file, the dictionary will have the following elements:
- `data` -- a `50000x3072` numpy array of uint8s. Each row of the array stores a `32x32` color image. The first 1024 entries contain the red channel values, the next 1024 the green, and the final 1024 the blue. The image is stored in row-major order, so that the first 32 entries of the array are the red channel values of the first row of the image.
- `labels` -- a list of `50000` numbers in the range `0-9`. The number at index `i` indicates the label of the `i`-th image in the array data.
- `test`: A pickled object of 10,000 test images and labels. This includes images and labels of all 10 classes. Unpickling the file gives a dictionary with the same key values as above.
:::
:::info
__Task 1.2 [assignment.main pt 1]:__ Load in both your training and testing data using `get_data`. Print out the shapes, values, etc. and once you are happy feel free to submit what you have so far to the autograder to check your score for the preprocessing tests.
:::
Throughout this assigment we recommend building `assignment.py` as you go so that you can test your implementations as you write them, not all at once. Now is a great time to start filling out `assignment.main` while testing your `get_data` at the same time.
## Step 2. Create your MLP model
Time to start modeling with PyTorch! Go to the `models/mlp.py` file and take a look at the stencil. The `MLP` class inherits from `CifarModel` (found in `models/base_model.py`), which itself inherits from `torch.nn.Module`. This means you get some nice features like `self.training` or `model.eval()`, as well as the loss and accuracy functions from the parent class.
You'll notice an `__init__` method and a `forward` method. In `__init__`, you should define all trainable layers as instance attributes (e.g., `nn.Linear`). In `forward`, you should define how the model computes its output from an input tensor.
:::info
__Task 2.1 [mlp.MLP.__init__]:__ Finish filling out `MLP.__init__`
You'll be working with PyTorch's `torch.nn` module. Some useful layers:
- [`nn.Linear`](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)
- [`torch.reshape`](https://pytorch.org/docs/stable/generated/torch.reshape.html) or [`torch.flatten`](https://pytorch.org/docs/stable/generated/torch.flatten.html)
- [`nn.Dropout`](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html)
- Initialize all hyperparameters within the constructor. We've given you default values.
- Create instances of your model's `nn.Linear` layers here. Keep in mind what output dimensions you need to produce valid class predictions.
- We recommend starting with just one hidden layer and then adding more once that is working.
:::
:::info
__Task 2.2 [mlp.MLP.forward]:__ Fill out `MLP.forward`
- First, flatten your input images. You should end up with `num_inputs` number of vectors.
- Call your dense layers!
- We expect the MLP to output **raw logits** (not a probability distribution) of shape `(batch_size, num_classes)`. PyTorch's `CrossEntropyLoss` applies softmax internally, so **do not apply softmax in `forward`**.
:::
:::danger
**Warning:** Make sure your MLP **flattens the images inside `forward`** since images in NCHW format (`(batch_size, 3, 32, 32)`) are passed in directly. Flattening in `forward` (not in preprocessing) is a design choice that makes using both models with the same data easier.
:::
:::info
__Task 2.3 [base_model.CifarModel.loss]:__ Given the logits and labels, compute and return the mean loss in `CifarModel.loss` inside `models/base_model.py`.
Note that both `MLP` and `CNN` inherit from `CifarModel`, so you only need to implement this once.
- Use the mean cross-entropy loss. We suggest [`nn.CrossEntropyLoss`](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html).
- Don't forget to initialize `self.loss_func` in the `__init__` of `CifarModel`.
- `CrossEntropyLoss` expects raw logits, not softmax outputs, and integer class indices as labels, not one-hot vectors.
:::
:::success
**Note:** `nn.CrossEntropyLoss` combines `LogSoftmax` and `NLLLoss` in one step. This is why you should NOT apply softmax in your `forward` method as the loss function handles it for you.
:::
:::info
__Task 2.4 [base_model.CifarModel.accuracy]:__ Given the logits and labels, compute then return the accuracy in `CifarModel.accuracy`
- To find your accuracy, first find for each input image the predicted, most likely class. You might find [`torch.argmax`](https://pytorch.org/docs/stable/generated/torch.argmax.html) helpful. Then, find the ratio of correct to incorrect predictions. You might find `torch.equal` and `torch.mean` useful for this task. Also, if you find yourself needing to use `torch.cast()`, please make sure to set the `dtype` to be `torch.float32`.
:::
Now, all that's left to do with your MLP is run it!
:::info
__Task 2.5 [assignment.main pt 2]:__ Initialize your MLP model in the main function of `assignment.py` to ensure nothing breaks. If you'd like, you can further sanity check your MLP by running a batch a data through the forward pass and confirming the output shape is what you expect.
You should also initialize your optimizer here. We recommend using an Adam Optimizer with a learning rate of $10^{-4}$, but feel free to experiment with other optimizers.
:::
## Step 3. Train and test
In the `main` function, you will want to get your train and test data, initialize your model, and train it for many epochs. We suggest training for 10 epochs. We have provided for you a train and test method to fill out. The train method will take in the model and do the forward and backward pass for a SINGLE epoch. Iterate until either your test accuracy is sufficiently large or you have reached the max number of epochs (you can set this to whatever you'd like with a hard cap at 25). For reference, we are able to reach good accuracy after no more than 10 epochs.
:::info
__Task 3.1 [train]:__ Go ahead and write the train function in `assignment.py`.
- You should shuffle your inputs and labels before each epoch. You might it find it helpful to use `torch.randperm` to shuffle your indices, then apply the permutation to both your labels and inputs.
- To improve accuracy, you may optionally apply `torchvision.transforms.functional.hflip`, which randomly flips your images horizontally, to training batches. Think about why this might help! **Do not flip during testing.**
- You should batch your inputs. The stencil provides `get_next_batch` to slice a batch by index.
- We are kind poeople, and therefore we have provided a nice progress bar for you. The stencil includes a `tqdm` progress bar already set up for you, you just need to fill in the loop body. If you want to display live training stats in the bar, uncomment and update the `pbar.set_postfix(...)` line.
- Return the average training accuracy across all batches.
:::
:::success
**Fun Goodies:** The stencil provides a nice `tqdm` to show a live training progress bar. It displays the current batch index, elapsed time, and any custom stats you add via `pbar.set_postfix(...)`. You don't need to add any additional printing, just fill in the loop logic. Here is an example output:
```
train: |████████████████| 195/195 [00:12<00:00, 15.2batch/s, loss=0.6821, acc=0.672]
```
:::
If you'd like, you can calculate the train accuracy to check that your model does not overfit the training set. If you get upwards of 80% accuracy on the training set but only 65% accuracy on the testing set, you might be overfitting.
:::info
__Task 3.2 [test]:__ Write the test function in `assignment.py`.
- The test function takes in the trained model and returns the accuracy on the test data *and* all predictions for each test image.
- Very similar to `train`, but no shuffling, no gradient computation, and no optimizer step. Yo might find `torch.no_grad()` helpful.
- You should return a tuple `(test_accuracy, test_preds)` where `test_preds` contains all the model's output logits concatenated across batches.
:::
:::danger
**⚠️WARNING⚠️:**
When testing **you should NOT randomly flip images or do any extra data augmentation.**
:::
:::info
__Task 3.3 [assignment.main pt 3]:__ Now try training your MLP Model!
:::
:::info
__Task 3.4 [assignment.main pt 4]:__ Once you have confirmed that training the model doesn't break, add in testing so you can see how the model does when it counts! We are looking for > 60% accuracy with an MLP model, which you should be able to reach without much trouble and a relatively small model.
The autograder checks your MLP accuracy in one of two ways:
1. **Predictions file:** Save your MLP's predictions to `predictions_mlp.npy` using `numpy.save`. This file should contain the argmax class index for each test image. You shold submit this file once you are ready as the autograder will grade it directly without re-training.
2. **Final file fallback:** If no `predictions_mlp.npy` is found, the autograder looks for any file in your submission whose name contains both `final` and `mlp` (e.g., `final_mlp.txt`). If found, it will train your model from scratch and verify it reaches the accuracy threshold. **Only use this path as a fallback since submitting a predictions file is much faster and more reliable.**
:::
### Improving your MLP
You might notice that your MLP doesn't perform that well. Two things to try are activation functions and dropout.
In PyTorch, you can add activations like ReLU directly in your `forward` method using `torch.nn.functional.relu(x)` or by adding `nn.ReLU()` as a layer in `__init__`. Think about of some other activation functions you might try.
Dropout can help prevent overfitting. Use `nn.Dropout(p=...) ` in `__init__` and apply it in `forward` during training. Dropout sets random entries in its input to 0. This way, the model is forced to make a prediction without certain input features. Therefore, if the model was overfitting on these individual features, then dropout would work to prevent this. PyTorch's dropout automatically disables itself at evaluation time when you call `model.eval()`.
## Step 4. Create your CNN model
Time for your second model! Go to `models/cnn.py`. The `CNN` class has the same structure as `MLP` an `__init__` to define layers, and a `forward` to run them.
Useful PyTorch layer references:
- [`nn.Conv2d`](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html)
- [`nn.MaxPool2d`](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html)
- [`nn.BatchNorm2d`](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html)
:::info
**Task 4.1 [CNN.__init__]:**
Go fill out the `__init__` function for the `CNN` class.
- Initialize all hyperparameters in the constructor.
- Create all of your `nn.Conv2d`, pooling, normalization, and `nn.Linear` layers here.
- You may use any permutation and number of convolution, pooling, and dense layers, as long as you use at least one convolution layer with strides of ``[1, 1, 1, 1]``, one pooling layer, and one fully connected layer.
Note: PyTorch's `nn.Conv2d` takes `(in_channels, out_channels, kernel_size, stride, padding)` and operates on NCHW-format tensors.
:::
:::success
If you're having trouble getting started with model architecture, here's an example structure:
- `nn.Conv2d` → `nn.BatchNorm2d` → `nn.ReLU()` → `nn.MaxPool2d`
- 2nd Convolution Layer + Bias, Batch Normalization, ReLU, Max Pooling
- 3rd Convolution Layer + Bias, Batch Normalization, ReLU
- Remember to reshape the output of the previous convolutional layer to make it compatible with the dense layers.
- 1st Linear Layer + Bias, Dropout `[nn.Dropout]`
- 2st Linear Layer + Bias, Dropout
- Final Linear Layer
:::
:::info
__Task 4.2 [CNN.forward]:__ Fill out the `forward` function. Your `forward` should return raw logits (NO softmax).
:::
:::success
**The `self.training` flag:** Every `nn.Module` has a built-in boolean attribute `self.training` that is automatically set to `True` when you call `model.train()` and `False` when you call `model.eval()`. You don't need to set this yourself as PyTorch handles it.
One common use of this flag is when you want different behavior during training and testing. You'll use `self.training` again in Step 5 when swapping in your `ManualConv2d` during testing.
```python
if self.training:
x = self.conv(x)
else:
x = self.manual_conv(x)
```
:::
:::info
__Task 4.3 [Train and Test CNN]:__ Go ahead to train and test your CNN model as you did with your MLP model.
The autograder checks your CNN accuracy in one of two ways:
1. **Predictions file:** Save your CNN's predictions to `predictions_cnn.npy` using `numpy.save`. This file should contain the argmax class index for each test image. Submit it alongside your code — the autograder will grade it directly without re-training.
2. **Final file fallback:** If no `predictions_cnn.npy` is found, the autograder looks for any file in your submission whose name contains `final` but **not** `mlp` (e.g., `final_cnn.txt` or `FINAL.txt`). If found, it will train your model from scratch and verify it reaches the 70% accuracy threshold. **Only use this path as a fallback since submitting a predictions file is much faster and more reliable.**
:::
## Step 5. Creating your own `ManualConv2d`
:::warning
Before starting this part of the assignment, you should ensure that you have an accuracy of **at least 70%** on the test set using `nn.Conv2d` for the cat-dog classification problem.
:::
:::success
You will be implementing your very own convolution layer!
`ManualConv2d` is located in `convolution/manual_convolution.py`. It is a `torch.nn.Module` subclass, just like `nn.Conv2d`. Its `__init__` already handles weight initialization for you so you only need to implement `forward`.
The class supports `padding` and an optional bias. **For the sake of simple math (less is more, no?), you should always use a stride of 1.** This is because the calculation for padding size changes as a result of the stride.
:::
:::danger
Do **NOT** change the `__init__` signature or the parameter names of `ManualConv2d`. The autograder will construct it with specific arguments.
:::
:::info
**Task [ManualConv2d.forward]:** Implement the `forward` method of `ManualConv2d`. Here are some specifics and hints:
- **[Inputs]** Inputs are in **NCHW format**: `(num_examples, in_channels, in_height, in_width)`.
- **[Filters]** Your filters, which are your model's weights, are stored in `self.filters` with shape `(out_channels, in_channels, kernel_height, kernel_width)`.
- **[Padding]** We are kind and have made padding very simple for you to implement. You can assume even padding on all sides. Look through numpy and torch documentation to find a simple way to implement padding.
- __[Outputs]__ Your output dimension height is equal to `(in_height + total_padY - filter_height) / strideY + 1` and your output dimension width is equal to `(in_width + total_padX - filter_width) / strideX + 1`. Again, `strideX` and `strideY` will always be 1 for this assignment. Refer to the CNN slides if you'd like to understand this derivation.
- __[Algorithm Hints]__ After padding (if needed), you will want to go through the entire batch of images and perform the convolution operator on each image. There are two ways of going about this - you can continuously append multidimensional NumPy arrays to an output array or you can create a NumPy array with the correct output dimensions, and just update each element in the output as you perform the convolution operator. We suggest doing the latter - it's conceptually easier to keep track of things this way.
- __[Algorithm Hints]__ You will want to iterate the entire height and width including padding, stopping when you cannot fit a filter over the rest of the padding input. For convolution with many input channels, you will want to perform the convolution per input channel and sum those dot products together.
- **[Return]** Return a `torch.Tensor` of shape `(num_examples, out_channels, out_height, out_width)`.
:::
:::danger
Writing `ManualConv2d` has given students some grief in the past. We recommend thinking through the algorithm carefully on paper before coding. Jumping right in is likely to confuse you.
:::
:::success
Hopefully Helpful Hints:
1. In the past, many students have found success thinking about how to use 4 for-loops to write ManualConv2d, then with some small tweaks you can easily get it down to 2 for-loops.
2. Don't be scared to use broadcasting by expanding the dimensions of the input or filters so that your computation is easier.
:::
## Step 6. Testing your `ManualConv2d`
:::info
**Task [local_test.py]:** We have provided several tests in `local_test.py` that compare your `ManualConv2d` output against PyTorch's `nn.Conv2d` with identical weights. Run the tests by executing:
```bash
python local_test.py
```
Each test prints a message if it passes (e.g., `"Sample test passed!"`). These tests cover basic correctness, different padding modes, non-square inputs, and bias behavior. If you pass these locally you should be passing the corresponding autograder tests.
The tests available are:
- `sample_test` — basic forward pass sanity check
- `base_case_test` — 1×1 kernel
- `padding_test_same` — SAME-style padding (output same size as input)
- `padding_test_valid` — VALID padding (no padding, output shrinks)
- `weird_shapes_1_same / _valid` — tall and narrow input
- `weird_shapes_2_same / _valid` — short and wide input
- `bias_test` — checks bias is correctly applied
:::
---
## Step 7: Using your `ManualConv2d`
:::info
Your `ManualConv2d` is slow, sorry :(
PyTorch's engineers have optimized `nn.Conv2d` to be super fast. Rather than training with it, which would take forever and break our autograder, we expected you to do the following:
1. **Train** using a regular `nn.Conv2d` layer.
2. **At test time**, copy the trained weights from your `nn.Conv2d` into a `ManualConv2d` with the same shape, and run the manual version instead.
You should utilize `self.training` and `ManualConv2d.set_weights` to copy the weights from your `nn.Conv2d` into your `ManualConv2d` at test time.
:::danger
You only need to do this weight-copying swap for **one** convolutional layer. You do **not** need to replace every `nn.Conv2d` in your model.
:::
## Visualizing Results
We have written two methods for you to visualize your results. The created visuals will not be graded and are entirely for your benefit. You can use it to check out your doggos and kittens.
- We've provided the `visualize_results(image_inputs, logits, image_labels, [first_label, second_label])` method for you to visualize your predictions against the true labels using matplotlib, a useful Python library for plotting graphs. This method is currently written with the image_labels having a shape of (num_images, num_classes). **DO NOT EDIT THIS FUNCTION**. You should call this function after training and testing, passing into `visualize_results` an input of 50 images, 50 probabilities, 50 labels, the first label name, and second label name.
- Unlike the first assignment, you will need to pass in the strings of the first and second classes. A `visualize_results` method call might look like: `visualize_results(image_inputs, logits, image_labels, ["cat", "dog"])`.
- This should result in two visuals, one for correct predictions, and one for incorrect predictions. You should do this after you are sure you have met the benchmark for test accuracy.
- We have also provided the `visualize_loss(losses)` method for you to visualize your loss per batch over time. Your model or your training function should have a list `loss_list` to which you can append batch losses to during training. You should call this function after training and testing, passing in `loss_list`.
# Submission
## Requirements
:::success
**Important:** The autograder grades accuracy using the predictions files you submit. You should submit the following two files: `predictions_cnn.npy` and `predictions_mlp.npy`.
Save predictions with `numpy.save("predictions_cnn.npy", preds)` where `preds` is a 1-D array of predicted class indices (0 or 1, or the original indices 3 and 5 — both are accepted).
If a predictions file is not found, the autograder will look for a final file (see Tasks 3.4 and 4.3) and train your model from scratch, but as we keep on rewriting, this is slower and less reliable than submitting a predictions file.
:::
Our autograder will import your model and your preprocessing functions. We will feed the result of your `get_data` function called on a path to our data and pass the result to your train method in order to return a fully trained model. After this, we will feed in your trained model, alongside the TA pre-processed data, to our custom test function. This will just batch the testing data using YOUR batch size and run it through your model's `call` function. However, we will test that your model can test with any batch size, meaning that you should not hardcode `self.batch_size` in your `forward` function. The logits which are returned will then be fed through an accuracy function. Additionally, we will test your conv2d function. In order to ensure you don't lose points, you need to make sure that you... A) correctly return training inputs and labels from `get_data`, B) ensure that your model's `forward` function returns logits from the inputs specified, and that it does not break on different batch sizes when testing, and C) it does not rely on any packages outside of pytorch, numpy, matplotlib, or the python standard library.
In addition, remember to include a brief README with your model's accuracy and any known bugs.
## Grading
**Code:** You will be primarily graded on functionality.
- **CNN** *at least* 70% accuracy on the cat-dog test subset
- **MLP** *at least* 60% accuracy on the cat-dog test subset
- **ManualConv2d** *correctness verified against* `nn.Conv2d` with the same weights
## Handing In
You should submit the assignment via Gradescope under the corresponding project, by dropping your files into Gradescope or through GitHub. To submit through GitHub, commit and push all changes to your repository. You can do this by running:
1. `git add -A`
2. `git commit -m "commit message"`
3. `git push`
After pushing to your repo, upload it to Gradescope. If you're testing on multiple branches, you have the option to pick whichever one you want.
:::danger
Make sure any `data/` folders are **not** being uploaded as they may be too large for the autograder.
:::
# Conclusion
Congrats on finishing your CNN homework! Bruno was able to successfully reach out to the Space Canines who provided a path to Bruno's home world! Bruno is finally on track to make it back home, so long as nothing else goes wrong...