--- tags: hw2, handout --- # HW2 Programming: Beras Pt. 2 :::info Conceptual questions due **Friday, February 23, 2024 at 6:00 PM EST** Programming assignment due **Monday, February 26, 2024 at 6:00 PM EST** ::: ## Theme ![](https://images.unsplash.com/photo-1510584587352-e0d0f4d2b853?q=80&w=3270&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D) *A few software engineers visit the school of fish, and begin to tell them about artificial intelligence on the surface, and how amazing `Keras` is. The fish are intrigued, and need your help improving `Beras`!* *Deep under the sea, they are learning deep learning deeply!* ## Assignment Overview In this assignment you will expand on Beras by adding support for MLPs, activation functions, and much, much more! ### Assignment Goals 1. Implement a multi-layer perceptron (MLP) with a structure resembling the Keras API. - Implement commonly used **activation functions** - Implement a more advanced **Gradient Tape** capable of **backpropogating through multiple** dense layers - Implement basic **accuracy metrics** 2. Apply this model to classify handwritten digits from the MNIST dataset. > **Note:** The code is shared between HW1 and HW2. Therefore, it is highly recommended to complete HW1 in its entirety in a timely fashion to prevent any delays in completing HW2. ## Getting Started ### Stencil Please click [here](https://classroom.github.com/a/cgIqItrq) to get the stencil code. > **Warning:** **Do not change the stencil except where specified.** You are welcome to write your own helper functions, however, changing the stencil's method signatures or removing pre-defined functions may result in an incompatibility with our autograder and result in a low overall grade. At this point you've already done a lot of hard work to get Beras working for a single dense layer, and we want to continue building on top of that! To do so we ask that you copy over some code from the following files from your HW1 repository this repository: - `beras/layers.py` (copy your file from HW1 into this file in the HW2 repo and replace the _entire_ file currently in this repo)/ - `beras/losses.py` (copy the code for the `call` and `get_input_gradients` functions for `MeanSquaredError` from HW1 into this file in HW2. Make sure **not** to replace the entire file: the stencil for this file in HW2 contains more tasks than you had in HW1) - `beras/optimizers.py` (copy the `BasicOptimizer` from your HW1 code into this file in HW2. Make sure **not** to replace the entire file: the stencil for this file in HW2 contains more tasks than you had in HW1). ### Environment You will need to use the virtual environment that you made in Homework 0. You can activate the environment by using the command `conda activate csci1470`. If you have any issues running the stencil code, be sure that your conda environment contains at least the following packages: - `python==3.10` - `numpy` - `tensorflow` - `scikit-learn` (called `sklearn` in actual Python programs!) - `pytest` <!-- TODO: Add more dependencies --> On Windows conda prompt or Mac terminal, you can check to see if a package is installed with: ```bash conda list -n csci1470 <package_name> ``` On Unix systems to check to see if a package is installed you can use: ```bash conda list -n csci1470 | grep <package_name> ``` > **Note:** Be sure to read this handout in its **entirety before** moving onto implementing **any** part of the assignment! ## Roadmap <!-- TODO: Finish writing this section --> ### Preprocessing Data For this assignment, you will be working with the MNIST dataset, a classic dataset used in machine learning consisting of the handwritten digits 0-9. Take a glance at [the documentation for the MNIST loading function we use](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/mnist/load_data). Before we jump back into developing Beras, we'll begin by preprocess the data into a more workable form. * First, we should take the input data and flatten each $28 \times 28$ image into flat vectors. * Then, normalize all the pixel values in the images to be a float value in the range $0.0$ to $1.0$ (Hint: there is a specific value you should use to normalize...) * Finally, convert everything into Beras Tensors and return everything. **Task 1:** Fill out the `load_and_preprocess_data` function in `preprocess.py`! ### Picking up Where We Left Off **Task 2:** Fill out `layers.py`,`MeanSquaredError` in `losses.py` with your code from HW1. ### Activation Functions Since you've already implemented Dense layers in Homework 1, all that's left to do to make an MLP is to create the activation layers that sit between the dense layers! For this assignment you'll be implementing LeakyReLU, Sigmoid, and Softmax activation functions. **Note:** By implementing LeakyReLU, you end up also implementing ReLU for free! **Task 3:** Implement the activation functions in `beras/activations.py`, along with their gradients! ### Loss Functions Unlike the regression task in Homework 1, we are working with categorical data for a classification task. We could still use the same MSE loss we used for our regression model, however, there are also other losses that are better suited for this task! **Task 4:** Implement the Categorical Cross Entropy Loss in `beras/losses.py` ### One Hot Encoding Since we are working with categorical data for our classifiaction task, we also want to have our labels onehot encoded. **Task 5:** Implement `fit`, `call`, and `inverse` in `beras/onehot.py`. ### Optimizers We touched on different optimization methods in Lab 2. Let's see if they can work their magic here to help your neural network classify some digits! **Task 6:** Implement `RMSProp` and `Adam` optimizers in `beras/optimizers.py` ### Accuracy Metrics When training and testing your model, its good to know how well its doing. We could use loss to indicate this, however, sometimes its hard to have a good understanding for what it means for a model to be at a particular loss value. Therefore, we rely on other accuracy metrics that provide us with a better intuition of how the model is doing. **Task 7:** Implement the categorical accuracy metric in `beras/metrics.py` ### Gradient Tape **Note:** In the past this has been the hardest section of the assignment for most people. Our advice is to take this slow, and try to see if you can draw out your thought process before hammering your way through the code. If you get stuck do not hesitate to reach out on EdStem or stop by TA hours! This arguably is the most important part of the assignment. We'll be implementing backpropogation through multiple layers rather than doing the backpropogation manually as we did in Homework 1. Before starting, make sure you understand the gradient tape function from Homework 1. **Task 8:** Implement a gradient tape in `beras/gradient_tape.py` **Hint:** Notice our `queue` variable! How can we play with the queue to backpropagate all the way through the computation graph? ### Model Evaluation **Task 9:** Implement `Model.evaluate` in `beras/model.py` **Hint:** Refer to `Model.fit`. What should be different between `fit` and `evaluate`? ### Putting it all Together All the individual pieces have been made! Its now time to put them all together. We'll be revisiting the `SingleLayerModel` class you previously saw in Homework 1, except this time we'll be filling it up with more components. <!-- TODO: Check this task to make sure it is clear what students are expected to do and is consistent with the stencil --> **Task 10.1:** Fill out the `call` function of the `SequentialModel` A `SequentialModel` is a type of model which consists of a list of layers which are called in order (or "sequentially") on the input in the call function. Think back to `SingleLayerModel` – we know how to call a single layer; how can we expand that to call multiple multiple layers (one after) another on the input? We might, for example, initialize a model which looks like: ```python model = Sequential( [ layers.Dense(...), #layer 1 layers.Dense(...), #layer 2 ... ] ) ``` We can assume that the layers are stored in the order we want to call them! (In the example above, layer 1 would get called before layer 2) **Task 10.2:** Write a call function that passes an input batch of flattened images through your model to get the predicted outputs and fill in `get_simple_model_components` and `get_advanced_model_components` in `assignment.py`. Keep in mind what the start and end dimensions should be. Additional details about filling in `get_simple_model_components` and `get_advanced_model_components`: * Simple Model: Two dense layers, with a Leaky ReLU and Sigmoid activation function. Use mean squared error as your loss metric. * Advanced Model: Two dense layers, with a Leaky ReLU and Softmax activation function. Use categorical cross entropy as your loss metric. Note that we also provide you with `get_simplest_model_components`. This should look very similar to what you implemented in HW1 and it will NOT be graded in this homework. This is mainly for you to test the basic functionality of your call function and beras classes. Some additional comments: * On Mac devices, you can manually terminate training with `control+c`. * You can play around with the size of intermediate layers in your model. * We have provided testing files (`test_assignment.py`, `test_beras.py`, `test_preprocess.py`) with some sanity checks that you can run to help identify where an issue might be occurring. While these sanity checks don't _guarantee_ that the code being tested is completely correct, it is useful to use them to see if you fail a basic test (in which case you know your code isn't quite right). * In order to run the tests: 1. Make sure you have the virtual environment activated 2. `cd` into the `code` directory 3. Uncomment the tests you want to run from the `main` method of the test file to be run 4. Run `python3 <test-filename>` (ex: if you want to run some of the tests in `test_beras.py`, run `python3 test_beras.py`). If a test fails, you will see an `AssertionError`. ## Submission <!-- TODO: Finish writing this section --> ### Requirements <!-- TODO: Finish writing this section --> - Implement a deep learning framework, use that framework to build a model and predict the classification of MNIST images. - Run and pass all sanity tests provided - Include a brief `README.md` file containing your **model's accuracy** and any **known bugs** (🐛) ### Grading <!-- TODO: Finish writing this section --> Your code will be primarily graded on functionality, as determined by the Gradescope autograder. Your model should have a test accuracy **greater than $95\%$** > **Warning:** You will not receive any credit for functions that use `tensorflow`, `keras`, `torch`, or `scikit-learn` functions within them. You must implement all functions manually using either vanilla Python or NumPy. ### Handing In You should submit the assignment via Gradescope under the corresponding project assignment **through Github**. To submit via Github, commit and push all of your changes to your repository to GitHub. You can do this by running the following commands. ```bash git commit -am "commit message" git push ``` > **Note:** We highly recommend committing your files to git and syncing with your GitHub repository **often** throughout the course of the assignment to ensure none of your hard work is **lost**!