HW3 Conceptual: CNNs

Conceptual questions due Monday, March 4th, 2024 at 6:00 PM EST
Programming assignment due Friday, March 8th, 2024 at 6:00 PM EST

Answer the following questions, showing your work where necessary. Please explain your answers and work.

We now require the use of

LATEX to typeset your answers, as it makes it easier for you and us.

We provide a general

LATEX template that you may use for conceptual assignments.

Do NOT include your name anywhere within this submission. Points will be deducted if you do so.

Theme

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Muffin or chihuahua? Or a chihuahua named "Muffin"? The age-old question no CNN can answer.

Conceptual Questions

1. Consider the three following
23×23
images of the digit 3.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • a. Which neural net is more fit to identify the digit in each image: a convolutional neural net or a multilayer perceptron (a neural network with multiple fully-connected layers and nonlinear layers)? Explain your reasoning. (2-3 sentences)
  • b. Will a convolutional layer with standard max-pooling (e.g
    2×2
    pooling) produce the same or different outputs for all of the images? Why/why not? How does this relate to translational invariance/equivariance? (hint: remember that the image is
    23×23
    ) (3-4 sentences)
  • c. Let’s say you built a convolutional neural network to classify these images with two layers: a convolution layer and a fully connected (linear) layer. What are their roles in the network respectively? (2-3 sentences).

2. Consider the image of a polar bear shown below.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • a. What are some examples of features that earlier convolutional layers will extract from this image? What about later layers?
  • b. The input image was converted into a matrix of size
    13×13
    along with a filter of size
    3×3
    with a stride of 2. Assume that we are using VALID padding. Determine the size of the convolved matrix.

3. The following questions refer to CNNs in different dimensions.

  • a. So far in this class, we’ve only explored 2D CNNs for image recognition and classification. However, 1D CNNs are also popular in many fields, with the network convolving linearly in only one direction. Give a scenario where a 1D CNN could be useful, and explain how the CNN can extract relevant features in a 1D setting. We’re looking for specific examples! (3-4 sentences)
  • b. Suppose you want your computer to read Twitter (or, as now called, X) data. Explain how you could leverage 1D CNNs to classify different emotions from input tweets. How would you train your model? What would your CNN kernel convolve over? How would you take into account variable tweet sizes? (3-4 sentences)

4. (Optional) Have feedback for this assignment? Found something confusing? We’d love to hear from you!

2740-Only Questions

1. Given an image
I
and convolutional kernel
K
, prove (for the discrete case) that convolution is equivariant under translation. It’s fine to do this just for 1D convolution.

Hint: Refer to the "Are CNNs Translation Invariant?" slide from the lecture.

2. Suppose you have a CNN that begins by taking an input image of size
28×28×3
and passing through a convolution layer that convolves the image using 3 filters of dimensions
2×2×3
with valid padding.

  • a. How many learnable parameters does this convolution layer have?
  • b. Suppose that you instead decided to use a fully connected layer to replicate the behavior of this convolutional layer. How many parameters would that fully connected layer have?
  • c. Read about cutout
    • i. What is cutout? Why is it useful?
    • ii. What are some similar methods? What makes them similar?
    • iii. What were the cutout sizes for CIFAR-10 and CIFAR-100? How did the researchers decide on their cutout size? Why do you think the cutout size differed for CIFAR-10 vs CIFAR-100?