---
tags: machine-learning
---
# Convolutional Neural Network with Numpy (Fast)
<div style="text-align: center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/0.png" width="80%">
</div>
> In the [previous post](https://hackmd.io/@machine-learning/blog-post-cnnumpy-slow), we have seen a naive implementation of Convolutional Neural network using Numpy.
>
> Here, we are going to implement a faster CNN using Numpy with the im2col/col2im method.
>
> To see the full implementation, please refer to my [repository](https://github.com/3outeille/CNNumpy).
>
> Also, if you want to read some of my blog posts, feel free to check them at my [blog](https://3outeille.github.io/deep-learning/).
# I) Forward propagation
:::info
- The main objective here is to ensure that the reader develops a strong intuition about how im2col/col2im works so that he can write them by himself.
- Thus, I decided to be less **rigorous** in my explanation to make things **clearer**.
- We will refer to the term "**level**" as a whole horizontal kernel slide from left to right.
- We will only discuss **average pooling** layer here (even though same logic can be applied on max pooling).
:::
## 1) Convolutional layer
- In the naive implementation, we used a lot of nested "For loops" which makes our code very slow.
- An approach could be to trade some memory for more speedup.
- Here are the following steps:
- **A.** ==Transform our input image into a matrix (im2col).==
- **B.** ==Reshape our kernel (flatten).==
- **C.** ==Perform matrix multiplication between reshaped input image and kernel.==
- We are going to see how it works intuitively and then how to implement it using Numpy.
## ☰ Intuition
- As an example, we will perform a convolution between an (1,3,4,4) input image and kernels of shape (2,3,2,2).
### A) <ins>Transform our input image into a matrix (im2col)</ins>
- Here is how it works:
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/1.gif">
<figcaption>Figure 1: Input image transformed into matrix</figcaption>
</div>
<br>
- How do we do that in Numpy? An efficient way to do so is by the help of [multi-dimensional arrays indexing](https://docs.scipy.org/doc/numpy/user/basics.indexing.html). For example,
```python
>>> y = np.arange(35).reshape(5,7)
>>> y
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
>>> y[np.array([0,2,4]), np.array([0,1,2])]
array([0, 15, 30]) # 0 = (0,0) / 15 = (2,1) / 30 = (4,2)
```
- Thus, the idea is to use **the multi-dimensional arrays indexing** property to transform our input image into a matrix.
- Indeed, we can notice few things:
- **Firstly**, the indices for each input image channel is the same. Thus, we can focus ourselves only on the first channel since the result will be the same for the others.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/2.png" width="90%">
</div>
- **Secondly**, we can observe a certain pattern in the indices (<a style="color:red">i</a>, <a style="color:#FF1493">j</a>) when we slide our kernel.
<br>
- <a style="color:red"><ins>Index i</ins></a>:
- At level 1, we have:
![](https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/3.png)
- At level 2, we have:
![](https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/4.png)
- At level 3, we have:
![](https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/5.png)
- <a style="color:red"><ins>Conclusion:</ins></a>
- We start with a [0, 0, 1, 1] vector at level 1.
- **At each level, we increase the vector by 1.**
<br>
- <a style="color:#FF1493"><ins>Index j</ins></a>:
- At level 1, we have:
![](https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/6.png)
- At level 2, we have:
![](https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/7.png)
- At level 3, we have:
![](https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/8.png)
- <a style="color:#FF1493"><ins>Conclusion:</ins></a>
- At the level 1, there is a total of 3 slides.
- For slide 1, we have a [0,1,0,1] vector.
- For slide 2, we have a [1,2,1,2] vector.
- For slide 3, we have a [2,3,2,3] vector.
- **We can notice an increase of 1 at each slide.**
- **At each level, we keep the same pattern.**
---
- Thus, even if it's not rigorous, we can intuitively think of a general formula for an **(n,n) image convolve to $X$ filters of shape (k, k)**.
<br>
- <a style="color:red"><ins>For index i</ins></a>:
<br>
- We start with at level 1 with the following vector:
$$[\underbrace{0,0,...0}_{k},\underbrace{1,1,...1}_{k},...,\underbrace{k-1,k-1,...,k-1}_{k}]$$
- ==**At each level, we increase this vector by 1**==
<br>
- <a style="color:#FF1493"><ins>For index j</a></ins>:
<br>
- At level 1, there is a total of n-k slides.
<br>
- For slide 1, we have a $[\underbrace{0,1,...,k-1,...,0,1,...,k-1}_{k}]$ vector.
- For slide 2, we have a $[\underbrace{1,2,...,k,...,1,2,...,k}_{k}]$ vector.
- ...
- For slide n-k, we have a $[\underbrace{n-k,n-k+1,...,n-1,...,n-k,n-k+1,...,n-1}_{k}]$ vector.
- ==**At each level, we keep the same pattern.**==
<br>
- The numbers of filters $X$ do not have any effect in this part. We will see that it's quite simple to deal with it during the kernel reshaping step.
- If you have $M$ images ($M > 1$), you will have the same matrix but stack horizontally $M$ times.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/9.gif">
</div>
### B) <ins>Reshape our kernel (flatten)</ins>
- Here is how it works:
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/10.png" width="80%">
<figcaption>Figure 2: Reshaped version of the 2 kernels</figcaption>
</div>
<br>
- As you can see, each filter is flattened and then stacked together. Thus, for $X$ filter, we will flatten and stack $X$ filters together.
### C) <ins>Matrix multiplication between reshaped input and kernel</ins>
- Now, we only need to perform a matrix multiplication.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/11.gif">
<figcaption>Figure 3: Matrix multiplication</figcaption>
</div>
<br>
- At the end, we need to reshape our matrix back to an image.
- ==Be aware the `np.reshape()` method doesn't return the expected result here (elements in wrong order). A little bit of numpy gymnastic solves the problem.==
## ☰ Implementation
- The most difficult part of im2col is to **transform our input image into a matrix.**
- To do so, we will use `np.tile()` and `np.repeat()` methods from Numpy. Here is how they work:
```python
>>> y = np.arange(5)
>>> y
array([0, 1, 2, 3, 4])
>>> np.tile(y, 2)
array([0, 1, 2, 3, 4, 0, 1, 2, 3, 4])
>>> np.repeat(y, 2)
array([0, 0, 1, 1, 2, 2, 3, 3, 4, 4])
```
- Now, let's explain the process visually.
:::info
**Reminder:**
- We want to perform a convolution between an (1,3,4,4) input image kernels of shape (2,3,2,2).
:::
- For index i:
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/12-a.png">
</div>
<br>
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/12-b.png">
</div>
<br>
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/12-c.gif">
</div>
<br>
- For index j:
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/13-a.png">
</div>
<br>
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/13-b.png">
</div>
<br>
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/13-c.gif">
</div>
<br>
- Now, we can transform our input image into a matrix.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/14.gif">
</div>
<br>
---
Here is the code to implement **im2col**:
```python=
def get_indices(X_shape, HF, WF, stride, pad):
"""
Returns index matrices in order to transform our input image into a matrix.
Parameters:
-X_shape: Input image shape.
-HF: filter height.
-WF: filter width.
-stride: stride value.
-pad: padding value.
Returns:
-i: matrix of index i.
-j: matrix of index j.
-d: matrix of index d.
(Use to mark delimitation for each channel
during multi-dimensional arrays indexing).
"""
# get input size
m, n_C, n_H, n_W = X_shape
# get output size
out_h = int((n_H + 2 * pad - HF) / stride) + 1
out_w = int((n_W + 2 * pad - WF) / stride) + 1
# ----Compute matrix of index i----
# Level 1 vector.
level1 = np.repeat(np.arange(HF), WF)
# Duplicate for the other channels.
level1 = np.tile(level1, n_C)
# Create a vector with an increase by 1 at each level.
everyLevels = stride * np.repeat(np.arange(out_h), out_w)
# Create matrix of index i at every levels for each channel.
i = level1.reshape(-1, 1) + everyLevels.reshape(1, -1)
# ----Compute matrix of index j----
# Slide 1 vector.
slide1 = np.tile(np.arange(WF), HF)
# Duplicate for the other channels.
slide1 = np.tile(slide1, n_C)
# Create a vector with an increase by 1 at each slide.
everySlides = stride * np.tile(np.arange(out_w), out_h)
# Create matrix of index j at every slides for each channel.
j = slide1.reshape(-1, 1) + everySlides.reshape(1, -1)
# ----Compute matrix of index d----
# This is to mark delimitation for each channel
# during multi-dimensional arrays indexing.
d = np.repeat(np.arange(n_C), HF * WF).reshape(-1, 1)
return i, j, d
def im2col(X, HF, WF, stride, pad):
"""
Transforms our input image into a matrix.
Parameters:
- X: input image.
- HF: filter height.
- WF: filter width.
- stride: stride value.
- pad: padding value.
Returns:
-cols: output matrix.
"""
# Padding
X_padded = np.pad(X, ((0,0), (0,0), (pad, pad), (pad, pad)), mode='constant')
i, j, d = get_indices(X.shape, HF, WF, stride, pad)
# Multi-dimensional arrays indexing.
cols = X_padded[:, d, i, j]
cols = np.concatenate(cols, axis=-1)
return cols
def forward(self, X):
"""
Performs a forward convolution.
Parameters:
- X : Last conv layer of shape (m, n_C_prev, n_H_prev, n_W_prev).
Returns:
- out: previous layer convolved.
"""
m, n_C_prev, n_H_prev, n_W_prev = X.shape
n_C = self.n_F
n_H = int((n_H_prev + 2 * self.p - self.f)/ self.s) + 1
n_W = int((n_W_prev + 2 * self.p - self.f)/ self.s) + 1
X_col = im2col(X, self.f, self.f, self.s, self.p)
w_col = self.W['val'].reshape((self.n_F, -1))
b_col = self.b['val'].reshape(-1, 1)
# Perform matrix multiplication.
out = w_col @ X_col + b_col
# Reshape back matrix to image.
out = np.array(np.hsplit(out, m)).reshape((m, n_C, n_H, n_W))
self.cache = X, X_col, w_col
return out
```
## 2) Pooling layer
- We can make the average pooling operation faster by using **im2col** method.
- ==Be aware that the `np.reshape()` method doesn't return the expected result here (elements in wrong order). A little bit of numpy gymnastic solves the problem.==
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/1-pool.png">
</div>
---
Here is the implementation code:
```python=
def forward(self, X):
"""
Apply average pooling.
Parameters:
- X: Output of activation function.
Returns:
- A_pool: X after average pooling layer.
"""
self.cache = X
m, n_C_prev, n_H_prev, n_W_prev = X.shape
n_C = n_C_prev
n_H = int((n_H_prev + 2 * self.p - self.f)/ self.s) + 1
n_W = int((n_W_prev + 2 * self.p - self.f)/ self.s) + 1
X_col = im2col(X, self.f, self.f, self.s, self.p)
X_col = X_col.reshape(n_C, X_col.shape[0]//n_C, -1)
A_pool = np.mean(X_col, axis=1)
# Reshape A_pool properly.
A_pool = np.array(np.hsplit(A_pool, m))
A_pool = A_pool.reshape(m, n_C, n_H, n_W)
return A_pool
```
# II) Backward propagation
:::warning
- This part will be tougher than the previous one.
- However, if you have read the <a href="https://hackmd.io/@bouteille/ByusmjZc8" style="color:red">previous post</a> about the naive implementation of Convolutional Neural network using Numpy, it should be fine.
- Along the way, you will often encounter a **"Be aware"** sentence about reshaping. I strongly advise you to run and play with `unit_tests.py` to understand why these numpy gymnastics were required.
:::
## 1) Convolutional layer
:::info
**Reminder:**
- We performed a convolution between (1,3,4,4) input image and kernels of shape (2,3,2,2) which output an (2,3,3) image.
- During the backward pass, the (2,3,3) image contains the error/gradient (**"dout"**) which needs to be back-propagated to the:
- (1,3,4,4) input image (layer).
- (2,3,2,2) kernels.
:::
## ⋆ Layer gradient: Intuition
- The formula to compute the layer gradient is:
$$\boxed{\frac{\partial L}{\partial I} = Conv(K, \frac{\partial L}{\partial O})}$$
- $\frac{\partial L}{\partial I}$: Input gradient.
- $K$: Kernels.
- $\frac{\partial L}{\partial O}$: Output gradient.
- $Conv$: Convolution operation.
- To do so, we will proceed as follow:
- **A.** ==Reshape dout ($\frac{\partial L}{\partial O}$).==
- **B.** ==Reshape kernels `w` into single matrix `w_col`.==
- **C.** ==Perform matrix multiplication between reshaped `dout` and kernel.==
- **D.** ==Reshape back to image (**col2im**).==
- We are going to see how it works intuitively and then how to implement it using Numpy.
### A) <ins>Reshape `dout`</ins>
- During backward propagation, the output of the forward convolution contains the error that needs to be back-propagated.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/15.png" width="80%">
</div>
### B) <ins>Reshape `w` into `w_col`</ins>
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/16.png">
</div>
### C) <ins>Perform matrix multiplication between reshaped `dout` and `w_col`</ins>
- In order to perform to perform the matrix multiplication, we need to transpose `w_col`.
- We will denoted the output as `dX_col`.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/17.gif">
</div>
<br>
- ==Notice that we are in fact, broadcasting the error in `dout` to each kernel as we did in the naive implementation.==
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/18.gif">
</div>
<br>
### D) <ins>Reshape back to image (col2im)</ins>
- ==Here, **col2im** is more than a simple backward operation of im2col. Indeed, we have to take care of cases where errors will overlap with others.==
- As we can see in the previous gif, the (14,14,6) image has overlapping window. We need to reproduce the same effect !
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/19.gif">
</div>
<br>
## ⋆ Layer gradient: Implementation
- The most difficult part of **col2im** is to reshape our matrix back to an image because it requires us to take care of the overlapping gradient.
- An efficient and elegant way to do so is to use the [np.add.at](https://numpy.org/doc/stable/reference/generated/numpy.ufunc.at.html) method from Numpy. Here is a short example of how it works:
```python=
>>> indices = [
[0,4,1], # rows.
[3,2,4] # columns.
]
>>> X = np.zeros((5,6))
>>> np.add.at(X, indices, 1)
>>> X
array([[0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0]])
```
- We will proceed as follow:
- Create a matrix filled with 0 of the same shape as input image (add padding if needed).
- `X_padded`: (1,3,4,4) with pad=0.
- Use **get_indices()** which returns index matrices, necessary to transform our input image into a matrix.
- `i`: (12,9)
- `j`: (12,9)
- `d`: (12,1)
- Retrieve `dX_col` batch dimension by splitting it `N` (number of images) times. For example, if you have `N` images, then:
- `dX_col`: (12, 9) => (N, 12, 9)
- ==Be aware that the `np.reshape()` method doesn't return the expected result here (elements in wrong order). A little bit of numpy gymnastic solves the problem.==
- Use `i,j,d` matrices as argument in `np.add.at` to reshape our matrix back to input image.
- Refer to step ==[D) Reshape back to image (col2im)](#D-Reshape-back-to-image-col2im)== for `np.add.at` method visualization.
- Remove padding from new image if needed.
## ○ Kernel gradient: Intuition
- The formula to compute the kernel gradient is:
$$\boxed{\frac{\partial L}{\partial K} = Conv(I, \frac{\partial L}{\partial O})}$$
- $\frac{\partial L}{\partial K}$: Kernels gradient.
- $I$: Input image.
- $\frac{\partial L}{\partial O}$: Output gradient.
- $Conv$: Convolution operation.
- To do so, we will:
- **A.** ==Reshape dout ($\frac{\partial L}{\partial O}$).==
- **B.** ==Apply **im2col** on `X` to get `X_col`.==
- **C.** ==Perform matrix multiplication between reshaped `dout` and `X_col` to get `dw_col`.==
- **D.** ==Reshape `dw_col` back to `dw`.==
- We are going to see how it works intuitively and then how to implement it using Numpy.
### A) <ins>Reshape `dout`</ins>
- ==Be aware that the `np.reshape()` method doesn't return the expect result here (elements in wrong order). A little bit of numpy gymnastic solves the problem.==
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/20.png" width="80%">
</div>
### B) <ins>Apply im2col on `X` to get `X_col`</ins>
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/21.gif">
</div>
### C) <ins>Perform matrix multiplication between reshaped dout and `X_col` to get `dw_col`</ins>
- In order to perform to perform the matrix multiplication, we need to transpose `X_col`.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/22.png">
</div>
<br>
- ==Notice that we are in fact, broadcasting the error in `dout` to to each “slide” we did during the naive implementation forward propagation over the input==.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/23.gif">
</div>
### D) <ins>Reshape `dw_col` back to `dw`</ins>
- We simply need to reshape `dw_col` back to its original kernel shape.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/24.png" width="80%">
</div>
## ○ Kernel gradient: Implementation
- Nothing fancy here.
---
Here is the code to implement the layer and kernel gradient.
```python=
def col2im(dX_col, X_shape, HF, WF, stride, pad):
"""
Transform our matrix back to the input image.
Parameters:
- dX_col: matrix with error.
- X_shape: input image shape.
- HF: filter height.
- WF: filter width.
- stride: stride value.
- pad: padding value.
Returns:
-x_padded: input image with error.
"""
# Get input size
N, D, H, W = X_shape
# Add padding if needed.
H_padded, W_padded = H + 2 * pad, W + 2 * pad
X_padded = np.zeros((N, D, H_padded, W_padded))
# Index matrices, necessary to transform our input image into a matrix.
i, j, d = get_indices(X_shape, HF, WF, stride, pad)
# Retrieve batch dimension by spliting dX_col N times: (X, Y) => (N, X, Y)
dX_col_reshaped = np.array(np.hsplit(dX_col, N))
# Reshape our matrix back to image.
# slice(None) is used to produce the [::] effect which means "for every elements".
np.add.at(X_padded, (slice(None), d, i, j), dX_col_reshaped)
# Remove padding from new image if needed.
if pad == 0:
return X_padded
elif type(pad) is int:
return X_padded[pad:-pad, pad:-pad, :, :]
def backward(self, dout):
"""
Distributes error from previous layer to convolutional layer and
compute error for the current convolutional layer.
Parameters:
- dout: error from previous layer.
Returns:
- dX: error of the current convolutional layer.
- self.W['grad']: weights gradient.
- self.b['grad']: bias gradient.
"""
X, X_col, w_col = self.cache
m, _, _, _ = X.shape
# Compute bias gradient.
self.b['grad'] = np.sum(dout, axis=(0,2,3))
# Reshape dout properly.
dout = dout.reshape(dout.shape[0] * dout.shape[1], dout.shape[2] * dout.shape[3])
dout = np.array(np.vsplit(dout, m))
dout = np.concatenate(dout, axis=-1)
# Perform matrix multiplication between reshaped dout and w_col to get dX_col.
dX_col = w_col.T @ dout
# Perform matrix multiplication between reshaped dout and X_col to get dW_col.
dw_col = dout @ X_col.T
# Reshape back to image (col2im).
dX = col2im(dX_col, X.shape, self.f, self.f, self.s, self.p)
# Reshape dw_col into dw.
self.W['grad'] = dw_col.reshape((dw_col.shape[0], self.n_C, self.f, self.f))
return dX, self.W['grad'], self.b['grad']
```
## 2) Pooling layer
- We first have to reshape our filters and divide by the filter size.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/25.png" width="80%">
</div>
<br/>
- We then repeat each element "filter size" time.
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/26.png" width="80%">
</div>
<br/>
- Finally, we apply **col2im**.
- ==Be aware that the `np.reshape()` method doesn't return the expected result here (elements in wrong order). A little bit of numpy gymnastic solves the problem.==
<div style="text-align:center">
<img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/blog-deep-learning/cnnumpy-fast/27.png" width="80%">
</div>
---
Here is the implementation code:
```python=
def backward(self, dout):
"""
Distributes error through pooling layer.
Parameters:
- dout: Previous layer with the error.
Returns:
- dX: Conv layer updated with error.
"""
X = self.cache
m, n_C_prev, n_H_prev, n_W_prev = X.shape
n_C = n_C_prev
n_H = int((n_H_prev + 2 * self.p - self.f)/ self.s) + 1
n_W = int((n_W_prev + 2 * self.p - self.f)/ self.s) + 1
dout_flatten = dout.reshape(n_C, -1) / (self.f * self.f)
dX_col = np.repeat(dout_flatten, self.f*self.f, axis=0)
dX = col2im(dX_col, X.shape, self.f, self.f, self.s, self.p)
# Reshape dX properly.
dX = dX.reshape(m, -1)
dX = np.array(np.hsplit(dX, n_C_prev))
dX = dX.reshape(m, n_C_prev, n_H_prev, n_W_prev)
return dX
```
# III) Performance of fast implementation
- The [naive implementation](https://github.com/3outeille/CNNumpy/tree/master/src/slow) takes around **4 hours for 1 epoch** where the [fast implementation](https://github.com/3outeille/CNNumpy/tree/master/src/fast) takes only **6 min for 1 epoch**.
- For your information, **with the same architecture using Pytorch**, it will take around **1 min for 1 epoch**.

In this post, we are going to see how to implement a Convolutional Neural Network using only Numpy. The main goal here is not only to give a boilerplate code but rather to have an in-depth explanation of the underlying mechanisms through illustrations, especially during the backward propagation where things get trickier. However, some knowledge about Convolutional Neural Networks building blocs are required. To see the full implementation, please refer to my repository. For the more advanced, here is another post where we implement a faster CNN using im2col/col2im methods.

9/30/2022This post is divided into 2 sections: Summary and Implementation. We are going to have an in-depth review of ImageNet Classification with Deep ConvolutionalNeural Networks paper which introduces the AlexNet architecture. The implementation uses Keras as framework. To see full implementation, please refer to this repository. Also, if you want to read other "Summary and Implementation", feel free to check them at my blog.

1/28/2022This post is divided into 2 sections: Summary and Implementation. We are going to have an in-depth review of Visualizing and Understanding Convolutional Networks paper which introduces the ZFNet and DeconvNet architecture. The implementation uses Pytorch as framework. To see full implementation, please refer to this repository. Also, if you want to read other "Summary and Implementation", feel free to check them at my blog.

1/28/2022This post is divided into 2 sections: Summary and Implementation. We are going to have an in-depth review of Very Deep Convolutional Networks for Large-Scale Image Recognition paper which introduces the VggNet architecture. The implementation uses Pytorch as framework. To see full implementation, please refer to this repository. Also, if you want to read other "Summary and Implementation", feel free to check them at my blog.

1/28/2022
Published on ** HackMD**

or

By clicking below, you agree to our terms of service.

Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet

Wallet
(
)

Connect another wallet
New to HackMD? Sign up