VggNet: Summary and Implementation
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More โ
This post is divided into 2 sections: Summary and Implementation.
We are going to have an in-depth review of Very Deep Convolutional Networks for Large-Scale Image Recognition paper which introduces the VggNet architecture.
The implementation uses Pytorch as framework. To see full implementation,
please refer to this repository.
Also, if you want to read other "Summary and Implementation", feel free to
check them at my blog.
I) Summary
- The paper Very Deep Convolutional Networks for Large-Scale Image Recognition introduces a familly of ConvNets called VGGNet.
- During ILSVLC-2014, they achieved 2nd place at the classification task (top-5 test error = 7.32%)
- They demonstrated that depth is beneficial for the classification accuracy.
- In spite of its large depth, the number of weights is not greater than number of weights in a more shallow net with larger conv.
- VGGNet uses a smaller receptive field (3x3 stride 1) contrary to AlexNet (11x11 with stride 4) and ZFNet (7x7 stride 2).
VGG architecture:
- Input size 224x224x3 (RGB image).
- Preprocessing done by substracting training set RGB mean.
- Filters size 3x3.
- Convolutional layers:
- stride 1.
- padding 1 (3x3 conv layers).
- ReLU or LRN for one of the config.
- followed by 5 max pooling layers (not all of them).
- Fully-connected layers:
- 1st: 4096 (ReLU).
- 2nd: 4096 (ReLU).
- 3rd: 100 (Softmax).
VGGNet configurations:
- VGG-11
- VGG-11 (LRN)
- VGG-13
- VGG-16 (Conv1)
- VGG-16
- VGG-19
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More โ
We are going to focus on VGG-16. Here is its architecture:
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More โ
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More โ
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More โ
II) Implementation
We are going to implement VGG-16.
1) Architecture build
class Vgg16(nn.Module):
def __init__(self):
super(Vgg16, self).__init__()
self.features = nn.Sequential(OrderedDict([
('block1-conv1', nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1)),
('block1-act1', nn.ReLU()),
('block1-conv2', nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1)),
('block1-act2', nn.ReLU()),
('pool1', nn.MaxPool2d(kernel_size=2, stride=2)),
('block2-conv1', nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)),
('block2-act1', nn.ReLU()),
('block2-conv2', nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1)),
('block2-act2', nn.ReLU()),
('pool2', nn.MaxPool2d(kernel_size=2, stride=2)),
('block3-conv1', nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1)),
('block3-act1', nn.ReLU()),
('block3-conv2', nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1)),
('block3-act2', nn.ReLU()),
('block3-conv3', nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1)),
('block3-act3', nn.ReLU()),
('pool3', nn.MaxPool2d(kernel_size=2, stride=2)),
('block4-conv1', nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1)),
('block4-act1', nn.ReLU()),
('block4-conv2', nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1)),
('block4-act2', nn.ReLU()),
('block4-conv3', nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1)),
('block4-act3', nn.ReLU()),
('pool4', nn.MaxPool2d(kernel_size=2, stride=2)),
('block5-conv1', nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1)),
('block5-act1', nn.ReLU()),
('block5-conv2', nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1)),
('block5-act2', nn.ReLU()),
('block5-conv3', nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1)),
('block5-act3', nn.ReLU()),
('pool5', nn.MaxPool2d(kernel_size=2, stride=2))
]))
self.classifier = nn.Sequential(OrderedDict([
('fc6', nn.Linear(512 * 7 * 7, 4096)),
('act6', nn.ReLU()),
('fc7', nn.Linear(4096, 4096)),
('act7', nn.ReLU()),
('fc8', nn.Linear(4096, 1000))
]))
def forward(self, x):
x = self.features(x)
x = x.view(x.size(0), -1)
x = self.classifier(x)
return x
2) Evaluating
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More โ