VGGNet on Flowers Recognition

# VGGNet on Flowers Recognition ### 1. Introduction In this problem we will implement the VGGNet with 16 layers to classify five different species of flowers: - **[VGGNet][VGG from Scratch]**: Construct a new network from scratch. - **[VGGNet (Pretrained=False)][VGG (pretrained = False)]**: Import vgg16 from torchvision.models, set the "pretrained" parameter to "False". - **[VGGNet (Pretrained=True)][VGG (pretrained = True)]**: Import vgg16 from torchvision.models, set the "pretrained" parameter to "True". [VGG from Scratch]: https://www.kaggle.com/hoangnguyen111/flowers-recognition-with-vggnet?scriptVersionId=43485908 [VGG (pretrained = False)]: https://www.kaggle.com/johnlee111/flowers-recognition-with-vggnet-pretrained-false?scriptVersionId=43485856 [VGG (pretrained = True)]: https://www.kaggle.com/hoangnguyen111/flowers-reconigtion-with-vggnet-pretrained-true?scriptVersionId=43551761 ### 2. Dataset - The [dataset][data] contains **4242 images** of flowers. The data collection is based on the data flicr, google images, yandex images. - The images are divided into **five classes**: chamomile, tulip, rose, sunflower, dandelion. For each class there are about 800 photos. Photos are not high resolution, about 320x240 pixels. Photos are not reduced to a single size, they have different proportions! [data]: https://www.kaggle.com/alxmamaev/flowers-recognition# ![Imgur](https://i.imgur.com/MkWxFep.png) *Some examples in the dataset* ### 3. Data Preprocessing - The data is resized to 256x256 pixels for better performance. Then, each is scaled from range [0,255] to [0,1]. - Data Augmentation: > transforms.RandomHorizontalFlip(), transforms.RandomVerticalFlip(), transforms.RandomRotation((30,60), resample=PIL.Image.BILINEAR), transforms.RandomRotation((-60,-30), resample=PIL.Image.BILINEAR), transforms.RandomAffine(degrees=0, translate=(0.1,0.1), scale=None, shear=None, resample=False, fillcolor=0), transforms.RandomPerspective(distortion_scale=0.2, p=0.5, interpolation=3, fill=0) ![Imgur](https://i.imgur.com/WIbMQeT.png) *Some examples after augmentation* - Finally, we normalize the images with these set of parameters from the original model on the ImageNet dataset: > mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])] ### 4. Architecture - All nets contain Batch Normalization after each ConV layer. - In addition, we add another Average Pooling between the ConV Layer and FC Layer. ![Alt](https://i.imgur.com/nSJPFoa.jpg "*The architecture of VGG16-Net*") *The architecture of VGG16-Net* ### 5. Parameters Train Test Parameters: > batch_size = 32 valid_split = 0.3 shuffle_dataset = True random_seed = 42 Loss function and Optimizer: > criterion = nn.CrossEntropyLoss() optim = Adam(vgg.parameters(), lr, weight_decay) | | VGGNet | VGGNet (Pretrained=False) | VGGNet (Pretrained=True) | |:------------:|:------:|:-------------------------:|:------------------------:| | lr | 3e-5 | 3e-5 | 1e-5 | | weight_decay | 1e-3 | 1e-3 | 1e-5 | ### 6. Result - Max Accuracy on the validation set from three distinct models: | VGGNet | VGGNet (Pretrained=False) | VGGNet (Pretrained=True) | |:------:|:-------------------------:|:------------------------:| | 0.862 | 0.848 | 0.909 | - Result after 150 epochs on the training set and validation set with three models: ![Imgur](https://i.imgur.com/xJlGFSF.png) ![Imgur](https://i.imgur.com/V1ToLAS.png) ![Imgur](https://i.imgur.com/Roa90fZ.png)

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.