VGGNet - HackMD

VGGNet === > 相信大家對神經網路已經有一定的了解了，VGG就是使用3x3的filer size，的filer size，以及把深度加深，論文中提到總共有11~19的深度，其中以Vgg-16、Vgg-19效果最好。VGGNet在2014年ILSVRC的分類比賽中拿到了第二名，後來不少CV框架都是使⽤用Vgg為Backbone。 --- ## 架構 ![](https://i.imgur.com/DaAZrgC.png) ![](https://i.imgur.com/ufWrLtD.png) 從上圖中可以看到，VGG的架構非常一致，只是深度不一樣，整個神經網絡都使用了同樣大小的filter size（3x3），取代過去以往較大的filter size，例如: 5x5，7x7。其中兩個3x3可以取代一個5x5，三個3x3可以取代一個7x7，如下圖所示。 ![](https://i.imgur.com/cCF6KbL.png) * ### 優點: 1. VGGNet的結構非常簡潔，整個網絡都使用了同樣大小的filter size（3x3）和max pooling（2x2） 2. 使用小的filter size可以減少參數量 3. 驗證了通過不斷加深網絡結構可以提升性能 * ### 缺點: 1. VGG產生很多的參數，因為後面有個全連階層。 --- ## 程式碼實作 ```typescript import numpy as np from keras.models import Model from keras.layers import Flatten from keras.layers import Dense from keras.layers import Input from keras.layers import Conv2D from keras.layers import MaxPooling2D from keras.layers import GlobalMaxPooling2D from keras.layers import GlobalAveragePooling2D from keras import backend as K def VGG16(include_top=True,input_tensor=None, input_shape=(224,224,1), pooling='max',classes=1000): img_input = Input(shape=input_shape) x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input) x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x) # Block 2 x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x) x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x) # Block 3 x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x) x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x) # Block 4 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x) # Block 5 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x) if include_top: # Classification block x = Flatten(name='flatten')(x) x = Dense(4096, activation='relu', name='fc1')(x) x = Dense(4096, activation='relu', name='fc2')(x) x = Dense(classes, activation='softmax', name='predictions')(x) else: if pooling == 'avg': x = GlobalAveragePooling2D()(x) elif pooling == 'max': x = GlobalMaxPooling2D()(x) inputs = img_input # Create model. model = Model(inputs, x, name='vgg16') return model ``` ``` model = VGG16(include_top=False) model.summary() ``` ![](https://i.imgur.com/zJIYgWH.png) ###### tags: `ML`