Learning Deep Features for Discriminative Localization

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba
Computer Science and Artificial Intelligence Laboratory
MIT

回目錄

Abstract

Revisit the GAP
- GAP(global average pooling layer)
Achieve 37.1% top-5 error on ILSVRC 2014
- ILSVRC(Large Scale Visual Recognition Challenge)

Introduction

convolutional layers can localize objects, but this ability is lost when fully-connected layers
- Network in Network, GoogLeNet avoid fully-connected layers.
  - minimize the number of params
GAP(known as a kind of structural regularizer) doesn't simply act as a regularizer.
This approach can be easily transferred to other recognition datasets for generic classification, localization and concept discovery.
- achieves 37.1% top-5 test error, close to the fully supervised AlexNet.

Class Activation Mapping

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Weakly-supervised Object Localization

Setup

use AlexNet, VGGnet and GoogLeNet to generate *GAP
1. remove some layer (fully connected layer and softmax)
2. add some convolution layer
  - 3 * 3, stride 1, padding 1 with 1024 units.
  - GAP layer
  - softmax

Results

Classification
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

similar

Localization
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

weakly supervision still have a long way to go.

Generic Localization

to test the ability about feature extraction between original network and the network concatenated with GAP by linear SVM
- Image Not Showing Possible Reasons
  - The image file may be corrupted
  - The server hosting the image is unavailable
  - The image path is incorrect
  - The image format is not supported
  Learn More →
- similar
to test the ability about localization by weakly supervision
- Image Not Showing Possible Reasons
  - The image file may be corrupted
  - The server hosting the image is unavailable
  - The image path is incorrect
  - The image format is not supported
  Learn More →
- It still can find the position of the object.

Deep Features for Generic Localization

Fine-grained Recognition

with bounding box
- Image Not Showing Possible Reasons
  - The image file may be corrupted
  - The server hosting the image is unavailable
  - The image path is incorrect
  - The image format is not supported
  Learn More →
- Image Not Showing Possible Reasons
  - The image file may be corrupted
  - The server hosting the image is unavailable
  - The image path is incorrect
  - The image format is not supported
  Learn More →

Pattern Discovery

Given a set of images containing a common concept, test the network whether can find where the position f the important regions in this images.
1. How to identify the important region before train the network to test the network performance.
  1. use GoogLeNet-GAP network training by image-level label. use SVM weight and GAP to contruct the CAM to identify the important region.
2. Experiment
  1. Discovering informative objects in the scenes
    - Image Not Showing Possible Reasons
      The image file may be corrupted
      The server hosting the image is unavailable
      The image path is incorrect
      The image format is not supported
      Learn More →
  2. Concept localization in weakly labeled images
    - Image Not Showing Possible Reasons
      The image file may be corrupted
      The server hosting the image is unavailable
      The image path is incorrect
      The image format is not supported
      Learn More →
  3. Weakly supervised text detector
    - Image Not Showing Possible Reasons
      The image file may be corrupted
      The server hosting the image is unavailable
      The image path is incorrect
      The image format is not supported
      Learn More →
    - postive set: picture with text
    - negtive set: picture without text
  4. Interpreting visual question answering (???)
    - Image Not Showing Possible Reasons
      The image file may be corrupted
      The server hosting the image is unavailable
      The image path is incorrect
      The image format is not supported
      Learn More →

Visualizing Class-Specific Units

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Conclusion

CAM fofr CNN with global average pooling.
- enable to visualize hotmap to the given image
weakly supervision to find localize the object.

Learning Deep Features for Discriminative Localization

Abstract

Introduction

Related Work

Class Activation Mapping

Weakly-supervised Object Localization

Setup

Results

Deep Features for Generic Localization

Fine-grained Recognition

Pattern Discovery

Visualizing Class-Specific Units

Conclusion

Read more

NYCU-OJ 使用教學

NYCU-OJ User Guide

NCTU PCCA Winter Camp Contest 2020 - 題解

NCTU PCCA Winter Camp Contest 2020