問答題

What is softmax used for?

What are three supervised cnn model usage?

What are the three major issues you need to learn when you study a neural network model?

  • Network Architecture
  • Activation function
  • Learning rule

Basic CNN architecture can be divided into two stages, what are these stages? What are the functions of the corresponding two stages?

  • Convolutional layers+ pooling layer
    • Feature extraction(特徵擷取)
  • Fully connected layer
    • Mapping of feature maps to target labels(分類)

計算題

TLU

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

原版

  • Activation =
    a=w1x1+w2x2+w3x3....
  • Y=1 if
    aθ
    , else y=0
  • Update weight =
    w=w+a(ty)V
    • a
      is the learning rate, V is the input vector for example
      x1=1,x2=0

TLU example

  • 2 input or gate(if one of the input is true )
  • Initial Weight
    w1=1,w2=2,θ=2
  • Time = 1
    • 0×1+0×2=0
    • A<θy=0
    • y=0
    • ty=0
      no change
  • Time = 2
    • no change
  • Time =3
    • 1×1+0×2=1
    • A<θy=0
    • w1=1+0.5(10)×1=1.5
    • w2=2+0.5(10)×0=2
  • Continue until weight fits all condition

Answer

  • No, because Activation condition is reversed
  • change learning rule to
    w=wa(ty)x

Perceptron

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

a

  • 2 layer, 2 input 2 output percetron

b

  • No , not linearly seperable by one line

c

  • Σs1=0.21+(0.5)1=0.3
  • Σs2=(0.7)1+0.51=0.2
  • w1,1=w1,1+0.5(00)1=0.2
  • w1,2=w1,2+0.5(00)1=0.5
  • w2,1=w2,1+0.5(10)1=0.2
  • w2,2=w2,2+0.5(10)1=1

* 

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

* Neural Networks and Machine Learning, Simon Haykin, 3nd ed., Pearson, 2009 p85(pdf)
* 
y(n)activation

Back propogation

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

a

  • Input layer 2, hidden 2, output layer 2

b

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • Note
    • w
      is the weight
    • a
      is the result of activation on the node
      a=f(Σ(wjiai))
    • η
      is the learning rate
    • δj=(djaj)f(sj)=(djaj)aj(1aj)
    • Δwji=ηδjai
w31
0.1
w32
0.2
w41
0.3
w42
0.4
w53
0.5
w54
0.6
w63
0.7
w64
0.8
  • input =
    (1,1)

Forward

  • a3=f(10.1+10.2)=0.57
  • a4=f(10.3+10.4)=0.67
  • a5=f(0.570.5+0.670.6)=0.67
  • a6=f(0.570.7+0.670.8)=0.72

δ
s

  • target (0,1)
  • δ6=(d6a6)a6(1a6)=(10.72)0.72(10.72)=0.0564
  • δ5=(00.67)0.67(10.67)=0.1481
  • δ4=(δ5w54+δ6w64)f(s4)=(0.14810.6+0.05640.8)0.67(10.67)=0.0097
  • δ3=(δ5w53+δ6w63)f(s3)=(0.14810.5+0.05640.7)0.57(10.57)=0.0085

Δw

ΔW64=ηδ6a4=0.50.05640.67=0.01889
ΔW63=0.50.05640.57=0.0161

ΔW54=0.50.14810.67=0.0496

ΔW53=0.50.14810.57=0.0422

ΔW42=0.50.00971=0.0049

ΔW41=0.50.00971=0.0049

ΔW32=0.50.00851=0.0043

ΔW31=0.50.00851=0.0043

update weight
w=w+Δw

w31
0.0957
w32
0.1957
w41
0.2951
w42
0.3951
w53
0.4578
w54
0.5504
w63
0.7161
w64
0.8189

驗證 tensorflow

https://colab.research.google.com/drive/1uWPPby020fEdusBBwIwjyLRyArkqoEMv?usp=sharing

CNN

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Architecture

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

A forward

CNN

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Pooling

Linear

  • a5=f(0.180+(0.05)90+0.0520+(0.02)60)=0.9644
  • a6=f(0.0580+(0.02)90+0.0320+(0.07)60)=0.1978
  • a7=f(0.40.9644+10.1978)=0.3581
  • a8=f(0.50.9644+0.50.1978)=0.5947

Softmax

  • s1=0.4411
  • s2=0.5589

Backward

* 因為有softmax層 outputlayer算法為:
*

δj=(tjyj)(yjyj2)aj(1aj)
*
tj
為target,
yj
為softmax的結果
* 其他不變

sigmoid
δ
s

  • target (1,0)
  • δ8=(00.5589)(0.55890.55892)0.5947(10.5947)=0.0332
  • δ=(10.4411)(0.44110.44112)0.3581(10.3581)=0.0317
  • δ=(δ8w86+δ7w76)f(a6)=(0.03320.5+0.03171)0.1978(10.1978)=0.0024
  • δ=(δ8w85+δ7w75)f(a5)=(0.03320.5+0.03170.4)0.9644(10.9644)=0.001
  • δ4=(δ6w64+δ5w54)f(a4)=(0.00240.070.0010.02)1=0.000188
  • δ3=(δ6w63+δ5w53)f(a3)=(0.00240.030.0010.05)1=0.000122
  • δ2=(δ6w62+δ5w52)f(a2)=(0.00240.020.0010.05)1=0.0001
  • δ1=(δ6w61+δ5w51)f(a1)=(0.00240.050.0010.1)1=0.00022

sigmoid
Δw

  • learning rate = 0.5
  • w8,6=0.5+0.50.03320.1978=0.5033
  • w8,5=0.5+0.50.03320.9644=0.4839
  • w7,6=1+0.50.03170.1978=0.9968
  • w7,5=0.4+0.50.03170.9644=0.3847
  • w6,4=0.07+0.50.002460=0.142
  • w6,3=0.03+0.50.002420=0.006
  • w6,2=0.02+0.50.002490=0.128
  • w6,1=0.05+0.50.002480=0.046
  • w5,4=0.02+0.50.00160=0.05
  • w5,3=0.05+0.50.00120=0.04
  • w5,2=0.05+0.50.00190=0.095
  • w5,1=0.1+0.50.00180=0.06

upsampling

  • reverse maxpool
  • reverse relu


IOU Calculate

chapter 6 Object detection

Difference between one stage and two stage

R-CNN

  • 重複製作對象feature map造成速度較慢
  • Hard to optimize
  • Have to be trained seperately

yolo family

  • single shot detector

Yolov1

  • backbone: based on GoogLeNet
  • Unified detection:
    • Non-Maximum Suppression (NMS) 用來選去包圍物件最佳的 Bounding box.獲得最佳的 Intersection over Union (IoU)
  • 優點: 快、 訓練較簡單
  • 缺點:
    • 一個Grid只能有一(或2)個Class,因此對於擁擠及較小的物間偵測能力較差
    • bounding box 對於物件的aspect ratio 較為固定

Yolov2

  • 更換 backbone 為 Darknet 19
  • Remove the fully connected layers with average pooling
  • More BBOX(5) in each grid cell(better at small and occlusion)
  • multi - scale training on each batch

Yolov3

  • Darknet 53
  • Residual learning(input is combined with block output)

Yolov4

  • CSPDarknet53
  • Neck: SPP(splatial pyramid) and PANet(, Path Aggregation Network)
  • Head: YOLO layer
  • Bounding box regression loss
  • CIoU, GIoU, DIoU, MSE 四種
    • IOU Loss
    • CIoU (Complete-IoU) Loss
  • Regularization
  • Data augmentation
    • Cut mix: 在圖中放另一張圖
    • Mosaic data augmentation: 將四張圖組合唯一

Yolov5

  • More mosaic data augment
  • GIoU (Generalized-IoU) Loss
  • PANet only
  • Implement in pytorch

YOLOX

  • Anchor free
    • Faster training/inference speed
    • Do not need to determine anchor parameters
  • Decoupled head

Else

  • Yolo F/R/S/P

Chapter 7 Instance segmentation

R-CNN

Fast R-CNN

  • Still Use Selective search

Faster R-CNN

  • Speed up with region proposal network

Mask R-CNN

  • Extends Faster R-CNN with segmentation

SOLO family

SOLO

SOLOV2

difference fromv1

  • The object mask generation is decoupled into a mask kernel prediction and mask feature learning, which are responsible for generating convolution kernels and the feature maps to be convolved with, respectively.
  • Predict high-resolution object masks
  • SOLOv2 significantly reduces inference overhead with matrix non-maximum suppression (NMS) technique.
  • Dynamic Convolutions
    • More flexible
    • It adds 2D offsets to the regular grid sampling locations in the standard convolution. It enables free form deformation of the sampling grid

outdated (no un-supervised)

  • Variational Autoencoders
  • Pixel RNN/CNN
  • Generative adversarial network(GAN)

What is the main difference between Supervised Learning and Un-supervised Learning networks?

  • Supervised learning
    • Data: Data and label
    • Goal: Map input data to label
  • Un-supervised learning
    • Data: Data, no labels
    • Goal: Learn some underlying structure of the data

Competitive network

SOFM

Cross entropy and Gradient Descent Method

tags: Artificial Neural Networks and Deep Learning CSnote Artificial Neural Networks and Deep Learning