Lien de la note Hackmd

Un reseau de neurones convolutif

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Le but est d'extraire les caracteristiques

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Les formules de convolution

Continue 1D:

(f * g) (x) = \int_{- \infty}^{+ \infty} f (x - t) g (t) d t = \int_{- \infty}^{+ \infty} f (t) g (x - t) d t

Discrete 2D:

(f * ω) (x, y) = \sum_{d x = - a}^{a} \sum_{d y = - b}^{b} ω (a + d x, b + d y) f (x + d x, y + d y)

f

est l'aimeg,

ω

le noyau, son support est

[- a, a] \times [- b, b]

Exemple de noyaux

ω

(WP Noyau_(traitement_d'image)):

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Conv2D

x = kl.Conv2D(filters = 4, kernel_size=(5, 5))(x)

L'image d'entree a 3 canaux -> chaque filtre a

5 \times 5 \times 3 + 1

poids
L'image de sortie a 4 canaux, elle pert 4 pixels dans chaque direction

En details + stride + padding = 'same'

Conv2D(filters = 2, kernel_size = (3, 3), stride = (2, 2), padding = 'same')

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Stride: une facon de reduire la taille d'une image
padding = 'same': la sortie a la meme taille

Convolution a trous

En anglais: atrous convolution


Conv2D(32, kernel_size=3, dilatation_rate=(2, 2))

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Couvre la meme surface qu'un noyau

5 \times 5

, ou que

2

convolutions

3 \times 3

a la suite, mais pour un cout moins cher (en poids)

Ne reduit pas la taille de l'image (padding = same)

Convolution separee

spatiale: une conv
$2 D \to 1$ conv.
$1 D$
profondeur:
$N$ conv
$2 D$ sur
$M$ couches
$\to$
$M$ conv
$2 D$ puis
$N$ conv
$1 D$

Convolution separeee spatiale

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

En pratique on fait:

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Conv separee en profondeur


kl.SeparableConv2D

Ici 3 couches:

$3$ conv
$2 D$ +
$4$ conv
$1 D$
$4$ couches en sortie

Gain de calcul important
perte de representation

\to

utilise

MobileNet

La convolution transposee (ou deconvolution)

Convolution: concentre en un pixel un bloc de pixel (fois un noyau)

Conv transposee: distribue un pixel (fois un noyau) a un bloc de pixel

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Mathematiquement les deux sont des convolutions mais la conv. transposee a pour but de simuler l'operation inverse de la conv

\begin{aligned} propagation conv transposee & \leftrightarrow retro-propagation conv \\ retro-propagation conv. transposee & \leftrightarrow propagation conv. \end{aligned}

Trucs d'architecture

Pooling


kl.MaxPooling2d(pool_size = (2, 2))

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Si on veut augmenter le nombre de couches il faut diminuer la taille de l'image sinon BOOM

On veut une vision multi-echelle il faut diminuer la taille de l'image + ponts.

L'inverse du pooling est kl

Ponts

La grande astuce de ResNet qui leur a permis de tout gagner

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

vert
$\to$
rouge
$\leftarrow$

Prog. et retro-prog.

Lors de la retropropagation l'erreur prend le pont et les convolutions

\to

les premieres couches sont corrigees

Dropout ou BatchNormalization

Pas besoin de dropout si BatchNormalization

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Evite que les poids importants en bloquent d'autres

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Apres convolution
Avant fonction d'activation
Reduit le besoin de normaliser les donnees

Types de problemes en vision

Semantic segmentation

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Classification
Classification + localisation
Object detection
Instance segmentation
- Celle qu'on va faire

U-net (2015)

Separation semantique d'images medicales

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Multi-echelle

Notre projet aujourd'hui !

Copy & crops: ce sont des PONTS

Ca sert a faire des concatenations

Fonctions d'erreur pour la segmentation

Si chaque image de sortie represente les pixels appartenant a la classe

k

, alors on peut finir avec un softmax:

y + k = e^{z_{k}} / \sum_{i} e^{z_{i}}

Erreur quadratique mse: pente douce, pas d'information d'exclusion
Entropie croisee:
$E = - \sum_{k} t \log y_{k} + (1 - t_{k}) \log (1 - y_{k})$
- binary_crossentropy avec
  $t_{k} = 0$ ou
  $1$
- categorical_crossentropy avec resultats sous la forme
  $[0, 0, . ., 1, . ., 0]$ pour indiquer la classe
  $k$
- sparse_categorical_crossentropy avec les classes indiques par des entier





y_true = [[1, 2], [0, 2]] #image 2x2 with 3 categories
y_pred = [[0.05, 0.95, 0], [0.1, .01, 0.8], #proba for each category
         [0.7, 0.2, 0.1], [0.2, 0.2, 0.6]] #for each pixel
loss = keras.losses.SparseCategoricalCrossentropy()
loss(y_true, y_pred).numpy()

Focal loss

E_{F L} = - \sum_{k} t_{k} (1 - y_{k})^{γ} + (1 - t_{k}) (1 - y_{k})^{γ} \log (1 - y_{k})

Comme la pente du log est forte, elle favorise les cas simples a detecter. On peut ecraser la courbe de

(1 - γ_{k})

pour aider à trouver les cas difficiles.

Augmenter le nombre de donnees

Souvent c'est bien utile, en particulier lorsqu'on manque de donnees.

Parfois ca rend la tache plus difficile et ca ne marche pas.


ImageDataGenerator