Try   HackMD

Lien de la note Hackmd

Scalable video recording

Scalability referes to the capacity of recovering physically meaningful image or video information from deconding only partial compressed bitstreams

  • Quality scalability:
    • finer to finer quantizations
  • Spatial scalability:
    • different spatial resolutions (Laplacian, Pyramid, )
  • Temporal scalability:
    • we can jump frames and add the missing ones progressively
  • Frequency scalability:
    • lower frequencies to higher frequencies
  • Combination of basic schemes
  • Granularity: coarse vs fine ones

Object-based scalability: different resolutions for different objects

2D motion vs optical flow

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

On a une sphere en train de tourner sans illumination: dans le flux video, il n'y a pas de difference.

Prenons ensuite une sphere dont la source lumineuse bouge: l'information visuelle changera.

The observed of apparent 2D motion is called optical flow

Optical flow equation and ambiguity in motion estimation

  • Imaginons une sequence video
    ψ(x,y,t)
  • On image un point
    (x,y)
    deplace en
    (x+dx,y+dy)
    au temps
    t+dt

Under the constant intensity assumption, the images of the same object point at different times have the same luminance value

ψ(x+dx,y+dy,t+dt)=ψ(x,y,t)

On fait un developpement de Taylor:

ψ(x+dx,y+dy,t+dt)=ψ(x,y,t)+ψxdx+ψydy+ψtdt

On obtient:

ψxdx+ψydy+ψtdt=0

Definisson

vx=dxdt,
vy=dydt
,
vt=dxtdt=1

ψxvx+ψyvy+ψt=0

Qui peut etre ecrit:

ψTv+ψt=0

Avec

ψT le gradient spatial

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

The flow vector

v at any point
x
can be decomposed into 2 orthogonal components:

v=vnen+vtet

As we can observe, when a straight edge moves in the plane, we can only detect the normal

vn of its motion vector !

Because

ψ=ψen the optical flow equation can be rewritten as:

vnψ+ψt=0

Avec

ψ la magnitude du vecteur gradient.

Les consequences de ces equations sont:

  1. A chaque pixel
    x
  2. We can compute

vn=ψtψ

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • This ambuigity in estimationg the motion vector is known as the aperture problem
  • The motion can be estimated uniquely only if the aperture contains at least 2 different gradient directions

General methodologies

  • We consider the ME between 2 given frames,
    ψ(x,y,t1)
    and
    ψ(x,y,t2)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

The problem is referred as to as forward motion estimation

Notation

Comment encoder les vecteurs de mouvements ?
Ils ne sont pas les memes en fonction de l'espace, il faut les encoder de facon parametrique.

Fonction mapping: nouvelle position

w(x,a)=x+d(x,a)

Avec le parametre

a qui encode le mouvement, ca nous donne la nouvelle position.

a=[a1,a2,,an]T

Motion representation

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Different representations de mouvement.

Image b: pixel-based

  • On a un vecteur pour chaque pixel de l'image

Image c: on va la faire en TP

  • On suppose qu'on fait un decoupage par bloc
  • On fait un vecteur de mouvement par bloc

Pour le champ de vecteur, comment est-ce qu'on parametrise ?

Translations
Polynomial motions
Rotations

On estime que l'image est faite de pixel et on fait de la pixel-wise

Ca fait 2 millions d'inconnues a trouver

On rajoute de la regularite.

En general, on decoupe en regions.

On estime d'abord le mouvement ou une region ?

Approche par blocs

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

On decompose l'image en blocs (ex: pour une image

100×100, en
33×33
)

On a des blocs qui vont se superposer car le mouvement n'est pas uniforme

Et on s'en fout !

On a egalement des coins qui ont bouges.

Il faut faire de la descente de gradient

  • Les version les plus simples qu'on peut imaginer c'est en terme de translation
  • Les blocs sont un bon compromis entre la precision et la complexite

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

It can induce warping effects

Motion esimation criteria

Displaced Frame Difference (DFD):

EDFD(a)=xΛ|ψ2(w(x,a))ψ1(x)|p

where

Λ is the domain of all pixels in
ψ1
and
p
a positive number

  • When
    p=1
    , the above error is called mean absolute difference (MAD) and when
    p=2
    Mean Squared Error (MSE)
  • The error image
    e(x,a)=ψ2(w(x,a))ψ1(x)
    is usually called displaced frame difference (DFD) image
  • When
    a
    is optimal (
    p=2
    )

EDFDa=2xΛ(ψ2(w(x,a))ψ1(x))d(w(x,a))daψ2(w(x,a))=0

Prenons un cas plus simple

ψtdt=ψ2(x)ψ1(x)

It is equivalent to minimize:

Eflow=xΛ|ψ1(x)Td(x,a)+ψ2(x)ψ1(x)|p

This solution verifies when

p=2

Eflowa=2xΛ(ψ1(x)Td(x,a)+ψ2(x)ψ1(x))d(x,a)daψi(x)

We can add a penalty term in our equation to enforce the smoothness of our vector field (i.e. must vary smoothly)

Es=xΛyNxd(x,a)d(y,a)2

We want to minimize:

Etotal=EDFD+wsEs

with

w the weighting coefficient.

  • We have to regularize but not too much (to avoid over-blurring)

Minimzation methods

On va surtout regarder la methode exhaustive

  • La methode de gradient
  • La methode de Newton-Raphson

Avec la descente de gradient et le probleme de dimensionnalite, on tombe souvent sur des minimums locaux et non globaux

  • One important search strategy is to use a multi-resolution representation of the motion field and conduct the search in a hierarchical manner
  • The basic idea is to first search the motion parameters in a coarse resolution, propagate this solution into a finer resolution, and then refine the solution in the finer resolution
  • It can combat the slowness of exhaustive search methods

Regularization

E=xΛ(ψxvx+ψy+ψt)2+ws(vx2+vy2)

Block matching algorithm (BMA)

  • Les blocs peuvent etre de forme polygonale
    • On prend en pratique des carres
  • On suppose qu'on fait de la translation

The Exhaustive Search Block Matching Algorithm (EBMA)

Under the block-wise translation model

w(x;a)=x+dmxBm

Then the error can be written:

E(dm,mM)=mMxBm|ψ2(x+dm)ψ1(x)|p

We can estimate the MV for each block individually

Em(dm)=xBm|ψ2(x+dm)ψ1(x)|p

Deformable block matching algorithm

dm(x)=k=1KΦm,k(x)dm,kxBm

Le deplacement au bloc

m de
x
est une somme ponderee des deplacements en 4 coins

Node-based motion representation

  • Nodal MVs vs Polynomial coefficients
    • Nodal
      • Stabilite

Motion estimation using node-based model

a=[dk;kK]

E(a)=xB(ψ2(w(x,a))ψ1(x))2

where:

w(x,a)=x+kKϕk(x)dk

Mesh-based motion estimation

  • Dans le cas des blocs: estime independants et deformes
  • Mesh: maillage sur l'image et on se permet de les deplacer en meme temps
    • Tout est corrole

Contrainte a connaitre: on ne veut pas que nos 2 carres s'inversent

  • On a souvent des discontinuetes au niveau des edges
  • Plus on augmente le nombre de noeuds, plus on a une estimation precise
    • Mais la puissance de calcul explose

Global motion estimation

Plusieurs methodes existent

Est-ce qu'on est dans le cadre ou pas d'avoir uniquement la camera qui bouge ?

Au foot et tennis, une grande partie du decor est stable

Region-based motion estimation

Est-ce qu'on separe en region ou on estime le mouvement ?

3 approches possibles

Multi-resolution motion estimation

  • Various ME approaches can be reduced to solving an error minimization problem
  • Major difficulties
    • Many local minima in the gradient-descent case
    • Not easy to reach the global minimum
    • Computation high

Pyramide laplacienne: on decompose l'image en bandes de frequence