Lien de la note Hackmd
Scalable video recording
Scalability referes to the capacity of recovering physically meaningful image or video information from deconding only partial compressed bitstreams
- Quality scalability:
- finer to finer quantizations
- Spatial scalability:
- different spatial resolutions (Laplacian, Pyramid, …)
- Temporal scalability:
- we can jump frames and add the missing ones progressively
- Frequency scalability:
- lower frequencies to higher frequencies
- Combination of basic schemes
- Granularity: coarse vs fine ones
Object-based scalability: different resolutions for different objects
2D motion vs optical flow
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
On a une sphere en train de tourner sans illumination: dans le flux video, il n'y a pas de difference.
Prenons ensuite une sphere dont la source lumineuse bouge: l'information visuelle changera.
The observed of apparent 2D motion is called optical flow
Optical flow equation and ambiguity in motion estimation
- Imaginons une sequence video
- On image un point deplace en au temps
Under the constant intensity assumption, the images of the same object point at different times have the same luminance value
On fait un developpement de Taylor:
On obtient:
Definisson , ,
Qui peut etre ecrit:
Avec le gradient spatial
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
The flow vector at any point can be decomposed into 2 orthogonal components:
As we can observe, when a straight edge moves in the plane, we can only detect the normal of its motion vector !
Because the optical flow equation can be rewritten as:
Avec la magnitude du vecteur gradient.
Les consequences de ces equations sont:
- A chaque pixel
- We can compute
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- This ambuigity in estimationg the motion vector is known as the aperture problem
- The motion can be estimated uniquely only if the aperture contains at least 2 different gradient directions
General methodologies
- We consider the ME between 2 given frames, and
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
The problem is referred as to as forward motion estimation
Notation
Comment encoder les vecteurs de mouvements ?
Ils ne sont pas les memes en fonction de l'espace, il faut les encoder de facon parametrique.
Fonction mapping: nouvelle position
Avec le parametre qui encode le mouvement, ca nous donne la nouvelle position.
Motion representation
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Different representations de mouvement.
Image b: pixel-based
- On a un vecteur pour chaque pixel de l'image
Image c: on va la faire en TP
- On suppose qu'on fait un decoupage par bloc
- On fait un vecteur de mouvement par bloc
Pour le champ de vecteur, comment est-ce qu'on parametrise ?
Translations
Polynomial motions
Rotations
…
On estime que l'image est faite de pixel et on fait de la pixel-wise
Ca fait 2 millions d'inconnues a trouver
On rajoute de la regularite.
En general, on decoupe en regions.
On estime d'abord le mouvement ou une region ?
Approche par blocs
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
On decompose l'image en blocs (ex: pour une image , en )
On a des blocs qui vont se superposer car le mouvement n'est pas uniforme
Et on s'en fout !
On a egalement des coins qui ont bouges.
Il faut faire de la descente de gradient
- Les version les plus simples qu'on peut imaginer c'est en terme de translation
- Les blocs sont un bon compromis entre la precision et la complexite
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
It can induce warping effects
Motion esimation criteria
Displaced Frame Difference (DFD):
where is the domain of all pixels in and a positive number
- When , the above error is called mean absolute difference (MAD) and when Mean Squared Error (MSE)
- The error image is usually called displaced frame difference (DFD) image
- When is optimal ()
Prenons un cas plus simple
It is equivalent to minimize:
This solution verifies when
We can add a penalty term in our equation to enforce the smoothness of our vector field (i.e. must vary smoothly)
We want to minimize:
with the weighting coefficient.
- We have to regularize but not too much (to avoid over-blurring)
Minimzation methods
On va surtout regarder la methode exhaustive
- La methode de gradient
- La methode de Newton-Raphson
Avec la descente de gradient et le probleme de dimensionnalite, on tombe souvent sur des minimums locaux et non globaux

- One important search strategy is to use a multi-resolution representation of the motion field and conduct the search in a hierarchical manner
- The basic idea is to first search the motion parameters in a coarse resolution, propagate this solution into a finer resolution, and then refine the solution in the finer resolution
- It can combat the slowness of exhaustive search methods
Regularization
Block matching algorithm (BMA)
- Les blocs peuvent etre de forme polygonale
- On prend en pratique des carres
- On suppose qu'on fait de la translation
The Exhaustive Search Block Matching Algorithm (EBMA)
Under the block-wise translation model
Then the error can be written:
We can estimate the MV for each block individually

Le deplacement au bloc de est une somme ponderee des deplacements en 4 coins
Node-based motion representation
- Nodal MVs vs Polynomial coefficients
Motion estimation using node-based model
where:
Mesh-based motion estimation
- Dans le cas des blocs: estime independants et deformes
- Mesh: maillage sur l'image et on se permet de les deplacer en meme temps

Contrainte a connaitre: on ne veut pas que nos 2 carres s'inversent
- On a souvent des discontinuetes au niveau des edges
- Plus on augmente le nombre de noeuds, plus on a une estimation precise
- Mais la puissance de calcul explose
Global motion estimation
Plusieurs methodes existent
Est-ce qu'on est dans le cadre ou pas d'avoir uniquement la camera qui bouge ?
Au foot et tennis, une grande partie du decor est stable
Region-based motion estimation
Est-ce qu'on separe en region ou on estime le mouvement ?
3 approches possibles
Multi-resolution motion estimation
- Various ME approaches can be reduced to solving an error minimization problem
- Major difficulties
- Many local minima in the gradient-descent case
- Not easy to reach the global minimum
- Computation high

Pyramide laplacienne: on decompose l'image en bandes de frequence