Introduction
Introduction
Single-view Geometry
Camera Introduction
Camera Calibration
Camera Models (TODO)
Image Transformation
Single-view Metrology
Perspective-n-Points / PnP Problem
Jacky Yeh changed a year agoBook mode Like Bookmark
Image recification is to make epipolar lines epipolar lines are perfectly aligned horizontally between stereo image.
Recall that in epipolar geometry post, we mentions two goals for epipolar geometry: Solve correspondence problem & Triangulation. In fact, we can solve these two problems more easily after image planes are rectified(aligned).
Easier to reconstruct depth
Top-view stereo vision
Jacky Yeh changed a year agoView mode Like Bookmark
Hey there! š Iām currently a master student of Robotics at UIUC.
The motivation for this book came out of my desire for a gentle introduction to 3D vision.
During the process of learning 3D vision, I found it hard to understand and connect some essential components. Even though abundant resources has been out there in the Internet, I just feel overwhelmed and frustrated when I was new to this field. This led me to ponder if there is a better way to organize, to provide beginners a smoother learning path.
This book is still working in progress. Please feel free to reach out me for any questions and welcome to point out mistakes in article. My email:
jackyyeh511@gmail.com
Types of Reconstruction Problem
Jacky Yeh changed a year agoView mode Like 1 Bookmark
:::success
Homography matrix describes the mapping relationship of points on the same ==plane== between different images.
:::
Application
Before delving in, let's look at some cool applications about homography transformation first!
application 1: align pictures
Jacky Yeh changed a year agoView mode Like Bookmark
Transformation in 2D
Let us start from 2D space, and then extend the concept to 3D space later. Noted that we use the homogeneous coordinates in this post. So matrix $M_{3 \times 3}$ is actually a 2D -> 2D transformation.
Name
Matrix
#DoF
Preserves
Shape
translation
Jacky Yeh changed a year agoView mode Like Bookmark
Problem Definition
Perspective-n-Point is the problem of estimating the pose of a calibrated camera given a set of n 3D points in the world and their corresponding 2D points in the image.
:::success
Given:
Intrinsic matrix $K$
Jacky Yeh changed a year agoView mode Like Bookmark
:::info
Find a camera's intrinsic and extrinsic parameters, describing the projection from 3D world coord system to 2D camera coordinate system.
:::
image
Perspective Projection
Converting from camera-coordinate $(x_{c}, y_{c}, z_{c})$ to image-coordinate $(u, v)$ includes three tasks:
Length projection using pinhole camera formula.
Jacky Yeh changed a year agoView mode Like Bookmark
Previously, we explored the computation of a camera's intrinsic and extrinsic parameters through one or more views using standard camera calibration or single-view metrology procedures. These methods allowed us to extract information about the 3D world from a single image. However, ==it is generally impractical to reconstruct the complete structure of the 3D world solely from a single image==. This limitation arises from the intrinsic ambiguity inherent in mapping 3D to 2D: certain details are inevitably lost.
For example, the following picture shows that a man holding up the Leaning Tower of Pisa can result in ambiguous scenarios.
image
The ambiguiy can be solved by looking into the same 3D scene in multiple views. If the number of view is 2, the forming geometry is called Epipolar Geometry.
What is epipolar geometry?
Definition & Terms
:::success
Jacky Yeh changed a year agoView mode Like Bookmark
Right now we go beyond geometry of two cameras to multiple cameras. With observed points from multiple views, we will be able to simultaneously determine both 3D structure of the scene and camera motion simultaneously. This problem is framed as structure from motion.
Problem Definition
:::success
Given:
$m$ images of $n$ fixed 3D points
$mn$ 3D -> 2D correspondences
$$
x_{ij} = M_{i}X_{j} ~~~~~~~~~~~~~~~ i=1 \ldots m; ~~ j = 1 \ldots n
Jacky Yeh changed a year agoView mode Like Bookmark
Homogeneous Systems of Linear Equations
In the homogeneous system of linear equations, the constant term in every equation is equal to 0.
Solution can be divided into the following steps:
A homogenous system has the form:
$$
Ax = 0
$$
We need to impose some constraint to avoid trivial solution ($x=0$). Constraints may be like:
Jacky Yeh changed a year agoView mode Like Bookmark
Pinhole Camera
Let's design a camera. The simplest way is to curve out a hole on a curtain. The light beams ejected from the object pass through this pinhole, projecting onto the image plane.
The relationship between projected height and real object's height is written as follow:
$$
y = f \frac{Y}{Z}
$$
Jacky Yeh changed 2 years agoView mode Like Bookmark
The term "Visual Odometry" was originally coined for an analogy with wheel odometry, a method that calculates a vehicle's motion by integrating the rotations of its wheels over time. Similarly, Visual Odometry functions by incrementally estimating the pose of a vehicle by analyzing changes in the images captured by its onboard cameras in response to its motion.
When I first encountered this term "visual odometry", I got so confused, especially trying to understand how it differs from VSLAM and SFM. So let us first clarify the relationship of these confusing term:
Finally I realized visual odometry is an another way to say ==sequential SFM== from Prof. Davide's great slides. The structure of this post is also mainly based on it.
The overall VO pipeline is shown below. This post will be mainly focusing on untangling different type of motion estimation algorithms (2D-2D, 3D-3D, 3D-2D), leaving the rest of components to other posts.
Jacky Yeh changed 2 years agoView mode Like Bookmark