# Group[5]_HW[1]_report ## Files in the zip ## Introduction This work aims at hand-crafting a transformation between chessboard(3D world) and camera system in order to implement "camera calibration". The whole procedure can be roughly divided into three stages, which would be further explored in the following section. 1. Find homography matrix $H_i$ for each image - Calculate ${H_i}$ by x~HX using DLT algorithm 2. Find the intrinsic matrix K - Calculate $B$ by $Vb=0$ - Calculate $K$ by ${B=K^{-T}K^{-1}}$ using Cholesky factorization 3. Find the extrinsic matrix $[R|t]$ for each image - Calculate $[R|t]$ by ${H_i=K[R|t]}$ ## Implementation Procedure ### 1. Use the points in each images to find Hi ```python= def find_homographies_dlt(objpoints, imgpoints): """ Calculates homography matrices Hi for multiple images using the Direct Linear Transform (DLT) algorithm. Args: objpoints: A list of numpy arrays, where each array contains the 3D object points in the world coordinate system for each image. imgpoints: A list of numpy arrays, where each array contains the 2D image points in the camera coordinate system for each image. Returns: A list of numpy arrays, where each array is a (3, 3) homography matrix Hi for each image. """ # Ensure input arrays are numpy arrays objpoints = np.array(objpoints) imgpoints = np.array(imgpoints) num_images = objpoints.shape[0] homographies = [] for i in range(num_images): objpoints_i = objpoints[i, :, :] imgpoints_i = imgpoints[i, :, :] N = objpoints_i.shape[0] # Construct the A matrix (without normalization) A = np.zeros((2*N, 9)) for j in range(N): x, y, z = objpoints_i[j] try: u, v = imgpoints_i[j] except: u, v = imgpoints_i[j][0] # for some reason, imgpoints_i[j] is a list of length 1 A[2*j] = [-x, -y, -1, 0, 0, 0, u*x, u*y, u] A[2*j + 1] = [0, 0, 0, -x, -y, -1, v*x, v*y, v] # Solve for the homography matrix using SVD U, S, V = np.linalg.svd(A) H = V[-1].reshape(3, 3) # Normalize the homography matrix (with a check for zero) if H[2, 2] != 0: H = H / H[2, 2] else: print(f"Warning: H[2, 2] is zero for image {i+1}. Skipping normalization.") homographies.append(H) return homographies ``` ### 2. Use Hi to find out the intrinsic matrix K ```python= def calculate_v(H, i, j): """ Note that hij is the i-th column and the j-th row, while in calling the value of a matrix, we put row index first then col index later. That is why the below value matrix looks like this. Example: hij = H[j-1][i-1] """ value = [H[0, i]*H[0, j], H[0, i]*H[1, j]+H[1, i]*H[0, j], H[1, i]*H[1, j], H[2, i]*H[0, j]+H[0, i]*H[2, j], H[2, i]*H[1, j]+H[1, i]*H[2, j], H[2, i]*H[2, j], ] return np.array(value) def find_intrinsic(homographies): """ Step 1: Get B from H. B can be simplified to b=[B11, B12, B22, B13, B23, B33]^T as B is a symmetrical matrix. Note B = [ B11 B12 B13 B12 B22 B23 B13 B23 B33 ] vij = [ hi1*hj1, hi1*hj2+hi2*hj1, hi2*hj2, hi3*hj1+hi1*hj3, hi3*hj2+hi2*hj3, hi3*hj3 ] !!! Note that the 2D(both col/row) index of H should "minus one". !!! ∵ Vb = 0 ∴ Get b => Know B Step 2: Get K from B. B = (K^-T)(K^-1) """ V = [] for i, H in enumerate(homographies): v12 = calculate_v(H, 0, 1) v11 = calculate_v(H, 0, 0) v22 = calculate_v(H, 1, 1) V.append(v12) V.append(v11-v22) V = np.array(V) w, v, vh = np.linalg.svd(V) b = vh[-1] print("b-norm:") print(np.linalg.norm(b)) # should be 1 B = [ [b[0], b[1], b[3]], [b[1], b[2], b[4]], [b[3], b[4], b[5]] ] print("B") print(B) y0 = (b[1]*b[3]-b[0]*b[4])/(b[0]*b[2]-b[1]*b[1]) landa = b[5] - (b[3]*b[3]+y0*(b[1]*b[3]-b[0]*b[4]))/b[0] alpha_x = math.sqrt(landa/b[0]) alpha_y = math.sqrt(landa*b[0]/(b[0]*b[2]-b[1]*b[1])) s = -(b[1]*alpha_x*alpha_x*alpha_y)/landa x0 = (s*y0/alpha_y)-(b[3]*alpha_x*alpha_x/landa) K = [[alpha_x, 0, x0], [0,alpha_y,y0], [0,0,1] ] print("K") print(K) return np.array(K) ``` ### 3. Find out the extrensics matrix of each images. ```python= def find_extrinsic(homographies, intrinsic): ''' H = [h1 h2 h3] = lambda * K * [r1 r2 r3 t] ''' intrinsic_inv = np.linalg.inv(intrinsic) extrinsics = [] for i, h in enumerate(homographies): h1 = h[:, 0] h2 = h[:, 1] h3 = h[:, 2] _lambda = 1 / (np.linalg.norm(np.dot(intrinsic_inv, h1))) r1 = _lambda * np.dot(intrinsic_inv, h1) r2 = _lambda * np.dot(intrinsic_inv, h2) r3 = np.cross(r1, r2) t = _lambda * np.dot(intrinsic_inv, h3) extrinsic = np.array([r1, r2, r3, t]).transpose() extrinsics.append(extrinsic) extrinsics = np.array(extrinsics) return extrinsics ``` ## Experimental Result - We have done two experiments respectively on two sets of data. One is provided by TAs, and the other is generated by us. Ours consists of 10 images, which could be found in the attached folder named "my_data". - The following visualize the layout of each camera with respect of different sets of data, featuring the position and the angle of each camera, and the red frame is where the chessboard is located. ### (1) data of TAs' ![Ta](https://hackmd.io/_uploads/ryXbmU6CA.png) ### (2) data of ours - Note that since the chessboard size of two datasets are different, we need to correct two variables as below. > corner_x = 7 > corner_y = 10 - Also, change the folder path to read the dataset > images = glob.glob('my_data/*.jpg') ![test](https://hackmd.io/_uploads/HJObmUaRA.png) ## Discussion Camera calibration is essential for many computer vision tasks. This homework focuses on implementing camera calibration using two approaches: utilizing the OpenCV library and building the algorithm from scratch. We'll delve into the differences, advantages, and disadvantages of each method. **1. Finding Homographies (Hi)** * **OpenCV** OpenCV provides functions like findChessboardCorners to detect corners in a checkerboard pattern and findHomography to estimate the homography matrix using those corners. These functions are optimized and robust, handling various scenarios and noise. * **From Scratch** Implementing this step from scratch involves manually detecting corners (e.g., using Harris corner detector) and then applying a method like Direct Linear Transform (DLT) to estimate the homography matrix. This requires a deeper understanding of the underlying mathematics and algorithms. **2. Estimating the Intrinsic Matrix (K)** * **OpenCV** OpenCV's calibrateCamera function directly estimates the intrinsic matrix along with distortion coefficients. It uses a closed-form solution based on Zhang's method, which efficiently solves for camera parameters. * **From Scratch** Implementing this from scratch involves utilizing the homographies and constraints (e.g., image of the absolute conic) to formulate equations and solve for the intrinsic parameters. This can be more complex and computationally expensive compared to OpenCV's optimized solution. **3. Determining Extrinsic Matrices** * **OpenCV** The calibrateCamera function in OpenCV also provides the extrinsic matrices (rotation and translation) for each image as part of its output. * **From Scratch** Once the intrinsic matrix is known, extracting the extrinsic matrices from the homographies requires decomposing them using techniques like QR decomposition or SVD. Implementing this requires careful consideration of numerical stability and efficiency. ## Conclusion - We successfully implemented the camera calibration process from scratch. By deriving the homographies for each image, we were able to compute both the intrinsic and extrinsic camera parameters. - We used images taken with our cellphone's camera to further validate the results. Additionally, by visualizing the camera's extrinsic parameters, we are able to see the camera's position and the relation between the camera and the chessboard. - Through this homework, we gained a deeper understanding the steps in the camera calibration process and the underlying mathematical computations. ## Work Assignment Plan Between Team Members - 313551011 李佾:coding(homography matrix), composing report - 313554015 周禹彤:coding(intrinsic matrix), composing report - 313560007 蕭皓隆:coding(extrinsic matrix), composing report