Camera Matrix and RQ Decomposition

The camera matrix (or projection matrix) $P$ is a $3 \times 4$ matrix that directly maps 3D world points to 2D pixel coordinates. Estimating $P$ via DLT and decomposing it into intrinsics and extrinsics via RQ decomposition provides an alternative initialization path.

Camera Matrix DLT

Problem Statement

Given: $N \geq 6$ correspondences between 3D world points ${P_{i}}$ and 2D pixel points ${p_{i}}$ .

Find: $3 \times 4$ projection matrix $P$ such that $p_{i} \sim P [P_{i}, 1]^{T}$ .

Derivation

The projection $p \sim P P_{h}$ (with $P_{h} = [P, 1]^{T}$ in homogeneous coordinates) gives, after cross-multiplication:

$u (p_{3}^{T} P_{h}) - p_{1}^{T} P_{h} = 0$ $v (p_{3}^{T} P_{h}) - p_{2}^{T} P_{h} = 0$

where $p_{i}^{T}$ is the $i$ -th row of $P$ .

This gives a $2 N \times 12$ system $A p = 0$ , solved via SVD with Hartley normalization of the 3D and 2D points.

Compute QR decomposition of $M^{T}$ : $M^{T} = Q \hat{R}$
Then $M = \hat{R}^{T} Q^{T}$ , where $\hat{R}^{T}$ is lower-triangular and $Q^{T}$ is orthogonal
Apply a permutation matrix $J$ to flip the matrix to upper-triangular form:
- $K = J \hat{R}^{T} J$ (upper-triangular)
- $R = J Q^{T}$ (orthogonal)

Sign Conventions

After decomposition, ensure:

$K$ has positive diagonal entries: if $K_{ii} < 0$ , negate column $i$ of $K$ and row $i$ of $R$
$det (R) = + 1$ : if $det (R) = - 1$ , negate a column of $R$ (and the corresponding column of $K$ )

Translation Extraction

$t = K^{- 1} p_{4}$

Full Decomposition

The CameraMatrixDecomposition struct:

#![allow(unused)]
fn main() {
pub struct CameraMatrixDecomposition {
    pub k: Mat3,   // Upper-triangular intrinsics
    pub r: Mat3,   // Rotation matrix (orthonormal, det = +1)
    pub t: Vec3,   // Translation vector
}
}

API

#![allow(unused)]
fn main() {
// Estimate the full 3×4 camera matrix
let P = dlt_camera_matrix(&world_pts, &image_pts)?;

// Decompose into K, R, t
let decomp = decompose_camera_matrix(&P)?;
println!("Intrinsics: {:?}", decomp.k);
println!("Rotation: {:?}", decomp.r);
println!("Translation: {:?}", decomp.t);

// Or just RQ decompose any 3×3 matrix
let (K, R) = rq_decompose(&M);
}

When to Use

Camera matrix DLT is useful when:

You have non-coplanar 3D-2D correspondences and want to estimate both intrinsics and pose simultaneously
You need a quick estimate of $K$ from a single view (without multiple homographies)

For calibration with a planar board, Zhang's method is preferred because it uses the planar constraint to get more constraints per view.

vision-calibration Book

Camera Matrix and RQ Decomposition

Camera Matrix DLT

Problem Statement

Derivation

Post-Processing

RQ Decomposition

Problem Statement

Algorithm

Sign Conventions

Translation Extraction

Full Decomposition

API

When to Use