Camera Matrix and RQ Decomposition

The camera matrix (or projection matrix) is a matrix that directly maps 3D world points to 2D pixel coordinates. Estimating via DLT and decomposing it into intrinsics and extrinsics via RQ decomposition provides an alternative initialization path.

Camera Matrix DLT

Problem Statement

Given: correspondences between 3D world points and 2D pixel points .

Find: projection matrix such that .

Derivation

The projection (with in homogeneous coordinates) gives, after cross-multiplication:

where is the -th row of .

This gives a system , solved via SVD with Hartley normalization of the 3D and 2D points.

Post-Processing

After denormalization, is but not guaranteed to decompose cleanly into due to noise. The RQ decomposition extracts the components.

RQ Decomposition

Problem Statement

Given: The left submatrix of (where ).

Find: Upper-triangular and orthogonal such that .

Algorithm

RQ decomposition is computed by transposing QR decomposition:

  1. Compute QR decomposition of :
  2. Then , where is lower-triangular and is orthogonal
  3. Apply a permutation matrix to flip the matrix to upper-triangular form:
    • (upper-triangular)
    • (orthogonal)

Sign Conventions

After decomposition, ensure:

  • has positive diagonal entries: if , negate column of and row of
  • : if , negate a column of (and the corresponding column of )

Translation Extraction

Full Decomposition

The CameraMatrixDecomposition struct:

#![allow(unused)]
fn main() {
pub struct CameraMatrixDecomposition {
    pub k: Mat3,   // Upper-triangular intrinsics
    pub r: Mat3,   // Rotation matrix (orthonormal, det = +1)
    pub t: Vec3,   // Translation vector
}
}

API

#![allow(unused)]
fn main() {
// Estimate the full 3×4 camera matrix
let P = dlt_camera_matrix(&world_pts, &image_pts)?;

// Decompose into K, R, t
let decomp = decompose_camera_matrix(&P)?;
println!("Intrinsics: {:?}", decomp.k);
println!("Rotation: {:?}", decomp.r);
println!("Translation: {:?}", decomp.t);

// Or just RQ decompose any 3×3 matrix
let (K, R) = rq_decompose(&M);
}

When to Use

Camera matrix DLT is useful when:

  • You have non-coplanar 3D-2D correspondences and want to estimate both intrinsics and pose simultaneously
  • You need a quick estimate of from a single view (without multiple homographies)

For calibration with a planar board, Zhang's method is preferred because it uses the planar constraint to get more constraints per view.