Pose from Homography

Given camera intrinsics $K$ and a homography $H$ from a planar calibration board to the image, we can decompose $H$ to recover the camera pose (rotation $R$ and translation $t$ ) relative to the board.

Problem Statement

Given: Intrinsics $K$ and homography $H$ such that $p \sim H \cdot [P_{x y}, 1]^{T}$ for board points at $Z = 0$ .

Find: Rigid transform $T_{C, B} = [R ∣ t]$ (board to camera).

Assumptions:

The board lies at $Z = 0$ in world coordinates
$K$ is known (or has been estimated)
The homography $H$ was computed from correct correspondences

Derivation

Extracting Rotation and Translation

Recall from the Homography chapter:

$H \sim K [r_{1} r_{2} t]$

where $r_{1}, r_{2}$ are the first two columns of the rotation matrix and $t$ is the translation.

Removing the intrinsics:

$K^{- 1} H = λ [r_{1} r_{2} t]$

Let $[a_{1} a_{2} a_{3}] = K^{- 1} H$ . The scale factor $λ$ is recovered from the constraint that $r_{1}$ and $r_{2}$ have unit norm:

$λ = \frac{2}{∥ a _{1} ∥ + ∥ a _{2} ∥}$

Then:

$r_{1} = λ a_{1}, r_{2} = λ a_{2}, t = λ a_{3}$

The third rotation column is:

$r_{3} = r_{1} \times r_{2}$

Projecting onto SO(3)

Due to noise, the matrix $R_{approx} = [r_{1} r_{2} r_{3}]$ is not exactly orthonormal. We project it onto $SO (3)$ using SVD:

$R_{approx} = U Σ V^{T}$

$R = U V^{T}$

If $det (R) = - 1$ , flip the sign of the third column of $U$ and recompute.

Ensuring Forward-Facing

If $t_{z} < 0$ , the board is behind the camera. In this case, flip the sign of both $R$ and $t$ :

$R \leftarrow - R, t \leftarrow - t$

This resolves the sign ambiguity inherent in the scale factor $λ$ .

Accuracy

The pose from homography is an approximate estimate because:

The homography itself is subject to noise (DLT algebraic error minimization, not geometric)
The SVD projection onto SO(3) corrects non-orthogonality but introduces additional error
Distortion (if not corrected) biases the homography

Typical rotation error: 1-5°. Typical translation direction error: 5-15%. These estimates are refined in non-linear optimization.

OpenCV equivalence: cv::decomposeHomographyMat provides a similar decomposition, returning up to 4 pose candidates. calibration-rs returns a single pose by resolving ambiguities via the forward-facing constraint.

API

#![allow(unused)]
fn main() {
let pose = estimate_planar_pose_from_h(&K, &H)?;
// pose: Iso3 (T_C_B: board to camera transform)
}

Usage in Calibration Pipeline

In the planar intrinsics calibration pipeline, pose estimation is applied to every view after $K$ has been estimated:

Compute homography $H_{k}$ per view
Estimate $K$ from all homographies (Zhang's method)
Decompose each $H_{k}$ to get pose $T_{C, B}^{(k)}$
Use poses as initial values for non-linear optimization

vision-calibration Book