Pose from Homography

Given camera intrinsics and a homography from a planar calibration board to the image, we can decompose to recover the camera pose (rotation and translation ) relative to the board.

Problem Statement

Given: Intrinsics and homography such that for board points at .

Find: Rigid transform (board to camera).

Assumptions:

  • The board lies at in world coordinates
  • is known (or has been estimated)
  • The homography was computed from correct correspondences

Derivation

Extracting Rotation and Translation

Recall from the Homography chapter:

where are the first two columns of the rotation matrix and is the translation.

Removing the intrinsics:

Let . The scale factor is recovered from the constraint that and have unit norm:

Then:

The third rotation column is:

Projecting onto SO(3)

Due to noise, the matrix is not exactly orthonormal. We project it onto using SVD:

If , flip the sign of the third column of and recompute.

Ensuring Forward-Facing

If , the board is behind the camera. In this case, flip the sign of both and :

This resolves the sign ambiguity inherent in the scale factor .

Accuracy

The pose from homography is an approximate estimate because:

  1. The homography itself is subject to noise (DLT algebraic error minimization, not geometric)
  2. The SVD projection onto SO(3) corrects non-orthogonality but introduces additional error
  3. Distortion (if not corrected) biases the homography

Typical rotation error: 1-5°. Typical translation direction error: 5-15%. These estimates are refined in non-linear optimization.

OpenCV equivalence: cv::decomposeHomographyMat provides a similar decomposition, returning up to 4 pose candidates. calibration-rs returns a single pose by resolving ambiguities via the forward-facing constraint.

API

#![allow(unused)]
fn main() {
let pose = estimate_planar_pose_from_h(&K, &H)?;
// pose: Iso3 (T_C_B: board to camera transform)
}

Usage in Calibration Pipeline

In the planar intrinsics calibration pipeline, pose estimation is applied to every view after has been estimated:

  1. Compute homography per view
  2. Estimate from all homographies (Zhang's method)
  3. Decompose each to get pose
  4. Use poses as initial values for non-linear optimization