Robust Loss Functions

Standard least squares minimizes the sum of squared residuals $\sum r_{i}^{2}$ . This objective is highly sensitive to outliers: a single point with a large residual can dominate the entire cost function and corrupt the solution. Robust loss functions (M-estimators) reduce the influence of large residuals, making optimization tolerant to outliers in the data.

Problem Setup

In non-linear least squares, we minimize:

$θ min i = 1 \sum N ρ (r_{i} (θ))$

where $ρ$ is the loss function applied to each residual $r_{i}$ . The standard (non-robust) case uses $ρ (r) = \frac{1}{2} r^{2}$ .

Available Loss Functions

calibration-rs provides three robust loss functions, each parameterized by a scale $c > 0$ that controls the transition from quadratic (inlier) to robust (outlier) behavior.

Huber Loss

$ρ (r) = {\frac{1}{2} r^{2} c (∣ r ∣ - \frac{c}{2}) if ∣ r ∣ \leq c if ∣ r ∣ > c$

Properties:

Quadratic for small residuals, linear for large residuals
Continuous first derivative
Influence function: bounded at $\pm c$ — outliers contribute constant gradient, not growing

When to use: The default robust loss. Good general-purpose choice when you expect a moderate number of outliers.

Cauchy Loss

$ρ (r) = \frac{c ^{2}}{2} ln (1 + \frac{r ^{2}}{c ^{2}})$

Properties:

Grows logarithmically for large residuals (slower than linear)
Smooth everywhere
Influence function: $ψ (r) = r / (1 + r^{2} / c^{2})$ — decreases to zero for large $∣ r ∣$ , effectively down-weighting far outliers

When to use: When outliers are far from the bulk of the data and should have near-zero influence.

Arctan Loss

$ρ (r) = c^{2} arctan (\frac{r ^{2}}{c ^{2}})$

Properties:

Bounded: $ρ (r) \to \frac{π}{2} c^{2}$ as $∣ r ∣ \to \infty$
Influence function approaches zero for large residuals (redescending)

When to use: When very strong outlier rejection is needed. More aggressive than Cauchy but can make convergence harder.

Comparison

Loss	Large- $r$ growth	Outlier influence	Convergence
Quadratic ( $r^{2} /2$ )	Quadratic	Unbounded	Best
Huber	Linear	Bounded (constant)	Good
Cauchy	Logarithmic	Decreasing	Moderate
Arctan	Bounded	Approaching zero	Can be tricky

Choosing the Scale Parameter $c$

The scale $c$ sets the boundary between "inlier" and "outlier" behavior:

Too small: Treats good data as outliers, reducing effective sample size
Too large: Outliers still dominate (approaches standard least squares)
Rule of thumb: Set $c$ to the expected residual magnitude for good data points. For reprojection residuals, $c = 1 - 3$ pixels is typical.

Usage in calibration-rs

Robust losses are specified per-residual block in the optimization IR:

#![allow(unused)]
fn main() {
pub enum RobustLoss {
    None,
    Huber { scale: f64 },
    Cauchy { scale: f64 },
    Arctan { scale: f64 },
}
}

Each problem type exposes the loss function as a configuration option:

#![allow(unused)]
fn main() {
session.update_config(|c| {
    c.robust_loss = RobustLoss::Huber { scale: 2.0 };
})?;
}

The backend applies the loss function during residual evaluation, modifying both the cost and the Jacobian.

Iteratively Reweighted Least Squares (IRLS)

Under the hood, robust loss functions are typically implemented via IRLS: each residual is weighted by $w_{i} = ρ^{'} (r_{i}) / r_{i}$ , and the weighted least-squares problem is solved iteratively. The Levenberg-Marquardt backend handles this automatically.

Interaction with RANSAC

RANSAC and robust losses address outliers at different stages:

RANSAC (linear initialization): Binary inlier/outlier classification. Used during model fitting to reject gross outliers before any optimization.
Robust losses (non-linear refinement): Soft down-weighting. Used during optimization to reduce the influence of moderate outliers that passed RANSAC.

The two approaches are complementary: RANSAC handles gross outliers during initialization, while robust losses handle smaller outliers during refinement.

vision-calibration Book