Optimal RANSAC

Background

Optimal RANSAC is a repeatable variant of the RANSAC algorithm introduced by Hast et al. (2013). Standard RANSAC is inherently non-deterministic: because it selects random minimal samples, two runs on the same data can find different inlier sets, which is problematic when the algorithm is embedded in a larger pipeline that must be reproducible. Optimal RANSAC addresses this by augmenting the standard hypothesis-and-verify loop with three iterative refinement steps that steer the search toward the global optimal inlier set, making repeated convergence to the same result much more likely than in the standard RANSAC algorithm.

The algorithm

Optimal RANSAC shares its outer structure with standard RANSAC but, crucially, replaces the single scoring step with three tightly coupled sub-procedures whenever a candidate hypothesis yields more than a small number of tentative inliers.

1. Resample (Algorithm 2)

A random subset of up to a quarter of the current tentative-inlier set is drawn and a new model is fitted to that subset. The new model is then scored against the full dataset using a (optionally wider) search tolerance $t_{\mathrm{search}}$. If this produces a larger inlier set the resampling loop restarts from the expanded set; otherwise it tries up to 8 different subsets before returning. This is directly inspired by the Local-Optimisation step of LO-RANSAC (Chum et al., 2003; Chum et al., 2004) but is triggered more aggressively: whenever a promising hypothesis is found rather than only after a global best is updated.

2. Rescore (Algorithm 3)

Starting from the inlier set produced by resampling, the model is iteratively re-estimated from all current inliers and scored against all data until the inlier set stops changing or after at most 20 iterations. Using all inliers for fitting (rather than only the minimal sample $s$) exploits the fact that many problems admit a least-squares fit from more than $s$ points, producing a model that is better aligned to the true structure.

3. Pruneset (Algorithm 4) — optional

When a wider search tolerance $t_{\mathrm{search}} > t$ is used alongside a user-supplied residual function, a final pruning pass removes any inlier whose residual exceeds the tight tolerance $t$. The most extreme inlier is removed one at a time and the model is re-estimated after each removal, so that the final model is always consistent with the retained inliers.

Stopping criterion

Instead of the adaptive $N$ iterations formula used by standard RANSAC, Optimal RANSAC terminates when the same inlier-set size is re-discovered a specified number of times in succession (min_consensus, default 2). The refinement steps make it very unlikely to find the same size by chance unless it corresponds to the true optimal set, so this simple criterion works surprisingly well in practice.

When to use Optimal RANSAC

Optimal RANSAC is preferable to standard RANSAC when

repeatability is important regardless of RNG state (standard RANSAC is repeatible given the same seeded RNG, but generally not otherwise, while Optimal RANSAC is designed to be repeatible regardless of RNG state),
the inlier fraction is very low (well below 5%), making RANSAC's standard adaptive stopping criterion unreliable, or
a high-precision final inlier set is needed (using the two-tolerance mode with t_search > t and residualfn).

It is less appropriate when

the model fitting step (fittingfn) does not support over-determined input (required for the rescore step), or
the data contains multiple competing structures of comparable size, in which case the algorithm may converge to a locally optimal set.

API

ConsensusFitting.optimalransac — Function

optimalransac(x, fittingfn, distfn, s, t;
              rng = Random.default_rng(),
              t_search = t,
              residualfn = nothing,
              degenfn = _ -> false,
              verbose = false,
              max_data_trials = 100,
              max_trials = 1000,
              min_inliers = 5,
              min_consensus = 2)

Robustly fit a model to data using the Optimal RANSAC algorithm of Hast et al. (2013), which extends standard RANSAC with three refinement steps (resample, rescore, and optionally pruneset) to produce a repeatable result.

Arguments

x: Data array of size [...] × N (arbitrary dimensionality per data point is supported, but the last dimension must correspond to the number of data points). Commonly will be d × N.
fittingfn: Function that fits a model to a sample of data points. Must have the signature M = fittingfn(x). Unlike ransac, this function is called with subsets of varying size — from the minimal s points (outer sampling) up to the full current inlier set (rescore step). It must therefore implement a least-squares or otherwise over-determined fit when given more than s points. The function should return an empty collection when it cannot produce a valid model (e.g., degenerate input). It may also return a collection of multiple candidate models; in that case distfn is responsible for selecting the best one.
distfn: Function that scores a model against all data points and returns the inlier set. Must have the signature (inliers, M) = distfn(M, x, t), where inliers is a vector of last-dimension indices into x for which the residual is below threshold t. When M holds multiple candidate models this function should select and return the one with the most inliers.
s: Minimum number of data points required by fittingfn to fit a model.
t: Primary inlier distance threshold (used for the outer sampling step and, when pruning is enabled, as the final tight tolerance).

Keyword Arguments

rng::Random.AbstractRNG: Random number generator. Defaults to Random.default_rng().
t_search::Real: Inlier threshold used during the resampling and rescoring steps. Must satisfy t_search ≥ t. When t_search > t and residualfn is supplied, a pruning pass is applied after resampling to trim the result back to the tight tolerance t, yielding the highest- precision final inlier set. Defaults to t (no separate search tolerance; pruning is skipped).
residualfn: Function residuals = residualfn(M, x) that returns a non-negative vector of per-point residuals (one entry per last-dimension slice of x). Required for the pruning step; ignored when t_search == t. Defaults to nothing.
degenfn: Function that tests whether a minimal sample would produce a degenerate model. Must have the signature r = degenfn(x) and return true when the sample is degenerate. Defaults to _ -> false.
verbose::Bool: When true, prints per-iteration diagnostics. Defaults to false.
verbose_io::IO: IO stream for verbose output. Defaults to stdout.
max_data_trials::Integer: Maximum attempts to draw a non-degenerate minimal sample in the outer loop before emitting a warning. Defaults to 100.
max_trials::Integer: Hard upper bound on outer-loop iterations. Acts as a safety limit; the algorithm's primary stopping criterion is the min_consensus convergence condition. Defaults to 1000.
min_inliers::Integer: Minimum tentative inlier count required to trigger the resampling/rescoring optimization. Defaults to 5.
min_consensus::Integer: Number of times the same-size inlier set must be found before the algorithm declares convergence. Defaults to 2, such that the algorithm must find the same inlier count twice in a row. For small inlier sets (fewer than ≈ 30 expected inliers) the paper recommends increasing this to 2 or more to avoid premature termination on a sub-optimal set.

Returns

M: The model with the largest (or, with pruning, the highest-quality) inlier set found by the algorithm.
inliers: Vector of last-dimension indices of x that are inliers to M.

Extended Help

Optimal RANSAC augments the standard hypothesis-then-verify loop with three additional refinement steps taken whenever a hypothesis yields more than min_inliers tentative inliers.

Resample (Algorithm 2 of Hast et al. (2013)) draws subsets of up to a quarter of the current tentative-inlier set, fits new candidate models, and rescores them against the full dataset. If a larger inlier set is found the loop restarts from that larger set; the procedure repeats up to 8 times per outer iteration.

Rescore (Algorithm 3) repeatedly re-estimates the model from the full current inlier set and rescores against all data until the inlier set stops changing or 20 iterations are exhausted.

Pruneset (Algorithm 4) is an optional final step enabled when t_search > t and residualfn is provided. It iteratively removes the point with the largest residual from the working set and re-estimates the model until every remaining point lies within the tight threshold t.

The algorithm terminates when the same inlier-set size is found min_consensus times in succession, indicating convergence to the optimal set. Unlike standard RANSAC, the stopping criterion does not depend on a statistical threshold over the inlier fraction; this makes the algorithm applicable to very highly contaminated sets (inlier ratios well below 5%).

Repeatability

The strong local-refinement steps (resample + rescore) reliably guide the search toward the global maximum from almost any starting hypothesis, making Optimal RANSAC near-deterministic across runs even with different random seeds, provided the data has a single dominant structure. Pass the same seeded rng to obtain bit-for-bit identical results.

Limitations

The algorithm is only appropriate when the model can be fitted to more than the minimal s points — the fittingfn must accept over-determined inputs. For problems with multiple competing structures of similar quality the algorithm may converge to a locally optimal set rather than the global optimum.

References

Hast et al. (2013)

source

References

This page cites the following references:

Chum, O.; Matas, J. and Kittler, J. (2003). Locally Optimized RANSAC. In: Pattern Recognition, edited by Michaelis, B. and Krell, G. (Springer Berlin Heidelberg, Berlin, Heidelberg); pp. 236–243.
Chum, O.; Matas, J. and Obdržálek, Š. (2004). Enhancing RANSAC by Generalized Model Optimization. In: Proceedings of the Asian Conference on Computer Vision (ACCV).
Hast, A.; Nysjö, J. and Marchetti, A. (2013). Optimal RANSAC – Towards a Repeatable Algorithm for Finding the Optimal Set. Journal of WSCG 21, 21–30.