Corrected Ground-Truth Smoother

A corrected ground-truth-like product across all 26 INSANE flights at the PX4-IMU scoring point, vetted against the maintainer ground truth.

The corrected ground-truth smoother is a robust GTSAM batch factor graph that produces a ground-truth-like product across all 26 INSANE flights at the PX4-IMU scoring point. It exports raw formal per-pose covariance, accompanied by a standalone, empirically NEES-calibrated scoring-time position-covariance artifact (code). It preserves the shipped maintainer truth where dual-fixed RTK already makes it strong, repairs the structural attitude defect documented on Ground-Truth Attitude Quality, and bridges the outdoor-to-indoor transition gap described on The Outdoor-to-Indoor Transition Handoff. Three dependency studies feed it.

The graph carries, per regime:

The dual-RTK position factors are the only outdoor heading source, so attitude is validated against held-out evidence rather than self-consistency.

Outdoor and Mars Result

On the 20 outdoor/Mars runs (19 Mars missions plus outdoor_1), the smoother reproduces the trustworthy position where dual-fixed RTK supports it and repairs the attitude where the shipped solution is gravity-inconsistent.

Quantity Median Worst
Position RMS vs maintainer truth (m) 0.062 0.351 (mars_14)
Static-gravity tilt, recovered (deg) 1.16
Static-gravity tilt improvement vs shipped (deg) 7.13 53.7 (outdoor_1, 16.9 → 0.28)
Held-out-baseline heading residual p95 (deg) 3.11 6.30

Heading is validated by holding out the dual-antenna baseline-heading factor over a window placed in a dual-fixed-rich region, re-solving, and scoring recovered keyframe yaw against the withheld measured baseline heading versus gap length. A gyro-rate-integral consistency witness bounds the yaw-rate increment independently, and a mars_5/6/7 shipped-attitude cross-check is applied where those runs’ attitude is trustworthy. Across 2 s, 5 s, 10 s, and 20 s held-out gaps the median recovered-yaw error p95 stays at 0.4-1.6 deg; the gyro-integral witness confirms the shape datum independently.

Covariance Calibration

The formal pose marginals are optimistic. Calibration is empirical and split by regime, fitting a position-variance inflation scalar against held-out NEES computed in the world frame (the body-frame marginal translation block rotated by R Cov R^T; the world-over-body NEES ratio is 0.96, so the choice of frame barely moves the position NEES):

Regime Keyframes Raw NEES (median) Variance inflation Calibrated in-χ² fraction
Dual-fixed RTK 1218 77.5 32.8× 0.67
Float RTK 558 93.2 39.4× 0.75
Indoor 420 0.5 1.0× 0.75

The raw RTK marginals are roughly 33-39× too tight in variance on held-out segments. The indoor regime is the opposite: its formal covariance is rigid-body↔︎PX4-IMU-calibration-dominated and already conservative (raw NEES median 0.5, below the χ²₃ median 2.37), so under an inflate-only rule it is left at the formal marginal — a held-out segment removes truth the exported product carries, so a sub-unit scalar would publish an over-sharp ground-truth covariance. The transition in-gap pure-visual segment carries no co-temporal holdout truth, so it is excluded from the campaign-wide calibration; instead each transition export carries a separate residual-based cov_pose6_honest_ingap field alongside the raw cov_pose6_formal, inflating the formal marginal (inflate-only) to per-tag-residual NEES consistency. The campaign-wide calibration itself is a standalone per-regime artifact applied to the exported position block at scoring time; the estimation graph and trajectory are untouched.

Transition Result

The transition runs add the indoor mocap flank to the outdoor RTK flank through a jointly-solved ENU↔︎mocap transform T_enu_mocap and a rigid-body↔︎PX4-IMU calibration T_body_pximu, stereo visual-odometry between-factors mapped into the IMU body frame, and per-tag in-gap fiducial factors from the pooled board-fixed map. No transition run has co-temporal ground_truth_8hz and mocap_vehicle overlap, and the raw dual-RTK rows that overlap mocap_vehicle arrive at degraded RTK status, so closure comes from the visual and inertial chain across the gap rather than a direct truth-lane overlap.

transition_1 and transition_2 converge cleanly (backbone under Powell’s Dogleg over multifrontal QR in 12 and 16 iterations; final fiducial passes at 4 iterations each). transition_3 is the exception: its two backbone passes reach the Levenberg-Marquardt iteration cap of 100 (cost still reducing at the cap; only its final fiducial pass converges, at 31 iterations). The plain-LM backbone is kept on transition_3 on purpose — the Dogleg+QR that converges the other two settles it into a basin that opens the fiducial gate and pushes the displacement out of the witness cluster — so transition_3 is witness-supported but optimizer-limited, a lm_capped state flagged explicitly. Numerical closure of that backbone is tracked as open work.

Run Gap (s) Recovered net gap displacement (m) ENU↔︎mocap yaw p95 (deg) Fiducial yaw p95 (deg) Stereo yaw-rate p95 (deg/s)
transition_1 4.97 3.20 0.23 2.09 4.31
transition_2 5.50 4.07 0.37 2.06 3.10
transition_3 11.11 3.21 0.63 0.75 3.96

Each recovered net gap displacement lands inside its per-run acceptance band — the independent-witness cluster set by rs_odom onboard VIO, the marker-bridge reconstruction, and the independent stereo VO (transition_3 [2.75, 3.41] m, transition_2 [4.00, 4.07] m, transition_1 [3.14, 3.20] m) — while the shipped ground_truth_80hz in-gap fill on transition_3 (~8.0 m) is a 2.48× outlier and is excluded as a scoring lane. Transition heading is validated by three physically independent through-gap consistency checks rather than self-consistency: one rigid ENU↔︎mocap yaw that reconciles the outdoor dual-RTK heading with the indoor mocap heading, stereo yaw-rate continuity against the stereo-VO between-factors, and per-tag fiducial orientation through the board-fixed map. The gap has no fully independent absolute in-gap yaw witness, so these guard against common-mode yaw error rather than supplying a second external truth.

Dependency Studies

The Computer-Vision Feasibility Probe established that cameras can carry the transition gap and gates the vision front-end. The Visual-Inertial Bundle Adjustment provides the pooled board-fixed fiducial map and the stereo relative-pose front-end. The Camera-LRF Calibration Tests resolve the laser-range-finder beam direction with an explicit uncertainty band.

Limitations and Open Work

  • The crossing carries two physically distinct error processes, modeled separately. Near the building (the outdoor flank, exposed by transition_2) the limiting error is RTK bias carrying false confidence: the reported RTK covariance saturates at its ~1.4 cm horizontal floor while the true position σ rises to ~0.6–0.7 m at the Dronehall entrance. A distance-ramped spatial RTK σ-floor inside a 4 m door band — max-combined with the reported σ so it only ever widens, keyed to the near-door dual-fixed signature rather than a run name — cut the transition_2 outdoor-flank truth RMS from 0.39 m to 0.16 m by no longer over-trusting the near-door fixes. The fiducial consistency gate has a narrow usable window (near ~1.5 m; at ≥3 m it admits mis-registered frames), so the residual fragility is a fiducial-system property — the gate and the absolute board→ENU registration — not a missing in-gap witness.
  • Inside the gap (exposed by the deep transition_3 gap) the limiting error is the absolute registration of the pooled fiducial field into the world frame. A corner-level fiducial bundle adjustment was evaluated for map-tightening and ruled out: the pooled board map is already ~1 cm metric and each in-gap frame’s multi-tag PnP agrees on the camera pose to ~3 cm, so map quality was never the bottleneck — the residual is IMU/VO drift plus board→ENU registration drift across the unpinned gap, which a tighter map cannot move. (Whether a per-frame multi-tag PnP versus single-square IPPE choice moves the transition_3 in-gap solve is a distinct, still-open question, not a map-tightening one.)
  • A level-then-yaw decomposition of the ENU↔︎mocap registration was tested as an A/B diagnostic and is benign, not a negative result: on transition_1/transition_2 it matches the plain SO(3)-mean registration to ~1e-5, and on a transition_3 rerun that isolates the seed effect the two seeds differ by 0.2 mm in gap displacement (2.75250 vs 2.75270 m) and 0.003° in the T_enu_mocap yaw, with both converging. It is retained as a diagnostic of the solved registration; the plain SO(3)-mean registration ships.
  • The three ground UWB anchors (mounted on the door fiducial boards) and the airframe UWB-tag lever arm — absent from the shipped calibration — are recovered, closing the previously-unobservable UWB geometry. As an in-gap factor it is dominated, though: the pooled fiducial map already pins the in-gap position to ~1 cm while the recovered anchors carry ~0.3 m range σ, and the door anchors go non-line-of-sight inside the gap once the vehicle crosses the doorway. The board-fixed fiducial map at millimetre-to-centimetre accuracy dominates both non-vision in-gap witnesses: the LRF nadir-range factor is likewise dominated (the in-gap vertical is the weak axis but fiducial-pinned to 16 mm against the LRF’s ~50-137 mm range prediction), and the nav↔︎stereo cameras agree to 7-13 cm in the board frame on the approach but see disjoint boards at disjoint times. UWB, LRF, and the cross-camera tie therefore stay validation cross-checks, not factors.
  • The indoor-only product is complete: indoor_1/2/3 run the same batch graph with mocap through the solved registration, exported at the PX4-IMU scoring point in a gravity-leveled ENU (z-axis gravity-up; yaw and origin the conventional gauge, since there is no GNSS indoors), reproducing the mocap truth to 5.8–7.2 mm RMS and bridging held-out 2.5 s mocap windows from inertial support to ~19 mm RMS. All 26 flights now carry a product with raw formal per-pose marginals plus the standalone per-regime scoring-time position-covariance calibration; the open covariance question is whether a sharper-but-honest covariance is recoverable for the transition in-gap pure-visual segment, which carries no co-temporal truth.