Corrected Ground-Truth Smoother
The corrected ground-truth smoother is a robust GTSAM batch factor graph that produces a ground-truth-like product across all 26 INSANE flights at the PX4-IMU scoring point. It exports raw formal per-pose covariance, accompanied by a standalone, empirically NEES-calibrated scoring-time position-covariance artifact (code). It preserves the shipped maintainer truth where dual-fixed RTK already makes it strong, repairs the structural attitude defect documented on Ground-Truth Attitude Quality, and bridges the outdoor-to-indoor transition gap described on The Outdoor-to-Indoor Transition Handoff. Three dependency studies feed it.
The graph carries, per regime:
- IMU preintegration as a GTSAM
CombinedImuFactoroff the Kalibr noise model, supplying inertial dynamics, gravity observability, and bias evolution. - Raw dual-RTK position factors at the
rtk_gps1/rtk_gps2antenna lever arms, each weighted by its per-row covariance with the centimetre RTK floors, and a dual-antenna baseline-heading factor. - A gravity-leveling static-attitude factor at the static epoch that repairs the shipped Wahba GPS+mag attitude’s structural gravity inconsistency, plus a static zero-velocity (ZUPT) factor.
- A barometer vertical factor fused at the actual
px4_baroevent times, with a per-keyframe pressure-bias random walk that respects the sensor’s recurring dropouts.
The dual-RTK position factors are the only outdoor heading source, so attitude is validated against held-out evidence rather than self-consistency.
Outdoor and Mars Result
On the 20 outdoor/Mars runs (19 Mars missions plus outdoor_1), the smoother reproduces the trustworthy position where dual-fixed RTK supports it and repairs the attitude where the shipped solution is gravity-inconsistent.
| Quantity | Median | Worst |
|---|---|---|
| Position RMS vs maintainer truth (m) | 0.062 | 0.351 (mars_14) |
| Static-gravity tilt, recovered (deg) | 1.16 | — |
| Static-gravity tilt improvement vs shipped (deg) | 7.13 | 53.7 (outdoor_1, 16.9 → 0.28) |
| Held-out-baseline heading residual p95 (deg) | 3.11 | 6.30 |
Heading is validated by holding out the dual-antenna baseline-heading factor over a window placed in a dual-fixed-rich region, re-solving, and scoring recovered keyframe yaw against the withheld measured baseline heading versus gap length. A gyro-rate-integral consistency witness bounds the yaw-rate increment independently, and a mars_5/6/7 shipped-attitude cross-check is applied where those runs’ attitude is trustworthy. Across 2 s, 5 s, 10 s, and 20 s held-out gaps the median recovered-yaw error p95 stays at 0.4-1.6 deg; the gyro-integral witness confirms the shape datum independently.
Covariance Calibration
The formal pose marginals are optimistic. Calibration is empirical and split by regime, fitting a position-variance inflation scalar against held-out NEES computed in the world frame (the body-frame marginal translation block rotated by R Cov R^T; the world-over-body NEES ratio is 0.96, so the choice of frame barely moves the position NEES):
| Regime | Keyframes | Raw NEES (median) | Variance inflation | Calibrated in-χ² fraction |
|---|---|---|---|---|
| Dual-fixed RTK | 1218 | 77.5 | 32.8× | 0.67 |
| Float RTK | 558 | 93.2 | 39.4× | 0.75 |
| Indoor | 420 | 0.5 | 1.0× | 0.75 |
The raw RTK marginals are roughly 33-39× too tight in variance on held-out segments. The indoor regime is the opposite: its formal covariance is rigid-body↔︎PX4-IMU-calibration-dominated and already conservative (raw NEES median 0.5, below the χ²₃ median 2.37), so under an inflate-only rule it is left at the formal marginal — a held-out segment removes truth the exported product carries, so a sub-unit scalar would publish an over-sharp ground-truth covariance. The transition in-gap pure-visual segment carries no co-temporal holdout truth, so it is excluded from the campaign-wide calibration; instead each transition export carries a separate residual-based cov_pose6_honest_ingap field alongside the raw cov_pose6_formal, inflating the formal marginal (inflate-only) to per-tag-residual NEES consistency. The campaign-wide calibration itself is a standalone per-regime artifact applied to the exported position block at scoring time; the estimation graph and trajectory are untouched.
Transition Result
The transition runs add the indoor mocap flank to the outdoor RTK flank through a jointly-solved ENU↔︎mocap transform T_enu_mocap and a rigid-body↔︎PX4-IMU calibration T_body_pximu, stereo visual-odometry between-factors mapped into the IMU body frame, and per-tag in-gap fiducial factors from the pooled board-fixed map. No transition run has co-temporal ground_truth_8hz and mocap_vehicle overlap, and the raw dual-RTK rows that overlap mocap_vehicle arrive at degraded RTK status, so closure comes from the visual and inertial chain across the gap rather than a direct truth-lane overlap.
transition_1 and transition_2 converge cleanly (backbone under Powell’s Dogleg over multifrontal QR in 12 and 16 iterations; final fiducial passes at 4 iterations each). transition_3 is the exception: its two backbone passes reach the Levenberg-Marquardt iteration cap of 100 (cost still reducing at the cap; only its final fiducial pass converges, at 31 iterations). The plain-LM backbone is kept on transition_3 on purpose — the Dogleg+QR that converges the other two settles it into a basin that opens the fiducial gate and pushes the displacement out of the witness cluster — so transition_3 is witness-supported but optimizer-limited, a lm_capped state flagged explicitly. Numerical closure of that backbone is tracked as open work.
| Run | Gap (s) | Recovered net gap displacement (m) | ENU↔︎mocap yaw p95 (deg) | Fiducial yaw p95 (deg) | Stereo yaw-rate p95 (deg/s) |
|---|---|---|---|---|---|
transition_1 |
4.97 | 3.20 | 0.23 | 2.09 | 4.31 |
transition_2 |
5.50 | 4.07 | 0.37 | 2.06 | 3.10 |
transition_3 |
11.11 | 3.21 | 0.63 | 0.75 | 3.96 |
Each recovered net gap displacement lands inside its per-run acceptance band — the independent-witness cluster set by rs_odom onboard VIO, the marker-bridge reconstruction, and the independent stereo VO (transition_3 [2.75, 3.41] m, transition_2 [4.00, 4.07] m, transition_1 [3.14, 3.20] m) — while the shipped ground_truth_80hz in-gap fill on transition_3 (~8.0 m) is a 2.48× outlier and is excluded as a scoring lane. Transition heading is validated by three physically independent through-gap consistency checks rather than self-consistency: one rigid ENU↔︎mocap yaw that reconciles the outdoor dual-RTK heading with the indoor mocap heading, stereo yaw-rate continuity against the stereo-VO between-factors, and per-tag fiducial orientation through the board-fixed map. The gap has no fully independent absolute in-gap yaw witness, so these guard against common-mode yaw error rather than supplying a second external truth.
Dependency Studies
The Computer-Vision Feasibility Probe established that cameras can carry the transition gap and gates the vision front-end. The Visual-Inertial Bundle Adjustment provides the pooled board-fixed fiducial map and the stereo relative-pose front-end. The Camera-LRF Calibration Tests resolve the laser-range-finder beam direction with an explicit uncertainty band.
Limitations and Open Work
- The crossing carries two physically distinct error processes, modeled separately. Near the building (the outdoor flank, exposed by
transition_2) the limiting error is RTK bias carrying false confidence: the reported RTK covariance saturates at its ~1.4 cm horizontal floor while the true position σ rises to ~0.6–0.7 m at the Dronehall entrance. A distance-ramped spatial RTK σ-floor inside a 4 m door band — max-combined with the reported σ so it only ever widens, keyed to the near-door dual-fixed signature rather than a run name — cut thetransition_2outdoor-flank truth RMS from 0.39 m to 0.16 m by no longer over-trusting the near-door fixes. The fiducial consistency gate has a narrow usable window (near ~1.5 m; at ≥3 m it admits mis-registered frames), so the residual fragility is a fiducial-system property — the gate and the absolute board→ENU registration — not a missing in-gap witness. - Inside the gap (exposed by the deep
transition_3gap) the limiting error is the absolute registration of the pooled fiducial field into the world frame. A corner-level fiducial bundle adjustment was evaluated for map-tightening and ruled out: the pooled board map is already ~1 cm metric and each in-gap frame’s multi-tag PnP agrees on the camera pose to ~3 cm, so map quality was never the bottleneck — the residual is IMU/VO drift plus board→ENU registration drift across the unpinned gap, which a tighter map cannot move. (Whether a per-frame multi-tag PnP versus single-square IPPE choice moves thetransition_3in-gap solve is a distinct, still-open question, not a map-tightening one.) - A level-then-yaw decomposition of the ENU↔︎mocap registration was tested as an A/B diagnostic and is benign, not a negative result: on
transition_1/transition_2it matches the plain SO(3)-mean registration to ~1e-5, and on atransition_3rerun that isolates the seed effect the two seeds differ by 0.2 mm in gap displacement (2.75250 vs 2.75270 m) and 0.003° in theT_enu_mocapyaw, with both converging. It is retained as a diagnostic of the solved registration; the plain SO(3)-mean registration ships. - The three ground UWB anchors (mounted on the door fiducial boards) and the airframe UWB-tag lever arm — absent from the shipped calibration — are recovered, closing the previously-unobservable UWB geometry. As an in-gap factor it is dominated, though: the pooled fiducial map already pins the in-gap position to ~1 cm while the recovered anchors carry ~0.3 m range σ, and the door anchors go non-line-of-sight inside the gap once the vehicle crosses the doorway. The board-fixed fiducial map at millimetre-to-centimetre accuracy dominates both non-vision in-gap witnesses: the LRF nadir-range factor is likewise dominated (the in-gap vertical is the weak axis but fiducial-pinned to 16 mm against the LRF’s ~50-137 mm range prediction), and the nav↔︎stereo cameras agree to 7-13 cm in the board frame on the approach but see disjoint boards at disjoint times. UWB, LRF, and the cross-camera tie therefore stay validation cross-checks, not factors.
- The indoor-only product is complete:
indoor_1/2/3run the same batch graph with mocap through the solved registration, exported at the PX4-IMU scoring point in a gravity-leveled ENU (z-axis gravity-up; yaw and origin the conventional gauge, since there is no GNSS indoors), reproducing the mocap truth to 5.8–7.2 mm RMS and bridging held-out 2.5 s mocap windows from inertial support to ~19 mm RMS. All 26 flights now carry a product with raw formal per-pose marginals plus the standalone per-regime scoring-time position-covariance calibration; the open covariance question is whether a sharper-but-honest covariance is recoverable for the transition in-gap pure-visual segment, which carries no co-temporal truth.