Visual-Inertial Bundle Adjustment

The visual-inertial bundle adjustment provides the fiducial front-end and the stereo relative-pose constraints the smoother consumes across the transition (code). It feeds the Corrected Ground-Truth Smoother and builds on the marker chain described under Maintainer-Generated Components. dir-5 ingests it on the transition cohort to span the outdoor-to-indoor handover, where ground_truth_8hz and mocap_vehicle have no co-temporal overlap.

Result

The transition fiducial field is one fixed physical rig. The mocap_tag_board body pose agrees to 0.61 mm and 0.055 deg across the three transition runs (worst pairwise), so the field does not move between runs. The maintainers’ three per-run tag_calibration.mat estimates of that same field disagree board-to-board at 13.92 cm and 3.25 deg — estimator scatter, not scene motion, and a 44.9 mm RMS / 79.4 mm p95 disagreement against the pooled map.

That fixed-rig premise licenses one shared, board-fixed fiducial map. The map is built once in the mocap_tag_board frame, pooled across all three transitions: within-frame inter-tag relative poses give the rig’s board-fixed metric structure vehicle-free, while the per-run vehicle-overlap detections anchor the map to the board frame. Both fuse in a single SE(3) pose graph (node per tag, robust between-factor per pooled inter-tag edge, weak absolute prior per anchored tag), so a tag seen strongly on one run carries onto the runs that see it weakly.

Map property Value
Tags mapped (anchored) 78 (28)
Per-tag pooled support, min / median 132 / 3096
Board-8 metric self-fit RMS vs the dir-9 layout 1.0 cm
VR-7 leave-tag-out held-out RMS 2.3 cm
VR-7 leave-board-out held-out RMS 6.5 cm

The map is gated as metric before any run trusts it by fitting board 8 (tags 97-107) against the known board layout; a small residual means the pooled sub-map reproduces the known inter-tag spacing. It is then validated out-of-sample against the vehicle mocap_tag_board measurement: withholding one tag’s absolute anchor and relocalizing it from the rest gives 2.3 cm RMS, and withholding board 8 entirely gives 6.5 cm RMS on a board the absolute anchoring never used. Detection coverage on the gaps is at least four tags on every gap frame (99 / 110 / 222), median 11-17 tags per frame at 0.14 px reprojection; the stereo front-end yields roughly 1300 metric points per frame.

No tag id is shared across the immediate handover flanks, and transition_1 frequently stays planar-PnP degenerate despite continuous tag visibility, so the fiducials feed dir-5 as per-tag absolute anchored in-gap localization rather than a direct tag-only outdoor-to-indoor bridge. The stereo front-end maps each between-factor into the IMU body frame and tolerates the referenced transition stereo pairs that are absent on disk (8 / 6 / 4 missing pairs across the three runs) plus the known corrupt transition_3 frame by dropping the affected pair.

Available But Not Yet Exploited

Each board carries small edge markers intended for higher-accuracy board-to-board calibration, and the three UWB ground anchors are mounted on boards 1, 2, and 7 at the door. Both are observable in the nav camera and, side-on, in the forward stereo camera, whose field of view overlaps the nav camera’s above roughly 2 m. These are available structure the present map does not yet use; the planned corner-level fiducial bundle adjustment that tightens the absolute board→ENU registration on transition_3 is the natural place to fold them in.