ActLoc: Learning to Localize on the Move via Active Viewpoint Selection

CoRL 2025

* equal contribution
1ETH Zürich, 2Sapienza University of Rome, 3University of Bonn, 4Microsoft

Abstract

Reliable localization is critical for robot navigation, yet most existing systems implicitly assume that all viewing directions at a location are equally informative. In practice, localization becomes unreliable when the robot observes unmapped, ambiguous, or uninformative regions. To address this, we present ActLoc, an active viewpoint-aware planning framework for enhancing localization accuracy for general robot navigation tasks. At its core, ActLoc employs a large-scale trained attention-based model for viewpoint selection. The model encodes a metric map and the camera poses used during map construction, and predicts localization accuracy across yaw and pitch directions at arbitrary 3D locations. These per-point accuracy distributions are incorporated into a path planner, enabling the robot to actively select camera orientations that maximize localization robustness while respecting task and motion constraints. ActLoc achieves state-of-the-art results on single-viewpoint selection and generalizes effectively to full-trajectory planning. Its modular design makes it readily applicable to diverse robot navigation and inspection tasks.

Teaser

Method

ActLoc takes as input 3D landmarks and reconstructed camera poses from Structure-from-Motion. At each waypoint, the inputs are transformed into an egocentric frame and locally cropped to focus on nearby geometry. The two modalities are independently encoded with self-attention and then fused through bidirectional cross-attention. The network outputs a LocMap, a yaw-pitch grid of localization performance distributions. During planning, the LocMap is combined with a smoothness term to form a mixed cost map. The planner selects viewpoints along the waypoint sequence that maximize localization reliability while keeping the motion continuous.

System Overview

BibTeX

@misc{li2025actloclearninglocalizeactive,
      title={ActLoc: Learning to Localize on the Move via Active Viewpoint Selection}, 
      author={Jiajie Li and Boyang Sun and Luca Di Giammarino and Hermann Blum and Marc Pollefeys},
      year={2025},
      eprint={2508.20981},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2508.20981}, 
}