Scale-aware Co-visible Region Detection for Image Matching

1State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS), Wuhan University
2The VITA Lab, École Polytechnique Fédérale de Lausanne (EPFL)
ISPRS Journal of Photogrammetry and Remote Sensing
*Corresponding Author
introduction

Comparison between direct point matching and hierarchical image matching under drastic scale variation. We define the scale ratio as the ratio of the diagonal lengths of the co-visible regions in image pairs. (a) Traditional methods directly establish point-level correspondences based on feature points. (b) Our method introduces intermediate region-level correspondence through co-visible region detection to form a hierarchical matching strategy.

overview

The SCoDe architecture. Features are fed into a transformer with Scale Head Attention, and outputs pass through the decoder to obtain co-visible region predictions.

Abstract

Matching images with significant scale differences remains a persistent challenge in photogrammetry and remote sensing. The scale discrepancy often degrades appearance consistency and introduces uncertainty in keypoint localization. While existing methods address scale variation through scale pyramids or scale-aware training, matching under significant scale differences remains an open challenge. To overcome this, we address the scale difference issue by detecting co-visible regions between image pairs and propose SCoDe (Scale-aware Co-visible region Detector), which both identifies co-visible regions and aligns their scales for highly robust, hierarchical point correspondence matching. Specifically, SCoDe employs a novel Scale Head Attention mechanism to map and correlate features across multiple scale subspaces, and uses a learnable query to aggregate scale-aware information of both images for co-visible region detection. In this way, correspondences can be established in a coarse-to-fine hierarchy, thereby mitigating semantic and localization uncertainties. Extensive experiments on three challenging datasets demonstrate that SCoDe outperforms state-of-the-art methods, improving the precision of a modern local feature matcher by 8.41%. Notably, SCoDe shows a clear advantage when handling images with drastic scale variations.

BibTeX

@article{pan2025scale,
  title={Scale-aware co-visible region detection for image matching},
  author={Pan, Xu and Xia, Zimin and Zheng, Xianwei},
  journal={ISPRS Journal of Photogrammetry and Remote Sensing},
  volume={229},
  pages={122--137},
  year={2025},
  publisher={Elsevier}
}