Image matching is the core of many computer vision applications for cultural heritage. The standard image matching pipeline detects keypoints at the beginning and freezes them until bundle adjustment, by which keypoints are allowed to move in order to improve the overall scene estimation. Recent deep image matching approaches do not follow this scheme, historically imposed by computational limits, and progressively refine the localization of the matches in a coarse-to-fine manner. This paper investigates the use of traditional computer vision approaches based on template matching to update the keypoint position throughout the whole matching pipeline. In order to improve the accuracy of the template matching, the usage of the coarse-to-fine refinement is explored and a novel normalization strategy for the local keypoint patches is designed. Specifically, the proposed patch normalization assumes a local piece-wise planar approximation of the scene and warps the corresponding patches according to a “middle homography”, so that, after normalization, patch distortion is roughly equally distributed within the two original patches. The experimental comparison of the considered approaches, mainly focused on cultural heritage scenes but straightforwardly generalizable to other common scenarios, shows the strengths and limitations of each evaluated method. This analysis indicates promising and interesting results for the investigated approaches, which can effectively be deployed to design better image matching solutions.
Progressive Keypoint Localization and Refinement in Image Matching
Luca Morelli;Fabio Remondino
2024-01-01
Abstract
Image matching is the core of many computer vision applications for cultural heritage. The standard image matching pipeline detects keypoints at the beginning and freezes them until bundle adjustment, by which keypoints are allowed to move in order to improve the overall scene estimation. Recent deep image matching approaches do not follow this scheme, historically imposed by computational limits, and progressively refine the localization of the matches in a coarse-to-fine manner. This paper investigates the use of traditional computer vision approaches based on template matching to update the keypoint position throughout the whole matching pipeline. In order to improve the accuracy of the template matching, the usage of the coarse-to-fine refinement is explored and a novel normalization strategy for the local keypoint patches is designed. Specifically, the proposed patch normalization assumes a local piece-wise planar approximation of the scene and warps the corresponding patches according to a “middle homography”, so that, after normalization, patch distortion is roughly equally distributed within the two original patches. The experimental comparison of the considered approaches, mainly focused on cultural heritage scenes but straightforwardly generalizable to other common scenarios, shows the strengths and limitations of each evaluated method. This analysis indicates promising and interesting results for the investigated approaches, which can effectively be deployed to design better image matching solutions.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.