IRIS Institutional Research Information System

Deriving accurate 3D geometry from multi-view 2D imagery remains a fundamental problem in photogrammetry and computer vision. Conventional pipelines, comprising feature extraction, image matching, bundle adjustment and dense reconstruction, are grounded in well-established geometric principles but remain sensitive to complex scenarios such as significant illumination variability, deficiency in texture and high variability in viewing angles. Recent deep learning developments have triggered a paradigm shift, reformulating multi-view 3D reconstruction as a data-driven, end-to-end optimization problem. Neural architectures now jointly learn feature representations, correspondence estimation and geometric reasoning, supported by large-scale training datasets, high-performance GPU computation, transformer networks and differentiable rendering frameworks. This study methodically examines the transition from traditional photogrammetric approaches to end-to-end AI-based reconstruction pipelines. Using benchmark geomatic datasets, we quantitatively evaluate the performance of two recent and representative end-to-end deep learning methods compared to classical photogrammetry. Results highlight performances of AI-driven approaches in 3D reconstructions and their limits for in large-scale, metric-oriented mapping and modeling applications.

Exploring modern end-to-end AI-based multi-view 3D reconstruction

June Moh Goo;Zichao Zeng;Luca Morelli;Fabio Remondino;Jan Boehm

2025-01-01

Abstract

Deriving accurate 3D geometry from multi-view 2D imagery remains a fundamental problem in photogrammetry and computer vision. Conventional pipelines, comprising feature extraction, image matching, bundle adjustment and dense reconstruction, are grounded in well-established geometric principles but remain sensitive to complex scenarios such as significant illumination variability, deficiency in texture and high variability in viewing angles. Recent deep learning developments have triggered a paradigm shift, reformulating multi-view 3D reconstruction as a data-driven, end-to-end optimization problem. Neural architectures now jointly learn feature representations, correspondence estimation and geometric reasoning, supported by large-scale training datasets, high-performance GPU computation, transformer networks and differentiable rendering frameworks. This study methodically examines the transition from traditional photogrammetric approaches to end-to-end AI-based reconstruction pipelines. Using benchmark geomatic datasets, we quantitatively evaluate the performance of two recent and representative end-to-end deep learning methods compared to classical photogrammetry. Results highlight performances of AI-driven approaches in 3D reconstructions and their limits for in large-scale, metric-oriented mapping and modeling applications.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2025

Appare nelle tipologie:

4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/369888

Citazioni

ND

social impact