High/very-high-resolution (HR/VHR) multitemporal images are important in remote sensing to monitor the dynamics of the Earth’s surface. Unsupervised object-based image analysis provides an effective solution to analyze such images. Image semantic segmentation assigns pixel labels from meaningful object groups and has been extensively studied in the context of single-image analysis, however not explored for multitemporal one. In this article, we propose to extend supervised semantic segmentation to the unsupervised joint semantic segmentation of multitemporal images. We propose a novel method that processes multitemporal images by separately feeding to a deep network comprising of trainable convolutional layers. The training process does not involve any external label, and segmentation labels are obtained from the argmax classification of the final layer. A novel loss function is used to detect object segments from individual images as well as establish a correspondence between distinct multitemporal segments. Multitemporal semantic labels and weights of the trainable layers are jointly optimized in iterations. We tested the method on three different HR/VHR data sets from Munich, Paris, and Trento, which shows the method to be effective. We further extended the proposed joint segmentation method for change detection (CD) and tested on a VHR multisensor data set from Trento.
Unsupervised Deep Joint Segmentation of Multitemporal High-Resolution Images
Saha, Sudipan;Bovolo, Francesca;
2020-01-01
Abstract
High/very-high-resolution (HR/VHR) multitemporal images are important in remote sensing to monitor the dynamics of the Earth’s surface. Unsupervised object-based image analysis provides an effective solution to analyze such images. Image semantic segmentation assigns pixel labels from meaningful object groups and has been extensively studied in the context of single-image analysis, however not explored for multitemporal one. In this article, we propose to extend supervised semantic segmentation to the unsupervised joint semantic segmentation of multitemporal images. We propose a novel method that processes multitemporal images by separately feeding to a deep network comprising of trainable convolutional layers. The training process does not involve any external label, and segmentation labels are obtained from the argmax classification of the final layer. A novel loss function is used to detect object segments from individual images as well as establish a correspondence between distinct multitemporal segments. Multitemporal semantic labels and weights of the trainable layers are jointly optimized in iterations. We tested the method on three different HR/VHR data sets from Munich, Paris, and Trento, which shows the method to be effective. We further extended the proposed joint segmentation method for change detection (CD) and tested on a VHR multisensor data set from Trento.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.