Radar Sounders (RSs) are sensors operating in the nadir-looking geometry (with HF or VHF bands) by transmitting modulated electromagnetic (EM) pulses and receiving the backscattering response from different subsurface targets. Recently, convolutional neural network (CNN) architectures were established for characterizing RS signals under the semantic segmentation framework. In this paper, we design a Fast Fourier Transform (FFT) based CNN-Transformer encoder to effectively capture the long-range contexts in the radargram. In our hybrid architecture, CNN models the high-dimensional local spatial contexts, and the Transformer establishes the global spatial contexts between the local spatial ones. To overcome Transformer complex self-attention layers by reducing learnable parameters; - we replace the self-attention mechanism of the Transformer with unparameterized FFT modules as depicted in FNet architecture for Natural Language Processing (NLP). The experimental results on the MCoRDS dataset indicate the capability of the CNN-Transformer encoder along with the unparameterized FFT modules to characterize the radargram with limited accuracy cost and by reducing the time consumption. A comparative analysis is carried out with the state-of-the-art Transformer-based architecture.

An FFT-based CNN-Transformer Encoder for Semantic Segmentation of Radar Sounder Signal

Ghosh, Raktim;Bovolo, Francesca
2022-01-01

Abstract

Radar Sounders (RSs) are sensors operating in the nadir-looking geometry (with HF or VHF bands) by transmitting modulated electromagnetic (EM) pulses and receiving the backscattering response from different subsurface targets. Recently, convolutional neural network (CNN) architectures were established for characterizing RS signals under the semantic segmentation framework. In this paper, we design a Fast Fourier Transform (FFT) based CNN-Transformer encoder to effectively capture the long-range contexts in the radargram. In our hybrid architecture, CNN models the high-dimensional local spatial contexts, and the Transformer establishes the global spatial contexts between the local spatial ones. To overcome Transformer complex self-attention layers by reducing learnable parameters; - we replace the self-attention mechanism of the Transformer with unparameterized FFT modules as depicted in FNet architecture for Natural Language Processing (NLP). The experimental results on the MCoRDS dataset indicate the capability of the CNN-Transformer encoder along with the unparameterized FFT modules to characterize the radargram with limited accuracy cost and by reducing the time consumption. A comparative analysis is carried out with the state-of-the-art Transformer-based architecture.
2022
9781510655379
9781510655386
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/335989
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact