The steady spread of the 2019 Coronavirus disease has brought about human and economic losses, imposing a new lifestyle across the world. On this point, medical imaging tests such as computed tomography (CT) and X-ray have demonstrated a sound screening potential. Deep learning methodologies have evidenced superior image analysis capabilities with respect to prior handcrafted counterparts. In this paper, we propose a novel deep learning framework for Coronavirus detection using CT and X-ray images. In particular, a Vision Transformer architecture is adopted as a backbone in the proposed network, in which a Siamese encoder is utilized. The latter is composed of two branches: one for processing the original image and another for processing an augmented view of the original image. The input images are divided into patches and fed through the encoder. The proposed framework is evaluated on public CT and X-ray datasets. The proposed system confirms its superiority over state-of-the-art methods on CT and X-ray data in terms of accuracy, precision, recall, specificity, and F1 score. Furthermore, the proposed system also exhibits good robustness when a small portion of training data is allocated.

COVID-19 Detection in CT/X-ray Imagery Using Vision Transformers

Mohamed Lamine Mekhalfi;
2022-01-01

Abstract

The steady spread of the 2019 Coronavirus disease has brought about human and economic losses, imposing a new lifestyle across the world. On this point, medical imaging tests such as computed tomography (CT) and X-ray have demonstrated a sound screening potential. Deep learning methodologies have evidenced superior image analysis capabilities with respect to prior handcrafted counterparts. In this paper, we propose a novel deep learning framework for Coronavirus detection using CT and X-ray images. In particular, a Vision Transformer architecture is adopted as a backbone in the proposed network, in which a Siamese encoder is utilized. The latter is composed of two branches: one for processing the original image and another for processing an augmented view of the original image. The input images are divided into patches and fed through the encoder. The proposed framework is evaluated on public CT and X-ray datasets. The proposed system confirms its superiority over state-of-the-art methods on CT and X-ray data in terms of accuracy, precision, recall, specificity, and F1 score. Furthermore, the proposed system also exhibits good robustness when a small portion of training data is allocated.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/331550
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact