Semantic Image Interpretation (SII) is the process of generating a structured description of the content of an input image. This description is encoded as a labelled direct graph where nodes correspond to objects in the image and edges to semantic relations between objects. Such a detailed structure allows a more accurate searching and retrieval of images. In this thesis, we propose two well-founded methods for SII. Both methods exploit background knowledge, in the form of logical constraints of a knowledge base, about the domain of the images. The first method formalizes the SII as the extraction of a partial model of a knowledge base. Partial models are built with a clustering and reasoning algorithm that considers both low-level and semantic features of images. The second method uses the framework Logic Tensor Networks to build the labelled direct graph of an image. This framework is able to learn from data in presence of the logical constraints of the knowledge base. Therefore, the graph construction is performed by predicting the labels of the nodes and the relations according to the logical constraints and the features of the objects in the image. These methods improve the state-of-the-art by introducing two well-founded methodologies that integrate low-level and semantic features of images with logical knowledge. Indeed, other methods, do not deal with low-level features or use only statistical knowledge coming from training sets or corpora. Moreover, the second method overcomes the performance of the state-of-the-art on the standard task of visual relationship detection.

Semantic Image Interpretation - Integration of Numerical Data and Logical Knowledge for Cognitive Vision

ivan donadello
2018-01-01

Abstract

Semantic Image Interpretation (SII) is the process of generating a structured description of the content of an input image. This description is encoded as a labelled direct graph where nodes correspond to objects in the image and edges to semantic relations between objects. Such a detailed structure allows a more accurate searching and retrieval of images. In this thesis, we propose two well-founded methods for SII. Both methods exploit background knowledge, in the form of logical constraints of a knowledge base, about the domain of the images. The first method formalizes the SII as the extraction of a partial model of a knowledge base. Partial models are built with a clustering and reasoning algorithm that considers both low-level and semantic features of images. The second method uses the framework Logic Tensor Networks to build the labelled direct graph of an image. This framework is able to learn from data in presence of the logical constraints of the knowledge base. Therefore, the graph construction is performed by predicting the labels of the nodes and the relations according to the logical constraints and the features of the objects in the image. These methods improve the state-of-the-art by introducing two well-founded methodologies that integrate low-level and semantic features of images with logical knowledge. Indeed, other methods, do not deal with low-level features or use only statistical knowledge coming from training sets or corpora. Moreover, the second method overcomes the performance of the state-of-the-art on the standard task of visual relationship detection.
2018
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/317682
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact