In this paper we propose a pilot study aimed at an in-depth comprehension of the phenomena underlying Ontology Population from text. The study has been carried out on a collection of Italian news articles, which have been manually annotated at several semantic levels. More specifically, we have annotated all the textual expressions (i.e. mentions) referring to Persons; each mention has been in turn decomposed into a number of attribute/value pairs; co-reference relations among mentions have been established, resulting in the identification of entities, which, finally, have been used to populate an ontology. There are two significant results of such a study. First, a number of factors have been empirically identified which determine the difficulty of Ontology Population from Text and which can now be taken into account while designing automatic systems. Second, the resulting dataset is a valuable resource for training and testing single components of Ontology Population systems.

From Mention to Ontology: A Pilot Study

Popescu, Octavian;Magnini, Bernardo;Pianta, Emanuele;Serafini, Luciano;Speranza, Manuela;Tamilin, Andrei
2006-01-01

Abstract

In this paper we propose a pilot study aimed at an in-depth comprehension of the phenomena underlying Ontology Population from text. The study has been carried out on a collection of Italian news articles, which have been manually annotated at several semantic levels. More specifically, we have annotated all the textual expressions (i.e. mentions) referring to Persons; each mention has been in turn decomposed into a number of attribute/value pairs; co-reference relations among mentions have been established, resulting in the identification of entities, which, finally, have been used to populate an ontology. There are two significant results of such a study. First, a number of factors have been empirically identified which determine the difficulty of Ontology Population from Text and which can now be taken into account while designing automatic systems. Second, the resulting dataset is a valuable resource for training and testing single components of Ontology Population systems.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/4036
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact