This paper presents the general objectives of the ONTOTEXT project (From Text to Knowledge for the Semantic Web), and the activities carried out during the first year of its development cycle. First, the task of annotating huge amounts of textual data (e.g. those available on the Web or in local document collections) will be introduced, focusing on its importance in order to enhance the interoperability of such data through ontology-based reasoning. Then, themain issues related to the annotation task will be discussed. These include the choice of an adequate formalism to capture and describe different types of relevant information contained in a text, and the adaptation of existing language specific markup formalisms to a new language (Italian in our case). Finally, the results of our experience in the concrete annotation of information about people and temporal expressions for the Italian Content Annotation Bank (I-CAB) being developed at ITC-irst and CELCT will be reported.

From Text to Knowledge for the Semantic Web: the ONTOTEXT Project

Magnini, Bernardo;Negri, Matteo;Pianta, Emanuele;Romano, Lorenza;Speranza, Manuela;Serafini, Luciano;Girardi, Christian;Sprugnoli, Rachele
2005-01-01

Abstract

This paper presents the general objectives of the ONTOTEXT project (From Text to Knowledge for the Semantic Web), and the activities carried out during the first year of its development cycle. First, the task of annotating huge amounts of textual data (e.g. those available on the Web or in local document collections) will be introduced, focusing on its importance in order to enhance the interoperability of such data through ontology-based reasoning. Then, themain issues related to the annotation task will be discussed. These include the choice of an adequate formalism to capture and describe different types of relevant information contained in a text, and the adaptation of existing language specific markup formalisms to a new language (Italian in our case). Finally, the results of our experience in the concrete annotation of information about people and temporal expressions for the Italian Content Annotation Bank (I-CAB) being developed at ITC-irst and CELCT will be reported.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/2991
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact