Hierarchical classifications are concept hierarchies used to organize large amounts of documents. File systems, products' taxonomies for the market place and the directories provided by Web portals are common examples of hierarchical classifications. As semi-structured knowledge sources, hierarchical classifications have peculiar features: they differ both from plain texts since they are based on a taxonomy of concepts, and from structured data sources (such as databases and formal ontologies), because many semantic relations are implicit. We propose a methodology for building a semantic interpretation of hierarchical classifications on the basis of the analysis of the taxonomic relations and the linguistic material they contain. We provide a formal semantics for hierarchical classifications and then we use that formal framework to interpret the implicit knowledge represented, by exploring a number of crucial linguistic issues. Relevant phenomena addressed include the disambiguation of polysemous words, the semantics of multiwords, and the interpretation of coordinations. The Web Directories of Google and Yahoo! have been chosen as an evaluation set. We show that there is a considerable amount of information to be made explicit and discuss the performance of an implementation of our analysis

Making explicit the hidden semantics of hierarchical classifications

Magnini, Bernardo;Serafini, Luciano;Speranza, Manuela
2003-01-01

Abstract

Hierarchical classifications are concept hierarchies used to organize large amounts of documents. File systems, products' taxonomies for the market place and the directories provided by Web portals are common examples of hierarchical classifications. As semi-structured knowledge sources, hierarchical classifications have peculiar features: they differ both from plain texts since they are based on a taxonomy of concepts, and from structured data sources (such as databases and formal ontologies), because many semantic relations are implicit. We propose a methodology for building a semantic interpretation of hierarchical classifications on the basis of the analysis of the taxonomic relations and the linguistic material they contain. We provide a formal semantics for hierarchical classifications and then we use that formal framework to interpret the implicit knowledge represented, by exploring a number of crucial linguistic issues. Relevant phenomena addressed include the disambiguation of polysemous words, the semantics of multiwords, and the interpretation of coordinations. The Web Directories of Google and Yahoo! have been chosen as an evaluation set. We show that there is a considerable amount of information to be made explicit and discuss the performance of an implementation of our analysis
2003
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/923
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact