Due to the fast growing of the information available on the Web, the retrieval of relevant content is increasingly hard. The complexity of the task is concerned both with the semantics of contents and with the filtering of quality-based sources. A recent strategy addressing the overwhelming amount of information is to focus the search on a snapshot of internet, namely a Web view. In this paper, we present a system supporting the creation of a quality-based view of the Web. We give a brief overview of the software and of its functional architecture. More emphasis is on the role of AI in supporting the organization of Web resources in a hierarchical structure of categories. We survey our recent works on document classifiers dealing with a twofold challenge. On one side, the task is to recommend classifications of Web resources when the taxonomy does not provide examples of classification, which usually happens when taxonomies are built from scratch. On the other side, even when taxonomies are populated, classifiers are trained with few examples since usually when a category achieves a certain amount of Web resources the organization policy suggests a refinement of the taxonomy. The paper includes a short description of a couple of case studies where the system has been deployed for real world applications.

Building Quality-based Views of the Web

Triolo, Enrico;Polettini, Nicola;Sona, Diego;Avesani, Paolo
2007-01-01

Abstract

Due to the fast growing of the information available on the Web, the retrieval of relevant content is increasingly hard. The complexity of the task is concerned both with the semantics of contents and with the filtering of quality-based sources. A recent strategy addressing the overwhelming amount of information is to focus the search on a snapshot of internet, namely a Web view. In this paper, we present a system supporting the creation of a quality-based view of the Web. We give a brief overview of the software and of its functional architecture. More emphasis is on the role of AI in supporting the organization of Web resources in a hierarchical structure of categories. We survey our recent works on document classifiers dealing with a twofold challenge. On one side, the task is to recommend classifications of Web resources when the taxonomy does not provide examples of classification, which usually happens when taxonomies are built from scratch. On the other side, even when taxonomies are populated, classifiers are trained with few examples since usually when a category achieves a certain amount of Web resources the organization policy suggests a refinement of the taxonomy. The paper includes a short description of a couple of case studies where the system has been deployed for real world applications.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/3283
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact