Keyphrases provide a semantic metadata that summarize and characterize documents. Since keyphrases summarize documents very concisely, they can be used as a low-cost measure of similarity between documents, making it possible to cluster documents into groups by measuring overlap between the keyphrases they are assigned to. In this work we investigate the usefulness of keyphrases in text categorization (TC). We do so by applying a keyphrase extraction algorithm to the Reuters collection and then using the keyphrases extracted as features of the TC algorithm. We report the results of our experiments, using various number of keyphrases per document comparing the performances with the bag-of-words (BOW) representation

Using Keyphrases as Features for Text Categorization

D'Avanzo, Ernesto;Lavelli, Alberto;Magnini, Bernardo;Zanoli, Roberto
2003-01-01

Abstract

Keyphrases provide a semantic metadata that summarize and characterize documents. Since keyphrases summarize documents very concisely, they can be used as a low-cost measure of similarity between documents, making it possible to cluster documents into groups by measuring overlap between the keyphrases they are assigned to. In this work we investigate the usefulness of keyphrases in text categorization (TC). We do so by applying a keyphrase extraction algorithm to the Reuters collection and then using the keyphrases extracted as features of the TC algorithm. We report the results of our experiments, using various number of keyphrases per document comparing the performances with the bag-of-words (BOW) representation
2003
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/2456
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact