SiteIF is a personal agent for a bilingual news web site that learns user's interests from the requested pages. In this paper we propose to use a word sense based document representation as a starting point to build a model of the user's interests. Documents passed over are processed and relevant senses (disambiguated over WordNet) are extracted and then combined to form a semantic network. A filtering procedure dynamically predicts new documents on the basis of the semantic network. There are two main advantages of a sense-based approach: first, the model predictions, being based on senses rather then words, are more accurate; second, the model is language independent, allowing navigation in multilingual sites. We report the results of a comparative experiment that has been carried out to give a quantitative estimation of these improvements
User modelling for news web sites with word sense based techniques
Magnini, Bernardo;Strapparava, Carlo
2004-01-01
Abstract
SiteIF is a personal agent for a bilingual news web site that learns user's interests from the requested pages. In this paper we propose to use a word sense based document representation as a starting point to build a model of the user's interests. Documents passed over are processed and relevant senses (disambiguated over WordNet) are extracted and then combined to form a semantic network. A filtering procedure dynamically predicts new documents on the basis of the semantic network. There are two main advantages of a sense-based approach: first, the model predictions, being based on senses rather then words, are more accurate; second, the model is language independent, allowing navigation in multilingual sites. We report the results of a comparative experiment that has been carried out to give a quantitative estimation of these improvementsI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.