This work reviews information retrieval systems developed at ITC-irst, which have been evaluated through several tracks of CLEF over the last three years. The presentation tries to follow the progress made over time in developing new statistical models first for monolingual information retrieval, and then for cross-language information retrieval. In addition to describing the underlying theory, the performance of monolingual and bilingual information retrieval models are reported, respectively, for Italian monolingual tracks and Italian-English bilingual tracks of CLEF. Monolingual systems by ITC-irst performed consistently well in all the official evaluations, while the bilingual system ranked in CLEF 2002 just behind competitors using commercial machine translation engines. However, by experimentally comparing our statistical topic translation model against a state-of-the-art commercial system, no statistically significant difference in retrieval performance could be measured on a larger set of queries
Statistical Models for Monolingual and Bilingual Information Retrieval
Bertoldi, Nicola;Federico, Marcello
2004-01-01
Abstract
This work reviews information retrieval systems developed at ITC-irst, which have been evaluated through several tracks of CLEF over the last three years. The presentation tries to follow the progress made over time in developing new statistical models first for monolingual information retrieval, and then for cross-language information retrieval. In addition to describing the underlying theory, the performance of monolingual and bilingual information retrieval models are reported, respectively, for Italian monolingual tracks and Italian-English bilingual tracks of CLEF. Monolingual systems by ITC-irst performed consistently well in all the official evaluations, while the bilingual system ranked in CLEF 2002 just behind competitors using commercial machine translation engines. However, by experimentally comparing our statistical topic translation model against a state-of-the-art commercial system, no statistically significant difference in retrieval performance could be measured on a larger set of queriesI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.