Combining WordNet and Dewey Decimal Classification for Building Lexical Resources for Information Extraction from Text

Cavaglià, G.; Ciravegna, F.

One of the aspects that is limiting the spread of applications in the field of Information Extraction from test is the cost of new applications. The lexicon definition is in particular one of the main bottlenecks. Generic resources such as lexical data bases are promising sources of information for reducing the cost of specific lesica definition, but they introduce lexical ambiguity that is difficult to control. In this paper we show how it is possible to build application specific lexica for information extraction from text by using WordNet. Lexical ambiguity is kept under control by marking synsets in WordNet with fields labels taken from the Dewey Decimal Classification