This paper describes the KNOWLEDGESTORE, a large-scale infrastructure for the combined storage and interlinking of multimedia re- sources and ontological knowledge. Information in the KNOWLEDGESTORE is organized around entities, such as persons, organizations and locations. The system allows (i) to import background knowledge about entities, in form of annotated RDF triples; (ii) to associate resources to entities by automatically recognizing, coreferring and linking mentions of named entities; and (iii) to derive new entities based on knowledge extracted from mentions. The KNOWLEDGESTORE builds on state of art technologies for language processing, including document tagging, named entity extraction and cross-document coreference. Its design provides for a tight integration of linguistic and semantic features, and eases the further processing of information by explicitly representing the contexts where knowledge and mentions are valid or relevant. We describe the system and report about the creation of a large-scale KNOWLEDGESTORE instance for storing and integrating multimedia contents and background knowledge relevant to the Italian Trentino region.
The KnowledgeStore: an Entity-Based Storage System
Cattoni, Roldano;Corcoglioniti, Francesco;Girardi, Christian;Magnini, Bernardo;Serafini, Luciano;Zanoli, Roberto
2012-01-01
Abstract
This paper describes the KNOWLEDGESTORE, a large-scale infrastructure for the combined storage and interlinking of multimedia re- sources and ontological knowledge. Information in the KNOWLEDGESTORE is organized around entities, such as persons, organizations and locations. The system allows (i) to import background knowledge about entities, in form of annotated RDF triples; (ii) to associate resources to entities by automatically recognizing, coreferring and linking mentions of named entities; and (iii) to derive new entities based on knowledge extracted from mentions. The KNOWLEDGESTORE builds on state of art technologies for language processing, including document tagging, named entity extraction and cross-document coreference. Its design provides for a tight integration of linguistic and semantic features, and eases the further processing of information by explicitly representing the contexts where knowledge and mentions are valid or relevant. We describe the system and report about the creation of a large-scale KNOWLEDGESTORE instance for storing and integrating multimedia contents and background knowledge relevant to the Italian Trentino region.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.