The recent achievements in Natural Language Processing in terms of scalability and performance, and the large availability of background knowledge within the Semantic Web and the Linked Open Data initiative, encourage researchers in doing a further step towards the creation of machines capable of understanding multimedia documents by exploiting background knowledge. To pursue this direction it turns out to be necessary to maintain a clear link between knowledge and the documents containing it. This is achieved in the KnowledgeStore, a scalable content management system that supports the tight integration and storage of multimedia resources and background and extracted knowledge. Integration is done by (i) identifying mentions of named entities in multimedia resources, (ii) establishing mention coreference and either (iii) linking mentions to entities in the background knowledge, or (iv) extending that knowledge with new entities. We present the KnowledgeStore and describe its use in creating a large scale repository of knowledge and multimedia resources in the Italian Trentino region, whose interlinking allows us to explore advanced tasks such as entity-based search and semantic enrichment.
Anchoring Background Knowledge to Rich Multimedia Contexts in the KnowledgeStore
Cattoni, Roldano;Corcoglioniti, Francesco;Girardi, Christian;Magnini, Bernardo;Serafini, Luciano;Zanoli, Roberto
2013-01-01
Abstract
The recent achievements in Natural Language Processing in terms of scalability and performance, and the large availability of background knowledge within the Semantic Web and the Linked Open Data initiative, encourage researchers in doing a further step towards the creation of machines capable of understanding multimedia documents by exploiting background knowledge. To pursue this direction it turns out to be necessary to maintain a clear link between knowledge and the documents containing it. This is achieved in the KnowledgeStore, a scalable content management system that supports the tight integration and storage of multimedia resources and background and extracted knowledge. Integration is done by (i) identifying mentions of named entities in multimedia resources, (ii) establishing mention coreference and either (iii) linking mentions to entities in the background knowledge, or (iv) extending that knowledge with new entities. We present the KnowledgeStore and describe its use in creating a large scale repository of knowledge and multimedia resources in the Italian Trentino region, whose interlinking allows us to explore advanced tasks such as entity-based search and semantic enrichment.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.