This paper explores the use of bilingual parallel corpora as a source of lexical knowledge for cross-lingual textual entailment. We claim that, in spite of the inherent difficulties of the task, phrase tables extracted from parallel data allow to capture both lexical relations between single words, and contextual information useful for inference. We experiment with a phrasal matching method in order to: i) build a system portable across languages, and ii) evaluate the contribution of lexical knowledge in isolation, without interaction with other inference mechanisms. Results achieved on an English-Spanish corpus obtained from the RTE3 dataset support our claim, with an overall accuracy above average scores reported by RTE participants on monolingual data. Finally, we show that using parallel corpora to extract paraphrase tables reveals their potential also in the monolingual setting, improving the results achieved with other sources of lexical knowledge.

Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment

Mehdad, Yashar;Negri, Matteo;Federico, Marcello
2011-01-01

Abstract

This paper explores the use of bilingual parallel corpora as a source of lexical knowledge for cross-lingual textual entailment. We claim that, in spite of the inherent difficulties of the task, phrase tables extracted from parallel data allow to capture both lexical relations between single words, and contextual information useful for inference. We experiment with a phrasal matching method in order to: i) build a system portable across languages, and ii) evaluate the contribution of lexical knowledge in isolation, without interaction with other inference mechanisms. Results achieved on an English-Spanish corpus obtained from the RTE3 dataset support our claim, with an overall accuracy above average scores reported by RTE participants on monolingual data. Finally, we show that using parallel corpora to extract paraphrase tables reveals their potential also in the monolingual setting, improving the results achieved with other sources of lexical knowledge.
2011
9781932432879
File in questo prodotto:
File Dimensione Formato  
P11-1134.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Creative commons
Dimensione 247.16 kB
Formato Adobe PDF
247.16 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/38580
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact