In this paper we illustrate an approach to the creation of high quality linguistically annotated resources based on the exploitation of aligned parallel corpora. This approach is based on the key notion that translating a test can be senn as a linguistic annotation task which is easier than manual annotation with formal schemes. After translation, formal annotation can be automatically derived from aligned translated texts. We will soon that translations can be exploited in various interesting ways to speed up and automate the linguistic annotation of texts. If none of the texts is already annotated, information from aligned texts can be exploited to carry out the annotation from scratch. On the contrary, if the texts in one language have been annotated and the oghers have not, annotations can be transferred from one language to the other. The transfer-based method allows for the exploitation of existing (mostly English) annotated resources to bootstrap the creation of annotated corpora in new languages with highly reduced human effort

Translation as Annotation

Pianta, Emanuele;Bentivogli, Luisa
2003-01-01

Abstract

In this paper we illustrate an approach to the creation of high quality linguistically annotated resources based on the exploitation of aligned parallel corpora. This approach is based on the key notion that translating a test can be senn as a linguistic annotation task which is easier than manual annotation with formal schemes. After translation, formal annotation can be automatically derived from aligned translated texts. We will soon that translations can be exploited in various interesting ways to speed up and automate the linguistic annotation of texts. If none of the texts is already annotated, information from aligned texts can be exploited to carry out the annotation from scratch. On the contrary, if the texts in one language have been annotated and the oghers have not, annotations can be transferred from one language to the other. The transfer-based method allows for the exploitation of existing (mostly English) annotated resources to bootstrap the creation of annotated corpora in new languages with highly reduced human effort
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/1644
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact