Current Statistical Machine Translation (SMT) systems translate texts sentence by sentence without considering any cross-sentential context. Assuming independence between sentences makes it difficult to take certain translation de- cisions when the necessary information cannot be determ- ined locally. We argue for the necessity to include cross- sentence dependencies in SMT. As a case in point, we study the problem of pronominal anaphora translation by manually evaluating German-English SMT output. We then present a word dependency model for SMT, which can represent links between word pairs in the same or in different sentences. We use this model to integrate the output of a coreference resol- ution system into English-German SMT with a view to im- proving the translation of anaphoric pronouns.
Modelling Pronominal Anaphora in Statistical Machine Translation
Hardmeier, Christian;Federico, Marcello
2010-01-01
Abstract
Current Statistical Machine Translation (SMT) systems translate texts sentence by sentence without considering any cross-sentential context. Assuming independence between sentences makes it difficult to take certain translation de- cisions when the necessary information cannot be determ- ined locally. We argue for the necessity to include cross- sentence dependencies in SMT. As a case in point, we study the problem of pronominal anaphora translation by manually evaluating German-English SMT output. We then present a word dependency model for SMT, which can represent links between word pairs in the same or in different sentences. We use this model to integrate the output of a coreference resol- ution system into English-German SMT with a view to im- proving the translation of anaphoric pronouns.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.