We present a work to evaluate the hypothesis that automatic evaluation metrics developed for Machine Translation (MT) systems have significant impact on predicting semantic similarity scores in Semantic Textual Similarity (STS) task, in light of their usage for paraphrase identification. We show that different metrics may have different behaviors and significances along the semantic scale [0-5] of the STS task. In addition, we compare several classification algorithms using a combination of different MT metrics to build an STS system; consequently, we show that although this approach obtains state of the art result in paraphrase identification task, it is insufficient to achieve the same result in STS.
Learning the Impact of Machine Translation Evaluation Metrics for Semantic Textual Similarity
Magnolini, Simone;Ngoc Phuoc An, Vo;Popescu, Octavian
2015-01-01
Abstract
We present a work to evaluate the hypothesis that automatic evaluation metrics developed for Machine Translation (MT) systems have significant impact on predicting semantic similarity scores in Semantic Textual Similarity (STS) task, in light of their usage for paraphrase identification. We show that different metrics may have different behaviors and significances along the semantic scale [0-5] of the STS task. In addition, we compare several classification algorithms using a combination of different MT metrics to build an STS system; consequently, we show that although this approach obtains state of the art result in paraphrase identification task, it is insufficient to achieve the same result in STS.File | Dimensione | Formato | |
---|---|---|---|
paper.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
Creative commons
Dimensione
12.47 MB
Formato
Adobe PDF
|
12.47 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.