This paper describes the system by FBK HLT- MT for cross-lingual semantic textual similar- ity measurement. Our approach is based on supervised regression with an ensemble deci- sion tree. In order to assign a semantic similar- ity score to an input sentence pair, the model combines features collected by state-of-the-art methods in machine translation quality esti- mation and distance metrics between cross- lingual embeddings of the two sentences. In our analysis, we compare different techniques for composing sentence vectors, several dis- tance features and ways to produce training data. The proposed system achieves a mean Pearson’s correlation of 0.39533, ranking 7th among all participants in the cross-lingual STS task organized within the SemEval 2016 evaluation campaign.
FBK HLT-MT at SemEval-2016 Task 1: Cross-lingual Semantic Similarity Measurement Using Quality Estimation Features and Compositional Bilingual Word Embeddings
Ataman, Duygu;Camargo de Souza, José Guilherme;Turchi, Marco;Negri, Matteo
2016-01-01
Abstract
This paper describes the system by FBK HLT- MT for cross-lingual semantic textual similar- ity measurement. Our approach is based on supervised regression with an ensemble deci- sion tree. In order to assign a semantic similar- ity score to an input sentence pair, the model combines features collected by state-of-the-art methods in machine translation quality esti- mation and distance metrics between cross- lingual embeddings of the two sentences. In our analysis, we compare different techniques for composing sentence vectors, several dis- tance features and ways to produce training data. The proposed system achieves a mean Pearson’s correlation of 0.39533, ranking 7th among all participants in the cross-lingual STS task organized within the SemEval 2016 evaluation campaign.File | Dimensione | Formato | |
---|---|---|---|
SemEval88.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
Creative commons
Dimensione
136.5 kB
Formato
Adobe PDF
|
136.5 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.