IRIS Institutional Research Information System

Misinformation is a global issue that shapes public discourse, influencing opinions and decision-making across various domains. While automated fact-checking (AFC) has become essential in combating misinformation, most work in multilingual settings has focused on claim verification rather than generating explanatory verdicts (i.e. short texts discussing the veracity of the claim), leaving a gap in AFC resources beyond English.To this end, we introduce EuroVerdict, a multilingual dataset designed for verdict generation, covering eight European languages. Developed in collaboration with professional fact-checkers, the dataset comprises claims, manually written verdicts, and supporting evidence, including fact-checking articles and additional secondary sources. We evaluate EuroVerdict with Llama-3.1-8B-Instruct on verdict generation under different settings, varying the prompt language, input article language, and training approach. Our results show that fine-tuning consistently improves performance, with models fine-tuned on original-language articles achieving the highest scores in both automatic and human evaluations. Using articles in a different language from the claim slightly lowers performance; however, pairing them with language-specific prompts improves results. Zero-shot and Chain-of-Thought setups perform worse, reinforcing the benefits of fine-tuning for multilingual verdict generation.

EuroVerdict: A Multilingual Dataset for Verdict Generation Against Misinformation

Russo, Daniel;Sadeghi, Fariba;Menini, Stefano;Guerini, Marco

2025-01-01

Abstract

Misinformation is a global issue that shapes public discourse, influencing opinions and decision-making across various domains. While automated fact-checking (AFC) has become essential in combating misinformation, most work in multilingual settings has focused on claim verification rather than generating explanatory verdicts (i.e. short texts discussing the veracity of the claim), leaving a gap in AFC resources beyond English.To this end, we introduce EuroVerdict, a multilingual dataset designed for verdict generation, covering eight European languages. Developed in collaboration with professional fact-checkers, the dataset comprises claims, manually written verdicts, and supporting evidence, including fact-checking articles and additional secondary sources. We evaluate EuroVerdict with Llama-3.1-8B-Instruct on verdict generation under different settings, varying the prompt language, input article language, and training approach. Our results show that fine-tuning consistently improves performance, with models fine-tuned on original-language articles achieving the highest scores in both automatic and human evaluations. Using articles in a different language from the claim slightly lowers performance; however, pairing them with language-specific prompts improves results. Zero-shot and Chain-of-Thought setups perform worse, reinforcing the benefits of fine-tuning for multilingual verdict generation.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2025

Appare nelle tipologie:

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2025.findings-acl.853.pdf accesso aperto Licenza: Copyright dell'editore Dimensione 6.43 MB Formato Adobe PDF Visualizza/Apri	6.43 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/369668

Citazioni

ND

social impact