This paper presents a new resource, called Content Types Dataset, to promote the analysis of texts as a composition of units with specific semantic and functional roles. By developing this dataset, we also introduce a new NLP task for the automatic classification of Content Types. The annotation scheme and the dataset are described together with two sets of classification experiments.

The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts

Sprugnoli, Rachele;Tonelli, Sara;Moretti, Giovanni
2017-01-01

Abstract

This paper presents a new resource, called Content Types Dataset, to promote the analysis of texts as a composition of units with specific semantic and functional roles. By developing this dataset, we also introduce a new NLP task for the automatic classification of Content Types. The annotation scheme and the dataset are described together with two sets of classification experiments.
File in questo prodotto:
File Dimensione Formato  
E17-2042.pdf

solo utenti autorizzati

Tipologia: Documento in Post-print
Licenza: DRM non definito
Dimensione 115.74 kB
Formato Adobe PDF
115.74 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/309699
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact