This paper presents a new resource, called Content Types Dataset, to promote the analysis of texts as a composition of units with specific semantic and functional roles. By developing this dataset, we also introduce a new NLP task for the automatic classification of Content Types. The annotation scheme and the dataset are described together with two sets of classification experiments.
The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts
Sprugnoli, Rachele;Tonelli, Sara;Moretti, Giovanni
2017-01-01
Abstract
This paper presents a new resource, called Content Types Dataset, to promote the analysis of texts as a composition of units with specific semantic and functional roles. By developing this dataset, we also introduce a new NLP task for the automatic classification of Content Types. The annotation scheme and the dataset are described together with two sets of classification experiments.File in questo prodotto:
File | Dimensione | Formato | |
---|---|---|---|
E17-2042.pdf
solo utenti autorizzati
Tipologia:
Documento in Post-print
Licenza:
DRM non definito
Dimensione
115.74 kB
Formato
Adobe PDF
|
115.74 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.