Benchmarking Historical Phase Recognition from Text and Events

Celli, Fabio; Rovera, Marco

This paper presents preliminary studies on a benchmark for the Historical Phase Recognition task. This task explores the application of computational linguistics to the study of long-term historical dynamics. We compare the utility of Event Tagging and BERT embeddings for classifying the phases of secular cycles defined by the the Structural-Demographic Theory. We explore this task both as five-class classification (crisis, growth, population immiseration, elite overproduction, State stess) and binary classification (rise, decline), on the basis of human- and LLM-annotated labels. Our findings reveal that Event Tagging, when aligned with human annotations, yields good performance in multi-class classification, but not in binary classification. Conversely, using BERT to extract features directly from text yields better performances with LLM-generated labels, in particular on the binary classification task. We also report higher inter-annotator agreement between LLMs compared to humans when labeling historical phases.