Process extraction from text is an important task of process discovery, for which various approaches have been developed in recent years. However, differently from other information extraction tasks, there is a lack of gold-standard corpora of business process descriptions carefully annotated with all the entities and relationships of interest. This paper presents the PET dataset, a first corpus of business process descriptions annotated with activities, gateways, actors, and flow information. We present our new resource, including a variety of baselines to benchmark the difficulty and challenges of business process extraction from text. The PET dataset, annotation guidelines, and inception schema are freely available via huggingface.co/datasets/patriziobellan/PET.

PET: An Annotated Dataset for Process Extraction from Natural Language Text Tasks

Patrizio Bellan;Mauro Dragoni;Chiara Ghidini;Simone Paolo Ponzetto
2023-01-01

Abstract

Process extraction from text is an important task of process discovery, for which various approaches have been developed in recent years. However, differently from other information extraction tasks, there is a lack of gold-standard corpora of business process descriptions carefully annotated with all the entities and relationships of interest. This paper presents the PET dataset, a first corpus of business process descriptions annotated with activities, gateways, actors, and flow information. We present our new resource, including a variety of baselines to benchmark the difficulty and challenges of business process extraction from text. The PET dataset, annotation guidelines, and inception schema are freely available via huggingface.co/datasets/patriziobellan/PET.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/337527
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact