Within a company, processes are typically documented in form of unstructured textual information. To exploit all the techniques of Business Process Management and Process Mining, process models need to be represented in a formal (or semi-formal) representation, the process model diagram. However, manually obtaining an initial process model out of a process description document is a time consuming and cost intensive operation. Some initial solutions to address the challenge of process extraction from text have been proposed in the literature. But, the analysis of state of the art contributions reveals that this line of research has not reached its maturity yet and that process extraction from text can be considered an unresolved problem still in an early stage of development. Indeed, these contributions mainly adopt ad-hoc solutions based on rules, word-lists, and heuristics. In this paper, we adopt the instrument of qualitative analysis on state-of-the-art approaches and tools to shed light on current limitations of the process extraction from text area. In addition to an analysis of the main reference papers we test reference tools on samples of text extracted from real documents describing Standard Operating Procedures that exhibit a greater complexity than the publicly available procedural descriptions so far used as reference text by the process extraction from text community. The analysis reveals the inability for those approaches to perform well in real scenarios. The discussion of the results illustrates open points, fundamental challenges to solve, and gaps to fill. It also suggests new ideas on how to tackle some of the identified limitations which we intend to pursue in the future.

A Qualitative Analysis of the State of the Art in Process Extraction from Text

Patrizio Bellan;Mauro Dragoni;Chiara Ghidini
2020-01-01

Abstract

Within a company, processes are typically documented in form of unstructured textual information. To exploit all the techniques of Business Process Management and Process Mining, process models need to be represented in a formal (or semi-formal) representation, the process model diagram. However, manually obtaining an initial process model out of a process description document is a time consuming and cost intensive operation. Some initial solutions to address the challenge of process extraction from text have been proposed in the literature. But, the analysis of state of the art contributions reveals that this line of research has not reached its maturity yet and that process extraction from text can be considered an unresolved problem still in an early stage of development. Indeed, these contributions mainly adopt ad-hoc solutions based on rules, word-lists, and heuristics. In this paper, we adopt the instrument of qualitative analysis on state-of-the-art approaches and tools to shed light on current limitations of the process extraction from text area. In addition to an analysis of the main reference papers we test reference tools on samples of text extracted from real documents describing Standard Operating Procedures that exhibit a greater complexity than the publicly available procedural descriptions so far used as reference text by the process extraction from text community. The analysis reveals the inability for those approaches to perform well in real scenarios. The discussion of the results illustrates open points, fundamental challenges to solve, and gaps to fill. It also suggests new ideas on how to tackle some of the identified limitations which we intend to pursue in the future.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/325926
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact