Predictive process monitoring (PPM) is a research area that focuses on predicting measures of interest (e.g., the completion time) for running cases based on event logs. State-of-the-art PPM techniques only consider intra-case information that comes from the case whose measures of interest one wishes to predict. However, in many systems, the outcome of a running case depends on the interplay of all cases that are being executed concurrently, or can be derived from the characteristics of cases that are executed in the same period of time. For example, in many situations, running cases compete over scarce resources, and the completion time of a running case can be derived from the number of similar cases running concurrently. In this work, we present a general framework for feature encoding that relies on a bi-dimensional state space representation. The first dimension corresponds to intra-case dependencies and utilizes existing feature encoding techniques. The second dimension encodes inter-case features using two approaches: (1) a knowledge-driven encoding (KDE), which assumes prior knowledge on case types, and (2) a data-driven encoding (DDE), which automatically identifies case types from data using case proximity metrics. Both approaches partition the event log into sets of cases that share common characteristics, and derive features according to these commonalities. We demonstrate the usefulness of the proposed framework with an empirical evaluation carried out against two real-life datasets coming from an outpatient hospital process and a manufacturing process.

From knowledge-driven to data-driven inter-case feature encoding in predictive process monitoring

Francescomarino, Chiara Di;
2019-01-01

Abstract

Predictive process monitoring (PPM) is a research area that focuses on predicting measures of interest (e.g., the completion time) for running cases based on event logs. State-of-the-art PPM techniques only consider intra-case information that comes from the case whose measures of interest one wishes to predict. However, in many systems, the outcome of a running case depends on the interplay of all cases that are being executed concurrently, or can be derived from the characteristics of cases that are executed in the same period of time. For example, in many situations, running cases compete over scarce resources, and the completion time of a running case can be derived from the number of similar cases running concurrently. In this work, we present a general framework for feature encoding that relies on a bi-dimensional state space representation. The first dimension corresponds to intra-case dependencies and utilizes existing feature encoding techniques. The second dimension encodes inter-case features using two approaches: (1) a knowledge-driven encoding (KDE), which assumes prior knowledge on case types, and (2) a data-driven encoding (DDE), which automatically identifies case types from data using case proximity metrics. Both approaches partition the event log into sets of cases that share common characteristics, and derive features according to these commonalities. We demonstrate the usefulness of the proposed framework with an empirical evaluation carried out against two real-life datasets coming from an outpatient hospital process and a manufacturing process.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/320721
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact