The large availability of hospital administrative and clinical data has encouraged the application of Process Mining techniques to the healthcare domain. Predictive Process Monitoring techniques can be used in order to learn from these data related to past historical executions and predict the future of incomplete cases. However, some of these data, possibly the most informative ones, are often available in natural language text, while structured information—extracted from these data—would be more beneficial for training predictive models. In this paper we focus on the scenario of the Home Hospitalization Service, supporting the team in making decisions on the home hospitalization of a patient, by predicting whether it is likely that a new patient will successfully undergo home hospitalization. We aim at investigating whether, in this scenario, we can take advantage of mapping unstructured textual diagnoses, reported by the doctor in the Emergency Department, into structured information, as the standardized disease ICD-9-CM codes, to provide more accurate predictions. To this aim, we devise two different approaches involving respectively lexicographic and semantic distance for mapping textual diagnoses in ICD-9-CM codes and leverage the structured information for making predictions.

Unstructured Data in Predictive Process Monitoring: Lexicographic and Semantic Mapping to ICD-9-CM Codes for the Home Hospitalization Service

Massimiliano Ronzani;Chiara Di Francescomarino;Mauro Dragoni;Chiara Ghidini;
2022-01-01

Abstract

The large availability of hospital administrative and clinical data has encouraged the application of Process Mining techniques to the healthcare domain. Predictive Process Monitoring techniques can be used in order to learn from these data related to past historical executions and predict the future of incomplete cases. However, some of these data, possibly the most informative ones, are often available in natural language text, while structured information—extracted from these data—would be more beneficial for training predictive models. In this paper we focus on the scenario of the Home Hospitalization Service, supporting the team in making decisions on the home hospitalization of a patient, by predicting whether it is likely that a new patient will successfully undergo home hospitalization. We aim at investigating whether, in this scenario, we can take advantage of mapping unstructured textual diagnoses, reported by the doctor in the Emergency Department, into structured information, as the standardized disease ICD-9-CM codes, to provide more accurate predictions. To this aim, we devise two different approaches involving respectively lexicographic and semantic distance for mapping textual diagnoses in ICD-9-CM codes and leverage the structured information for making predictions.
2022
9783031084201
9783031084218
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/336753
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact