Objectives: Predicting clinical patients’ vital signs remains a critical issue in intensive
care unit (ICU) related studies. However, studies on electronic health record (EHR) data
have mostly analyzed numerical data and rarely semi-structured textual data.
Methods: Our study used structured and semi-structured data (i.e., patients’ diagnosis
data and inspection reports) collected from the MIMIC-III database. First, we used
the Latent Dirichlet Allocation (LDA) model (a model employed in natural language
processing) to process semi-structured data. Then, we used machine learning methods for
the prediction of clinical outcomes in 38,597 adult ICU patients.
Results: Based on the results, combining the structured and semi-structured data of
ICU patients can strengthen the ICU patient mortality prediction accuracy. The model
with machine learning methods generated favorable mortality predictions, where the
highest AUROC, for long-term mortality is 0.871, and the highest AUROC for short-term
mortality is 0.922.
Conclusions: The constructed model successfully identified crucial variables for
predicting patient mortality. Thus, when providing medical services to patients, health care
personnel may consider the critical variables associated with the patients’ hospitalization
durations to ensure that the patients receive optimal medical services.