期刊文献+

基于多维动态特征的重症患者死亡风险预测模型构建 被引量:1

Development of mortality prediction model for critically ill patients based on multidimensional and dynamic clinical characteristics
原文传递
导出
摘要 目的基于医院信息系统(HIS)收集的重症患者多维动态临床特征,采用随机森林算法构建死亡风险预测模型,并比较该模型和急性生理学与慢性健康状况评分Ⅱ(APACHEⅡ)模型的预测效能。方法从中南大学湘雅三医院HIS系统提取2014年1月至2020年6月收治的10925例年龄在14岁以上的重症住院患者病历资料,同时提取所有重症患者的APACHEⅡ评分记录,并基于APACHEⅡ评分系统中提出的死亡风险计算公式计算患者的预期死亡概率。将有APACHEⅡ评分记录的689个样本作为测试集;其他10236个样本数据用于建立随机森林模型,其中随机选取10%(n=1024)作为验证集,90%(n=9212)作为训练集。按照病危结束前3 d的时间序列选取患者一般资料、生命体征数据、生化检验结果和静脉用药剂量等临床特征,构建重症患者死亡风险预测的随机森林模型。绘制受试者工作特征曲线(ROC曲线),通过ROC曲线下面积(AUROC)评价模型预测效能;根据精准率(Precision)和召回率(Recall)绘制Precision-Recall曲线(PR曲线),通过PR曲线下面积(AUPRC)评价模型的分类准确性;绘制校准曲线,通过校准度指标Brier分数评估模型预测的事件发生概率与实际发生概率的一致性。结果10925例重症患者均纳入分析,其中男性7797例(占71.4%),女性3128例(占28.6%);年龄(58.9±16.3)岁;中位住院时间12(7,20)d;8538例(78.2%)患者入重症监护病房(ICU),中位ICU住院时间66(13,151)h;住院病死率19.0%(2077/10925)。与存活组(n=8848)比较,死亡组(n=2077)患者年龄更大(岁:60.1±16.5比58.5±16.4,P<0.01),入ICU比例更高〔82.8%(1719/2077)比77.1%(6819/8848),P<0.01〕,且合并高血压、糖尿病及脑卒中史的比例亦更高〔44.7%(928/2077)比36.3%(3212/8848),20.0%(415/2077)比16.9%(1495/8848),15.5%(322/2077)比10.0%(885/8848),均P<0.01〕。在测试集数据中,随机森林模型对重症患者住院期间死亡风险的预测价值大于APACHEⅡ模型,主要表现为随机森林模型的AUROC和AUPRC均高于APACHEⅡ模型〔AUROC:0.856(95%可信区间为0.812~0.896)比0.783(95%可信区间为0.737~0.826),AUPRC:0.650(95%可信区间为0.604~0.762)比0.524(95%可信区间为0.439~0.609)〕,Brier分数低于APACHEⅡ模型〔0.104(95%可信区间为0.085~0.113)比0.124(95%可信区间为0.107~0.141)〕。结论基于多维动态特征的随机森林模型对于预测重症患者住院期间死亡风险具有较大的应用价值,且优于传统APACHEⅡ评分系统。 Objective To develop a mortality prediction model for critically ill patients based on multidimensional and dynamic clinical data collected by the hospital information system(HIS)using random forest algorithm,and to compare the prediction efficiency of the model with acute physiology and chronic health evaluationⅡ(APACHEⅡ)model.Methods The clinical data of 10925 critically ill patients aged over 14 years old admitted to the Third Xiangya Hospital of Central South University from January 2014 to June 2020 were extracted from the HIS system,and APACHEⅡscores of the critically ill patients were extracted.Expected mortality of patients was calculated according to the death risk calculation formula of APACHEⅡscoring system.A total of 689 samples with APACHEⅡscore records were used as the test set,and the other 10236 samples were used to establish the random forest model,of which 10%(n=1024)were randomly selected as the validation set and 90%(n=9212)were selected as the training set.According to the time series of 3 days before the end of critical illness,the clinical characteristics of patients such as general information,vital signs data,biochemical test results and intravenous drug doses were selected to develope a random forest model for predicting the mortality of critically ill patients.Using the APACHEⅡmodel as a reference,receiver operator characteristic curve(ROC curve)was drawn,and the discrimination performance of the model was evaluated through the area under the ROC curve(AUROC).According to the precision and recall,Precision-Recall curve(PR curve)was drawn,and the calibration performance of the model was evaluated through the area under the PR curve(AUPRC).Calibration curve was drawn,and the consistency between the predicted event occurrence probability of the model and the actual occurrence probability was evaluated through the calibration index Brier score.Results Among the 10925 patients,there were 7797 males(71.4%)and 3128 females(28.6%).The average age was(58.9±16.3)years old.The median length of hospital stay was 12(7,20)days.Most patients(n=8538,78.2%)were admitted to intensive care unit(ICU),and the median length of ICU stay was 66(13,151)hours.The hospitalized mortality was 19.0%(2077/10925).Compared with the survival group(n=8848),the patients in the death group(n=2077)were older(years old:60.1±16.5 vs.58.5±16.4,P<0.01),the ratio of ICU admission was higher[82.8%(1719/2077)vs.77.1%(6819/8848),P<0.01],and the proportion of patients with hypertension,diabetes and stroke history was also higher[44.7%(928/2077)vs.36.3%(3212/8848),20.0%(415/2077)vs.16.9%(1495/8848),15.5%(322/2077)vs.10.0%(885/8848),all P<0.01].In the test set data,the prediction value of random forest model for the risk of death during hospitalization of critically ill patients was greater than that of APACHEⅡmodel,which showed by that the AUROC and AUPRC of random forest model were higher than those of APACHEⅡmodel[AUROC:0.856(95%confidence interval was 0.812-0.896)vs.0.783(95%confidence interval was 0.737-0.826),AUPRC:0.650(95%confidence interval was 0.604-0.762)vs.0.524(95%confidence interval was 0.439-0.609)],and Brier score was lower than that of APACHEⅡmodel[0.104(95%confidence interval was 0.085-0.113)vs.0.124(95%confidence interval was 0.107-0.141)].Conclusion The random forest model based on multidimensional dynamic characteristics has great application value in predicting hospital mortality risk for critically ill patients,and it is superior to the traditional APACHEⅡscoring system.
作者 赵尚平 汤观秀 刘盼 郭延明 杨明施 李国辉 Zhao Shangping;Tang Guanxiu;Liu Pan;Guo Yanming;Yang Mingshi;Li Guohui(Laboratory for Big Data and Decision,National University of Defense Technology,Changsha 410003,Hunan,China;Department of Intensive Care Unit,the Third Xiangya Hospital,Central South University,Changsha 410013,Hunan,China;Department of Nursing,the Third Xiangya Hospital,Central South University,Changsha 410013,Hunan,China)
出处 《中华危重病急救医学》 CAS CSCD 北大核心 2023年第4期415-420,共6页 Chinese Critical Care Medicine
基金 国家重点研发计划项目(2018YFC2001800)。
关键词 重症 死亡风险 预测 随机森林 电子医疗记录 Critical illness Death risk Prediction Random forest Electronic medical record
  • 相关文献

参考文献6

二级参考文献45

共引文献174

同被引文献15

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部