摘要
目的 评价支持向量机(SVM)、随机森林、极限梯度提升(XGBoost)3种机器学习算法与logistic回归模型在重症缺血性脑卒中30 d死亡结局预测中的效果。方法 使用2008年至2019年美国重症监护医学信息数据库Ⅳ(MIMIC-Ⅳ)中符合纳入标准的2 358例重症缺血性脑卒中患者资料,分别用SVM、随机森林、XGBoost3种机器学习算法与logistic回归方法,结合合成少数过采样技术(SMOTE)建立早期死亡预测模型,并使用ROC曲线的AUC值、准确度、F1分数、布里尔分数等指标评价模型的预测效果。结果 SVM、随机森林、XGBoost与logistic回归模型在原始不平衡数据集中预测早期死亡的AUC值分别为0.78、0.81、0.84、0.83。应用SMOTE合成数据集后,SVM、随机森林、XGBoost与logistic回归模型的AUC值分别为0.72、0.84、0.83、0.83。除SVM模型外,随机森林、XGBoost模型与logistic回归之间有相似的预测能力,但其准确度、布里尔分数均优于logistic回归模型,综合分类性能更优。结论 机器学习算法在缺血性脑卒中早期死亡预测中性能较传统logistic回归方法更优。
Objective To evaluate the effects of 3 machine learning algorithms(support vector machine [SVM],random forest, and extreme gradient boosting [XGBoost]) and logistic regression in predicting the 30-d mortality of severe ischemic stroke patients. Methods The data of 2 358 patients with severe ischemic stroke who qualified for the criteria in the Medical Information Mart for Intensive Care Ⅳ(MIMIC-Ⅳ) database from 2008 to 2019 were used. SVM, random forest, XGBoost and logistic regression combined with synthetic minority oversampling technique(SMOTE) were used respectively to build early mortality prediction models. The prediction performance of models was evaluated by the area under curve(AUC) of receiver operating characteristic curve, accuracy, F1-score, and Brier score. Results The AUC values of SVM, random forest, XGBoost and logistic regression models using original unbalance data were 0.78, 0.81, 0.84 and 0.83, respectively. After using SMOTE-based synthetic data, the AUC values of SVM, random forest, XGBoost and logistic regression models were 0.72, 0.84, 0.83 and 0.83, respectively. Except for SVM, random forest and XGBoost had similar predictive ability to logistic regression, but their accuracy and Brier score were better than logistic regression, and their overall classification performance was better. Conclusion Machine learning algorithms have better performance than traditional logistic regression in predicting early mortality of ischemic stroke patients.
作者
罗枭
程义
何倩
涂博祥
吴骋
贺佳
LUO Xiao;CHENG Yi;HE Qian;TU Bo-xiang;WU Cheng;HE Jia(Department of Military Health Statistics,Faculty of Health Services,Naval Medical University(Second Military Medical University),Shanghai 200433,China)
出处
《海军军医大学学报》
CAS
CSCD
北大核心
2022年第12期1365-1371,共7页
Academic Journal of Naval Medical University
基金
军队双重学科建设项目-03
上海市公共卫生体系建设三年行动计划学科带头人计划(GWV-10.2-XD05)
上海市公共卫生体系建设三年行动计划学科建设项目(GWV-10.1-XK05)。
关键词
重症缺血性脑卒中
早期死亡预测
机器学习
合成少数过采样技术
severe ischemic stroke
early mortality prediction
machine learning
synthetic minority oversampling technique