摘要
帕金森病作为一种慢性神经系统疾病严重影响中老年人群的生活质量,其运动功能评分(motor-UPDRS)对于评估患者疾病严重程度和治疗效果至关重要。本文基于UCI的帕金森病远程语音数据集,首先采用lasso变量选择方法筛选出影响运动功能评分的15个重要特征,分别构建高斯过程回归、支持向量回归、随机森林和XGBoost四种机器学习模型。经过训练、贝叶斯调参优化和性能评估,发现XGBoost模型在预测帕金森病患者的运动功能评分方面表现最优,模型的RMSE、MAE和R2分别为1.18,0.63和0.98。最后对XGBoost模型的特征重要性进行分析,探寻影响帕金森病预测的关键特征,为帕金森病的诊疗提供更加合理的理论和技术支撑。As a chronic neurological disease, Parkinson’s disease seriously affects the quality of middle-aged and elderly people’s life. Its motor function score (motor-UPDRS) is crucial to assess the severity of the disease and the treatment effect of patients. Based on the remote voice dataset of Parkinson’s disease from UCI, this paper first used the lasso variable selection method to screen out 15 important features that affect motor function scores, and constructed four machine learning models, namely Gaussian process regression, support vector regression, random forest and XGBoost respectively. After training, parameter tuning and performance evaluation, it was found that the XGBoost model had the best performance in predicting the motor function score of patients with Parkinson’s disease, with RMSE, MAE and R2 of the model being 1.18, 0.63 and 0.98 respectively. Finally, the feature importance of the XGBoost model was analyzed, and the key features affecting the prediction of Parkinson’s disease were explored, so as to provide more reasonable theoretical and technical support for the diagnosis and treatment of Parkinson’s disease.
出处
《统计学与应用》
2024年第5期1663-1676,共14页
Statistical and Application