摘要
变量选择是进行多元回归的首要环节,变量选择的结果直接影响参数估计的效果及模型的预测精果,发现使用SCAD方法在Logistic回归中可以较准确地将非重要变量的回归系数压缩为零。另外将模型应用于实际数据分析,达到了与模拟一致的效果,且在解决自变量多重共线性方面以及提升模型预测精度上均有良好的表现。
Variable selection is the first step of multiple regression,and the result of variable selection directly affects the effect of parameter estimation and the prediction accuracy of the model.This paper mainly adopts the Monte Carlo simulation to compare the variable selection results of the Lasso,Elastic Net and SCAD methods for Logistic regression models,discovering that that the SCAD method can be used to accurately compress the regression coefficients of unimportant variables to zero in Logistic regression.In addition,the model is applied to the actual data analysis,which has achieved the same effect as the simulation,and has a good performance in solving the multicollinearity of independent variables and improving the prediction accuracy of the model.
作者
王倩
李风军
Wang Qian;Li Fengjun(School of Mathematics and Statistics,Ningxia University,Yinchuan 750021,China)
出处
《统计与决策》
CSSCI
北大核心
2021年第16期48-51,共4页
Statistics & Decision
基金
国家自然科学基金资助项目(12061055)
宁夏自然科学基金资助项目(2020AAC03030,2021AAC03175)
宁夏大学研究生创新项目(GIP2020-31)。
关键词
变量选择
LOGISTIC回归
系数压缩法
广义线性模型
variable selection
Logistic regression
coefficient compression method
generalized linear model