双重稀疏结构的线性回归模型是一种描述解释变量组间和组内同时具有稀疏性的统计模型,我们常用Sparse Group Lasso对此模型进行变量选择.然而在很多应用中,解释变量很难做到精确测量,从而我们在应用Sparse Group Lasso方法时需要考虑测...双重稀疏结构的线性回归模型是一种描述解释变量组间和组内同时具有稀疏性的统计模型,我们常用Sparse Group Lasso对此模型进行变量选择.然而在很多应用中,解释变量很难做到精确测量,从而我们在应用Sparse Group Lasso方法时需要考虑测量误差的影响.针对这一问题,本文提出了一种具有双重稀疏结构的线性测量误差回归模型的Sparse Group Lasso变量选择方法(MESGL).该方法先利用半正定投影算子对观测数据的误差进行修正,然后借助ADMM算法对修正后的数据进行恢复,最后利用Sparse Group Lasso方法进行变量选择和参数估计.在一些正则条件下,我们建立了参数估计量的非渐近Oracle不等式,并且通过随机模拟分析验证了MESGL方法在变量选择和参数估计上取得的良好效果.展开更多
We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size.We use truncated series expansion based on polyn...We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size.We use truncated series expansion based on polynomial splines to approximate the nonparametric component in this model.Under a sparsity assumption on the regression coefficients of the linear component and some regularity conditions,we derive the oracle inequalities for the prediction risk and the estimation error.We also provide sufficient conditions under which the Lasso estimator is selection consistent for the variables in the linear part of the model.In addition,we derive the rate of convergence of the estimator of the nonparametric function.We conduct simulation studies to evaluate the finite sample performance of variable selection and nonparametric function estimation.展开更多
文摘双重稀疏结构的线性回归模型是一种描述解释变量组间和组内同时具有稀疏性的统计模型,我们常用Sparse Group Lasso对此模型进行变量选择.然而在很多应用中,解释变量很难做到精确测量,从而我们在应用Sparse Group Lasso方法时需要考虑测量误差的影响.针对这一问题,本文提出了一种具有双重稀疏结构的线性测量误差回归模型的Sparse Group Lasso变量选择方法(MESGL).该方法先利用半正定投影算子对观测数据的误差进行修正,然后借助ADMM算法对修正后的数据进行恢复,最后利用Sparse Group Lasso方法进行变量选择和参数估计.在一些正则条件下,我们建立了参数估计量的非渐近Oracle不等式,并且通过随机模拟分析验证了MESGL方法在变量选择和参数估计上取得的良好效果.
文摘We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size.We use truncated series expansion based on polynomial splines to approximate the nonparametric component in this model.Under a sparsity assumption on the regression coefficients of the linear component and some regularity conditions,we derive the oracle inequalities for the prediction risk and the estimation error.We also provide sufficient conditions under which the Lasso estimator is selection consistent for the variables in the linear part of the model.In addition,we derive the rate of convergence of the estimator of the nonparametric function.We conduct simulation studies to evaluate the finite sample performance of variable selection and nonparametric function estimation.