Variable selection plays an important role in high-dimensional data analysis.But the high-dimensional data often induces the strongly correlated variables problem,which should be properly handled.In this paper,we prop...Variable selection plays an important role in high-dimensional data analysis.But the high-dimensional data often induces the strongly correlated variables problem,which should be properly handled.In this paper,we propose Elastic Net procedure for partially linear models and prove the group effect of its estimate.A simulation study shows that the Elastic Net procedure deals with the strongly correlated variables problem better than the Lasso,ALasso and the Ridge do.Based on the real world data study,we can get that the Elastic Net procedure is particularly useful when the number of predictors pffis much bigger than the sample size n.展开更多
This paper proposes an empirical likelihood based diagnostic technique for heteroscedasticity for semiparametric varying-coefficient partially linear models with missing responses. Firstly, the authors complement the ...This paper proposes an empirical likelihood based diagnostic technique for heteroscedasticity for semiparametric varying-coefficient partially linear models with missing responses. Firstly, the authors complement the missing response variables by regression method. Then, the empirical likelihood method is introduced to study the heteroscedasticity of the semiparametric varying-coefficient partially linear models with complete-case data. Finally, the authors obtain the finite sample property by numerical simulation.展开更多
This paper considers tests for regression coefficients in high dimensional partially linear Models.The authors first use the B-spline method to estimate the unknown smooth function so that it could be linearly express...This paper considers tests for regression coefficients in high dimensional partially linear Models.The authors first use the B-spline method to estimate the unknown smooth function so that it could be linearly expressed.Then,the authors propose an empirical likelihood method to test regression coefficients.The authors derive the asymptotic chi-squared distribution with two degrees of freedom of the proposed test statistics under the null hypothesis.In addition,the method is extended to test with nuisance parameters.Simulations show that the proposed method have a good performance in control of type-I error rate and power.The proposed method is also employed to analyze a data of Skin Cutaneous Melanoma(SKCM).展开更多
In this paper,we consider the statistical inferences in a partially linear model when the model error follows an autoregressive process.A two-step procedure is proposed for estimating the unknown parameters by taking ...In this paper,we consider the statistical inferences in a partially linear model when the model error follows an autoregressive process.A two-step procedure is proposed for estimating the unknown parameters by taking into account of the special structure in error.Since the asymptotic matrix of the estimator for the parametric part has a complex structure,an empirical likelihood function is also developed.We derive the asymptotic properties of the related statistics under mild conditions.Some simulations,as well as a real data example,are conducted to illustrate the finite sample performance.展开更多
In this paper, we study ultra-high-dimensional partially linear models when the dimension of thelinear predictors grows exponentially with the sample size. For the variable screening, we proposea sequential profile La...In this paper, we study ultra-high-dimensional partially linear models when the dimension of thelinear predictors grows exponentially with the sample size. For the variable screening, we proposea sequential profile Lasso method (SPLasso) and show that it possesses the screening property.SPLasso can also detect all relevant predictors with probability tending to one, no matter whetherthe ultra-high models involve both parametric and nonparametric parts. To select the best subset among the models generated by SPLasso, we propose an extended Bayesian information criterion (EBIC) for choosing the final model. We also conduct simulation studies and apply a realdata example to assess the performance of the proposed method and compare with the existingmethod.展开更多
In this article, we study the variable selection of partially linear single-index model(PLSIM). Based on the minimized average variance estimation, the variable selection of PLSIM is done by minimizing average varianc...In this article, we study the variable selection of partially linear single-index model(PLSIM). Based on the minimized average variance estimation, the variable selection of PLSIM is done by minimizing average variance with adaptive l1 penalty. Implementation algorithm is given. Under some regular conditions, we demonstrate the oracle properties of aLASSO procedure for PLSIM. Simulations are used to investigate the effectiveness of the proposed method for variable selection of PLSIM.展开更多
In this paper, an efficient shrinkage estimation procedure for the partially linear varying coefficient model (PLVC) with random effect is considered. By selecting the significant variable and estimating the nonzero c...In this paper, an efficient shrinkage estimation procedure for the partially linear varying coefficient model (PLVC) with random effect is considered. By selecting the significant variable and estimating the nonzero coefficient, the model structure specification is accomplished by introducing a novel penalized estimating equation. Under some mild conditions, the asymptotic properties for the proposed model selection and estimation results, such as the sparsity and oracle property, are established. Some numerical simulation studies and a real data analysis are presented to examine the finite sample performance of the procedure.展开更多
Prediction plays an important role in data analysis.Model averaging method generally provides better prediction than using any of its components.Even though model averaging has been extensively investigated under inde...Prediction plays an important role in data analysis.Model averaging method generally provides better prediction than using any of its components.Even though model averaging has been extensively investigated under independent errors,few authors have considered model averaging for semiparametric models with correlated errors.In this paper,the authors offer an optimal model averaging method to improve the prediction in partially linear model for longitudinal data.The model averaging weights are obtained by minimizing criterion,which is an unbiased estimator of the expected in-sample squared error loss plus a constant.Asymptotic properties,including asymptotic optimality and consistency of averaging weights,are established under two scenarios:(i)All candidate models are misspecified;(ii)Correct models are available in the candidate set.Simulation studies and an empirical example show that the promise of the proposed procedure over other competitive methods.展开更多
In this paper,we focus on the partially linear varying-coefficient quantile regression with missing observations under ultra-high dimension,where the missing observations include either responses or covariates or the ...In this paper,we focus on the partially linear varying-coefficient quantile regression with missing observations under ultra-high dimension,where the missing observations include either responses or covariates or the responses and part of the covariates are missing at random,and the ultra-high dimension implies that the dimension of parameter is much larger than sample size.Based on the B-spline method for the varying coefficient functions,we study the consistency of the oracle estimator which is obtained only using active covariates whose coefficients are nonzero.At the same time,we discuss the asymptotic normality of the oracle estimator for the linear parameter.Note that the active covariates are unknown in practice,non-convex penalized estimator is investigated for simultaneous variable selection and estimation,whose oracle property is also established.Finite sample behavior of the proposed methods is investigated via simulations and real data analysis.展开更多
In many application fields of regression analysis,prior information about how explanatory variables affect response variable of interest is often available and can be formulated as constraints on regression coefficien...In many application fields of regression analysis,prior information about how explanatory variables affect response variable of interest is often available and can be formulated as constraints on regression coefficients.In this paper,the authors consider statistical inference of partially linear spatial autoregressive model under constraint conditions.By combining series approximation method,twostage least squares method and Lagrange multiplier method,the authors obtain constrained estimators of the parameters and function in the partially linear spatial autoregressive model and investigate their asymptotic properties.Furthermore,the authors propose a testing method to check whether the parameters in the parametric component of the partially linear spatial autoregressive model satisfy linear constraint conditions,and derive asymptotic distributions of the resulting test statistic under both null and alternative hypotheses.Simulation results show that the proposed constrained estimators have better finite sample performance than the unconstrained estimators and the proposed testing method performs well in finite samples.Furthermore,a real example is provided to illustrate the application of the proposed estimation and testing methods.展开更多
The partially linear single-index model(PLSIM) is a flexible and powerful model for analyzing the relationship between the response and the multivariate covariates. This paper considers the PLSIM with measurement erro...The partially linear single-index model(PLSIM) is a flexible and powerful model for analyzing the relationship between the response and the multivariate covariates. This paper considers the PLSIM with measurement error possibly in all the variables. The authors propose a new efficient estimation procedure based on the local linear smoothing and the simulation-extrapolation method,and further establish the asymptotic normality of the proposed estimators for both the index parameter and nonparametric link function. The authors also carry out extensive Monte Carlo simulation studies to evaluate the finite sample performance of the new method, and apply it to analyze the osteoporosis prevention data.展开更多
The authors study the empirical likelihood method for partially linear errors-in-variables model with covariate data missing at random.Empirical likelihood ratios for the regression coefficients and the baseline funct...The authors study the empirical likelihood method for partially linear errors-in-variables model with covariate data missing at random.Empirical likelihood ratios for the regression coefficients and the baseline function are investigated,and the corresponding empirical log-likelihood ratios are proved to be asymptotically standard chi-squared,which can be used to construct confidence regions.The finite sample behavior of the proposed methods is evaluated by a simulation study which indicates that the proposed methods are comparable in terms of coverage probabilities and average length of confidence intervals.Finally,the Earthquake Magnitude dataset is used to illustrate our proposed method.展开更多
This paper considers partially linear additive models with the number of parameters diverging when some linear cons train ts on the parame trie par t are available.This paper proposes a constrained profile least-squar...This paper considers partially linear additive models with the number of parameters diverging when some linear cons train ts on the parame trie par t are available.This paper proposes a constrained profile least-squares estimation for the parametrie components with the nonparametric functions being estimated by basis function approximations.The consistency and asymptotic normality of the restricted estimator are given under some certain conditions.The authors construct a profile likelihood ratio test statistic to test the validity of the linear constraints on the parametrie components,and demonstrate that it follows asymptotically chi-squared distribution under the null and alternative hypo theses.The finite sample performance of the proposed method is illus trated by simulation studies and a data analysis.展开更多
Dimensional variation analysis in multistation manufacturing processes(MMPs)is a challenging research topic with great practical significance.Researchers have been focused on constructing various mathematical models t...Dimensional variation analysis in multistation manufacturing processes(MMPs)is a challenging research topic with great practical significance.Researchers have been focused on constructing various mathematical models to identify the correlations among the huge amounts of collected production data.However,current models have achieved insufficient insights into the variation correlation laws due to the complexity of the data’s mutual relations.In this study,a data-driven modeling method is developed for deep data-mining and dimensional variation analysis.The proposed initial mathematical expression originates from practical engineering knowledge.Through a mathematical treatment,the mathematical expression is transformed into a first-order AR(1)model format,which contains multiple dimensional variations’interstation and temporal correlating information.To obtain this information,the estimation of the proposed model is discussed in detail.A simulation case involving two key product characteristics of a grinding process is used to demonstrate the effectiveness and accuracy of the proposed method for dimensional variation analysis in MMPs.展开更多
In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to...In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to be estimated,random errorsε_(i)are(α,β)-mix_(i)ng random variables.The p-th(p>1)mean consistency,strong consistency and complete consistency for least squares estimators ofβ^(*)and g(·)are investigated under some mild conditions.In addition,a numerical simulation is carried out to study the finite sample performance of the theoretical results.Finally,a real data analysis is provided to further verify the effect of the model.展开更多
Partially linear varying coefficient model is a generalization of partially linear model and varying coefficient model and is frequently used in statistical modeling. In this paper, we construct estimators of the para...Partially linear varying coefficient model is a generalization of partially linear model and varying coefficient model and is frequently used in statistical modeling. In this paper, we construct estimators of the parametric and nonparametric components by Profile least-squares procedure which is based on local linear smoothing. The resulting estimators are shown to be asymptotically normal with heteroscedastic error.展开更多
基金Supported by National Natural Science Foundation of China(No.71462002)the Project for Teaching Reform of Guangxi(GXZZJG2017B084)the Project for Fostering Distinguished Youth Scholars of Guangxi(2020KY50012)。
文摘Variable selection plays an important role in high-dimensional data analysis.But the high-dimensional data often induces the strongly correlated variables problem,which should be properly handled.In this paper,we propose Elastic Net procedure for partially linear models and prove the group effect of its estimate.A simulation study shows that the Elastic Net procedure deals with the strongly correlated variables problem better than the Lasso,ALasso and the Ridge do.Based on the real world data study,we can get that the Elastic Net procedure is particularly useful when the number of predictors pffis much bigger than the sample size n.
基金supported by the National Natural Science Foundation of China under Grant Nos. 11471060 and 11871124the Key Project of Statistical Science of China under Grant No. 2017LZ27。
文摘This paper proposes an empirical likelihood based diagnostic technique for heteroscedasticity for semiparametric varying-coefficient partially linear models with missing responses. Firstly, the authors complement the missing response variables by regression method. Then, the empirical likelihood method is introduced to study the heteroscedasticity of the semiparametric varying-coefficient partially linear models with complete-case data. Finally, the authors obtain the finite sample property by numerical simulation.
基金supported by the University of Chinese Academy of Sciences under Grant No.Y95401TXX2Beijing Natural Science Foundation under Grant No.Z190004Key Program of Joint Funds of the National Natural Science Foundation of China under Grant No.U19B2040。
文摘This paper considers tests for regression coefficients in high dimensional partially linear Models.The authors first use the B-spline method to estimate the unknown smooth function so that it could be linearly expressed.Then,the authors propose an empirical likelihood method to test regression coefficients.The authors derive the asymptotic chi-squared distribution with two degrees of freedom of the proposed test statistics under the null hypothesis.In addition,the method is extended to test with nuisance parameters.Simulations show that the proposed method have a good performance in control of type-I error rate and power.The proposed method is also employed to analyze a data of Skin Cutaneous Melanoma(SKCM).
基金supported by the NSF of China(Nos.11971208,11601197)the NSSF of China(Grant No.21&ZD152)+2 种基金the China Postdoctoral Science Foundation(Nos.2016M600511,2017T100475)the NSF of Jiangxi Province(Nos.2018ACB21002,20171ACB21030)the Post graduate Innovation Project of Jiangxi Province(No.YC2021CB124)。
文摘In this paper,we consider the statistical inferences in a partially linear model when the model error follows an autoregressive process.A two-step procedure is proposed for estimating the unknown parameters by taking into account of the special structure in error.Since the asymptotic matrix of the estimator for the parametric part has a complex structure,an empirical likelihood function is also developed.We derive the asymptotic properties of the related statistics under mild conditions.Some simulations,as well as a real data example,are conducted to illustrate the finite sample performance.
基金Gaorong Li’s research was supported in part by the National Natural Science Foundation of China[number 11471029]Tiejun Tong’s research was supported in part by the National Natural Science Foundation of China[number 11671338]+1 种基金the Hong Kong Baptist University grants[grant number FRG2/15-16/019][grant number FRG1/16-17/018].
文摘In this paper, we study ultra-high-dimensional partially linear models when the dimension of thelinear predictors grows exponentially with the sample size. For the variable screening, we proposea sequential profile Lasso method (SPLasso) and show that it possesses the screening property.SPLasso can also detect all relevant predictors with probability tending to one, no matter whetherthe ultra-high models involve both parametric and nonparametric parts. To select the best subset among the models generated by SPLasso, we propose an extended Bayesian information criterion (EBIC) for choosing the final model. We also conduct simulation studies and apply a realdata example to assess the performance of the proposed method and compare with the existingmethod.
文摘In this article, we study the variable selection of partially linear single-index model(PLSIM). Based on the minimized average variance estimation, the variable selection of PLSIM is done by minimizing average variance with adaptive l1 penalty. Implementation algorithm is given. Under some regular conditions, we demonstrate the oracle properties of aLASSO procedure for PLSIM. Simulations are used to investigate the effectiveness of the proposed method for variable selection of PLSIM.
文摘In this paper, an efficient shrinkage estimation procedure for the partially linear varying coefficient model (PLVC) with random effect is considered. By selecting the significant variable and estimating the nonzero coefficient, the model structure specification is accomplished by introducing a novel penalized estimating equation. Under some mild conditions, the asymptotic properties for the proposed model selection and estimation results, such as the sparsity and oracle property, are established. Some numerical simulation studies and a real data analysis are presented to examine the finite sample performance of the procedure.
基金supported by the National Natural Science Foundation of China under Grant Nos.11971421,71925007,72091212,and 12288201Yunling Scholar Research Fund of Yunnan Province under Grant No.YNWR-YLXZ-2018-020+1 种基金the CAS Project for Young Scientists in Basic Research under Grant No.YSBR-008the Start-Up Grant from Kunming University of Science and Technology under Grant No.KKZ3202207024.
文摘Prediction plays an important role in data analysis.Model averaging method generally provides better prediction than using any of its components.Even though model averaging has been extensively investigated under independent errors,few authors have considered model averaging for semiparametric models with correlated errors.In this paper,the authors offer an optimal model averaging method to improve the prediction in partially linear model for longitudinal data.The model averaging weights are obtained by minimizing criterion,which is an unbiased estimator of the expected in-sample squared error loss plus a constant.Asymptotic properties,including asymptotic optimality and consistency of averaging weights,are established under two scenarios:(i)All candidate models are misspecified;(ii)Correct models are available in the candidate set.Simulation studies and an empirical example show that the promise of the proposed procedure over other competitive methods.
基金Supported by National Natural Science Foundation of China(Grant No.12071348)Fundamental Research Funds for Central Universities,China(Grant No.2023-3-2D-04)。
文摘In this paper,we focus on the partially linear varying-coefficient quantile regression with missing observations under ultra-high dimension,where the missing observations include either responses or covariates or the responses and part of the covariates are missing at random,and the ultra-high dimension implies that the dimension of parameter is much larger than sample size.Based on the B-spline method for the varying coefficient functions,we study the consistency of the oracle estimator which is obtained only using active covariates whose coefficients are nonzero.At the same time,we discuss the asymptotic normality of the oracle estimator for the linear parameter.Note that the active covariates are unknown in practice,non-convex penalized estimator is investigated for simultaneous variable selection and estimation,whose oracle property is also established.Finite sample behavior of the proposed methods is investigated via simulations and real data analysis.
基金supported by the Natural Science Foundation of Shaanxi Province under Grant No.2021JM349the Natural Science Foundation of China under Grant Nos.11972273 and 52170172。
文摘In many application fields of regression analysis,prior information about how explanatory variables affect response variable of interest is often available and can be formulated as constraints on regression coefficients.In this paper,the authors consider statistical inference of partially linear spatial autoregressive model under constraint conditions.By combining series approximation method,twostage least squares method and Lagrange multiplier method,the authors obtain constrained estimators of the parameters and function in the partially linear spatial autoregressive model and investigate their asymptotic properties.Furthermore,the authors propose a testing method to check whether the parameters in the parametric component of the partially linear spatial autoregressive model satisfy linear constraint conditions,and derive asymptotic distributions of the resulting test statistic under both null and alternative hypotheses.Simulation results show that the proposed constrained estimators have better finite sample performance than the unconstrained estimators and the proposed testing method performs well in finite samples.Furthermore,a real example is provided to illustrate the application of the proposed estimation and testing methods.
基金the National Natural Science Foundation of China under Grant Nos. 11971171,11971300, 11901286, 12071267 and 12171310the Shanghai Natural Science Foundation under Grant No.20ZR1421800+2 种基金the Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science (East China Normal University)the General Research Fund (HKBU12303421, HKBU12303918)the Initiation Grant for Faculty Niche Research Areas (RC-FNRA-IG/20-21/SCI/03) of Hong Kong Baptist University。
文摘The partially linear single-index model(PLSIM) is a flexible and powerful model for analyzing the relationship between the response and the multivariate covariates. This paper considers the PLSIM with measurement error possibly in all the variables. The authors propose a new efficient estimation procedure based on the local linear smoothing and the simulation-extrapolation method,and further establish the asymptotic normality of the proposed estimators for both the index parameter and nonparametric link function. The authors also carry out extensive Monte Carlo simulation studies to evaluate the finite sample performance of the new method, and apply it to analyze the osteoporosis prevention data.
基金supported by National Natural Science Foundation of China(Grant No.71420107025),supported by National Natural Science Foundation of China under Grant Nos.11071022,11028103,11231010
文摘The authors study the empirical likelihood method for partially linear errors-in-variables model with covariate data missing at random.Empirical likelihood ratios for the regression coefficients and the baseline function are investigated,and the corresponding empirical log-likelihood ratios are proved to be asymptotically standard chi-squared,which can be used to construct confidence regions.The finite sample behavior of the proposed methods is evaluated by a simulation study which indicates that the proposed methods are comparable in terms of coverage probabilities and average length of confidence intervals.Finally,the Earthquake Magnitude dataset is used to illustrate our proposed method.
基金supported by the National Natural Science Foundation of China under Grant No.11771250the Natural Science Foundation of Shandong Province under Grant No.ZR2019MA002the Program for Scientific Research Innovation of Graduate Dissertation under Grant No.LWCXB201803
文摘This paper considers partially linear additive models with the number of parameters diverging when some linear cons train ts on the parame trie par t are available.This paper proposes a constrained profile least-squares estimation for the parametrie components with the nonparametric functions being estimated by basis function approximations.The consistency and asymptotic normality of the restricted estimator are given under some certain conditions.The authors construct a profile likelihood ratio test statistic to test the validity of the linear constraints on the parametrie components,and demonstrate that it follows asymptotically chi-squared distribution under the null and alternative hypo theses.The finite sample performance of the proposed method is illus trated by simulation studies and a data analysis.
基金The research work was supported by the natural science fund for colleges and universities in Jiangsu province(Nos.15KJB460016 and 14KJB460029)the major industrial technology project in Xuzhou city(No.KC16GZ015)+1 种基金the major industrial technology project in Jiangsu Province(No.BE2016047)the natural science foundation of China(No.71561016).The author would also like to gratefully acknowledge Professor Fugee Tsung and the other colleagues at Hong Kong University of Science and Technology for their valuable comments.
文摘Dimensional variation analysis in multistation manufacturing processes(MMPs)is a challenging research topic with great practical significance.Researchers have been focused on constructing various mathematical models to identify the correlations among the huge amounts of collected production data.However,current models have achieved insufficient insights into the variation correlation laws due to the complexity of the data’s mutual relations.In this study,a data-driven modeling method is developed for deep data-mining and dimensional variation analysis.The proposed initial mathematical expression originates from practical engineering knowledge.Through a mathematical treatment,the mathematical expression is transformed into a first-order AR(1)model format,which contains multiple dimensional variations’interstation and temporal correlating information.To obtain this information,the estimation of the proposed model is discussed in detail.A simulation case involving two key product characteristics of a grinding process is used to demonstrate the effectiveness and accuracy of the proposed method for dimensional variation analysis in MMPs.
基金Supported by the National Social Science Foundation of China(Grant No.22BTJ059)。
文摘In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to be estimated,random errorsε_(i)are(α,β)-mix_(i)ng random variables.The p-th(p>1)mean consistency,strong consistency and complete consistency for least squares estimators ofβ^(*)and g(·)are investigated under some mild conditions.In addition,a numerical simulation is carried out to study the finite sample performance of the theoretical results.Finally,a real data analysis is provided to further verify the effect of the model.
基金the National Natural Science Foundation of China (No.10431010)
文摘Partially linear varying coefficient model is a generalization of partially linear model and varying coefficient model and is frequently used in statistical modeling. In this paper, we construct estimators of the parametric and nonparametric components by Profile least-squares procedure which is based on local linear smoothing. The resulting estimators are shown to be asymptotically normal with heteroscedastic error.