The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to elimin...The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.展开更多
This paper selects seven indicators of financial revenue and housing sales price in recent 19 years in China,and uses SPSS and Excel to carry out descriptive statistics,independent sample t-test,correlation analysis a...This paper selects seven indicators of financial revenue and housing sales price in recent 19 years in China,and uses SPSS and Excel to carry out descriptive statistics,independent sample t-test,correlation analysis and regression analysis to comprehensively study the correlation between financial revenue and housing sales price in China,and establishes the relationship between financial revenue and housing sales price When the average selling price of commercial housing increases by one unit,the fiscal revenue will increase by 27.855 points.展开更多
Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea ...Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea (SCS) based on the simple ocean data assimilation (SODA) dataset. The spatio-temporal distributions of the MLD, the buoyancy flux (combining the NHF and the NFF) and the wind stress of the SCS were presented. Then using an oceanic vertical mixing model, the MLD after a certain time under the same initial conditions but various pairs of boundary conditions (the three factors) was simulated. Applying the MLR method to the results, regression equations which modeling the relationship between the simulated MLD and the three factors were calculated. The equations indicate that when the NHF was negative, it was the primary driver of the mixed layer deepening; and when the NHF was positive, the wind stress played a more important role than that of the NHF while the NFF had the least effect. When the NHF was positive, the relative quantitative effects of the wind stress, the NHF, and the NFF were about i0, 6 and 2. The above conclusions were applied to explaining the spatio-temporal distributions of the MLD in the SCS and thus proved to be valid.展开更多
Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated wa...Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated water. This study focuses on the application of statistical techniques, Multiple Linear Regression model and MANOVA to assess health impacts due to pollution in Cauvery river stretch in Srirangapatna. In this study, using Multiple Linear Regression, it is found that health impact level is 60.8% dependent on water quality parameters of BOD, COD, TDS, TC and FC. The t-statistics and their associated 2-tailed p-values indicate that COD and TDS produces health impacts compared to BOD, TC and FC, when their effects are put together across all the six sampling stations in Srirangapatna. Further Pearson correlation Matrix shows highly significant positive correlation amongst parameters across all stations indicating possibility of common sources of origin that might be anthropogenic. Also graphs are plotted for individual parameters across all stations and it reveals that COD and TDS values are significant across all sampling stations, though their values are higher in impact stations, causing health impacts.展开更多
In this paper we consider a linear regression model with fixed design. A new rule for the selection of a relevant submodel is introduced on the basis of parameter tests. One particular feature of the rule is that subj...In this paper we consider a linear regression model with fixed design. A new rule for the selection of a relevant submodel is introduced on the basis of parameter tests. One particular feature of the rule is that subjective grading of the model complexity can be incorporated. We provide bounds for the mis-selection error. Simulations show that by using the proposed selection rule, the mis-selection error can be controlled uniformly.展开更多
Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed dat...Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS) and weighed least-squares regression (WLS). All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding β1.展开更多
The purpose of this study was to determine a suitable model for investigating the effects of climate factors on the area burned by forest fire in the Tahe forest region, Daxing'an Mountains, in northeast China. The r...The purpose of this study was to determine a suitable model for investigating the effects of climate factors on the area burned by forest fire in the Tahe forest region, Daxing'an Mountains, in northeast China. The response variables were the area burned by lightning- caused fire, human-caused fire, and total burned area. The predictor variables were nine climate variables collected from the local weather station. Three regression models were utilized, including multiple linear regression, log- linear model (log-transformation on both response and predictor variables), and gamma-generalized linear model. The goodness-of-fit of the models were compared based on model fitting statistics such as R2, AIC, and RMSE. The results revealed that the gamma-generalized linear model was generally superior to both multiple linear regressionmodel and log-linear model for fitting the fire data. Further, the best models were selected based on the criteria that the climate variables were statistically significant at at = 0.05. The gamma best models indicated that maximum wind speed, precipitation, and days that rainfall greater than 0.1 mm had significant impacts on the area burned by the lightning-caused fire, while the mean temperature and minimum relative humidity were the .main drivers of the burned area caused by human activities. Overall, the total burned area by forest fire was significantly influenced by days that rainfall greater than 0.1 mm and minimum rela- tive humidity, indicating that the moisture condition of forest stands determine the burned area by forest fire.展开更多
This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression model...This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression models are investigated, the first of which corresponds to systems with a negative feedback, while the second class presents systems without the feedback. In the first case the use of shrinkage estimators, especially the Principal Component estimator, is inappropriate but is possible in the second case with the right choice of the regularization parameter or of the number of principal components included in the regression model. This fact is substantiated by the study of the distribution of the random variable , where b is the LS estimate and β is the true coefficient, since the form of this distribution is the basic characteristic of the specified classes. For this study, a regression approximation of the distribution of the event based on the Edgeworth series was developed. Also, alternative approaches are examined to resolve the multicollinearity issue, including an application of the known Inequality Constrained Least Squares method and the Dual estimator method proposed by the author. It is shown that with a priori information the Euclidean distance between the estimates and the true coefficients can be significantly reduced.展开更多
Glacier response patterns at the catchment scale are highly heterogeneous and defined by a complex interplay of various dynamics and surface factors.Previous studies have explained heterogeneous responses in qualitati...Glacier response patterns at the catchment scale are highly heterogeneous and defined by a complex interplay of various dynamics and surface factors.Previous studies have explained heterogeneous responses in qualitative ways but quantitative assessment is lacking yet where an intrazone homogeneous climate assumption can be valid.Hence,in the current study,the reason for heterogeneous mass balance has been explained in quantitative methods using a multiple linear regression model in the Sikkim Himalayan region.At first,the topographical parameters are selected from previously published studies,then the most significant topographical and geomorphological parameters are selected with backward stepwise subset selection methods.Finally,the contributions of selected parameters are calculated by least square methods.The results show that,the magnitude of mass balance lies between-0.003±0.24 to-1.029±0.24 m.w.e.a^(-1) between 2000 and 2020 in the Sikkim Himalaya region.Also,the study shows that,out of the terminus type of the glacier,glacier area,debris cover,ice-mixed debris,slope,aspect,mean elevation,and snout elevation of the glaciers,only the terminus type and mean elevation of the glacier are significantly altering the glacier mass balance in the Sikkim Himalayan region.Mathematically,the mass loss is approximately 0.40 m.w.e.a^(-1) higher in the lake-terminating glaciers compared to the land-terminating glaciers in the same elevation zone.On the other hand,a thousand meters mean elevation drop is associated with 0.179 m.w.e.a-1of mass loss despite the terminus type of the glaciers.In the current study,the model using the terminus type of the glaciers and the mean elevation of the glaciers explains 76% of fluctuation of mass balance in the Sikkim Himalayan region.展开更多
目的科学评价芙蓉李果实成熟期间的营养品质,建立色度值表观特征与营养品质的关系。方法以福建省主栽品种芙蓉李为研究对象,对其成熟期间果糖、葡萄糖、蔗糖、苹果酸、奎尼酸、琥珀酸、柠檬酸、富马酸、矢车菊素-3-芸香糖苷、矢车菊素-3...目的科学评价芙蓉李果实成熟期间的营养品质,建立色度值表观特征与营养品质的关系。方法以福建省主栽品种芙蓉李为研究对象,对其成熟期间果糖、葡萄糖、蔗糖、苹果酸、奎尼酸、琥珀酸、柠檬酸、富马酸、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷、多酚、黄酮、类胡萝卜素等13个品质指标进行分析和综合评价。结果芙蓉李成熟期间,各品质指标的含量变化存在显著差异(P<0.05),综合运用相关分析、因子分析、绝对因子分析-多元线性回归(absolute principal component scores-multiple linear regression,APCS-MLR)分析筛选可反映芙蓉李综合品质的主要指标。因子分析提取出3个主因子,贡献率分别为52.677%、23.468%、11.649%,累计贡献率为87.794%。综合APCS-MLR等数理统计分析,主因子1主要对果糖、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷贡献较大,贡献率分别为53.00%、73.85%、55.54%;主因子2主要对蔗糖、富马酸、果糖、柠檬酸的贡献率较大,分别为28.26%、18.70%、16.14%、15.59%;主因子3主要对多酚(29.13%)和黄酮(28.28%)有较大贡献率;选取3个主因子总贡献率高于60%的果糖、葡萄糖、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷作为综合品质评价的主要指标。分别对已筛选出的4个主要评价指标与色度值进行多元线性逐步回归分析,建立4个主要指标与色度值的表观预测模型,各模型均具有较好的拟合度,预测值与实测值的均方根误差较小;进一步验证结果表明,通过色度值对4个指标的预测具有较高的可靠性和准确性。结论本研究筛选出的主要指标及预测模型可更加简单、便捷地评价芙蓉李果实成熟期间的综合品质。展开更多
基金supported by the National Natural Science Foundation of China(71071077)the Ministry of Education Key Project of National Educational Science Planning(DFA090215)+1 种基金China Postdoctoral Science Foundation(20100481137)Funding of Jiangsu Innovation Program for Graduate Education(CXZZ11-0226)
文摘The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.
基金Thank you for your valuable comments and suggestions.This research was supported by Yunnan applied basic research project(NO.2017FD150)Chuxiong Normal University General Research Project(NO.XJYB2001).
文摘This paper selects seven indicators of financial revenue and housing sales price in recent 19 years in China,and uses SPSS and Excel to carry out descriptive statistics,independent sample t-test,correlation analysis and regression analysis to comprehensively study the correlation between financial revenue and housing sales price in China,and establishes the relationship between financial revenue and housing sales price When the average selling price of commercial housing increases by one unit,the fiscal revenue will increase by 27.855 points.
基金The National Natural Science Foundation of China under contract No.11174235the Science and Technology Development Project of Shaanxi Province of China under contract No.2010KJXX-02+2 种基金the Program for New Century Excellent Talents in University of China under contract No. NCET-08-0455the Science and Technology Innovation Foundation of Northwestern Polytechnical University of Chinathe Doctorate Foundation of Northwestern Polytechnical University of China under contract No.CX201226.
文摘Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea (SCS) based on the simple ocean data assimilation (SODA) dataset. The spatio-temporal distributions of the MLD, the buoyancy flux (combining the NHF and the NFF) and the wind stress of the SCS were presented. Then using an oceanic vertical mixing model, the MLD after a certain time under the same initial conditions but various pairs of boundary conditions (the three factors) was simulated. Applying the MLR method to the results, regression equations which modeling the relationship between the simulated MLD and the three factors were calculated. The equations indicate that when the NHF was negative, it was the primary driver of the mixed layer deepening; and when the NHF was positive, the wind stress played a more important role than that of the NHF while the NFF had the least effect. When the NHF was positive, the relative quantitative effects of the wind stress, the NHF, and the NFF were about i0, 6 and 2. The above conclusions were applied to explaining the spatio-temporal distributions of the MLD in the SCS and thus proved to be valid.
文摘Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated water. This study focuses on the application of statistical techniques, Multiple Linear Regression model and MANOVA to assess health impacts due to pollution in Cauvery river stretch in Srirangapatna. In this study, using Multiple Linear Regression, it is found that health impact level is 60.8% dependent on water quality parameters of BOD, COD, TDS, TC and FC. The t-statistics and their associated 2-tailed p-values indicate that COD and TDS produces health impacts compared to BOD, TC and FC, when their effects are put together across all the six sampling stations in Srirangapatna. Further Pearson correlation Matrix shows highly significant positive correlation amongst parameters across all stations indicating possibility of common sources of origin that might be anthropogenic. Also graphs are plotted for individual parameters across all stations and it reveals that COD and TDS values are significant across all sampling stations, though their values are higher in impact stations, causing health impacts.
文摘In this paper we consider a linear regression model with fixed design. A new rule for the selection of a relevant submodel is introduced on the basis of parameter tests. One particular feature of the rule is that subjective grading of the model complexity can be incorporated. We provide bounds for the mis-selection error. Simulations show that by using the proposed selection rule, the mis-selection error can be controlled uniformly.
文摘Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS) and weighed least-squares regression (WLS). All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding β1.
基金funded by Asia-Pacific Forests Net(APFNET/2010/FPF/001)National Natural Science Foundation of China(Grant No.31400552)Forestry industry research special funds for public welfare projects(201404402)
文摘The purpose of this study was to determine a suitable model for investigating the effects of climate factors on the area burned by forest fire in the Tahe forest region, Daxing'an Mountains, in northeast China. The response variables were the area burned by lightning- caused fire, human-caused fire, and total burned area. The predictor variables were nine climate variables collected from the local weather station. Three regression models were utilized, including multiple linear regression, log- linear model (log-transformation on both response and predictor variables), and gamma-generalized linear model. The goodness-of-fit of the models were compared based on model fitting statistics such as R2, AIC, and RMSE. The results revealed that the gamma-generalized linear model was generally superior to both multiple linear regressionmodel and log-linear model for fitting the fire data. Further, the best models were selected based on the criteria that the climate variables were statistically significant at at = 0.05. The gamma best models indicated that maximum wind speed, precipitation, and days that rainfall greater than 0.1 mm had significant impacts on the area burned by the lightning-caused fire, while the mean temperature and minimum relative humidity were the .main drivers of the burned area caused by human activities. Overall, the total burned area by forest fire was significantly influenced by days that rainfall greater than 0.1 mm and minimum rela- tive humidity, indicating that the moisture condition of forest stands determine the burned area by forest fire.
文摘This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression models are investigated, the first of which corresponds to systems with a negative feedback, while the second class presents systems without the feedback. In the first case the use of shrinkage estimators, especially the Principal Component estimator, is inappropriate but is possible in the second case with the right choice of the regularization parameter or of the number of principal components included in the regression model. This fact is substantiated by the study of the distribution of the random variable , where b is the LS estimate and β is the true coefficient, since the form of this distribution is the basic characteristic of the specified classes. For this study, a regression approximation of the distribution of the event based on the Edgeworth series was developed. Also, alternative approaches are examined to resolve the multicollinearity issue, including an application of the known Inequality Constrained Least Squares method and the Dual estimator method proposed by the author. It is shown that with a priori information the Euclidean distance between the estimates and the true coefficients can be significantly reduced.
文摘Glacier response patterns at the catchment scale are highly heterogeneous and defined by a complex interplay of various dynamics and surface factors.Previous studies have explained heterogeneous responses in qualitative ways but quantitative assessment is lacking yet where an intrazone homogeneous climate assumption can be valid.Hence,in the current study,the reason for heterogeneous mass balance has been explained in quantitative methods using a multiple linear regression model in the Sikkim Himalayan region.At first,the topographical parameters are selected from previously published studies,then the most significant topographical and geomorphological parameters are selected with backward stepwise subset selection methods.Finally,the contributions of selected parameters are calculated by least square methods.The results show that,the magnitude of mass balance lies between-0.003±0.24 to-1.029±0.24 m.w.e.a^(-1) between 2000 and 2020 in the Sikkim Himalaya region.Also,the study shows that,out of the terminus type of the glacier,glacier area,debris cover,ice-mixed debris,slope,aspect,mean elevation,and snout elevation of the glaciers,only the terminus type and mean elevation of the glacier are significantly altering the glacier mass balance in the Sikkim Himalayan region.Mathematically,the mass loss is approximately 0.40 m.w.e.a^(-1) higher in the lake-terminating glaciers compared to the land-terminating glaciers in the same elevation zone.On the other hand,a thousand meters mean elevation drop is associated with 0.179 m.w.e.a-1of mass loss despite the terminus type of the glaciers.In the current study,the model using the terminus type of the glaciers and the mean elevation of the glaciers explains 76% of fluctuation of mass balance in the Sikkim Himalayan region.
文摘目的科学评价芙蓉李果实成熟期间的营养品质,建立色度值表观特征与营养品质的关系。方法以福建省主栽品种芙蓉李为研究对象,对其成熟期间果糖、葡萄糖、蔗糖、苹果酸、奎尼酸、琥珀酸、柠檬酸、富马酸、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷、多酚、黄酮、类胡萝卜素等13个品质指标进行分析和综合评价。结果芙蓉李成熟期间,各品质指标的含量变化存在显著差异(P<0.05),综合运用相关分析、因子分析、绝对因子分析-多元线性回归(absolute principal component scores-multiple linear regression,APCS-MLR)分析筛选可反映芙蓉李综合品质的主要指标。因子分析提取出3个主因子,贡献率分别为52.677%、23.468%、11.649%,累计贡献率为87.794%。综合APCS-MLR等数理统计分析,主因子1主要对果糖、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷贡献较大,贡献率分别为53.00%、73.85%、55.54%;主因子2主要对蔗糖、富马酸、果糖、柠檬酸的贡献率较大,分别为28.26%、18.70%、16.14%、15.59%;主因子3主要对多酚(29.13%)和黄酮(28.28%)有较大贡献率;选取3个主因子总贡献率高于60%的果糖、葡萄糖、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷作为综合品质评价的主要指标。分别对已筛选出的4个主要评价指标与色度值进行多元线性逐步回归分析,建立4个主要指标与色度值的表观预测模型,各模型均具有较好的拟合度,预测值与实测值的均方根误差较小;进一步验证结果表明,通过色度值对4个指标的预测具有较高的可靠性和准确性。结论本研究筛选出的主要指标及预测模型可更加简单、便捷地评价芙蓉李果实成熟期间的综合品质。