Count data that exhibit over dispersion (variance of counts is larger than its mean) are commonly analyzed using discrete distributions such as negative binomial, Poisson inverse Gaussian and other models. The Poisson...Count data that exhibit over dispersion (variance of counts is larger than its mean) are commonly analyzed using discrete distributions such as negative binomial, Poisson inverse Gaussian and other models. The Poisson is characterized by the equality of mean and variance whereas the Negative Binomial and the Poisson inverse Gaussian have variance larger than the mean and therefore are more appropriate to model over-dispersed count data. As an alternative to these two models, we shall use the generalized Poisson distribution for group comparisons in the presence of multiple covariates. This problem is known as the ANCOVA and is solved for continuous data. Our objectives were to develop ANCOVA using the generalized Poisson distribution, and compare its goodness of fit to that of the nonparametric Generalized Additive Models. We used real life data to show that the model performs quite satisfactorily when compared to the nonparametric Generalized Additive Models.展开更多
Background: In this paper, a regression model for predicting the spatial distribution of forest cockchafer larvae in the Hessian Ried region (Germany) is presented. The forest cockchafer, a native biotic pest, is a...Background: In this paper, a regression model for predicting the spatial distribution of forest cockchafer larvae in the Hessian Ried region (Germany) is presented. The forest cockchafer, a native biotic pest, is a major cause of damage in forests in this region particularly during the regeneration phase. The model developed in this study is based on a systematic sample inventory of forest cockchafer larvae by excavation across the Hessian Ried. These forest cockchafer larvae data were characterized by excess zeros and overdispersion. Methods: Using specific generalized additive regression models, different discrete distributions, including the Poisson, negative binomial and zero-inflated Poisson distributions, were compared. The methodology employed allowed the simultaneous estimation of non-linear model effects of causal covariates and, to account for spatial autocorrelation, of a 2-dimensional spatial trend function. In the validation of the models, both the Akaike information criterion (AIC) and more detailed graphical procedures based on randomized quantile residuals were used. Results: The negative binomial distribution was superior to the Poisson and the zero-inflated Poisson distributions, providing a near perfect fit to the data, which was proven in an extensive validation process. The causal predictors found to affect the density of larvae significantly were distance to water table and percentage of pure clay layer in the soil to a depth of I m. Model predictions showed that larva density increased with an increase in distance to the water table up to almost 4 m, after which it remained constant, and with a reduction in the percentage of pure clay layer. However this latter correlation was weak and requires further investigation. The 2-dimensional trend function indicated a strong spatial effect, and thus explained by far the highest proportion of variation in larva density. Conclusions: As such the model can be used to support forest practitioners in their decision making for regeneration and forest protection planning in the Hessian predicting future spatial patterns of the larva density is still comparatively weak. Ried. However, the application of the model for somewhat limited because the causal effects are展开更多
In this paper, the frequency of an earthquake occurrence and magnitude relationship has been modeled with generalized linear models for the set of earthquake data of Nepal. A goodness of fit of a statistical model is ...In this paper, the frequency of an earthquake occurrence and magnitude relationship has been modeled with generalized linear models for the set of earthquake data of Nepal. A goodness of fit of a statistical model is applied for generalized linear models and considering the model selection information criterion, Akaike information criterion and Bayesian information criterion, generalized Poisson regression model has been selected as a suitable model for the study. The objective of this study is to determine the parameters (a and b values), estimate the probability of an earthquake occurrence and its return period using a Poisson regression model and compared with the Gutenberg-Richter model. The study suggests that the probabilities of earthquake occurrences and return periods estimated by both the models are relatively close to each other. The return periods from the generalized Poisson regression model are comparatively smaller than the Gutenberg-Richter model.展开更多
Habitat suitability index(HSI)models have been widely used to analyze the relationship between species abundance and environmental factors,and ultimately inform management of marine species.The response of species abu...Habitat suitability index(HSI)models have been widely used to analyze the relationship between species abundance and environmental factors,and ultimately inform management of marine species.The response of species abundance to each environmental variable is different and habitat requirements may change over life history stages and seasons.Therefore,it is necessary to determine the optimal combination of environmental variables in HSI modelling.In this study,generalized additive models(GAMs)were used to determine which environmental variables to be included in the HSI models.Significant variables were retained and weighted in the HSI model according to their relative contribution(%)to the total deviation explained by the boosted regression tree(BRT).The HSI models were applied to evaluate the habitat suitability of mantis shrimp Oratosquilla oratoria in the Haizhou Bay and adjacent areas in 2011 and 2013–2017.Ontogenetic and seasonal variations in HSI models of mantis shrimp were also examined.Among the four models(non-optimized model,BRT informed HSI model,GAM informed HSI model,and both BRT and GAM informed HSI model),both BRT and GAM informed HSI model showed the best performance.Four environmental variables(bottom temperature,depth,distance offshore and sediment type)were selected in the HSI models for four groups(spring-juvenile,spring-adult,falljuvenile and fall-adult)of mantis shrimp.The distribution of habitat suitability showed similar patterns between juveniles and adults,but obvious seasonal variations were observed.This study suggests that the process of optimizing environmental variables in HSI models improves the performance of HSI models,and this optimization strategy could be extended to other marine organisms to enhance the understanding of the habitat suitability of target species.展开更多
The purpose of this article is to investigate approaches for modeling individual patient count/rate data over time accounting for temporal correlation and non</span><span style="font-family:Verdana;"...The purpose of this article is to investigate approaches for modeling individual patient count/rate data over time accounting for temporal correlation and non</span><span style="font-family:Verdana;">-</span><span style="font-family:Verdana;">constant dispersions while requiring reasonable amounts of time to search over alternative models for those data. This research addresses formulations for two approaches for extending generalized estimating equations (GEE) modeling. These approaches use a likelihood-like function based on the multivariate normal density. The first approach augments standard GEE equations to include equations for estimation of dispersion parameters. The second approach is based on estimating equations determined by partial derivatives of the likelihood-like function with respect to all model parameters and so extends linear mixed modeling. Three correlation structures are considered including independent, exchangeable, and spatial autoregressive of order 1 correlations. The likelihood-like function is used to formulate a likelihood-like cross-validation (LCV) score for use in evaluating models. Example analyses are presented using these two modeling approaches applied to three data sets of counts/rates over time for individual cancer patients including pain flares per day, as needed pain medications taken per day, and around the clock pain medications taken per day per dose. Means and dispersions are modeled as possibly nonlinear functions of time using adaptive regression modeling methods to search through alternative models compared using LCV scores. The results of these analyses demonstrate that extended linear mixed modeling is preferable for modeling individual patient count/rate data over time</span><span style="font-family:Verdana;">,</span><span style="font-family:Verdana;"> because in example analyses</span><span style="font-family:Verdana;">,</span><span style="font-family:Verdana;"> it either generates better LCV scores or more parsimonious models and requires substantially less time.展开更多
Malaria is a major cause of morbidity and mortality in Apac district, Northern Uganda. Hence, the study aimed to model malaria incidences with respect to climate variables for the period 2007 to 2016 in Apac district....Malaria is a major cause of morbidity and mortality in Apac district, Northern Uganda. Hence, the study aimed to model malaria incidences with respect to climate variables for the period 2007 to 2016 in Apac district. Data on monthly malaria incidence in Apac district for the period January 2007 to December 2016 was obtained from the Ministry of health, Uganda whereas climate data was obtained from Uganda National Meteorological Authority. Generalized linear models, Poisson and negative binomial regression models were employed to analyze the data. These models were used to fit monthly malaria incidences as a function of monthly rainfall and average temperature. Negative binomial model provided a better fit as compared to the Poisson regression model as indicated by the residual plots and residual deviances. The Pearson correlation test indicated a strong positive association between rainfall and malaria incidences. High malaria incidences were observed in the months of August, September and November. This study showed a significant association between monthly malaria incidence and climate variables that is rainfall and temperature. This study provided useful information for predicting malaria incidence and developing the future warning system. This is an important tool for policy makers to put in place effective control measures for malaria early enough.展开更多
文摘Count data that exhibit over dispersion (variance of counts is larger than its mean) are commonly analyzed using discrete distributions such as negative binomial, Poisson inverse Gaussian and other models. The Poisson is characterized by the equality of mean and variance whereas the Negative Binomial and the Poisson inverse Gaussian have variance larger than the mean and therefore are more appropriate to model over-dispersed count data. As an alternative to these two models, we shall use the generalized Poisson distribution for group comparisons in the presence of multiple covariates. This problem is known as the ANCOVA and is solved for continuous data. Our objectives were to develop ANCOVA using the generalized Poisson distribution, and compare its goodness of fit to that of the nonparametric Generalized Additive Models. We used real life data to show that the model performs quite satisfactorily when compared to the nonparametric Generalized Additive Models.
文摘Background: In this paper, a regression model for predicting the spatial distribution of forest cockchafer larvae in the Hessian Ried region (Germany) is presented. The forest cockchafer, a native biotic pest, is a major cause of damage in forests in this region particularly during the regeneration phase. The model developed in this study is based on a systematic sample inventory of forest cockchafer larvae by excavation across the Hessian Ried. These forest cockchafer larvae data were characterized by excess zeros and overdispersion. Methods: Using specific generalized additive regression models, different discrete distributions, including the Poisson, negative binomial and zero-inflated Poisson distributions, were compared. The methodology employed allowed the simultaneous estimation of non-linear model effects of causal covariates and, to account for spatial autocorrelation, of a 2-dimensional spatial trend function. In the validation of the models, both the Akaike information criterion (AIC) and more detailed graphical procedures based on randomized quantile residuals were used. Results: The negative binomial distribution was superior to the Poisson and the zero-inflated Poisson distributions, providing a near perfect fit to the data, which was proven in an extensive validation process. The causal predictors found to affect the density of larvae significantly were distance to water table and percentage of pure clay layer in the soil to a depth of I m. Model predictions showed that larva density increased with an increase in distance to the water table up to almost 4 m, after which it remained constant, and with a reduction in the percentage of pure clay layer. However this latter correlation was weak and requires further investigation. The 2-dimensional trend function indicated a strong spatial effect, and thus explained by far the highest proportion of variation in larva density. Conclusions: As such the model can be used to support forest practitioners in their decision making for regeneration and forest protection planning in the Hessian predicting future spatial patterns of the larva density is still comparatively weak. Ried. However, the application of the model for somewhat limited because the causal effects are
文摘In this paper, the frequency of an earthquake occurrence and magnitude relationship has been modeled with generalized linear models for the set of earthquake data of Nepal. A goodness of fit of a statistical model is applied for generalized linear models and considering the model selection information criterion, Akaike information criterion and Bayesian information criterion, generalized Poisson regression model has been selected as a suitable model for the study. The objective of this study is to determine the parameters (a and b values), estimate the probability of an earthquake occurrence and its return period using a Poisson regression model and compared with the Gutenberg-Richter model. The study suggests that the probabilities of earthquake occurrences and return periods estimated by both the models are relatively close to each other. The return periods from the generalized Poisson regression model are comparatively smaller than the Gutenberg-Richter model.
基金The National Key R&D Program of China under contract No.2017YFE0104400the National Natural Science Foundation of China under contract No.31772852the Marine S&T Fund of Shandong Province for Pilot National Laboratory for Marine Science and Technology(Qingdao)under contract No.2018SDKJ0501-2。
文摘Habitat suitability index(HSI)models have been widely used to analyze the relationship between species abundance and environmental factors,and ultimately inform management of marine species.The response of species abundance to each environmental variable is different and habitat requirements may change over life history stages and seasons.Therefore,it is necessary to determine the optimal combination of environmental variables in HSI modelling.In this study,generalized additive models(GAMs)were used to determine which environmental variables to be included in the HSI models.Significant variables were retained and weighted in the HSI model according to their relative contribution(%)to the total deviation explained by the boosted regression tree(BRT).The HSI models were applied to evaluate the habitat suitability of mantis shrimp Oratosquilla oratoria in the Haizhou Bay and adjacent areas in 2011 and 2013–2017.Ontogenetic and seasonal variations in HSI models of mantis shrimp were also examined.Among the four models(non-optimized model,BRT informed HSI model,GAM informed HSI model,and both BRT and GAM informed HSI model),both BRT and GAM informed HSI model showed the best performance.Four environmental variables(bottom temperature,depth,distance offshore and sediment type)were selected in the HSI models for four groups(spring-juvenile,spring-adult,falljuvenile and fall-adult)of mantis shrimp.The distribution of habitat suitability showed similar patterns between juveniles and adults,but obvious seasonal variations were observed.This study suggests that the process of optimizing environmental variables in HSI models improves the performance of HSI models,and this optimization strategy could be extended to other marine organisms to enhance the understanding of the habitat suitability of target species.
文摘The purpose of this article is to investigate approaches for modeling individual patient count/rate data over time accounting for temporal correlation and non</span><span style="font-family:Verdana;">-</span><span style="font-family:Verdana;">constant dispersions while requiring reasonable amounts of time to search over alternative models for those data. This research addresses formulations for two approaches for extending generalized estimating equations (GEE) modeling. These approaches use a likelihood-like function based on the multivariate normal density. The first approach augments standard GEE equations to include equations for estimation of dispersion parameters. The second approach is based on estimating equations determined by partial derivatives of the likelihood-like function with respect to all model parameters and so extends linear mixed modeling. Three correlation structures are considered including independent, exchangeable, and spatial autoregressive of order 1 correlations. The likelihood-like function is used to formulate a likelihood-like cross-validation (LCV) score for use in evaluating models. Example analyses are presented using these two modeling approaches applied to three data sets of counts/rates over time for individual cancer patients including pain flares per day, as needed pain medications taken per day, and around the clock pain medications taken per day per dose. Means and dispersions are modeled as possibly nonlinear functions of time using adaptive regression modeling methods to search through alternative models compared using LCV scores. The results of these analyses demonstrate that extended linear mixed modeling is preferable for modeling individual patient count/rate data over time</span><span style="font-family:Verdana;">,</span><span style="font-family:Verdana;"> because in example analyses</span><span style="font-family:Verdana;">,</span><span style="font-family:Verdana;"> it either generates better LCV scores or more parsimonious models and requires substantially less time.
文摘Malaria is a major cause of morbidity and mortality in Apac district, Northern Uganda. Hence, the study aimed to model malaria incidences with respect to climate variables for the period 2007 to 2016 in Apac district. Data on monthly malaria incidence in Apac district for the period January 2007 to December 2016 was obtained from the Ministry of health, Uganda whereas climate data was obtained from Uganda National Meteorological Authority. Generalized linear models, Poisson and negative binomial regression models were employed to analyze the data. These models were used to fit monthly malaria incidences as a function of monthly rainfall and average temperature. Negative binomial model provided a better fit as compared to the Poisson regression model as indicated by the residual plots and residual deviances. The Pearson correlation test indicated a strong positive association between rainfall and malaria incidences. High malaria incidences were observed in the months of August, September and November. This study showed a significant association between monthly malaria incidence and climate variables that is rainfall and temperature. This study provided useful information for predicting malaria incidence and developing the future warning system. This is an important tool for policy makers to put in place effective control measures for malaria early enough.
文摘分析南极磷虾分布与环境因子的非线性和空间非静态性关系,对南极磷虾的高效捕捞和管理具有重要意义。本研究基于“龙腾”船2015、2016年在南设得兰群岛捕捞作业的渔捞日志数据,应用广义加模型(Generalized additive model,GAM)和地理权重回归模型(Geographical weighted regression,GWR)探究南极磷虾(Euphausia superba)渔场分布与环境因子的非线性和空间非静态性关系,并比较这2种模型的模拟性能,为南极磷虾的渔场渔情预报、资源评估和渔业管理提供基础数据。GAM模型结果显示,2015、2016年单位捕捞努力量渔获量(Catch per unit effort,CPUE)与作业水深均呈显著负相关关系(P<0.01),表明在作业水深范围内,南极磷虾在较浅水域集群密度较高;2015年CPUE与表层水温呈显著正相关关系(P<0.01),但在2016年呈显著负相关关系(P<0.01),推测是由于2年调查作业位置不同所致;CPUE与离岸距离关系不显著(P≥0.05)。GWR模型结果显示,作业水深对CPUE的影响无显著的空间变化(P>0.05);海水表温和离岸距离对CPUE的影响具显著的空间变化(P<0.01),表明这2个因子对南极磷虾渔场分布的影响在空间上不连续,存在显著空间非静态性。GAM模型可用于研究资源分布与驱动因子的一般规律;GWR模型作为全局回归模型的有效补充,可用于探究一般规律不适合的特殊区域,便于发现资源分布的“热点”区域,未来在海洋生物资源分布研究中将有广阔的应用前景。