In recent years, functional data has been widely used in finance, medicine, biology and other fields. The current clustering analysis can solve the problems in finite-dimensional space, but it is difficult to be direc...In recent years, functional data has been widely used in finance, medicine, biology and other fields. The current clustering analysis can solve the problems in finite-dimensional space, but it is difficult to be directly used for the clustering of functional data. In this paper, we propose a new unsupervised clustering algorithm based on adaptive weights. In the absence of initialization parameter, we use entropy-type penalty terms and fuzzy partition matrix to find the optimal number of clusters. At the same time, we introduce a measure based on adaptive weights to reflect the difference in information content between different clustering metrics. Simulation experiments show that the proposed algorithm has higher purity than some algorithms.展开更多
Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities tu...Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities turn up dangerous contaminants in our surroundings. This study investigated two years’ worth of air quality and outlier detection data from two Indian cities. Studies on air pollution have used numerous types of methodologies, with various gases being seen as a vector whose components include gas concentration values for each observation per-formed. We use curves to represent the monthly average of daily gas emissions in our technique. The approach, which is based on functional depth, was used to find outliers in the city of Delhi and Kolkata’s gas emissions, and the outcomes were compared to those from the traditional method. In the evaluation and comparison of these models’ performances, the functional approach model studied well.展开更多
In this paper,we consider the clustering of bivariate functional data where each random surface consists of a set of curves recorded repeatedly for each subject.The k-centres surface clustering method based on margina...In this paper,we consider the clustering of bivariate functional data where each random surface consists of a set of curves recorded repeatedly for each subject.The k-centres surface clustering method based on marginal functional principal component analysis is proposed for the bivariate functional data,and a novel clustering criterion is presented where both the random surface and its partial derivative function in two directions are considered.In addition,we also consider two other clustering methods,k-centres surface clustering methods based on product functional principal component analysis or double functional principal component analysis.Simulation results indicate that the proposed methods have a nice performance in terms of both the correct classification rate and the adjusted rand index.The approaches are further illustrated through empirical analysis of human mortality data.展开更多
We propose a methodology for testing two-sample means in high-dimensional functional data that requires no decaying pattern on eigenvalues of the functional data.To the best of our knowledge,we are the first to consid...We propose a methodology for testing two-sample means in high-dimensional functional data that requires no decaying pattern on eigenvalues of the functional data.To the best of our knowledge,we are the first to consider and address such a problem.To be specific,we devise a confidence region for the mean curve difference between two samples,which directly establishes a rigorous inferential procedure based on the multiplier bootstrap.In addition,the proposed test permits the functional observations in each sample to have mutually different distributions and arbitrary correlation structures,which is regarded as the desired property of distribution/correlation-free,leading to a more challenging scenario for theoretical development.Other desired properties include the allowance for highly unequal sample sizes,exponentially growing data dimension in sample sizes and consistent power behavior under fairly general alternatives.The proposed test is shown uniformly convergent to the prescribed significance,and its finite sample performance is evaluated via the simulation study and an application to electroencephalography data.展开更多
We propose a two-sample test for the mean functions of functional data when the number of bases is much lager than the sample size.The novel test is based on U-statistics which avoids estimating the covariance operato...We propose a two-sample test for the mean functions of functional data when the number of bases is much lager than the sample size.The novel test is based on U-statistics which avoids estimating the covariance operator accurately under the high dimensional situation.We further prove the asymptotic normality of our test statistic under both null hypothesis and a local alternative hypothesis.An extensive simulation study is presented which shows that the proposed test works well in comparison with several other methods under the high dimensional situation.An application to egg-laying trajectories of Mediterranean fruit flies data set demonstrates the applicability of the method.展开更多
Chlorophyll-a(Chl-a)concentration is a primary indicator for marine environmental monitoring.The spatio-temporal variations of sea surface Chl-a concentration in the Yellow Sea(YS)and the East China Sea(ECS)in 2001-20...Chlorophyll-a(Chl-a)concentration is a primary indicator for marine environmental monitoring.The spatio-temporal variations of sea surface Chl-a concentration in the Yellow Sea(YS)and the East China Sea(ECS)in 2001-2020 were investigated by reconstructing the MODIS Level 3 products with the data interpolation empirical orthogonal function(DINEOF)method.The reconstructed results by interpolating the combined MODIS daily+8-day datasets were found better than those merely by interpolating daily or 8-day data.Chl-a concentration in the YS and the ECS reached its maximum in spring,with blooms occurring,decreased in summer and autumn,and increased in late autumn and early winter.By performing empirical orthogonal function(EOF)decomposition of the reconstructed data fields and correlation analysis with several potential environmental factors,we found that the sea surface temperature(SST)plays a significant role in the seasonal variation of Chl a,especially during spring and summer.The increase of SST in spring and the upper-layer nutrients mixed up during the last winter might favor the occurrence of spring blooms.The high sea surface temperature(SST)throughout the summer would strengthen the vertical stratification and prevent nutrients supply from deep water,resulting in low surface Chl-a concentrations.The sea surface Chl-a concentration in the YS was found decreased significantly from 2012 to 2020,which was possibly related to the Pacific Decadal Oscillation(PDO).展开更多
We consider the semiparametric partially linear regression models with mean function XTβ + g(z), where X and z are functional data. The new estimators of β and g(z) are presented and some asymptotic results are...We consider the semiparametric partially linear regression models with mean function XTβ + g(z), where X and z are functional data. The new estimators of β and g(z) are presented and some asymptotic results are given. The strong convergence rates of the proposed estimators are obtained. In our estimation, the observation number of each subject will be completely flexible. Some simulation study is conducted to investigate the finite sample performance of the proposed estimators.展开更多
We propose a new functional single index model, which called dynamic single-index model for functional data, or DSIM, to efficiently perform non-linear and dynamic relationships between functional predictor and functi...We propose a new functional single index model, which called dynamic single-index model for functional data, or DSIM, to efficiently perform non-linear and dynamic relationships between functional predictor and functional response. The proposed model naturally allows for some curvature not captured by the ordinary functional linear model. By using the proposed two-step estimating algorithm, we develop the estimates for both the link function and the regression coefficient function, and then provide predictions of new response trajectories. Besides the asymptotic properties for the estimates of the unknown functions, we also establish the consistency of the predictions of new response trajectories under mild conditions. Finally, we show through extensive simulation studies and a real data example that the proposed DSIM can highly outperform existed functional regression methods in most settings.展开更多
This paper deals with the conditional density estimator of a real response variable given a functional random variable(i.e.,takes values in an infinite-dimensional space).Specifically,we focus on the functional index ...This paper deals with the conditional density estimator of a real response variable given a functional random variable(i.e.,takes values in an infinite-dimensional space).Specifically,we focus on the functional index model,and this approach represents a good compromise between nonparametric and parametric models.Then we give under general conditions and when the variables are independent,the quadratic error and asymptotic normality of estimator by local linear method,based on the single-index structure.Finally,wecomplete these theoretical advances by some simulation studies showing both the practical result of the local linear method and the good behaviour for finite sample sizes of the estimator and of the Monte Carlo methods to create functional pseudo-confidence area.展开更多
Background:The accurate estimation of temporal patterns of influenza may help in utilizing hospital resources and guiding influenza surveillance.This paper proposes functional data analysis(FDA)to improve the predicti...Background:The accurate estimation of temporal patterns of influenza may help in utilizing hospital resources and guiding influenza surveillance.This paper proposes functional data analysis(FDA)to improve the prediction of temporal patterns of influenza.Methods:We illustrate FDA methods using the weekly Influenza-like Illness(ILI)activity level data from the U.S.We propose to use the Fourier basis function for transforming discrete weekly data to the smoothed functional ILI activities.Functional analysis of variance(FANOVA)is used to examine the regional differences in temporal patterns and the impact of state's political orientation.Results:The ILI activity has a very distinct peak at the beginning and end of the year.There are significant differences in average level of ILI activities among geographic regions.However,the temporal patterns in terms of the peak and flat time are quite consistent across regions.The geographic and temporal patterns of ILI activities also depend on the political make-up of states.The states affiliated with Republicans had higher ILI activities than those affiliated with Democrats across the whole year.The influence of political party affiliation on temporal pattern is quite different among geographic regions.Conclusions:Functional data analysis can help us to reveal the temporal variability in average ILI levels,rate of change in ILI levels,and the effect of geographical regions.Consideration should be given to wider application of FDA to generate more accurate estimates in public health and biomedical research.展开更多
The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to a...The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to achieve better classification accuracy.In this paper,we propose a mean-variance-based(MV)feature weighting method for classifying functional data or functional curves.In the feature extraction stage,each sample curve is approximated by B-splines to transfer features to the coefficients of the spline basis.After that,a feature weighting approach based on statistical principles is introduced by comprehensively considering the between-class differences and within-class variations of the coefficients.We also introduce a scaling parameter to adjust the gap between the weights of features.The new feature weighting approach can adaptively enhance noteworthy local features while mitigating the impact of confusing features.The algorithms for feature weighted K-nearest neighbor and support vector machine classifiers are both provided.Moreover,the new approach can be well integrated into existing functional data classifiers,such as the generalized functional linear model and functional linear discriminant analysis,resulting in a more accurate classification.The performance of the mean-variance-based classifiers is evaluated by simulation studies and real data.The results show that the newfeatureweighting approach significantly improves the classification accuracy for complex functional data.展开更多
It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when th...It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when the covariates of the nonparametric component are functional,the robust estimates for the regression parameter and regression operator are introduced.The main propose of the paper is to consider data-driven methods of selecting the number of neighbors in order to make the proposed processes fully automatic.We use thek Nearest Neighbors procedure(kNN)to construct the kernel estimator of the proposed robust model.Under some regularity conditions,we state consistency results for kNN functional estimators,which are uniform in the number of neighbors(UINN).Furthermore,a simulation study and an empirical application to a real data analysis of octane gasoline predictions are carried out to illustrate the higher predictive performances and the usefulness of the kNN approach.展开更多
Fuzzy clustering theory is widely used in data mining of full-face tunnel boring machine.However,the traditional fuzzy clustering algorithm based on objective function is difficult to effectively cluster functional da...Fuzzy clustering theory is widely used in data mining of full-face tunnel boring machine.However,the traditional fuzzy clustering algorithm based on objective function is difficult to effectively cluster functional data.We propose a new Fuzzy clustering algorithm,namely FCM-ANN algorithm.The algorithm replaces the clustering prototype of the FCM algorithm with the predicted value of the artificial neural network.This makes the algorithm not only satisfy the clustering based on the traditional similarity criterion,but also can effectively cluster the functional data.In this paper,we first use the t-test as an evaluation index and apply the FCM-ANN algorithm to the synthetic datasets for validity testing.Then the algorithm is applied to TBM operation data and combined with the crossvalidation method to predict the tunneling speed.The predicted results are evaluated by RMSE and R^(2).According to the experimental results on the synthetic datasets,we obtain the relationship among the membership threshold,the number of samples,the number of attributes and the noise.Accordingly,the datasets can be effectively adjusted.Applying the FCM-ANN algorithm to the TBM operation data can accurately predict the tunneling speed.The FCM-ANN algorithm has improved the traditional fuzzy clustering algorithm,which can be used not only for the prediction of tunneling speed of TBM but also for clustering or prediction of other functional data.展开更多
The study of estimation of conditional extreme quantile in incomplete data frameworks is of growing interest. Specially, the estimation of the extreme value index in a censorship framework has been the purpose of many...The study of estimation of conditional extreme quantile in incomplete data frameworks is of growing interest. Specially, the estimation of the extreme value index in a censorship framework has been the purpose of many inves<span style="font-family:Verdana;">tigations when finite dimension covariate information has been considered. In this paper, the estimation of the conditional extreme quantile of a </span><span style="font-family:Verdana;">heavy-tailed distribution is discussed when some functional random covariate (</span><i><span style="font-family:Verdana;">i.e.</span></i><span style="font-family:Verdana;"> valued in some infinite-dimensional space) information is available and the scalar response variable is right-censored. A Weissman-type estimator of conditional extreme quantiles is proposed and its asymptotic normality is established under mild assumptions. A simulation study is conducted to assess the finite-sample behavior of the proposed estimator and a comparison with two simple estimations strategies is provided.</span>展开更多
For Hermite-Birkhoff interpolation of scattered multidumensional data by radial basis function (?),existence and characterization theorems and a variational principle are proved. Examples include (?)(r)=r^b,Duchon'...For Hermite-Birkhoff interpolation of scattered multidumensional data by radial basis function (?),existence and characterization theorems and a variational principle are proved. Examples include (?)(r)=r^b,Duchon's thin-plate splines,Hardy's multiquadrics,and inverse multiquadrics.展开更多
This paper studies the problem of robust H∞ control of piecewise-linear chaotic systems with random data loss. The communication links between the plant and the controller are assumed to be imperfect (that is, data ...This paper studies the problem of robust H∞ control of piecewise-linear chaotic systems with random data loss. The communication links between the plant and the controller are assumed to be imperfect (that is, data loss occurs intermittently, which appears typically in a network environment). The data loss is modelled as a random process which obeys a Bernoulli distribution. In the face of random data loss, a piecewise controller is designed to robustly stabilize the networked system in the sense of mean square and also achieve a prescribed H∞ disturbance attenuation performance based on a piecewise-quadratic Lyapunov function. The required H∞ controllers can be designed by solving a set of linear matrix inequalities (LMIs). Chua's system is provided to illustrate the usefulness and applicability of the developed theoretical results.展开更多
A variety of factors affect air quality, making it a difficult issue. The level of clean air in a certain area is referred to as air quality. It is challenging for conventional approaches to correctly discover aberran...A variety of factors affect air quality, making it a difficult issue. The level of clean air in a certain area is referred to as air quality. It is challenging for conventional approaches to correctly discover aberrant values or outliers due to the significant fluctuation of this sort of data, which is influenced by Climate change and the environment. With accelerating industrial expansion and rising population density in Kolkata City, air pollution is continuously rising. This study involves two phases, in the first phase imputation of missing values and second detection of outliers using Statistical Process Control (SPC), and Functional Data Analysis (FDA), studies to achieve the efficacy of the outlier identification methodology proposed with working days and Nonworking days of the variables NO<sub>2</sub>, SO<sub>2</sub>, and O<sub>3</sub>, which were used for a year in a row in Kolkata, India. The results show how the functional data approach outshines traditional outlier detection methods. The outcomes show that functional data analysis vibrates more than the other two approaches after imputation, and the suggested outlier detector is absolutely appropriate for the precise detection of outliers in highly variable data.展开更多
This paper is focused on the goodness-of-fit test of the functional linear composite quantile regression model.A nonparametric test is proposed by using the orthogonality of the residual and its conditional expectatio...This paper is focused on the goodness-of-fit test of the functional linear composite quantile regression model.A nonparametric test is proposed by using the orthogonality of the residual and its conditional expectation under the null model.The proposed test statistic has an asymptotic standard normal distribution under the null hypothesis,and tends to infinity in probability under the alternative hypothesis,which implies the consistency of the test.Furthermore,it is proved that the test statistic converges to a normal distribution with nonzero mean under a local alternative hypothesis.Extensive simulations are reported,and the results show that the proposed test has proper sizes and is sensitive to the considered model discrepancies.The proposed methods are also applied to two real datasets.展开更多
To better describe and understand the time dynamics in functional data analysis,it is often desirable to recover the partial derivatives of the random surface.A novel approach is proposed based on marginal functional ...To better describe and understand the time dynamics in functional data analysis,it is often desirable to recover the partial derivatives of the random surface.A novel approach is proposed based on marginal functional principal component analysis to derive the representation for partial derivatives.To obtain the Karhunen-Lo`eve expansion of the partial derivatives,an adaptive estimation is explored.Asymptotic results of the proposed estimates are established.Simulation studies show that the proposed methods perform well in finite samples.Application to the human mortality data reveals informative time dynamics in mortality rates.展开更多
Motivated by a medical study that attempts to analyze the relationship between growth curves and other variables and to measure the association among multiple growth curves,the authors develop a functional multiple-ou...Motivated by a medical study that attempts to analyze the relationship between growth curves and other variables and to measure the association among multiple growth curves,the authors develop a functional multiple-outcome model to decompose the total variation of multiple functional outcomes into variation explained by independent variables with time-varying coefficient functions,by latent factors and by noise.The latent factors are the hidden common factors that influence the multiple outcomes and are found through the combined functional principal component analysis approach.Through the coefficients of the latent factors one may further explore the association of the multiple outcomes.This method is applied to the multivariate growth data of infants in a real medical study in Shanghai and produces interpretable results.Convergence rates for the proposed estimates of the varying coefficient and covariance functions of the model are derived under mild conditions.展开更多
文摘In recent years, functional data has been widely used in finance, medicine, biology and other fields. The current clustering analysis can solve the problems in finite-dimensional space, but it is difficult to be directly used for the clustering of functional data. In this paper, we propose a new unsupervised clustering algorithm based on adaptive weights. In the absence of initialization parameter, we use entropy-type penalty terms and fuzzy partition matrix to find the optimal number of clusters. At the same time, we introduce a measure based on adaptive weights to reflect the difference in information content between different clustering metrics. Simulation experiments show that the proposed algorithm has higher purity than some algorithms.
文摘Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities turn up dangerous contaminants in our surroundings. This study investigated two years’ worth of air quality and outlier detection data from two Indian cities. Studies on air pollution have used numerous types of methodologies, with various gases being seen as a vector whose components include gas concentration values for each observation per-formed. We use curves to represent the monthly average of daily gas emissions in our technique. The approach, which is based on functional depth, was used to find outliers in the city of Delhi and Kolkata’s gas emissions, and the outcomes were compared to those from the traditional method. In the evaluation and comparison of these models’ performances, the functional approach model studied well.
基金supported by National Natural Science Foundation of China (Grant Nos.12261007)Natural Science Foundation of Guangxi Province (Grant No.2020GXNSFAA297225)。
文摘In this paper,we consider the clustering of bivariate functional data where each random surface consists of a set of curves recorded repeatedly for each subject.The k-centres surface clustering method based on marginal functional principal component analysis is proposed for the bivariate functional data,and a novel clustering criterion is presented where both the random surface and its partial derivative function in two directions are considered.In addition,we also consider two other clustering methods,k-centres surface clustering methods based on product functional principal component analysis or double functional principal component analysis.Simulation results indicate that the proposed methods have a nice performance in terms of both the correct classification rate and the adjusted rand index.The approaches are further illustrated through empirical analysis of human mortality data.
基金supported by National Natural Science Foundation of China (Grant No.11901313)Fundamental Research Funds for the Central Universities+1 种基金Key Laboratory for Medical Data Analysis and Statistical Research of TianjinKey Laboratory of Pure Mathematics and Combinatorics.
文摘We propose a methodology for testing two-sample means in high-dimensional functional data that requires no decaying pattern on eigenvalues of the functional data.To the best of our knowledge,we are the first to consider and address such a problem.To be specific,we devise a confidence region for the mean curve difference between two samples,which directly establishes a rigorous inferential procedure based on the multiplier bootstrap.In addition,the proposed test permits the functional observations in each sample to have mutually different distributions and arbitrary correlation structures,which is regarded as the desired property of distribution/correlation-free,leading to a more challenging scenario for theoretical development.Other desired properties include the allowance for highly unequal sample sizes,exponentially growing data dimension in sample sizes and consistent power behavior under fairly general alternatives.The proposed test is shown uniformly convergent to the prescribed significance,and its finite sample performance is evaluated via the simulation study and an application to electroencephalography data.
基金Supported by the National Natural Science Foundation of China(Grant Nos.11671268 and 12271370)the Guangdong Basic and Applied Basic Research Foundation(Grant No.2020A1515010821)+1 种基金the Fundamental Research Funds for the Central Universities(Grant No.12619624)Supported by the Research Start-up Fund for new young Teachers of Capital University of Economics and Business(Grant No.00592254417068)。
文摘We propose a two-sample test for the mean functions of functional data when the number of bases is much lager than the sample size.The novel test is based on U-statistics which avoids estimating the covariance operator accurately under the high dimensional situation.We further prove the asymptotic normality of our test statistic under both null hypothesis and a local alternative hypothesis.An extensive simulation study is presented which shows that the proposed test works well in comparison with several other methods under the high dimensional situation.An application to egg-laying trajectories of Mediterranean fruit flies data set demonstrates the applicability of the method.
基金Supported by the Fundamental Research Funds for the Central Universities(Nos.202341017,202313024)。
文摘Chlorophyll-a(Chl-a)concentration is a primary indicator for marine environmental monitoring.The spatio-temporal variations of sea surface Chl-a concentration in the Yellow Sea(YS)and the East China Sea(ECS)in 2001-2020 were investigated by reconstructing the MODIS Level 3 products with the data interpolation empirical orthogonal function(DINEOF)method.The reconstructed results by interpolating the combined MODIS daily+8-day datasets were found better than those merely by interpolating daily or 8-day data.Chl-a concentration in the YS and the ECS reached its maximum in spring,with blooms occurring,decreased in summer and autumn,and increased in late autumn and early winter.By performing empirical orthogonal function(EOF)decomposition of the reconstructed data fields and correlation analysis with several potential environmental factors,we found that the sea surface temperature(SST)plays a significant role in the seasonal variation of Chl a,especially during spring and summer.The increase of SST in spring and the upper-layer nutrients mixed up during the last winter might favor the occurrence of spring blooms.The high sea surface temperature(SST)throughout the summer would strengthen the vertical stratification and prevent nutrients supply from deep water,resulting in low surface Chl-a concentrations.The sea surface Chl-a concentration in the YS was found decreased significantly from 2012 to 2020,which was possibly related to the Pacific Decadal Oscillation(PDO).
文摘We consider the semiparametric partially linear regression models with mean function XTβ + g(z), where X and z are functional data. The new estimators of β and g(z) are presented and some asymptotic results are given. The strong convergence rates of the proposed estimators are obtained. In our estimation, the observation number of each subject will be completely flexible. Some simulation study is conducted to investigate the finite sample performance of the proposed estimators.
基金supported by National Natural Science Foundation of China (Grant No. 11271080)
文摘We propose a new functional single index model, which called dynamic single-index model for functional data, or DSIM, to efficiently perform non-linear and dynamic relationships between functional predictor and functional response. The proposed model naturally allows for some curvature not captured by the ordinary functional linear model. By using the proposed two-step estimating algorithm, we develop the estimates for both the link function and the regression coefficient function, and then provide predictions of new response trajectories. Besides the asymptotic properties for the estimates of the unknown functions, we also establish the consistency of the predictions of new response trajectories under mild conditions. Finally, we show through extensive simulation studies and a real data example that the proposed DSIM can highly outperform existed functional regression methods in most settings.
文摘This paper deals with the conditional density estimator of a real response variable given a functional random variable(i.e.,takes values in an infinite-dimensional space).Specifically,we focus on the functional index model,and this approach represents a good compromise between nonparametric and parametric models.Then we give under general conditions and when the variables are independent,the quadratic error and asymptotic normality of estimator by local linear method,based on the single-index structure.Finally,wecomplete these theoretical advances by some simulation studies showing both the practical result of the local linear method and the good behaviour for finite sample sizes of the estimator and of the Monte Carlo methods to create functional pseudo-confidence area.
基金Authors acknowledged the Canadian Institute for Health Research(CIHR)Children's Hospital Research Institute of Manitoba(CHRIM)Foundation+1 种基金Visual and Automated Disease Analytics(VADA)graduate training program of Natural Sciences and Engineering Research Council of Canada(NSERC)for providing the funding opportunities to conduct this research.
文摘Background:The accurate estimation of temporal patterns of influenza may help in utilizing hospital resources and guiding influenza surveillance.This paper proposes functional data analysis(FDA)to improve the prediction of temporal patterns of influenza.Methods:We illustrate FDA methods using the weekly Influenza-like Illness(ILI)activity level data from the U.S.We propose to use the Fourier basis function for transforming discrete weekly data to the smoothed functional ILI activities.Functional analysis of variance(FANOVA)is used to examine the regional differences in temporal patterns and the impact of state's political orientation.Results:The ILI activity has a very distinct peak at the beginning and end of the year.There are significant differences in average level of ILI activities among geographic regions.However,the temporal patterns in terms of the peak and flat time are quite consistent across regions.The geographic and temporal patterns of ILI activities also depend on the political make-up of states.The states affiliated with Republicans had higher ILI activities than those affiliated with Democrats across the whole year.The influence of political party affiliation on temporal pattern is quite different among geographic regions.Conclusions:Functional data analysis can help us to reveal the temporal variability in average ILI levels,rate of change in ILI levels,and the effect of geographical regions.Consideration should be given to wider application of FDA to generate more accurate estimates in public health and biomedical research.
基金the National Social Science Foundation of China(Grant No.22BTJ035).
文摘The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to achieve better classification accuracy.In this paper,we propose a mean-variance-based(MV)feature weighting method for classifying functional data or functional curves.In the feature extraction stage,each sample curve is approximated by B-splines to transfer features to the coefficients of the spline basis.After that,a feature weighting approach based on statistical principles is introduced by comprehensively considering the between-class differences and within-class variations of the coefficients.We also introduce a scaling parameter to adjust the gap between the weights of features.The new feature weighting approach can adaptively enhance noteworthy local features while mitigating the impact of confusing features.The algorithms for feature weighted K-nearest neighbor and support vector machine classifiers are both provided.Moreover,the new approach can be well integrated into existing functional data classifiers,such as the generalized functional linear model and functional linear discriminant analysis,resulting in a more accurate classification.The performance of the mean-variance-based classifiers is evaluated by simulation studies and real data.The results show that the newfeatureweighting approach significantly improves the classification accuracy for complex functional data.
文摘It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when the covariates of the nonparametric component are functional,the robust estimates for the regression parameter and regression operator are introduced.The main propose of the paper is to consider data-driven methods of selecting the number of neighbors in order to make the proposed processes fully automatic.We use thek Nearest Neighbors procedure(kNN)to construct the kernel estimator of the proposed robust model.Under some regularity conditions,we state consistency results for kNN functional estimators,which are uniform in the number of neighbors(UINN).Furthermore,a simulation study and an empirical application to a real data analysis of octane gasoline predictions are carried out to illustrate the higher predictive performances and the usefulness of the kNN approach.
基金supported by the National Key R&D Program of China(Grant Nos.2018YFB1700704 and 2018YFB1702502)the Study on the Key Management and Privacy Preservation in VANET,The Innovation Foundation of Science and Technology of Dalian(2018J12GX045).
文摘Fuzzy clustering theory is widely used in data mining of full-face tunnel boring machine.However,the traditional fuzzy clustering algorithm based on objective function is difficult to effectively cluster functional data.We propose a new Fuzzy clustering algorithm,namely FCM-ANN algorithm.The algorithm replaces the clustering prototype of the FCM algorithm with the predicted value of the artificial neural network.This makes the algorithm not only satisfy the clustering based on the traditional similarity criterion,but also can effectively cluster the functional data.In this paper,we first use the t-test as an evaluation index and apply the FCM-ANN algorithm to the synthetic datasets for validity testing.Then the algorithm is applied to TBM operation data and combined with the crossvalidation method to predict the tunneling speed.The predicted results are evaluated by RMSE and R^(2).According to the experimental results on the synthetic datasets,we obtain the relationship among the membership threshold,the number of samples,the number of attributes and the noise.Accordingly,the datasets can be effectively adjusted.Applying the FCM-ANN algorithm to the TBM operation data can accurately predict the tunneling speed.The FCM-ANN algorithm has improved the traditional fuzzy clustering algorithm,which can be used not only for the prediction of tunneling speed of TBM but also for clustering or prediction of other functional data.
文摘The study of estimation of conditional extreme quantile in incomplete data frameworks is of growing interest. Specially, the estimation of the extreme value index in a censorship framework has been the purpose of many inves<span style="font-family:Verdana;">tigations when finite dimension covariate information has been considered. In this paper, the estimation of the conditional extreme quantile of a </span><span style="font-family:Verdana;">heavy-tailed distribution is discussed when some functional random covariate (</span><i><span style="font-family:Verdana;">i.e.</span></i><span style="font-family:Verdana;"> valued in some infinite-dimensional space) information is available and the scalar response variable is right-censored. A Weissman-type estimator of conditional extreme quantiles is proposed and its asymptotic normality is established under mild assumptions. A simulation study is conducted to assess the finite-sample behavior of the proposed estimator and a comparison with two simple estimations strategies is provided.</span>
文摘For Hermite-Birkhoff interpolation of scattered multidumensional data by radial basis function (?),existence and characterization theorems and a variational principle are proved. Examples include (?)(r)=r^b,Duchon's thin-plate splines,Hardy's multiquadrics,and inverse multiquadrics.
基金Project partially supported by the Young Scientists Fund of the National Natural Science Foundation of China(Grant No.60904004)the Key Youth Science and Technology Foundation of University of Electronic Science and Technology of China (Grant No.L08010201JX0720)
文摘This paper studies the problem of robust H∞ control of piecewise-linear chaotic systems with random data loss. The communication links between the plant and the controller are assumed to be imperfect (that is, data loss occurs intermittently, which appears typically in a network environment). The data loss is modelled as a random process which obeys a Bernoulli distribution. In the face of random data loss, a piecewise controller is designed to robustly stabilize the networked system in the sense of mean square and also achieve a prescribed H∞ disturbance attenuation performance based on a piecewise-quadratic Lyapunov function. The required H∞ controllers can be designed by solving a set of linear matrix inequalities (LMIs). Chua's system is provided to illustrate the usefulness and applicability of the developed theoretical results.
文摘A variety of factors affect air quality, making it a difficult issue. The level of clean air in a certain area is referred to as air quality. It is challenging for conventional approaches to correctly discover aberrant values or outliers due to the significant fluctuation of this sort of data, which is influenced by Climate change and the environment. With accelerating industrial expansion and rising population density in Kolkata City, air pollution is continuously rising. This study involves two phases, in the first phase imputation of missing values and second detection of outliers using Statistical Process Control (SPC), and Functional Data Analysis (FDA), studies to achieve the efficacy of the outlier identification methodology proposed with working days and Nonworking days of the variables NO<sub>2</sub>, SO<sub>2</sub>, and O<sub>3</sub>, which were used for a year in a row in Kolkata, India. The results show how the functional data approach outshines traditional outlier detection methods. The outcomes show that functional data analysis vibrates more than the other two approaches after imputation, and the suggested outlier detector is absolutely appropriate for the precise detection of outliers in highly variable data.
基金supported by the Natural Science Foundation of China under Grant Nos.11271014 and 11971045。
文摘This paper is focused on the goodness-of-fit test of the functional linear composite quantile regression model.A nonparametric test is proposed by using the orthogonality of the residual and its conditional expectation under the null model.The proposed test statistic has an asymptotic standard normal distribution under the null hypothesis,and tends to infinity in probability under the alternative hypothesis,which implies the consistency of the test.Furthermore,it is proved that the test statistic converges to a normal distribution with nonzero mean under a local alternative hypothesis.Extensive simulations are reported,and the results show that the proposed test has proper sizes and is sensitive to the considered model discrepancies.The proposed methods are also applied to two real datasets.
基金supported by National Natural Science Foundation of China(Grant Nos.11861014,11561006 and 11971404)Natural Science Foundation of Guangxi Province(Grant No.2018GXNSFAA281145)+1 种基金Humanity and Social Science Youth Foundation of Ministry of Education of China(Grant No.19YJC910010)the Intramural Research Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development,National Institutes of Health,USA。
文摘To better describe and understand the time dynamics in functional data analysis,it is often desirable to recover the partial derivatives of the random surface.A novel approach is proposed based on marginal functional principal component analysis to derive the representation for partial derivatives.To obtain the Karhunen-Lo`eve expansion of the partial derivatives,an adaptive estimation is explored.Asymptotic results of the proposed estimates are established.Simulation studies show that the proposed methods perform well in finite samples.Application to the human mortality data reveals informative time dynamics in mortality rates.
基金supported by the National Natural Science Foundation of China under Grant Nos.11771146,11831008,81530086,11771145,11871252the 111 Project(B14019)Program of Shanghai Subject Chief Scientist under Grant No.14XD1401600。
文摘Motivated by a medical study that attempts to analyze the relationship between growth curves and other variables and to measure the association among multiple growth curves,the authors develop a functional multiple-outcome model to decompose the total variation of multiple functional outcomes into variation explained by independent variables with time-varying coefficient functions,by latent factors and by noise.The latent factors are the hidden common factors that influence the multiple outcomes and are found through the combined functional principal component analysis approach.Through the coefficients of the latent factors one may further explore the association of the multiple outcomes.This method is applied to the multivariate growth data of infants in a real medical study in Shanghai and produces interpretable results.Convergence rates for the proposed estimates of the varying coefficient and covariance functions of the model are derived under mild conditions.