Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities tu...Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities turn up dangerous contaminants in our surroundings. This study investigated two years’ worth of air quality and outlier detection data from two Indian cities. Studies on air pollution have used numerous types of methodologies, with various gases being seen as a vector whose components include gas concentration values for each observation per-formed. We use curves to represent the monthly average of daily gas emissions in our technique. The approach, which is based on functional depth, was used to find outliers in the city of Delhi and Kolkata’s gas emissions, and the outcomes were compared to those from the traditional method. In the evaluation and comparison of these models’ performances, the functional approach model studied well.展开更多
Background:The accurate estimation of temporal patterns of influenza may help in utilizing hospital resources and guiding influenza surveillance.This paper proposes functional data analysis(FDA)to improve the predicti...Background:The accurate estimation of temporal patterns of influenza may help in utilizing hospital resources and guiding influenza surveillance.This paper proposes functional data analysis(FDA)to improve the prediction of temporal patterns of influenza.Methods:We illustrate FDA methods using the weekly Influenza-like Illness(ILI)activity level data from the U.S.We propose to use the Fourier basis function for transforming discrete weekly data to the smoothed functional ILI activities.Functional analysis of variance(FANOVA)is used to examine the regional differences in temporal patterns and the impact of state's political orientation.Results:The ILI activity has a very distinct peak at the beginning and end of the year.There are significant differences in average level of ILI activities among geographic regions.However,the temporal patterns in terms of the peak and flat time are quite consistent across regions.The geographic and temporal patterns of ILI activities also depend on the political make-up of states.The states affiliated with Republicans had higher ILI activities than those affiliated with Democrats across the whole year.The influence of political party affiliation on temporal pattern is quite different among geographic regions.Conclusions:Functional data analysis can help us to reveal the temporal variability in average ILI levels,rate of change in ILI levels,and the effect of geographical regions.Consideration should be given to wider application of FDA to generate more accurate estimates in public health and biomedical research.展开更多
The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to a...The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to achieve better classification accuracy.In this paper,we propose a mean-variance-based(MV)feature weighting method for classifying functional data or functional curves.In the feature extraction stage,each sample curve is approximated by B-splines to transfer features to the coefficients of the spline basis.After that,a feature weighting approach based on statistical principles is introduced by comprehensively considering the between-class differences and within-class variations of the coefficients.We also introduce a scaling parameter to adjust the gap between the weights of features.The new feature weighting approach can adaptively enhance noteworthy local features while mitigating the impact of confusing features.The algorithms for feature weighted K-nearest neighbor and support vector machine classifiers are both provided.Moreover,the new approach can be well integrated into existing functional data classifiers,such as the generalized functional linear model and functional linear discriminant analysis,resulting in a more accurate classification.The performance of the mean-variance-based classifiers is evaluated by simulation studies and real data.The results show that the newfeatureweighting approach significantly improves the classification accuracy for complex functional data.展开更多
It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when th...It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when the covariates of the nonparametric component are functional,the robust estimates for the regression parameter and regression operator are introduced.The main propose of the paper is to consider data-driven methods of selecting the number of neighbors in order to make the proposed processes fully automatic.We use thek Nearest Neighbors procedure(kNN)to construct the kernel estimator of the proposed robust model.Under some regularity conditions,we state consistency results for kNN functional estimators,which are uniform in the number of neighbors(UINN).Furthermore,a simulation study and an empirical application to a real data analysis of octane gasoline predictions are carried out to illustrate the higher predictive performances and the usefulness of the kNN approach.展开更多
The problem of predicting continuous scalar outcomes from functional predictors has received high levels of interest in recent years in many fields,especially in the food industry.The k-nearest neighbor(k-NN)method of...The problem of predicting continuous scalar outcomes from functional predictors has received high levels of interest in recent years in many fields,especially in the food industry.The k-nearest neighbor(k-NN)method of Near-Infrared Reflectance(NIR)analysis is practical,relatively easy to implement,and becoming one of the most popular methods for conducting food quality based on NIR data.The k-NN is often named k nearest neighbor classifier when it is used for classifying categorical variables,while it is called k-nearest neighbor regression when it is applied for predicting noncategorical variables.The objective of this paper is to use the functional Near-Infrared Reflectance(NIR)spectroscopy approach to predict some chemical components with some modern statistical models based on the kernel and k-Nearest Neighbour procedures.In this paper,three NIR spectroscopy datasets are used as examples,namely Cookie dough,sugar,and tecator data.Specifically,we propose three models for this kind of data which are Functional Nonparametric Regression,Functional Robust Regression,and Functional Relative Error Regression,with both kernel and k-NN approaches to compare between them.The experimental result shows the higher efficiency of k-NN predictor over the kernel predictor.The predictive power of the k-NN method was compared with that of the kernel method,and several real data sets were used to determine the predictive power of both methods.展开更多
A variety of factors affect air quality, making it a difficult issue. The level of clean air in a certain area is referred to as air quality. It is challenging for conventional approaches to correctly discover aberran...A variety of factors affect air quality, making it a difficult issue. The level of clean air in a certain area is referred to as air quality. It is challenging for conventional approaches to correctly discover aberrant values or outliers due to the significant fluctuation of this sort of data, which is influenced by Climate change and the environment. With accelerating industrial expansion and rising population density in Kolkata City, air pollution is continuously rising. This study involves two phases, in the first phase imputation of missing values and second detection of outliers using Statistical Process Control (SPC), and Functional Data Analysis (FDA), studies to achieve the efficacy of the outlier identification methodology proposed with working days and Nonworking days of the variables NO<sub>2</sub>, SO<sub>2</sub>, and O<sub>3</sub>, which were used for a year in a row in Kolkata, India. The results show how the functional data approach outshines traditional outlier detection methods. The outcomes show that functional data analysis vibrates more than the other two approaches after imputation, and the suggested outlier detector is absolutely appropriate for the precise detection of outliers in highly variable data.展开更多
We propose a new functional single index model, which called dynamic single-index model for functional data, or DSIM, to efficiently perform non-linear and dynamic relationships between functional predictor and functi...We propose a new functional single index model, which called dynamic single-index model for functional data, or DSIM, to efficiently perform non-linear and dynamic relationships between functional predictor and functional response. The proposed model naturally allows for some curvature not captured by the ordinary functional linear model. By using the proposed two-step estimating algorithm, we develop the estimates for both the link function and the regression coefficient function, and then provide predictions of new response trajectories. Besides the asymptotic properties for the estimates of the unknown functions, we also establish the consistency of the predictions of new response trajectories under mild conditions. Finally, we show through extensive simulation studies and a real data example that the proposed DSIM can highly outperform existed functional regression methods in most settings.展开更多
We propose a method which uses functional singular component to establish functional additive models. The proposed methodology reduces the curve regression problem to ordinary(i.e., scalar) additive regression problem...We propose a method which uses functional singular component to establish functional additive models. The proposed methodology reduces the curve regression problem to ordinary(i.e., scalar) additive regression problems of the singular components of the predictor process and response process. Consistency of estimators for the nonparametric function and prediction are proved, respectively. A simulation study is conducted to investigate the finite sample performances of the proposed estimators.展开更多
In this paper,we consider composite quantile regression for partial functional linear regression model with polynomial spline approximation.Under some mild conditions,the convergence rates of the estimators and mean s...In this paper,we consider composite quantile regression for partial functional linear regression model with polynomial spline approximation.Under some mild conditions,the convergence rates of the estimators and mean squared prediction error,and asymptotic normality of parameter vector are obtained.Simulation studies demonstrate that the proposed new estimation method is robust and works much better than the least-squares based method when there are outliers in the dataset or the random error follows heavy-tailed distributions.Finally,we apply the proposed methodology to a spectroscopic data sets to illustrate its usefulness in practice.展开更多
This paper presents a robust estimation procedure by using modal regression for the partial functional linear regression,which combines the common linear model with the functional linear regression model.The outstandi...This paper presents a robust estimation procedure by using modal regression for the partial functional linear regression,which combines the common linear model with the functional linear regression model.The outstanding merit of the new method is that it is robust against outliers or heavy-tail error distributions while performs no worse than the least-square-based estimation method for normal error cases.The slope function is fitted by B-spline.Under suitable conditions,the authors obtain the convergence rates and asymptotic normality of the estimators.Finally,simulation studies and a real data example are conducted to examine the finite sample performance of the proposed method.Both the simulation results and the real data analysis confirm that the newly proposed method works very well.展开更多
The increasing richness of data encourages a comprehensive understanding of economic and financial activities,where variables of interest may include not only scalar(point-like)indicators,but also functional(curve-lik...The increasing richness of data encourages a comprehensive understanding of economic and financial activities,where variables of interest may include not only scalar(point-like)indicators,but also functional(curve-like)and compositional(pie-like)ones.In many research topics,the variables are also chronologically collected across individuals,which falls into the paradigm of longitudinal analysis.The complicated nature of data,however,increases the difficulty of modeling these variables under the classic longitudinal frame-work.In this study,we investigate the linear mixed-effects model(LMM)for such complex data.Different types of variables arefirst consistently represented using the corresponding basis expansions so that the classic LMM can then be conducted on them,which gener-alizes the theoretical framework of LMM to complex data analysis.A number of simulation studies indicate the feasibility and effectiveness of the proposed model.We further illustrate its practical utility in a real data study on Chinese stock market and show that the proposed method can enhance the performance and interpretability of the regression for complex data with diversified characteristics.展开更多
Currently,working with partially observed functional data has attracted a greatly increasing attention,since there are many applications in which each functional curve may be observed only on a subset of a common doma...Currently,working with partially observed functional data has attracted a greatly increasing attention,since there are many applications in which each functional curve may be observed only on a subset of a common domain,and the incompleteness makes most existing methods for functional data analysis ineffective.In this paper,motivated by the appealing characteristics of conditional quantile regression,the authors consider the functional linear quantile regression,assuming the explanatory functions are observed partially on dense but discrete point grids of some random subintervals of the domain.A functional principal component analysis(FPCA)based estimator is proposed for the slope function,and the convergence rate of the estimator is investigated.In addition,the finite sample performance of the proposed estimator is evaluated through simulation studies and a real data application.展开更多
文摘Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities turn up dangerous contaminants in our surroundings. This study investigated two years’ worth of air quality and outlier detection data from two Indian cities. Studies on air pollution have used numerous types of methodologies, with various gases being seen as a vector whose components include gas concentration values for each observation per-formed. We use curves to represent the monthly average of daily gas emissions in our technique. The approach, which is based on functional depth, was used to find outliers in the city of Delhi and Kolkata’s gas emissions, and the outcomes were compared to those from the traditional method. In the evaluation and comparison of these models’ performances, the functional approach model studied well.
基金Authors acknowledged the Canadian Institute for Health Research(CIHR)Children's Hospital Research Institute of Manitoba(CHRIM)Foundation+1 种基金Visual and Automated Disease Analytics(VADA)graduate training program of Natural Sciences and Engineering Research Council of Canada(NSERC)for providing the funding opportunities to conduct this research.
文摘Background:The accurate estimation of temporal patterns of influenza may help in utilizing hospital resources and guiding influenza surveillance.This paper proposes functional data analysis(FDA)to improve the prediction of temporal patterns of influenza.Methods:We illustrate FDA methods using the weekly Influenza-like Illness(ILI)activity level data from the U.S.We propose to use the Fourier basis function for transforming discrete weekly data to the smoothed functional ILI activities.Functional analysis of variance(FANOVA)is used to examine the regional differences in temporal patterns and the impact of state's political orientation.Results:The ILI activity has a very distinct peak at the beginning and end of the year.There are significant differences in average level of ILI activities among geographic regions.However,the temporal patterns in terms of the peak and flat time are quite consistent across regions.The geographic and temporal patterns of ILI activities also depend on the political make-up of states.The states affiliated with Republicans had higher ILI activities than those affiliated with Democrats across the whole year.The influence of political party affiliation on temporal pattern is quite different among geographic regions.Conclusions:Functional data analysis can help us to reveal the temporal variability in average ILI levels,rate of change in ILI levels,and the effect of geographical regions.Consideration should be given to wider application of FDA to generate more accurate estimates in public health and biomedical research.
基金the National Social Science Foundation of China(Grant No.22BTJ035).
文摘The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to achieve better classification accuracy.In this paper,we propose a mean-variance-based(MV)feature weighting method for classifying functional data or functional curves.In the feature extraction stage,each sample curve is approximated by B-splines to transfer features to the coefficients of the spline basis.After that,a feature weighting approach based on statistical principles is introduced by comprehensively considering the between-class differences and within-class variations of the coefficients.We also introduce a scaling parameter to adjust the gap between the weights of features.The new feature weighting approach can adaptively enhance noteworthy local features while mitigating the impact of confusing features.The algorithms for feature weighted K-nearest neighbor and support vector machine classifiers are both provided.Moreover,the new approach can be well integrated into existing functional data classifiers,such as the generalized functional linear model and functional linear discriminant analysis,resulting in a more accurate classification.The performance of the mean-variance-based classifiers is evaluated by simulation studies and real data.The results show that the newfeatureweighting approach significantly improves the classification accuracy for complex functional data.
文摘It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when the covariates of the nonparametric component are functional,the robust estimates for the regression parameter and regression operator are introduced.The main propose of the paper is to consider data-driven methods of selecting the number of neighbors in order to make the proposed processes fully automatic.We use thek Nearest Neighbors procedure(kNN)to construct the kernel estimator of the proposed robust model.Under some regularity conditions,we state consistency results for kNN functional estimators,which are uniform in the number of neighbors(UINN).Furthermore,a simulation study and an empirical application to a real data analysis of octane gasoline predictions are carried out to illustrate the higher predictive performances and the usefulness of the kNN approach.
基金funding this work through the Research Groups Program under Grant Number R.G.P.1/189/41.I.M.A.and M.K.A.received the grant.
文摘The problem of predicting continuous scalar outcomes from functional predictors has received high levels of interest in recent years in many fields,especially in the food industry.The k-nearest neighbor(k-NN)method of Near-Infrared Reflectance(NIR)analysis is practical,relatively easy to implement,and becoming one of the most popular methods for conducting food quality based on NIR data.The k-NN is often named k nearest neighbor classifier when it is used for classifying categorical variables,while it is called k-nearest neighbor regression when it is applied for predicting noncategorical variables.The objective of this paper is to use the functional Near-Infrared Reflectance(NIR)spectroscopy approach to predict some chemical components with some modern statistical models based on the kernel and k-Nearest Neighbour procedures.In this paper,three NIR spectroscopy datasets are used as examples,namely Cookie dough,sugar,and tecator data.Specifically,we propose three models for this kind of data which are Functional Nonparametric Regression,Functional Robust Regression,and Functional Relative Error Regression,with both kernel and k-NN approaches to compare between them.The experimental result shows the higher efficiency of k-NN predictor over the kernel predictor.The predictive power of the k-NN method was compared with that of the kernel method,and several real data sets were used to determine the predictive power of both methods.
文摘A variety of factors affect air quality, making it a difficult issue. The level of clean air in a certain area is referred to as air quality. It is challenging for conventional approaches to correctly discover aberrant values or outliers due to the significant fluctuation of this sort of data, which is influenced by Climate change and the environment. With accelerating industrial expansion and rising population density in Kolkata City, air pollution is continuously rising. This study involves two phases, in the first phase imputation of missing values and second detection of outliers using Statistical Process Control (SPC), and Functional Data Analysis (FDA), studies to achieve the efficacy of the outlier identification methodology proposed with working days and Nonworking days of the variables NO<sub>2</sub>, SO<sub>2</sub>, and O<sub>3</sub>, which were used for a year in a row in Kolkata, India. The results show how the functional data approach outshines traditional outlier detection methods. The outcomes show that functional data analysis vibrates more than the other two approaches after imputation, and the suggested outlier detector is absolutely appropriate for the precise detection of outliers in highly variable data.
基金supported by National Natural Science Foundation of China (Grant No. 11271080)
文摘We propose a new functional single index model, which called dynamic single-index model for functional data, or DSIM, to efficiently perform non-linear and dynamic relationships between functional predictor and functional response. The proposed model naturally allows for some curvature not captured by the ordinary functional linear model. By using the proposed two-step estimating algorithm, we develop the estimates for both the link function and the regression coefficient function, and then provide predictions of new response trajectories. Besides the asymptotic properties for the estimates of the unknown functions, we also establish the consistency of the predictions of new response trajectories under mild conditions. Finally, we show through extensive simulation studies and a real data example that the proposed DSIM can highly outperform existed functional regression methods in most settings.
基金supported by National Natural Science Foundation of China (Grant Nos. 11171331, 11561006, 11331011)Program for Creative Research Group of National Natural Science Foundation of China (Grant No. 61621003)+4 种基金a Grant from the Key Lab of Random Complex Structure and Data Science, Chinese Academy of Sciencesthe Natural Science Foundation of Shenzhen UniversityResearch Projects of Colleges and Universities in Guangxi (Grant No. KY2015YB171)Innovation Project of Guangxi Graduate Education (Grant No. JGY2015122)a Grant from the Key Base of Humanities and Social Sciences in Guangxi College
文摘We propose a method which uses functional singular component to establish functional additive models. The proposed methodology reduces the curve regression problem to ordinary(i.e., scalar) additive regression problems of the singular components of the predictor process and response process. Consistency of estimators for the nonparametric function and prediction are proved, respectively. A simulation study is conducted to investigate the finite sample performances of the proposed estimators.
基金Supported by the National Natural Science Foundation of China(Grant Nos.11671096,11690013,11731011 and 12071267)the Natural Science Foundation of Shanxi Province,China(Grant No.201901D111279)。
文摘In this paper,we consider composite quantile regression for partial functional linear regression model with polynomial spline approximation.Under some mild conditions,the convergence rates of the estimators and mean squared prediction error,and asymptotic normality of parameter vector are obtained.Simulation studies demonstrate that the proposed new estimation method is robust and works much better than the least-squares based method when there are outliers in the dataset or the random error follows heavy-tailed distributions.Finally,we apply the proposed methodology to a spectroscopic data sets to illustrate its usefulness in practice.
基金supported by the National Natural Science Foundation of China under Grant Nos.11671096,11690013,11731011。
文摘This paper presents a robust estimation procedure by using modal regression for the partial functional linear regression,which combines the common linear model with the functional linear regression model.The outstanding merit of the new method is that it is robust against outliers or heavy-tail error distributions while performs no worse than the least-square-based estimation method for normal error cases.The slope function is fitted by B-spline.Under suitable conditions,the authors obtain the convergence rates and asymptotic normality of the estimators.Finally,simulation studies and a real data example are conducted to examine the finite sample performance of the proposed method.Both the simulation results and the real data analysis confirm that the newly proposed method works very well.
基金This research was financially supported by the Natural Science Foundation of China(Nos.71420107025,11701023).
文摘The increasing richness of data encourages a comprehensive understanding of economic and financial activities,where variables of interest may include not only scalar(point-like)indicators,but also functional(curve-like)and compositional(pie-like)ones.In many research topics,the variables are also chronologically collected across individuals,which falls into the paradigm of longitudinal analysis.The complicated nature of data,however,increases the difficulty of modeling these variables under the classic longitudinal frame-work.In this study,we investigate the linear mixed-effects model(LMM)for such complex data.Different types of variables arefirst consistently represented using the corresponding basis expansions so that the classic LMM can then be conducted on them,which gener-alizes the theoretical framework of LMM to complex data analysis.A number of simulation studies indicate the feasibility and effectiveness of the proposed model.We further illustrate its practical utility in a real data study on Chinese stock market and show that the proposed method can enhance the performance and interpretability of the regression for complex data with diversified characteristics.
基金supported by the National Natural Science Foundation of China under Grant No.11771032。
文摘Currently,working with partially observed functional data has attracted a greatly increasing attention,since there are many applications in which each functional curve may be observed only on a subset of a common domain,and the incompleteness makes most existing methods for functional data analysis ineffective.In this paper,motivated by the appealing characteristics of conditional quantile regression,the authors consider the functional linear quantile regression,assuming the explanatory functions are observed partially on dense but discrete point grids of some random subintervals of the domain.A functional principal component analysis(FPCA)based estimator is proposed for the slope function,and the convergence rate of the estimator is investigated.In addition,the finite sample performance of the proposed estimator is evaluated through simulation studies and a real data application.