Error or drift is frequently produced in pose estimation based on geometric"feature detection and tracking"monocular visual odometry(VO)when the speed of camera movement exceeds 1.5 m/s.While,in most VO meth...Error or drift is frequently produced in pose estimation based on geometric"feature detection and tracking"monocular visual odometry(VO)when the speed of camera movement exceeds 1.5 m/s.While,in most VO methods based on deep learning,weight factors are in the form of fixed values,which are easy to lead to overfitting.A new measurement system,for monocular visual odometry,named Deep Learning Visual Odometry(DLVO),is proposed based on neural network.In this system,Convolutional Neural Network(CNN)is used to extract feature and perform feature matching.Moreover,Recurrent Neural Network(RNN)is used for sequence modeling to estimate camera’s 6-dof poses.Instead of fixed weight values of CNN,Bayesian distribution of weight factors are introduced in order to effectively solve the problem of network overfitting.The 18,726 frame images in KITTI dataset are used for training network.This system can increase the generalization ability of network model in prediction process.Compared with original Recurrent Convolutional Neural Network(RCNN),our method can reduce the loss of test model by 5.33%.And it’s an effective method in improving the robustness of translation and rotation information than traditional VO methods.展开更多
A general uncertainty relation between the change of weighted value which represents learning ability and the discrimination error of unlearning sample sets which represents generalization ability is revealed in the m...A general uncertainty relation between the change of weighted value which represents learning ability and the discrimination error of unlearning sample sets which represents generalization ability is revealed in the modeling of back propagation (BP) neural network. Tests of numerical simulation for multitype of complicated functions are carried out to determine the value distribution (1×10?5~5×10?4) of overfitting parameter in the uncertainty relation. Based on the uncertainty relation, the overfitting in the training process of given sample sets using BP neural network can be judged.展开更多
The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Infor...The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. We describe different black-box attacks from potential adversaries and study their impact on the amount and type of information that may be recovered from commonly used and deployed LLMs. Our research investigates the relationship between PII leakage, memorization, and factors such as model size, architecture, and the nature of attacks employed. The study utilizes two broad categories of attacks: PII leakage-focused attacks (auto-completion and extraction attacks) and memorization-focused attacks (various membership inference attacks). The findings from these investigations are quantified using an array of evaluative metrics, providing a detailed understanding of LLM vulnerabilities and the effectiveness of different attacks.展开更多
Deep learning (DL) techniques, more specifically Convolutional Neural Networks (CNNs), have become increasingly popular in advancing the field of data science and have had great successes in a wide array of applicatio...Deep learning (DL) techniques, more specifically Convolutional Neural Networks (CNNs), have become increasingly popular in advancing the field of data science and have had great successes in a wide array of applications including computer vision, speech, natural language processing, etc. However, the training process of CNNs is computationally intensive and has high computational cost, especially when the dataset is huge. To overcome these obstacles, this paper takes advantage of distributed frameworks and cloud computing to develop a parallel CNN algorithm. MapReduce is a scalable and fault-tolerant data processing tool that was developed to provide significant improvements in large-scale data-intensive applications in clusters. A MapReduce-based CNN (MCNN) is developed in this work to tackle the task of image classification. In addition, the proposed MCNN adopted the idea of adding dropout layers in the networks to tackle the overfitting problem. Close examination of the implementation of MCNN as well as how the proposed algorithm accelerates learning are discussed and demonstrated through experiments. Results reveal high classification accuracy and significant improvements in speedup, scaleup and sizeup compared to the standard algorithms.展开更多
Because of overfitting and the improvement of generalization capability (GC)available in the construction of forecasting models using artificial neural network (ANN), a newmethod is proposed for model establishment by...Because of overfitting and the improvement of generalization capability (GC)available in the construction of forecasting models using artificial neural network (ANN), a newmethod is proposed for model establishment by means of making a low-dimension ANN learning matrixthrough principal component analysis (PCA). The results show that the PC A is able to construct anANN model without the need of finding an optimal structure with the appropriate number ofhidden-layer nodes, thus avoids overfitting by condensing forecasting information, reducingdimension and removing noise, and GC is greatly raised compared to the traditional ANN and stepwiseregression techniques for model establishment.展开更多
Melanoma is the most lethal malignant tumour,and its prevalence is increasing.Early detection and diagnosis of skin cancer can alert patients to manage precautions and dramatically improve the lives of people.Recently...Melanoma is the most lethal malignant tumour,and its prevalence is increasing.Early detection and diagnosis of skin cancer can alert patients to manage precautions and dramatically improve the lives of people.Recently,deep learning has grown increasingly popular in the extraction and categorization of skin cancer features for effective prediction.A deep learning model learns and co-adapts representations and features from training data to the point where it fails to perform well on test data.As a result,overfitting and poor performance occur.To deal with this issue,we proposed a novel Consecutive Layerwise weight Con-straint MaxNorm model(CLCM-net)for constraining the norm of the weight vector that is scaled each time and bounding to a limit.This method uses deep convolutional neural networks and also custom layer-wise weight constraints that are set to the whole weight matrix directly to learn features efficiently.In this research,a detailed analysis of these weight norms is performed on two distinct datasets,International Skin Imaging Collaboration(ISIC)of 2018 and 2019,which are challenging for convolutional networks to handle.According to thefindings of this work,CLCM-net did a better job of raising the model’s performance by learning the features efficiently within the size limit of weights with appropriate weight constraint settings.The results proved that the proposed techniques achieved 94.42%accuracy on ISIC 2018,91.73%accuracy on ISIC 2019 datasets and 93%of accuracy on combined dataset.展开更多
Oversampling is the most utilized approach to deal with class-imbalanced datasets,as seen by the plethora of oversampling methods developed in the last two decades.We argue in the following editorial the issues with o...Oversampling is the most utilized approach to deal with class-imbalanced datasets,as seen by the plethora of oversampling methods developed in the last two decades.We argue in the following editorial the issues with oversampling that stem from the possibility of overfitting and the generation of synthetic cases that might not accurately represent the minority class.These limitations should be considered when using oversampling techniques.We also propose several alternate strategies for dealing with imbalanced data,as well as a future work perspective.展开更多
Machine learning method has been widely used in various geotechnical engineering risk analysis in recent years. However, the overfitting problem often occurs due to the small number of samples obtained in history. Thi...Machine learning method has been widely used in various geotechnical engineering risk analysis in recent years. However, the overfitting problem often occurs due to the small number of samples obtained in history. This paper proposes the FuzzySVM(support vector machine) geotechnical engineering risk analysis method based on the Bayesian network. The proposed method utilizes the fuzzy set theory to build a Bayesian network to reflect prior knowledge, and utilizes the SVM to build a Bayesian network to reflect historical samples. Then a Bayesian network for evaluation is built in Bayesian estimation method by combining prior knowledge with historical samples. Taking seismic damage evaluation of slopes as an example, the steps of the method are stated in detail. The proposed method is used to evaluate the seismic damage of 96 slopes along roads in the area affected by the Wenchuan earthquake. The evaluation results show that the method can solve the overfitting problem, which often occurs if the machine learning methods are used to evaluate risk of geotechnical engineering, and the performance of the method is much better than that of the previous machine learning methods. Moreover,the proposed method can also effectively evaluate various geotechnical engineering risks in the absence of some influencing factors.展开更多
Overfitting frequently occurs in deep learning.In this paper,we propose a novel regularization method called drop-activation to reduce overfitting and improve generalization.The key idea is to drop nonlinear activatio...Overfitting frequently occurs in deep learning.In this paper,we propose a novel regularization method called drop-activation to reduce overfitting and improve generalization.The key idea is to drop nonlinear activation functions by setting them to be identity functions randomly during training time.During testing,we use a deterministic network with a new activation function to encode the average effect of dropping activations randomly.Our theoretical analyses support the regularization effect of drop-activation as implicit parameter reduction and verify its capability to be used together with batch normalization(Iolfe and Szegedy in Batch normalization:accelerating deep network training by reducing internal covariate shift.arXiv:1502.03167,2015).The experimental results on CIFAR10,CIFAR100,SVHN,EMNIST,and ImageNet show that drop-activation generally improves the performance of popular neural network architectures for the image classification task.Furthermore,as a regularizer drop-activation can be used in harmony with standard training and regularization techniques such as batch normalization and AutoAugment(Cubuk et al.in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,pp.113-123,2019).The code is available at https://github.com/LeungSamWai/Drop-Activ ation.展开更多
The method of determining the structures and parameters of radial basis function neural networks(RBFNNs) using improved genetic algorithms is proposed. Akaike′s information criterion (AIC) with generalization error t...The method of determining the structures and parameters of radial basis function neural networks(RBFNNs) using improved genetic algorithms is proposed. Akaike′s information criterion (AIC) with generalization error term is used as the best criterion of optimizing the structures and parameters of networks. It is shown from the simulation results that the method not only improves the approximation and generalization capability of RBFNNs ,but also obtain the optimal or suboptimal structures of networks.展开更多
This paper proposes a novel nonlinear correlation filter for facial landmark localization. Firstly, we prove that SVM as a classifier can also be used for localization. Then, soft constrained Minimum Average Correlati...This paper proposes a novel nonlinear correlation filter for facial landmark localization. Firstly, we prove that SVM as a classifier can also be used for localization. Then, soft constrained Minimum Average Correlation Energy filter (soft constrained MACE) is proposed, which is more resistent to overfittings to training set than other variants of correlation filter. In order to improve the performance for the multi-mode of the targets, locally linear framework is introduced to our model, which results in Fourier Locally Linear Soft Constraint MACE (FL^2 SC-MACE). Furthermore, we formulate the fast implementation and show that the time consumption in test process is independent of the number of training samples. The merits of our method include accurate localization performance, desiring generalization capability to the variance of objects, fast testing speed and insensitivity to parameter settings. We conduct the cross-set eye localization experiments on challenging FRGC, FERET and BioID datasets. Our method surpasses the state-of-arts especially in pixelwise accuracy.展开更多
Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while ...Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification accuracy.This paper uses instance reduction techniques for the datasets before mining the association rules and building the classifier.Instance reduction techniques were originally developed to reduce memory requirements in instance-based learning.This paper utilizes them to remove noise from the dataset before training the association rules classifier.Extensive experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise ratios.Experiments show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning repository.The improvements were more apparent in the 5%and the 10%noise cases.When RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original test.The average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise case.Higher confidence was also reported in building the association rules when RENN was used.The above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.展开更多
To attain the goal of carbon peaking and carbon neutralization,the inevitable choice is the open sharing of power data and connection to the grid of high-permeability renewable energy.However,this approach is hindered...To attain the goal of carbon peaking and carbon neutralization,the inevitable choice is the open sharing of power data and connection to the grid of high-permeability renewable energy.However,this approach is hindered by the lack of training data for predicting new grid-connected PV power stations.To overcome this problem,this work uses open and shared power data as input for a short-term PV-power-prediction model based on feature transfer learning to facilitate the generalization of the PV-power-prediction model to multiple PV-power stations.The proposed model integrates a structure model,heat-dissipation conditions,and the loss coefficients of PV modules.Clear-Sky entropy,characterizes seasonal and weather data features,describes the main meteorological characteristics at the PV power station.Taking gate recurrent unit neural networks as the framework,the open and shared PV-power data as the source-domain training label,and a small quantity of power data from a new grid-connected PV power station as the target-domain training label,the neural network hidden layer is shared between the target domain and the source domain.The fully connected layer is established in the target domain,and the regularization constraint is introduced to fine-tune and suppress the overfitting in feature transfer.The prediction of PV power is completed by using the actual power data of PV power stations.The average measures of the normalized root mean square error(NRMSE),the normalized mean absolute percentage error(NMAPE),and the normalized maximum absolute percentage error(NLAE)for the model decrease by 15%,12%,and 35%,respectively,which reflects a much greater adaptability than is possible with other methods.These results show that the proposed method is highly generalizable to different types of PV devices and operating environments that offer insufficient training data.展开更多
In recent years,automatic program repair approaches have developed rapidly in the field of software engineering.However,the existing program repair techniques based on genetic programming suffer from requiring verific...In recent years,automatic program repair approaches have developed rapidly in the field of software engineering.However,the existing program repair techniques based on genetic programming suffer from requiring verification of a large number of candidate patches,which consume a lot of computational resources.In this paper,we propose a random search and code similarity based automatic program repair(RSCSRepair).First,to reduce the verification computation effort for candidate patches,we introduce test filtering to reduce the number of test cases and use test case prioritization techniques to reconstruct a new set of test cases.Second,we use a combination of code similarity and random search for patch generation.Finally,we use a patch overfitting detection method to improve the quality of patches.In order to verify the performance of our approach,we conducted the experiments on the Defects4J benchmark.The experimental results show that RSCSRepair correctly repairs up to 54 bugs,with improvements of 14.3%,8.5%,14.3%and 10.3%for our approach compared with jKali,Nopol,CapGen and Sim Fix,respectively.展开更多
The forecasting of time-series data plays an important role in various domains. It is of significance in theory and application to improve prediction accuracy of the time-series data. With the progress in the study of...The forecasting of time-series data plays an important role in various domains. It is of significance in theory and application to improve prediction accuracy of the time-series data. With the progress in the study of time-series, time-series forecasting model becomes more complicated, and consequently great concern has been drawn to the techniques in designing the forecasting model. A modeling method which is easy to use by engineers and may generate good results is in urgent need. In this paper, a gradient-boost AR ensemble learning algorithm (AREL) is put forward. The effectiveness of AREL is assessed by theoretical analyses, and it is demonstrated that this method can build a strong predictive model by assembling a set of AR models. In order to avoid fitting exactly any single training example, an insensitive loss function is introduced in the AREL algorithm, and accordingly the influence of random noise is reduced. To further enhance the capability of AREL algorithm for non-stationary time-series, improve the robustness of algorithm, discourage overfitting, and reduce sensitivity of algorithm to parameter settings, a weighted kNN prediction method based on AREL algorithm is presented. The results of numerical testing on real data demonstrate that the proposed modeling method and prediction method are effective.展开更多
Fitting of corneal topography data to analytical surfaces has been necessary in many clinical and experimental applications,yet absolute superiority of fitting methods was still unclear,and their overfitting risks wer...Fitting of corneal topography data to analytical surfaces has been necessary in many clinical and experimental applications,yet absolute superiority of fitting methods was still unclear,and their overfitting risks were not well studied.This study aimed to evaluate the accuracy and reliability of orthogonal polynomials as fitting routines to represent corneal topography.Four orthogonal polynomials,namely,Zernike polynomials(ZPs),pseudo-Zernike polynomials(PZPs),Gaussian-Hermite polynomials(GHPs)and Orthogonal Fourier-Mellin polynomials(OFMPs),were employed to fit anterior and posterior corneal topographies collected from 200 healthy and 174 keratoconic eyes using Pentacam topographer.The fitting performance of these polynomials were compared,and the potential overfitting risks were assessed through a prediction exercise.The results showed that,except for low orders,the fitting performance differed little among polynomials with orders10 regarding surface reconstruction(RMSEs~0.3μm).Anterior surfaces of normal corneas were fitted more efficiently,followed by those of keratoconic corneas,then posterior corneal surfaces.The results,however,revealed an alarming fact that all polynomials tended to overfit the data beyond certain orders.GHPs,closely followed by ZPs,were the most robust in predicting unmeasured surface locations;while PZPs and especially OFMPs overfitted the surfaces drastically.Order 10 appeared to be optimum for corneal surfaces with 10-mm diameter,ensuring accurate reconstruction and avoiding overfitting.The optimum order however varied with topography diameters and data resolutions.The study concluded that continuing to use ZPs as fitting routine for most topography maps,or using GHPs instead,remains a good choice.Choosing polynomial orders close to the topography diameters(millimeters)is generally suggested to ensure both reconstruction accuracy and prediction reliability and avoid overfitting for both normal and complex(e.g.,keratoconic)corneal surfaces.展开更多
This paper analyses the intrinsic relationship between the BP network learning ability and generalization ability and other influencing factors when the overfit occurs, and introduces the multiple correlation coeffici...This paper analyses the intrinsic relationship between the BP network learning ability and generalization ability and other influencing factors when the overfit occurs, and introduces the multiple correlation coefficient to describe the complexity of samples; it follows the calculation uncertainty principle and the minimum principle of neural network structural design, provides an analogy of the general uncertainty relation in the information transfer process, and ascertains the uncertainty relation between the training relative error of the training sample set, which reflects the network learning ability, and the test relative error of the test sample set, which represents the network generalization ability; through the simulation of BP network overfit numerical modeling test with different types of functions, it is ascertained that the overfit parameter q in the relation generally has a span of 7×10-3 to 7×10-2; the uncertainty relation then helps to obtain the formula for calculating the number of hidden nodes of a network with good generalization ability under the condition that multiple correlation coefficient is used to describe sample complexity and the given approximation error requirement is satisfied; the rationality of this formula is verified; this paper also points out that applying the BP network to the training process of the given sample set is the best method for stopping training that improves the generalization ability.展开更多
基金supported by National Key R&D Plan(2017YFB1301104),NSFC(61877040,61772351)Sci-Tech Innovation Fundamental Scientific Research Funds(025195305000)(19210010005),academy for multidisciplinary study of Capital Normal University。
文摘Error or drift is frequently produced in pose estimation based on geometric"feature detection and tracking"monocular visual odometry(VO)when the speed of camera movement exceeds 1.5 m/s.While,in most VO methods based on deep learning,weight factors are in the form of fixed values,which are easy to lead to overfitting.A new measurement system,for monocular visual odometry,named Deep Learning Visual Odometry(DLVO),is proposed based on neural network.In this system,Convolutional Neural Network(CNN)is used to extract feature and perform feature matching.Moreover,Recurrent Neural Network(RNN)is used for sequence modeling to estimate camera’s 6-dof poses.Instead of fixed weight values of CNN,Bayesian distribution of weight factors are introduced in order to effectively solve the problem of network overfitting.The 18,726 frame images in KITTI dataset are used for training network.This system can increase the generalization ability of network model in prediction process.Compared with original Recurrent Convolutional Neural Network(RCNN),our method can reduce the loss of test model by 5.33%.And it’s an effective method in improving the robustness of translation and rotation information than traditional VO methods.
基金Supported by the the Nation Natural Science Foundation of China (No.40271024)
文摘A general uncertainty relation between the change of weighted value which represents learning ability and the discrimination error of unlearning sample sets which represents generalization ability is revealed in the modeling of back propagation (BP) neural network. Tests of numerical simulation for multitype of complicated functions are carried out to determine the value distribution (1×10?5~5×10?4) of overfitting parameter in the uncertainty relation. Based on the uncertainty relation, the overfitting in the training process of given sample sets using BP neural network can be judged.
文摘The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. We describe different black-box attacks from potential adversaries and study their impact on the amount and type of information that may be recovered from commonly used and deployed LLMs. Our research investigates the relationship between PII leakage, memorization, and factors such as model size, architecture, and the nature of attacks employed. The study utilizes two broad categories of attacks: PII leakage-focused attacks (auto-completion and extraction attacks) and memorization-focused attacks (various membership inference attacks). The findings from these investigations are quantified using an array of evaluative metrics, providing a detailed understanding of LLM vulnerabilities and the effectiveness of different attacks.
文摘Deep learning (DL) techniques, more specifically Convolutional Neural Networks (CNNs), have become increasingly popular in advancing the field of data science and have had great successes in a wide array of applications including computer vision, speech, natural language processing, etc. However, the training process of CNNs is computationally intensive and has high computational cost, especially when the dataset is huge. To overcome these obstacles, this paper takes advantage of distributed frameworks and cloud computing to develop a parallel CNN algorithm. MapReduce is a scalable and fault-tolerant data processing tool that was developed to provide significant improvements in large-scale data-intensive applications in clusters. A MapReduce-based CNN (MCNN) is developed in this work to tackle the task of image classification. In addition, the proposed MCNN adopted the idea of adding dropout layers in the networks to tackle the overfitting problem. Close examination of the implementation of MCNN as well as how the proposed algorithm accelerates learning are discussed and demonstrated through experiments. Results reveal high classification accuracy and significant improvements in speedup, scaleup and sizeup compared to the standard algorithms.
基金This work is sponsored by the Ministry of Science and Technology of China Project "2004 DIB3J122"
文摘Because of overfitting and the improvement of generalization capability (GC)available in the construction of forecasting models using artificial neural network (ANN), a newmethod is proposed for model establishment by means of making a low-dimension ANN learning matrixthrough principal component analysis (PCA). The results show that the PC A is able to construct anANN model without the need of finding an optimal structure with the appropriate number ofhidden-layer nodes, thus avoids overfitting by condensing forecasting information, reducingdimension and removing noise, and GC is greatly raised compared to the traditional ANN and stepwiseregression techniques for model establishment.
文摘Melanoma is the most lethal malignant tumour,and its prevalence is increasing.Early detection and diagnosis of skin cancer can alert patients to manage precautions and dramatically improve the lives of people.Recently,deep learning has grown increasingly popular in the extraction and categorization of skin cancer features for effective prediction.A deep learning model learns and co-adapts representations and features from training data to the point where it fails to perform well on test data.As a result,overfitting and poor performance occur.To deal with this issue,we proposed a novel Consecutive Layerwise weight Con-straint MaxNorm model(CLCM-net)for constraining the norm of the weight vector that is scaled each time and bounding to a limit.This method uses deep convolutional neural networks and also custom layer-wise weight constraints that are set to the whole weight matrix directly to learn features efficiently.In this research,a detailed analysis of these weight norms is performed on two distinct datasets,International Skin Imaging Collaboration(ISIC)of 2018 and 2019,which are challenging for convolutional networks to handle.According to thefindings of this work,CLCM-net did a better job of raising the model’s performance by learning the features efficiently within the size limit of weights with appropriate weight constraint settings.The results proved that the proposed techniques achieved 94.42%accuracy on ISIC 2018,91.73%accuracy on ISIC 2019 datasets and 93%of accuracy on combined dataset.
文摘Oversampling is the most utilized approach to deal with class-imbalanced datasets,as seen by the plethora of oversampling methods developed in the last two decades.We argue in the following editorial the issues with oversampling that stem from the possibility of overfitting and the generation of synthetic cases that might not accurately represent the minority class.These limitations should be considered when using oversampling techniques.We also propose several alternate strategies for dealing with imbalanced data,as well as a future work perspective.
基金supported by the National Key Research and Development Program (Grant No. 2017YFC0504901)Sichuan Traffic Construction Science and Technology Project(Grant No. 2016B2–2)Doctoral Innovation Fund Program of Southwest Jiaotong University(Grant No. D-CX201804)
文摘Machine learning method has been widely used in various geotechnical engineering risk analysis in recent years. However, the overfitting problem often occurs due to the small number of samples obtained in history. This paper proposes the FuzzySVM(support vector machine) geotechnical engineering risk analysis method based on the Bayesian network. The proposed method utilizes the fuzzy set theory to build a Bayesian network to reflect prior knowledge, and utilizes the SVM to build a Bayesian network to reflect historical samples. Then a Bayesian network for evaluation is built in Bayesian estimation method by combining prior knowledge with historical samples. Taking seismic damage evaluation of slopes as an example, the steps of the method are stated in detail. The proposed method is used to evaluate the seismic damage of 96 slopes along roads in the area affected by the Wenchuan earthquake. The evaluation results show that the method can solve the overfitting problem, which often occurs if the machine learning methods are used to evaluate risk of geotechnical engineering, and the performance of the method is much better than that of the previous machine learning methods. Moreover,the proposed method can also effectively evaluate various geotechnical engineering risks in the absence of some influencing factors.
文摘Overfitting frequently occurs in deep learning.In this paper,we propose a novel regularization method called drop-activation to reduce overfitting and improve generalization.The key idea is to drop nonlinear activation functions by setting them to be identity functions randomly during training time.During testing,we use a deterministic network with a new activation function to encode the average effect of dropping activations randomly.Our theoretical analyses support the regularization effect of drop-activation as implicit parameter reduction and verify its capability to be used together with batch normalization(Iolfe and Szegedy in Batch normalization:accelerating deep network training by reducing internal covariate shift.arXiv:1502.03167,2015).The experimental results on CIFAR10,CIFAR100,SVHN,EMNIST,and ImageNet show that drop-activation generally improves the performance of popular neural network architectures for the image classification task.Furthermore,as a regularizer drop-activation can be used in harmony with standard training and regularization techniques such as batch normalization and AutoAugment(Cubuk et al.in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,pp.113-123,2019).The code is available at https://github.com/LeungSamWai/Drop-Activ ation.
文摘The method of determining the structures and parameters of radial basis function neural networks(RBFNNs) using improved genetic algorithms is proposed. Akaike′s information criterion (AIC) with generalization error term is used as the best criterion of optimizing the structures and parameters of networks. It is shown from the simulation results that the method not only improves the approximation and generalization capability of RBFNNs ,but also obtain the optimal or suboptimal structures of networks.
文摘This paper proposes a novel nonlinear correlation filter for facial landmark localization. Firstly, we prove that SVM as a classifier can also be used for localization. Then, soft constrained Minimum Average Correlation Energy filter (soft constrained MACE) is proposed, which is more resistent to overfittings to training set than other variants of correlation filter. In order to improve the performance for the multi-mode of the targets, locally linear framework is introduced to our model, which results in Fourier Locally Linear Soft Constraint MACE (FL^2 SC-MACE). Furthermore, we formulate the fast implementation and show that the time consumption in test process is independent of the number of training samples. The merits of our method include accurate localization performance, desiring generalization capability to the variance of objects, fast testing speed and insensitivity to parameter settings. We conduct the cross-set eye localization experiments on challenging FRGC, FERET and BioID datasets. Our method surpasses the state-of-arts especially in pixelwise accuracy.
基金The APC was funded by the Deanship of Scientific Research,Saudi Electronic University.
文摘Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification accuracy.This paper uses instance reduction techniques for the datasets before mining the association rules and building the classifier.Instance reduction techniques were originally developed to reduce memory requirements in instance-based learning.This paper utilizes them to remove noise from the dataset before training the association rules classifier.Extensive experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise ratios.Experiments show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning repository.The improvements were more apparent in the 5%and the 10%noise cases.When RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original test.The average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise case.Higher confidence was also reported in building the association rules when RENN was used.The above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.
基金supported by the NationalNatural Science Foundation of China(No.6180802161)the Educational Commission of Liaoning Province of China(No.JZL201915401)We thank TopEdit(www.topeditsci.com)for its linguistic assistance during the preparation of this manuscript.
文摘To attain the goal of carbon peaking and carbon neutralization,the inevitable choice is the open sharing of power data and connection to the grid of high-permeability renewable energy.However,this approach is hindered by the lack of training data for predicting new grid-connected PV power stations.To overcome this problem,this work uses open and shared power data as input for a short-term PV-power-prediction model based on feature transfer learning to facilitate the generalization of the PV-power-prediction model to multiple PV-power stations.The proposed model integrates a structure model,heat-dissipation conditions,and the loss coefficients of PV modules.Clear-Sky entropy,characterizes seasonal and weather data features,describes the main meteorological characteristics at the PV power station.Taking gate recurrent unit neural networks as the framework,the open and shared PV-power data as the source-domain training label,and a small quantity of power data from a new grid-connected PV power station as the target-domain training label,the neural network hidden layer is shared between the target domain and the source domain.The fully connected layer is established in the target domain,and the regularization constraint is introduced to fine-tune and suppress the overfitting in feature transfer.The prediction of PV power is completed by using the actual power data of PV power stations.The average measures of the normalized root mean square error(NRMSE),the normalized mean absolute percentage error(NMAPE),and the normalized maximum absolute percentage error(NLAE)for the model decrease by 15%,12%,and 35%,respectively,which reflects a much greater adaptability than is possible with other methods.These results show that the proposed method is highly generalizable to different types of PV devices and operating environments that offer insufficient training data.
基金the Cultivation Programme for Young Backbone Teachers in Henan University of Technology,the Key Scientific Research Project of Colleges and Universities in Henan Province(No.22A520024)the Major Public Welfare Project of Henan Province(No.201300311200)the National Natural Science Foundation of China(Nos.61602154 and 61340037)。
文摘In recent years,automatic program repair approaches have developed rapidly in the field of software engineering.However,the existing program repair techniques based on genetic programming suffer from requiring verification of a large number of candidate patches,which consume a lot of computational resources.In this paper,we propose a random search and code similarity based automatic program repair(RSCSRepair).First,to reduce the verification computation effort for candidate patches,we introduce test filtering to reduce the number of test cases and use test case prioritization techniques to reconstruct a new set of test cases.Second,we use a combination of code similarity and random search for patch generation.Finally,we use a patch overfitting detection method to improve the quality of patches.In order to verify the performance of our approach,we conducted the experiments on the Defects4J benchmark.The experimental results show that RSCSRepair correctly repairs up to 54 bugs,with improvements of 14.3%,8.5%,14.3%and 10.3%for our approach compared with jKali,Nopol,CapGen and Sim Fix,respectively.
基金supported by the National Natural Science Foundation of China (Grant No. 60974101)Program for New Century Talents of Education Ministry of China (Grant No. NCET-06-0828)
文摘The forecasting of time-series data plays an important role in various domains. It is of significance in theory and application to improve prediction accuracy of the time-series data. With the progress in the study of time-series, time-series forecasting model becomes more complicated, and consequently great concern has been drawn to the techniques in designing the forecasting model. A modeling method which is easy to use by engineers and may generate good results is in urgent need. In this paper, a gradient-boost AR ensemble learning algorithm (AREL) is put forward. The effectiveness of AREL is assessed by theoretical analyses, and it is demonstrated that this method can build a strong predictive model by assembling a set of AR models. In order to avoid fitting exactly any single training example, an insensitive loss function is introduced in the AREL algorithm, and accordingly the influence of random noise is reduced. To further enhance the capability of AREL algorithm for non-stationary time-series, improve the robustness of algorithm, discourage overfitting, and reduce sensitivity of algorithm to parameter settings, a weighted kNN prediction method based on AREL algorithm is presented. The results of numerical testing on real data demonstrate that the proposed modeling method and prediction method are effective.
基金This work was supported by the National Natural Science Foundation of China[82001924,31771020]the Zhejiang Provincial Natural Science Foundation of China[LY22H180005,LY20H120001,LQ20A020008]+5 种基金the Science and Technology Plan Project of Wenzhou Science and Technology Bureau[Y20180172]the Projects of Medical and Health Technology Development Program in Zhejiang Province[2019RC056]A Project Supported by Scientific Research Fund of Zhejiang Provincial Education Department[Y201839651].The study sponsors had no role in the study designcollection,analysis,and interpretation of datathe writing of the manuscriptthe decision to submit the manuscript for publication.
文摘Fitting of corneal topography data to analytical surfaces has been necessary in many clinical and experimental applications,yet absolute superiority of fitting methods was still unclear,and their overfitting risks were not well studied.This study aimed to evaluate the accuracy and reliability of orthogonal polynomials as fitting routines to represent corneal topography.Four orthogonal polynomials,namely,Zernike polynomials(ZPs),pseudo-Zernike polynomials(PZPs),Gaussian-Hermite polynomials(GHPs)and Orthogonal Fourier-Mellin polynomials(OFMPs),were employed to fit anterior and posterior corneal topographies collected from 200 healthy and 174 keratoconic eyes using Pentacam topographer.The fitting performance of these polynomials were compared,and the potential overfitting risks were assessed through a prediction exercise.The results showed that,except for low orders,the fitting performance differed little among polynomials with orders10 regarding surface reconstruction(RMSEs~0.3μm).Anterior surfaces of normal corneas were fitted more efficiently,followed by those of keratoconic corneas,then posterior corneal surfaces.The results,however,revealed an alarming fact that all polynomials tended to overfit the data beyond certain orders.GHPs,closely followed by ZPs,were the most robust in predicting unmeasured surface locations;while PZPs and especially OFMPs overfitted the surfaces drastically.Order 10 appeared to be optimum for corneal surfaces with 10-mm diameter,ensuring accurate reconstruction and avoiding overfitting.The optimum order however varied with topography diameters and data resolutions.The study concluded that continuing to use ZPs as fitting routine for most topography maps,or using GHPs instead,remains a good choice.Choosing polynomial orders close to the topography diameters(millimeters)is generally suggested to ensure both reconstruction accuracy and prediction reliability and avoid overfitting for both normal and complex(e.g.,keratoconic)corneal surfaces.
文摘This paper analyses the intrinsic relationship between the BP network learning ability and generalization ability and other influencing factors when the overfit occurs, and introduces the multiple correlation coefficient to describe the complexity of samples; it follows the calculation uncertainty principle and the minimum principle of neural network structural design, provides an analogy of the general uncertainty relation in the information transfer process, and ascertains the uncertainty relation between the training relative error of the training sample set, which reflects the network learning ability, and the test relative error of the test sample set, which represents the network generalization ability; through the simulation of BP network overfit numerical modeling test with different types of functions, it is ascertained that the overfit parameter q in the relation generally has a span of 7×10-3 to 7×10-2; the uncertainty relation then helps to obtain the formula for calculating the number of hidden nodes of a network with good generalization ability under the condition that multiple correlation coefficient is used to describe sample complexity and the given approximation error requirement is satisfied; the rationality of this formula is verified; this paper also points out that applying the BP network to the training process of the given sample set is the best method for stopping training that improves the generalization ability.