The quick spread of the CoronavirusDisease(COVID-19)infection around the world considered a real danger for global health.The biological structure and symptoms of COVID-19 are similar to other viral chest maladies,whi...The quick spread of the CoronavirusDisease(COVID-19)infection around the world considered a real danger for global health.The biological structure and symptoms of COVID-19 are similar to other viral chest maladies,which makes it challenging and a big issue to improve approaches for efficient identification of COVID-19 disease.In this study,an automatic prediction of COVID-19 identification is proposed to automatically discriminate between healthy and COVID-19 infected subjects in X-ray images using two successful moderns are traditional machine learning methods(e.g.,artificial neural network(ANN),support vector machine(SVM),linear kernel and radial basis function(RBF),k-nearest neighbor(k-NN),Decision Tree(DT),andCN2 rule inducer techniques)and deep learningmodels(e.g.,MobileNets V2,ResNet50,GoogleNet,DarkNet andXception).A largeX-ray dataset has been created and developed,namely the COVID-19 vs.Normal(400 healthy cases,and 400 COVID cases).To the best of our knowledge,it is currently the largest publicly accessible COVID-19 dataset with the largest number of X-ray images of confirmed COVID-19 infection cases.Based on the results obtained from the experiments,it can be concluded that all the models performed well,deep learning models had achieved the optimum accuracy of 98.8%in ResNet50 model.In comparison,in traditional machine learning techniques, the SVM demonstrated the best result for an accuracy of 95% and RBFaccuracy 94% for the prediction of coronavirus disease 2019.展开更多
Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remain...Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remains a challenging task under diverse walking sequences due to the covariant factors such as normal walking and walking with wearing a coat.Researchers,over the years,have worked on successfully identifying subjects using different techniques,but there is still room for improvement in accuracy due to these covariant factors.This paper proposes an automated model-free framework for human gait recognition in this article.There are a few critical steps in the proposed method.Firstly,optical flow-based motion region esti-mation and dynamic coordinates-based cropping are performed.The second step involves training a fine-tuned pre-trained MobileNetV2 model on both original and optical flow cropped frames;the training has been conducted using static hyperparameters.The third step proposed a fusion technique known as normal distribution serially fusion.In the fourth step,a better optimization algorithm is applied to select the best features,which are then classified using a Bi-Layered neural network.Three publicly available datasets,CASIA A,CASIA B,and CASIA C,were used in the experimental process and obtained average accuracies of 99.6%,91.6%,and 95.02%,respectively.The proposed framework has achieved improved accuracy compared to the other methods.展开更多
This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to...This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to automatically adapt to the task of understanding single-modal and multimodal problems.Existing methods either focus on single-modal ormultimodal problems,and they cannot fit each other.A general geometry problem solver shouldobviouslybe able toprocess variousmodalproblems at the same time.Inthispaper,a shared feature-learning model of multimodal data is adopted to learn the unified feature representation of text and image,which can solve the heterogeneity issue between multimodal geometry problems.A contrastive learning model of multimodal data enhances the semantic relevance betweenmultimodal features and maps them into a unified semantic space,which can effectively adapt to both single-modal and multimodal downstream tasks.Based on the feature extraction and fusion of multimodal data,a proposed geometry problem solver uses relation extraction,theorem reasoning,and problem solving to present solutions in a readable way.Experimental results show the effectiveness of the method.展开更多
Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconst...Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconstruction has become a prevalent approach for unsupervised anomaly detection.However,effectively learning representations and achieving accurate detection results remain challenging due to the intricate temporal patterns and dependencies in real-world time series.In this paper,we propose a cross-dimension attentive feature fusion network for time series anomaly detection,referred to as CAFFN.Specifically,a series and feature mixing block is introduced to learn representations in 1D space.Additionally,a fast Fourier transform is employed to convert the time series into 2D space,providing the capability for 2D feature extraction.Finally,a cross-dimension attentive feature fusion mechanism is designed that adaptively integrates features across different dimensions for anomaly detection.Experimental results on real-world time series datasets demonstrate that CAFFN performs better than other competing methods in time series anomaly detection.展开更多
Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fin...Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fine-grained features by training deep models such that similar images are clustered,and dissimilar images are separated in the low embedding space.Previous works primarily focused on defining local structure loss functions like triplet loss,pairwise loss,etc.However,training via these approaches takes a long training time,and they have poor accuracy.Additionally,representations learned through it tend to tighten up in the embedded space and lose generalizability to unseen classes.This paper proposes a noise-assisted representation learning method for fine-grained image retrieval to mitigate these issues.In the proposed work,class manifold learning is performed in which positive pairs are created with noise insertion operation instead of tightening class clusters.And other instances are treated as negatives within the same cluster.Then a loss function is defined to penalize when the distance between instances of the same class becomes too small relative to the noise pair in that class in embedded space.The proposed approach is validated on CARS-196 and CUB-200 datasets and achieved better retrieval results(85.38%recall@1 for CARS-196%and 70.13%recall@1 for CUB-200)compared to other existing methods.展开更多
In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owin...In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owing to the lack of accurately labeled network traffic data,many unsupervised feature representation learning models have been proposed with state-of-theart performance.Yet,these models fail to consider the classification error while learning the feature representation.Intuitively,the learnt feature representation may degrade the performance of the classification task.For the first time in the field of intrusion detection,this paper proposes an unsupervised IDS model leveraging the benefits of deep autoencoder(DAE)for learning the robust feature representation and one-class support vector machine(OCSVM)for finding the more compact decision hyperplane for intrusion detection.Specially,the proposed model defines a new unified objective function to minimize the reconstruction and classification error simultaneously.This unique contribution not only enables the model to support joint learning for feature representation and classifier training but also guides to learn the robust feature representation which can improve the discrimination ability of the classifier for intrusion detection.Three set of evaluation experiments are conducted to demonstrate the potential of the proposed model.First,the ablation evaluation on benchmark dataset,NSL-KDD validates the design decision of the proposed model.Next,the performance evaluation on recent intrusion dataset,UNSW-NB15 signifies the stable performance of the proposed model.Finally,the comparative evaluation verifies the efficacy of the proposed model against recently published state-of-the-art methods.展开更多
Industrial Control Systems (ICS) or SCADA networks are increasingly targeted by cyber-attacks as their architectures shifted from proprietary hardware, software and protocols to standard and open sources ones. Further...Industrial Control Systems (ICS) or SCADA networks are increasingly targeted by cyber-attacks as their architectures shifted from proprietary hardware, software and protocols to standard and open sources ones. Furthermore, these systems which used to be isolated are now interconnected to corporate networks and to the Internet. Among the countermeasures to mitigate the threats, anomaly detection systems play an important role as they can help detect even unknown attacks. Deep learning which has gained a great attention in the last few years due to excellent results in image, video and natural language processing is being used for anomaly detection in information security, particularly in SCADA networks. The salient features of the data from SCADA networks are learnt as hierarchical representation using deep architectures, and those learnt features are used to classify the data into normal or anomalous ones. This article is a review of various architectures such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Stacked Autoencoder (SAE), Long Short Term Memory (LSTM), or a combination of those architectures, for anomaly detection purpose in SCADA networks.展开更多
Visible-infrared Cross-modality Person Re-identification(VI-ReID)is a critical technology in smart public facilities such as cities,campuses and libraries.It aims to match pedestrians in visible light and infrared ima...Visible-infrared Cross-modality Person Re-identification(VI-ReID)is a critical technology in smart public facilities such as cities,campuses and libraries.It aims to match pedestrians in visible light and infrared images for video surveillance,which poses a challenge in exploring cross-modal shared information accurately and efficiently.Therefore,multi-granularity feature learning methods have been applied in VI-ReID to extract potential multi-granularity semantic information related to pedestrian body structure attributes.However,existing research mainly uses traditional dual-stream fusion networks and overlooks the core of cross-modal learning networks,the fusion module.This paper introduces a novel network called the Augmented Deep Multi-Granularity Pose-Aware Feature Fusion Network(ADMPFF-Net),incorporating the Multi-Granularity Pose-Aware Feature Fusion(MPFF)module to generate discriminative representations.MPFF efficiently explores and learns global and local features with multi-level semantic information by inserting disentangling and duplicating blocks into the fusion module of the backbone network.ADMPFF-Net also provides a new perspective for designing multi-granularity learning networks.By incorporating the multi-granularity feature disentanglement(mGFD)and posture information segmentation(pIS)strategies,it extracts more representative features concerning body structure information.The Local Information Enhancement(LIE)module augments high-performance features in VI-ReID,and the multi-granularity joint loss supervises model training for objective feature learning.Experimental results on two public datasets show that ADMPFF-Net efficiently constructs pedestrian feature representations and enhances the accuracy of VI-ReID.展开更多
With the increasing complexity of industrial processes, the high-dimensional industrial data exhibit a strong nonlinearity, bringing considerable challenges to the fault diagnosis of industrial processes. To efficient...With the increasing complexity of industrial processes, the high-dimensional industrial data exhibit a strong nonlinearity, bringing considerable challenges to the fault diagnosis of industrial processes. To efficiently extract deep meaningful features that are crucial for fault diagnosis, a sparse Gaussian feature extractor(SGFE) is designed to learn a nonlinear mapping that projects the raw data into the feature space with the fault label dimension. The feature space is described by the one-hot encoding of the fault category label as an orthogonal basis. In this way, the deep sparse Gaussian features related to fault categories can be gradually learned from the raw data by SGFE. In the feature space,the sparse Gaussian(SG) loss function is designed to constrain the distribution of features to multiple sparse multivariate Gaussian distributions. The sparse Gaussian features are linearly separable in the feature space, which is conducive to improving the accuracy of the downstream fault classification task. The feasibility and practical utility of the proposed SGFE are verified by the handwritten digits MNIST benchmark and Tennessee-Eastman(TE) benchmark process,respectively.展开更多
Dear Sir,Iam Dr.Kavitha S,from the Department of Electronics and Communication Engineering,Nandha Engineering College,Erode,Tamil Nadu,India.I write to present the detection of glaucoma using extreme learning machine(...Dear Sir,Iam Dr.Kavitha S,from the Department of Electronics and Communication Engineering,Nandha Engineering College,Erode,Tamil Nadu,India.I write to present the detection of glaucoma using extreme learning machine(ELM)and fractal feature analysis.Glaucoma is the second most frequent cause of permanent blindness in industrial展开更多
Intelligent diagnosis driven by big data for mechanical fault is an important means to ensure the safe operation ofequipment. In these methods, deep learning-based machinery fault diagnosis approaches have received in...Intelligent diagnosis driven by big data for mechanical fault is an important means to ensure the safe operation ofequipment. In these methods, deep learning-based machinery fault diagnosis approaches have received increasingattention and achieved some results. It might lead to insufficient performance for using transfer learning alone andcause misclassification of target samples for domain bias when building deep models to learn domain-invariantfeatures. To address the above problems, a deep discriminative adversarial domain adaptation neural networkfor the bearing fault diagnosis model is proposed (DDADAN). In this method, the raw vibration data are firstlyconverted into frequency domain data by Fast Fourier Transform, and an improved deep convolutional neuralnetwork with wide first-layer kernels is used as a feature extractor to extract deep fault features. Then, domaininvariant features are learned from the fault data with correlation alignment-based domain adversarial training.Furthermore, to enhance the discriminative property of features, discriminative feature learning is embeddedinto this network to make the features compact, as well as separable between classes within the class. Finally, theperformance and anti-noise capability of the proposedmethod are evaluated using two sets of bearing fault datasets.The results demonstrate that the proposed method is capable of handling domain offset caused by differentworkingconditions and maintaining more than 97.53% accuracy on various transfer tasks. Furthermore, the proposedmethod can achieve high diagnostic accuracy under varying noise levels.展开更多
To gain a more comprehensive understanding and evaluate foam aluminum's performance,researchers have introduced various characterization indicators.However,the current understanding of the significance of these in...To gain a more comprehensive understanding and evaluate foam aluminum's performance,researchers have introduced various characterization indicators.However,the current understanding of the significance of these indicators in analyzing foam aluminum's performance is limited.This study employs the Generalized Regression Neural Network(GRNN)method to establish a model that links foam aluminum's microstructure characterization data with its mechanical properties.Through the GRNN model,researchers extracted four of the most crucial features and their corresponding weight values from the 13 pore characteristics of foam aluminum.Subsequently,a new characterization formula,called“Wang equivalent porosity”(WEP),was developed by using residual weights assigned to the feature weights,and four parameter coefficients were obtained.This formula aims to represent the relationship between foam aluminum's microstructural features and its mechanical performance.Furthermore,the researchers conducted model verification using compression data from 11 sets of foam aluminum.The validation results showed that among these 11 foam aluminum datasets,the Gibson-Ashby formula yielded anomalous results in two cases,whereas WEP exhibited exceptional stability without any anomalies.In comparison to the Gibson-Ashby formula,WEP demonstrated an 18.18%improvement in evaluation accuracy.展开更多
The performance of traditional vibration based fault diagnosis methods greatly depends on those hand- crafted features extracted using signal processing algo- rithms, which require significant amounts of domain knowle...The performance of traditional vibration based fault diagnosis methods greatly depends on those hand- crafted features extracted using signal processing algo- rithms, which require significant amounts of domain knowledge and human labor, and do not generalize well to new diagnosis domains. Recently, unsupervised represen- tation learning provides an alternative promising solution to feature extraction in traditional fault diagnosis due to its superior learning ability from unlabeled data. Given that vibration signals usually contain multiple temporal struc- tures, this paper proposes a multiscale representation learning (MSRL) framework to learn useful features directly from raw vibration signals, with the aim to capture rich and complementary fault pattern information at dif- ferent scales. In our proposed approach, a coarse-grained procedure is first employed to obtain multiple scale signals from an original vibration signal. Then, sparse filtering, a newly developed unsupervised learning algorithm, is applied to automatically learn useful features from each scale signal, respectively, and then the learned features at each scale to be concatenated one by one to obtain multi- scale representations. Finally, the multiscale representa- tions are fed into a supervised classifier to achieve diagnosis results. Our proposed approach is evaluated using two different case studies: motor bearing and wind turbine gearbox fault diagnosis. Experimental results show that the proposed MSRL approach can take full advantages of the availability of unlabeled data to learn discriminative features and achieved better performance with higher accuracy and stability compared to the traditional approaches.展开更多
As a new neural network model,extreme learning machine(ELM)has a good learning rate and generalization ability.However,ELM with a single hidden layer structure often fails to achieve good results when faced with large...As a new neural network model,extreme learning machine(ELM)has a good learning rate and generalization ability.However,ELM with a single hidden layer structure often fails to achieve good results when faced with large-scale multi-featured problems.To resolve this problem,we propose a multi-layer framework for the ELM learning algorithm to improve the model’s generalization ability.Moreover,noises or abnormal points often exist in practical applications,and they result in the inability to obtain clean training data.The generalization ability of the original ELM decreases under such circumstances.To address this issue,we add model bias and variance to the loss function so that the model gains the ability to minimize model bias and model variance,thus reducing the influence of noise signals.A new robust multi-layer algorithm called ML-RELM is proposed to enhance outlier robustness in complex datasets.Simulation results show that the method has high generalization ability and strong robustness to noise.展开更多
Object detection in images has been identified as a critical area of research in computer vision image processing.Research has developed several novel methods for determining an object’s location and category from an...Object detection in images has been identified as a critical area of research in computer vision image processing.Research has developed several novel methods for determining an object’s location and category from an image.However,there is still room for improvement in terms of detection effi-ciency.This study aims to develop a technique for detecting objects in images.To enhance overall detection performance,we considered object detection a two-fold problem,including localization and classification.The proposed method generates class-independent,high-quality,and precise proposals using an agglomerative clustering technique.We then combine these proposals with the relevant input image to train our network on convolutional features.Next,a network refinement module decreases the quantity of generated proposals to produce fewer high-quality candidate proposals.Finally,revised candidate proposals are sent into the network’s detection process to determine the object type.The algorithm’s performance is evaluated using publicly available the PASCAL Visual Object Classes Challenge 2007(VOC2007),VOC2012,and Microsoft Common Objects in Context(MS-COCO)datasets.Using only 100 proposals per image at intersection over union((IoU)=0.5 and 0.7),the proposed method attains Detection Recall(DR)rates of(93.17%and 79.35%)and(69.4%and 58.35%),and Mean Average Best Overlap(MABO)values of(79.25%and 62.65%),for the VOC2007 and MS-COCO datasets,respectively.Besides,it achieves a Mean Average Precision(mAP)of(84.7%and 81.5%)on both VOC datasets.The experiment findings reveal that our method exceeds previous approaches in terms of overall detection performance,proving its effectiveness.展开更多
Person re-identification is a prevalent technology deployed on intelligent surveillance.There have been remarkable achievements in person re-identification methods based on the assumption that all person images have a...Person re-identification is a prevalent technology deployed on intelligent surveillance.There have been remarkable achievements in person re-identification methods based on the assumption that all person images have a sufficiently high resolution,yet such models are not applicable to the open world.In real world,the changing distance between pedestrians and the camera renders the resolution of pedestrians captured by the camera inconsistent.When low-resolution(LR)images in the query set are matched with high-resolution(HR)images in the gallery set,it degrades the performance of the pedestrian matching task due to the absent pedestrian critical information in LR images.To address the above issues,we present a dualstream coupling network with wavelet transform(DSCWT)for the cross-resolution person re-identification task.Firstly,we use the multi-resolution analysis principle of wavelet transform to separately process the low-frequency and high-frequency regions of LR images,which is applied to restore the lost detail information of LR images.Then,we devise a residual knowledge constrained loss function that transfers knowledge between the two streams of LR images and HR images for accessing pedestrian invariant features at various resolutions.Extensive qualitative and quantitative experiments across four benchmark datasets verify the superiority of the proposed approach.展开更多
In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific...In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.展开更多
Tremendous amount of data are being generated and saved in many complex engineering and social systems every day.It is significant and feasible to utilize the big data to make better decisions by machine learning tech...Tremendous amount of data are being generated and saved in many complex engineering and social systems every day.It is significant and feasible to utilize the big data to make better decisions by machine learning techniques. In this paper, we focus on batch reinforcement learning(RL) algorithms for discounted Markov decision processes(MDPs) with large discrete or continuous state spaces, aiming to learn the best possible policy given a fixed amount of training data. The batch RL algorithms with handcrafted feature representations work well for low-dimensional MDPs. However, for many real-world RL tasks which often involve high-dimensional state spaces, it is difficult and even infeasible to use feature engineering methods to design features for value function approximation. To cope with high-dimensional RL problems, the desire to obtain data-driven features has led to a lot of works in incorporating feature selection and feature learning into traditional batch RL algorithms. In this paper, we provide a comprehensive survey on automatic feature selection and unsupervised feature learning for high-dimensional batch RL. Moreover, we present recent theoretical developments on applying statistical learning to establish finite-sample error bounds for batch RL algorithms based on weighted Lpnorms. Finally, we derive some future directions in the research of RL algorithms, theories and applications.展开更多
Emotion-based features are critical for achieving high performance in a speech emotion recognition(SER) system. In general, it is difficult to develop these features due to the ambiguity of the ground-truth. In this p...Emotion-based features are critical for achieving high performance in a speech emotion recognition(SER) system. In general, it is difficult to develop these features due to the ambiguity of the ground-truth. In this paper, we apply several unsupervised feature learning algorithms(including K-means clustering, the sparse auto-encoder, and sparse restricted Boltzmann machines), which have promise for learning task-related features by using unlabeled data, to speech emotion recognition. We then evaluate the performance of the proposed approach and present a detailed analysis of the effect of two important factors in the model setup, the content window size and the number of hidden layer nodes. Experimental results show that larger content windows and more hidden nodes contribute to higher performance. We also show that the two-layer network cannot explicitly improve performance compared to a single-layer network.展开更多
As the significant branch of intelligent vehicle networking technology, the intelligent fatigue driving detection technology has been introduced into the paper in order to recognize the fatigue state of the vehicle dr...As the significant branch of intelligent vehicle networking technology, the intelligent fatigue driving detection technology has been introduced into the paper in order to recognize the fatigue state of the vehicle driver and avoid the traffic accident. The disadvantages of the traditional fatigue driving detection method have been pointed out when we study on the traditional eye tracking technology and traditional artificial neural networks. On the basis of the image topological analysis technology, Haar like features and extreme learning machine algorithm, a new detection method of the intelligent fatigue driving has been proposed in the paper. Besides, the detailed algorithm and realization scheme of the intelligent fatigue driving detection have been put forward as well. Finally, by comparing the results of the simulation experiments, the new method has been verified to have a better robustness, efficiency and accuracy in monitoring and tracking the drivers' fatigue driving by using the human eye tracking technology.展开更多
文摘The quick spread of the CoronavirusDisease(COVID-19)infection around the world considered a real danger for global health.The biological structure and symptoms of COVID-19 are similar to other viral chest maladies,which makes it challenging and a big issue to improve approaches for efficient identification of COVID-19 disease.In this study,an automatic prediction of COVID-19 identification is proposed to automatically discriminate between healthy and COVID-19 infected subjects in X-ray images using two successful moderns are traditional machine learning methods(e.g.,artificial neural network(ANN),support vector machine(SVM),linear kernel and radial basis function(RBF),k-nearest neighbor(k-NN),Decision Tree(DT),andCN2 rule inducer techniques)and deep learningmodels(e.g.,MobileNets V2,ResNet50,GoogleNet,DarkNet andXception).A largeX-ray dataset has been created and developed,namely the COVID-19 vs.Normal(400 healthy cases,and 400 COVID cases).To the best of our knowledge,it is currently the largest publicly accessible COVID-19 dataset with the largest number of X-ray images of confirmed COVID-19 infection cases.Based on the results obtained from the experiments,it can be concluded that all the models performed well,deep learning models had achieved the optimum accuracy of 98.8%in ResNet50 model.In comparison,in traditional machine learning techniques, the SVM demonstrated the best result for an accuracy of 95% and RBFaccuracy 94% for the prediction of coronavirus disease 2019.
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea.(No.20204010600090).
文摘Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remains a challenging task under diverse walking sequences due to the covariant factors such as normal walking and walking with wearing a coat.Researchers,over the years,have worked on successfully identifying subjects using different techniques,but there is still room for improvement in accuracy due to these covariant factors.This paper proposes an automated model-free framework for human gait recognition in this article.There are a few critical steps in the proposed method.Firstly,optical flow-based motion region esti-mation and dynamic coordinates-based cropping are performed.The second step involves training a fine-tuned pre-trained MobileNetV2 model on both original and optical flow cropped frames;the training has been conducted using static hyperparameters.The third step proposed a fusion technique known as normal distribution serially fusion.In the fourth step,a better optimization algorithm is applied to select the best features,which are then classified using a Bi-Layered neural network.Three publicly available datasets,CASIA A,CASIA B,and CASIA C,were used in the experimental process and obtained average accuracies of 99.6%,91.6%,and 95.02%,respectively.The proposed framework has achieved improved accuracy compared to the other methods.
基金supported by the NationalNatural Science Foundation of China (No.62107014,Jian P.,62177025,He B.)the Key R&D and Promotion Projects of Henan Province (No.212102210147,Jian P.)Innovative Education Program for Graduate Students at North China University of Water Resources and Electric Power,China (No.YK-2021-99,Guo F.).
文摘This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to automatically adapt to the task of understanding single-modal and multimodal problems.Existing methods either focus on single-modal ormultimodal problems,and they cannot fit each other.A general geometry problem solver shouldobviouslybe able toprocess variousmodalproblems at the same time.Inthispaper,a shared feature-learning model of multimodal data is adopted to learn the unified feature representation of text and image,which can solve the heterogeneity issue between multimodal geometry problems.A contrastive learning model of multimodal data enhances the semantic relevance betweenmultimodal features and maps them into a unified semantic space,which can effectively adapt to both single-modal and multimodal downstream tasks.Based on the feature extraction and fusion of multimodal data,a proposed geometry problem solver uses relation extraction,theorem reasoning,and problem solving to present solutions in a readable way.Experimental results show the effectiveness of the method.
基金supported in part by the National Natural Science Foundation of China(Grants 62376172,62006163,62376043)in part by the National Postdoctoral Program for Innovative Talents(Grant BX20200226)in part by Sichuan Science and Technology Planning Project(Grants 2022YFSY0047,2022YFQ0014,2023ZYD0143,2022YFH0021,2023YFQ0020,24QYCX0354,24NSFTD0025).
文摘Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconstruction has become a prevalent approach for unsupervised anomaly detection.However,effectively learning representations and achieving accurate detection results remain challenging due to the intricate temporal patterns and dependencies in real-world time series.In this paper,we propose a cross-dimension attentive feature fusion network for time series anomaly detection,referred to as CAFFN.Specifically,a series and feature mixing block is introduced to learn representations in 1D space.Additionally,a fast Fourier transform is employed to convert the time series into 2D space,providing the capability for 2D feature extraction.Finally,a cross-dimension attentive feature fusion mechanism is designed that adaptively integrates features across different dimensions for anomaly detection.Experimental results on real-world time series datasets demonstrate that CAFFN performs better than other competing methods in time series anomaly detection.
文摘Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fine-grained features by training deep models such that similar images are clustered,and dissimilar images are separated in the low embedding space.Previous works primarily focused on defining local structure loss functions like triplet loss,pairwise loss,etc.However,training via these approaches takes a long training time,and they have poor accuracy.Additionally,representations learned through it tend to tighten up in the embedded space and lose generalizability to unseen classes.This paper proposes a noise-assisted representation learning method for fine-grained image retrieval to mitigate these issues.In the proposed work,class manifold learning is performed in which positive pairs are created with noise insertion operation instead of tightening class clusters.And other instances are treated as negatives within the same cluster.Then a loss function is defined to penalize when the distance between instances of the same class becomes too small relative to the noise pair in that class in embedded space.The proposed approach is validated on CARS-196 and CUB-200 datasets and achieved better retrieval results(85.38%recall@1 for CARS-196%and 70.13%recall@1 for CUB-200)compared to other existing methods.
基金This work was supported by the Research Deanship of Prince Sattam Bin Abdulaziz University,Al-Kharj,Saudi Arabia(Grant No.2020/01/17215).Also,the author thanks Deanship of college of computer engineering and sciences for technical support provided to complete the project successfully。
文摘In the era of Big data,learning discriminant feature representation from network traffic is identified has as an invariably essential task for improving the detection ability of an intrusion detection system(IDS).Owing to the lack of accurately labeled network traffic data,many unsupervised feature representation learning models have been proposed with state-of-theart performance.Yet,these models fail to consider the classification error while learning the feature representation.Intuitively,the learnt feature representation may degrade the performance of the classification task.For the first time in the field of intrusion detection,this paper proposes an unsupervised IDS model leveraging the benefits of deep autoencoder(DAE)for learning the robust feature representation and one-class support vector machine(OCSVM)for finding the more compact decision hyperplane for intrusion detection.Specially,the proposed model defines a new unified objective function to minimize the reconstruction and classification error simultaneously.This unique contribution not only enables the model to support joint learning for feature representation and classifier training but also guides to learn the robust feature representation which can improve the discrimination ability of the classifier for intrusion detection.Three set of evaluation experiments are conducted to demonstrate the potential of the proposed model.First,the ablation evaluation on benchmark dataset,NSL-KDD validates the design decision of the proposed model.Next,the performance evaluation on recent intrusion dataset,UNSW-NB15 signifies the stable performance of the proposed model.Finally,the comparative evaluation verifies the efficacy of the proposed model against recently published state-of-the-art methods.
文摘Industrial Control Systems (ICS) or SCADA networks are increasingly targeted by cyber-attacks as their architectures shifted from proprietary hardware, software and protocols to standard and open sources ones. Furthermore, these systems which used to be isolated are now interconnected to corporate networks and to the Internet. Among the countermeasures to mitigate the threats, anomaly detection systems play an important role as they can help detect even unknown attacks. Deep learning which has gained a great attention in the last few years due to excellent results in image, video and natural language processing is being used for anomaly detection in information security, particularly in SCADA networks. The salient features of the data from SCADA networks are learnt as hierarchical representation using deep architectures, and those learnt features are used to classify the data into normal or anomalous ones. This article is a review of various architectures such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Stacked Autoencoder (SAE), Long Short Term Memory (LSTM), or a combination of those architectures, for anomaly detection purpose in SCADA networks.
基金supported in part by the National Natural Science Foundation of China under Grant 62177029,62307025in part by the Startup Foundation for Introducing Talent of Nanjing University of Posts and Communications under Grant NY221041in part by the General Project of The Natural Science Foundation of Jiangsu Higher Education Institution of China 22KJB520025,23KJD580.
文摘Visible-infrared Cross-modality Person Re-identification(VI-ReID)is a critical technology in smart public facilities such as cities,campuses and libraries.It aims to match pedestrians in visible light and infrared images for video surveillance,which poses a challenge in exploring cross-modal shared information accurately and efficiently.Therefore,multi-granularity feature learning methods have been applied in VI-ReID to extract potential multi-granularity semantic information related to pedestrian body structure attributes.However,existing research mainly uses traditional dual-stream fusion networks and overlooks the core of cross-modal learning networks,the fusion module.This paper introduces a novel network called the Augmented Deep Multi-Granularity Pose-Aware Feature Fusion Network(ADMPFF-Net),incorporating the Multi-Granularity Pose-Aware Feature Fusion(MPFF)module to generate discriminative representations.MPFF efficiently explores and learns global and local features with multi-level semantic information by inserting disentangling and duplicating blocks into the fusion module of the backbone network.ADMPFF-Net also provides a new perspective for designing multi-granularity learning networks.By incorporating the multi-granularity feature disentanglement(mGFD)and posture information segmentation(pIS)strategies,it extracts more representative features concerning body structure information.The Local Information Enhancement(LIE)module augments high-performance features in VI-ReID,and the multi-granularity joint loss supervises model training for objective feature learning.Experimental results on two public datasets show that ADMPFF-Net efficiently constructs pedestrian feature representations and enhances the accuracy of VI-ReID.
基金Projects(62125306, 62133003) supported by the National Natural Science Foundation of ChinaProject(TPL2019C03) supported by the Open Fund of Science and Technology on Thermal Energy and Power Laboratory,ChinaProject supported by the Fundamental Research Funds for the Central Universities(Zhejiang University NGICS Platform),China。
文摘With the increasing complexity of industrial processes, the high-dimensional industrial data exhibit a strong nonlinearity, bringing considerable challenges to the fault diagnosis of industrial processes. To efficiently extract deep meaningful features that are crucial for fault diagnosis, a sparse Gaussian feature extractor(SGFE) is designed to learn a nonlinear mapping that projects the raw data into the feature space with the fault label dimension. The feature space is described by the one-hot encoding of the fault category label as an orthogonal basis. In this way, the deep sparse Gaussian features related to fault categories can be gradually learned from the raw data by SGFE. In the feature space,the sparse Gaussian(SG) loss function is designed to constrain the distribution of features to multiple sparse multivariate Gaussian distributions. The sparse Gaussian features are linearly separable in the feature space, which is conducive to improving the accuracy of the downstream fault classification task. The feasibility and practical utility of the proposed SGFE are verified by the handwritten digits MNIST benchmark and Tennessee-Eastman(TE) benchmark process,respectively.
文摘Dear Sir,Iam Dr.Kavitha S,from the Department of Electronics and Communication Engineering,Nandha Engineering College,Erode,Tamil Nadu,India.I write to present the detection of glaucoma using extreme learning machine(ELM)and fractal feature analysis.Glaucoma is the second most frequent cause of permanent blindness in industrial
基金the Natural Science Foundation of Henan Province(232300420094)the Science and TechnologyResearch Project of Henan Province(222102220092).
文摘Intelligent diagnosis driven by big data for mechanical fault is an important means to ensure the safe operation ofequipment. In these methods, deep learning-based machinery fault diagnosis approaches have received increasingattention and achieved some results. It might lead to insufficient performance for using transfer learning alone andcause misclassification of target samples for domain bias when building deep models to learn domain-invariantfeatures. To address the above problems, a deep discriminative adversarial domain adaptation neural networkfor the bearing fault diagnosis model is proposed (DDADAN). In this method, the raw vibration data are firstlyconverted into frequency domain data by Fast Fourier Transform, and an improved deep convolutional neuralnetwork with wide first-layer kernels is used as a feature extractor to extract deep fault features. Then, domaininvariant features are learned from the fault data with correlation alignment-based domain adversarial training.Furthermore, to enhance the discriminative property of features, discriminative feature learning is embeddedinto this network to make the features compact, as well as separable between classes within the class. Finally, theperformance and anti-noise capability of the proposedmethod are evaluated using two sets of bearing fault datasets.The results demonstrate that the proposed method is capable of handling domain offset caused by differentworkingconditions and maintaining more than 97.53% accuracy on various transfer tasks. Furthermore, the proposedmethod can achieve high diagnostic accuracy under varying noise levels.
基金Sponsored by the Shanxi Provincial College Teaching Reform Innovation Funding Project(Grant No.201901d111270)the Natural Science Foundation of Shanxi Province(Grant No.201701d11127)。
文摘To gain a more comprehensive understanding and evaluate foam aluminum's performance,researchers have introduced various characterization indicators.However,the current understanding of the significance of these indicators in analyzing foam aluminum's performance is limited.This study employs the Generalized Regression Neural Network(GRNN)method to establish a model that links foam aluminum's microstructure characterization data with its mechanical properties.Through the GRNN model,researchers extracted four of the most crucial features and their corresponding weight values from the 13 pore characteristics of foam aluminum.Subsequently,a new characterization formula,called“Wang equivalent porosity”(WEP),was developed by using residual weights assigned to the feature weights,and four parameter coefficients were obtained.This formula aims to represent the relationship between foam aluminum's microstructural features and its mechanical performance.Furthermore,the researchers conducted model verification using compression data from 11 sets of foam aluminum.The validation results showed that among these 11 foam aluminum datasets,the Gibson-Ashby formula yielded anomalous results in two cases,whereas WEP exhibited exceptional stability without any anomalies.In comparison to the Gibson-Ashby formula,WEP demonstrated an 18.18%improvement in evaluation accuracy.
基金Supported by Hebei Provincial Natural Science Foundation of China(Grant No.F2016203421)
文摘The performance of traditional vibration based fault diagnosis methods greatly depends on those hand- crafted features extracted using signal processing algo- rithms, which require significant amounts of domain knowledge and human labor, and do not generalize well to new diagnosis domains. Recently, unsupervised represen- tation learning provides an alternative promising solution to feature extraction in traditional fault diagnosis due to its superior learning ability from unlabeled data. Given that vibration signals usually contain multiple temporal struc- tures, this paper proposes a multiscale representation learning (MSRL) framework to learn useful features directly from raw vibration signals, with the aim to capture rich and complementary fault pattern information at dif- ferent scales. In our proposed approach, a coarse-grained procedure is first employed to obtain multiple scale signals from an original vibration signal. Then, sparse filtering, a newly developed unsupervised learning algorithm, is applied to automatically learn useful features from each scale signal, respectively, and then the learned features at each scale to be concatenated one by one to obtain multi- scale representations. Finally, the multiscale representa- tions are fed into a supervised classifier to achieve diagnosis results. Our proposed approach is evaluated using two different case studies: motor bearing and wind turbine gearbox fault diagnosis. Experimental results show that the proposed MSRL approach can take full advantages of the availability of unlabeled data to learn discriminative features and achieved better performance with higher accuracy and stability compared to the traditional approaches.
基金Project(21878081)supported by the National Natural Science Foundation of ChinaProject(222201917006)supported by the Fundamental Research Funds for the Central Universities,China。
文摘As a new neural network model,extreme learning machine(ELM)has a good learning rate and generalization ability.However,ELM with a single hidden layer structure often fails to achieve good results when faced with large-scale multi-featured problems.To resolve this problem,we propose a multi-layer framework for the ELM learning algorithm to improve the model’s generalization ability.Moreover,noises or abnormal points often exist in practical applications,and they result in the inability to obtain clean training data.The generalization ability of the original ELM decreases under such circumstances.To address this issue,we add model bias and variance to the loss function so that the model gains the ability to minimize model bias and model variance,thus reducing the influence of noise signals.A new robust multi-layer algorithm called ML-RELM is proposed to enhance outlier robustness in complex datasets.Simulation results show that the method has high generalization ability and strong robustness to noise.
基金funded by Huanggang Normal University,China,Self-type Project of 2021(No.30120210103)and 2022(No.2042021008).
文摘Object detection in images has been identified as a critical area of research in computer vision image processing.Research has developed several novel methods for determining an object’s location and category from an image.However,there is still room for improvement in terms of detection effi-ciency.This study aims to develop a technique for detecting objects in images.To enhance overall detection performance,we considered object detection a two-fold problem,including localization and classification.The proposed method generates class-independent,high-quality,and precise proposals using an agglomerative clustering technique.We then combine these proposals with the relevant input image to train our network on convolutional features.Next,a network refinement module decreases the quantity of generated proposals to produce fewer high-quality candidate proposals.Finally,revised candidate proposals are sent into the network’s detection process to determine the object type.The algorithm’s performance is evaluated using publicly available the PASCAL Visual Object Classes Challenge 2007(VOC2007),VOC2012,and Microsoft Common Objects in Context(MS-COCO)datasets.Using only 100 proposals per image at intersection over union((IoU)=0.5 and 0.7),the proposed method attains Detection Recall(DR)rates of(93.17%and 79.35%)and(69.4%and 58.35%),and Mean Average Best Overlap(MABO)values of(79.25%and 62.65%),for the VOC2007 and MS-COCO datasets,respectively.Besides,it achieves a Mean Average Precision(mAP)of(84.7%and 81.5%)on both VOC datasets.The experiment findings reveal that our method exceeds previous approaches in terms of overall detection performance,proving its effectiveness.
基金supported by the National Natural Science Foundation of China(61471154,61876057)the Key Research and Development Program of Anhui Province-Special Project of Strengthening Science and Technology Police(202004D07020012).
文摘Person re-identification is a prevalent technology deployed on intelligent surveillance.There have been remarkable achievements in person re-identification methods based on the assumption that all person images have a sufficiently high resolution,yet such models are not applicable to the open world.In real world,the changing distance between pedestrians and the camera renders the resolution of pedestrians captured by the camera inconsistent.When low-resolution(LR)images in the query set are matched with high-resolution(HR)images in the gallery set,it degrades the performance of the pedestrian matching task due to the absent pedestrian critical information in LR images.To address the above issues,we present a dualstream coupling network with wavelet transform(DSCWT)for the cross-resolution person re-identification task.Firstly,we use the multi-resolution analysis principle of wavelet transform to separately process the low-frequency and high-frequency regions of LR images,which is applied to restore the lost detail information of LR images.Then,we devise a residual knowledge constrained loss function that transfers knowledge between the two streams of LR images and HR images for accessing pedestrian invariant features at various resolutions.Extensive qualitative and quantitative experiments across four benchmark datasets verify the superiority of the proposed approach.
基金Project supported by the National Natural Science Foundation of China(No.61379074)the Zhejiang Provincial Natural Science Foundation of China(Nos.LZ12F02003 and LY15F020035)
文摘In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.
基金supported by National Natural Science Foundation of China(Nos.61034002,61233001 and 61273140)
文摘Tremendous amount of data are being generated and saved in many complex engineering and social systems every day.It is significant and feasible to utilize the big data to make better decisions by machine learning techniques. In this paper, we focus on batch reinforcement learning(RL) algorithms for discounted Markov decision processes(MDPs) with large discrete or continuous state spaces, aiming to learn the best possible policy given a fixed amount of training data. The batch RL algorithms with handcrafted feature representations work well for low-dimensional MDPs. However, for many real-world RL tasks which often involve high-dimensional state spaces, it is difficult and even infeasible to use feature engineering methods to design features for value function approximation. To cope with high-dimensional RL problems, the desire to obtain data-driven features has led to a lot of works in incorporating feature selection and feature learning into traditional batch RL algorithms. In this paper, we provide a comprehensive survey on automatic feature selection and unsupervised feature learning for high-dimensional batch RL. Moreover, we present recent theoretical developments on applying statistical learning to establish finite-sample error bounds for batch RL algorithms based on weighted Lpnorms. Finally, we derive some future directions in the research of RL algorithms, theories and applications.
基金supported by the National Natural Science Foundation of China(Nos.61272211 and 61170126)the Six Talent Peaks Foundation of Jiangsu Province,China(No.DZXX027)
文摘Emotion-based features are critical for achieving high performance in a speech emotion recognition(SER) system. In general, it is difficult to develop these features due to the ambiguity of the ground-truth. In this paper, we apply several unsupervised feature learning algorithms(including K-means clustering, the sparse auto-encoder, and sparse restricted Boltzmann machines), which have promise for learning task-related features by using unlabeled data, to speech emotion recognition. We then evaluate the performance of the proposed approach and present a detailed analysis of the effect of two important factors in the model setup, the content window size and the number of hidden layer nodes. Experimental results show that larger content windows and more hidden nodes contribute to higher performance. We also show that the two-layer network cannot explicitly improve performance compared to a single-layer network.
基金supported by the National Natural Science Foundation of China(61272357,61300074,61572075)
文摘As the significant branch of intelligent vehicle networking technology, the intelligent fatigue driving detection technology has been introduced into the paper in order to recognize the fatigue state of the vehicle driver and avoid the traffic accident. The disadvantages of the traditional fatigue driving detection method have been pointed out when we study on the traditional eye tracking technology and traditional artificial neural networks. On the basis of the image topological analysis technology, Haar like features and extreme learning machine algorithm, a new detection method of the intelligent fatigue driving has been proposed in the paper. Besides, the detailed algorithm and realization scheme of the intelligent fatigue driving detection have been put forward as well. Finally, by comparing the results of the simulation experiments, the new method has been verified to have a better robustness, efficiency and accuracy in monitoring and tracking the drivers' fatigue driving by using the human eye tracking technology.