Fault diagnosis is important for maintaining the safety and effectiveness of chemical process.Considering the multivariate,nonlinear,and dynamic characteristic of chemical process,many time-series-based data-driven fa...Fault diagnosis is important for maintaining the safety and effectiveness of chemical process.Considering the multivariate,nonlinear,and dynamic characteristic of chemical process,many time-series-based data-driven fault diagnosis methods have been developed in recent years.However,the existing methods have the problem of long-term dependency and are difficult to train due to the sequential way of training.To overcome these problems,a novel fault diagnosis method based on time-series and the hierarchical multihead self-attention(HMSAN)is proposed for chemical process.First,a sliding window strategy is adopted to construct the normalized time-series dataset.Second,the HMSAN is developed to extract the time-relevant features from the time-series process data.It improves the basic self-attention model in both width and depth.With the multihead structure,the HMSAN can pay attention to different aspects of the complicated chemical process and obtain the global dynamic features.However,the multiple heads in parallel lead to redundant information,which cannot improve the diagnosis performance.With the hierarchical structure,the redundant information is reduced and the deep local time-related features are further extracted.Besides,a novel many-to-one training strategy is introduced for HMSAN to simplify the training procedure and capture the long-term dependency.Finally,the effectiveness of the proposed method is demonstrated by two chemical cases.The experimental results show that the proposed method achieves a great performance on time-series industrial data and outperforms the state-of-the-art approaches.展开更多
Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have becom...Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.展开更多
The satellite-terrestrial networks possess the ability to transcend geographical constraints inherent in traditional communication networks,enabling global coverage and offering users ubiquitous computing power suppor...The satellite-terrestrial networks possess the ability to transcend geographical constraints inherent in traditional communication networks,enabling global coverage and offering users ubiquitous computing power support,which is an important development direction of future communications.In this paper,we take into account a multi-scenario network model under the coverage of low earth orbit(LEO)satellite,which can provide computing resources to users in faraway areas to improve task processing efficiency.However,LEO satellites experience limitations in computing and communication resources and the channels are time-varying and complex,which makes the extraction of state information a daunting task.Therefore,we explore the dynamic resource management issue pertaining to joint computing,communication resource allocation and power control for multi-access edge computing(MEC).In order to tackle this formidable issue,we undertake the task of transforming the issue into a Markov decision process(MDP)problem and propose the self-attention based dynamic resource management(SABDRM)algorithm,which effectively extracts state information features to enhance the training process.Simulation results show that the proposed algorithm is capable of effectively reducing the long-term average delay and energy consumption of the tasks.展开更多
In the application of aerial target recognition,on the one hand,the recognition error produced by the single measurement of the sensor is relatively large due to the impact of noise.On the other hand,it is difficult t...In the application of aerial target recognition,on the one hand,the recognition error produced by the single measurement of the sensor is relatively large due to the impact of noise.On the other hand,it is difficult to apply machine learning methods to improve the intelligence and recognition effect due to few or no actual measurement samples.Aiming at these problems,an aerial target recognition algorithm based on self-attention and Long Short-Term Memory Network(LSTM)is proposed.LSTM can effectively extract temporal dependencies.The attention mechanism calculates the weight of each input element and applies the weight to the hidden state of the LSTM,thereby adjusting the LSTM’s attention to the input.This combination retains the learning ability of LSTM and introduces the advantages of the attention mechanism,making the model have stronger feature extraction ability and adaptability when processing sequence data.In addition,based on the prior information of the multidimensional characteristics of the target,the three-point estimation method is adopted to simulate an aerial target recognition dataset to train the recognition model.The experimental results show that the proposed algorithm achieves more than 91%recognition accuracy,lower false alarm rate and higher robustness compared with the multi-attribute decision-making(MADM)based on fuzzy numbers.展开更多
The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random mis...The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random missing(RM)that differs significantly from common missing patterns of RTT-AT.The method for solving the RM may experience performance degradation or failure when applied to RTT-AT imputation.Conventional autoregressive deep learning methods are prone to error accumulation and long-term dependency loss.In this paper,a non-autoregressive imputation model that addresses the issue of missing value imputation for two common missing patterns in RTT-AT is proposed.Our model consists of two probabilistic sparse diagonal masking self-attention(PSDMSA)units and a weight fusion unit.It learns missing values by combining the representations outputted by the two units,aiming to minimize the difference between the missing values and their actual values.The PSDMSA units effectively capture temporal dependencies and attribute correlations between time steps,improving imputation quality.The weight fusion unit automatically updates the weights of the output representations from the two units to obtain a more accurate final representation.The experimental results indicate that,despite varying missing rates in the two missing patterns,our model consistently outperforms other methods in imputation performance and exhibits a low frequency of deviations in estimates for specific missing entries.Compared to the state-of-the-art autoregressive deep learning imputation model Bidirectional Recurrent Imputation for Time Series(BRITS),our proposed model reduces mean absolute error(MAE)by 31%~50%.Additionally,the model attains a training speed that is 4 to 8 times faster when compared to both BRITS and a standard Transformer model when trained on the same dataset.Finally,the findings from the ablation experiments demonstrate that the PSDMSA,the weight fusion unit,cascade network design,and imputation loss enhance imputation performance and confirm the efficacy of our design.展开更多
False data injection attack(FDIA)can affect the state estimation of the power grid by tampering with the measured value of the power grid data,and then destroying the stable operation of the smart grid.Existing work u...False data injection attack(FDIA)can affect the state estimation of the power grid by tampering with the measured value of the power grid data,and then destroying the stable operation of the smart grid.Existing work usually trains a detection model by fusing the data-driven features from diverse power data streams.Data-driven features,however,cannot effectively capture the differences between noisy data and attack samples.As a result,slight noise disturbances in the power grid may cause a large number of false detections for FDIA attacks.To address this problem,this paper designs a deep collaborative self-attention network to achieve robust FDIA detection,in which the spatio-temporal features of cascaded FDIA attacks are fully integrated.Firstly,a high-order Chebyshev polynomials-based graph convolution module is designed to effectively aggregate the spatio information between grid nodes,and the spatial self-attention mechanism is involved to dynamically assign attention weights to each node,which guides the network to pay more attention to the node information that is conducive to FDIA detection.Furthermore,the bi-directional Long Short-Term Memory(LSTM)network is introduced to conduct time series modeling and long-term dependence analysis for power grid data and utilizes the temporal selfattention mechanism to describe the time correlation of data and assign different weights to different time steps.Our designed deep collaborative network can effectively mine subtle perturbations from spatiotemporal feature information,efficiently distinguish power grid noise from FDIA attacks,and adapt to diverse attack intensities.Extensive experiments demonstrate that our method can obtain an efficient detection performance over actual load data from New York Independent System Operator(NYISO)in IEEE 14,IEEE 39,and IEEE 118 bus systems,and outperforms state-of-the-art FDIA detection schemes in terms of detection accuracy and robustness.展开更多
In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a p...In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues, we propose a speech-text multimodal dual-tower architecture for speech recognition. It employs cross-modal interactions to achieve close semantic alignment during the encoding stage and strengthen its capabilities in modeling auditory long-distance context dependencies. In addition, a two-stage training strategy is elaborately devised to derive semantics-aware acoustic representations effectively. The first stage focuses on pre-training the speech-text multimodal encoding module to enhance inter-modal semantic alignment and aural long-distance context dependencies. The second stage fine-tunes the entire network to bridge the input modality variation gap between the training and inference phases and boost generalization performance. Extensive experiments demonstrate the effectiveness of the proposed speech-text multimodal speech recognition method on the ATCC and AISHELL-1 datasets. It reduces the character error rate to 6.54% and 8.73%, respectively, and exhibits substantial performance gains of 28.76% and 23.82% compared with the best baseline model. The case studies indicate that the obtained semantics-aware acoustic representations aid in accurately recognizing terms with similar pronunciations but distinctive semantics. The research provides a novel modeling paradigm for semantics-aware speech recognition in air traffic control communications, which could contribute to the advancement of intelligent and efficient aviation safety management.展开更多
To predict renewable energy sources such as solar power in microgrids more accurately,a hybrid power prediction method is presented in this paper.First,the self-attention mechanism is introduced based on a bidirection...To predict renewable energy sources such as solar power in microgrids more accurately,a hybrid power prediction method is presented in this paper.First,the self-attention mechanism is introduced based on a bidirectional gated recurrent neural network(BiGRU)to explore the time-series characteristics of solar power output and consider the influence of different time nodes on the prediction results.Subsequently,an improved quantum particle swarm optimization(QPSO)algorithm is proposed to optimize the hyperparameters of the combined prediction model.The final proposed LQPSO-BiGRU-self-attention hybrid model can predict solar power more effectively.In addition,considering the coordinated utilization of various energy sources such as electricity,hydrogen,and renewable energy,a multi-objective optimization model that considers both economic and environmental costs was constructed.A two-stage adaptive multi-objective quantum particle swarm optimization algorithm aided by a Lévy flight,named MO-LQPSO,was proposed for the comprehensive optimal scheduling of a multi-energy microgrid system.This algorithm effectively balances the global and local search capabilities and enhances the solution of complex nonlinear problems.The effectiveness and superiority of the proposed scheme are verified through comparative simulations.展开更多
Early and timely diagnosis of stroke is critical for effective treatment,and the electroencephalogram(EEG)offers a low-cost,non-invasive solution.However,the shortage of high-quality patient EEG data often hampers the...Early and timely diagnosis of stroke is critical for effective treatment,and the electroencephalogram(EEG)offers a low-cost,non-invasive solution.However,the shortage of high-quality patient EEG data often hampers the accuracy of diagnostic classification methods based on deep learning.To address this issue,our study designed a deep data amplification model named Progressive Conditional Generative Adversarial Network with Efficient Approximating Self Attention(PCGAN-EASA),which incrementally improves the quality of generated EEG features.This network can yield full-scale,fine-grained EEG features from the low-scale,coarse ones.Specially,to overcome the limitations of traditional generative models that fail to generate features tailored to individual patient characteristics,we developed an encoder with an effective approximating self-attention mechanism.This encoder not only automatically extracts relevant features across different patients but also reduces the computational resource consumption.Furthermore,the adversarial loss and reconstruction loss functions were redesigned to better align with the training characteristics of the network and the spatial correlations among electrodes.Extensive experimental results demonstrate that PCGAN-EASA provides the highest generation quality and the lowest computational resource usage compared to several existing approaches.Additionally,it significantly improves the accuracy of subsequent stroke classification tasks.展开更多
Aerial threat assessment is a crucial link in modern air combat, whose result counts a great deal for commanders to make decisions. With the consideration that the existing threat assessment methods have difficulties ...Aerial threat assessment is a crucial link in modern air combat, whose result counts a great deal for commanders to make decisions. With the consideration that the existing threat assessment methods have difficulties in dealing with high dimensional time series target data, a threat assessment method based on self-attention mechanism and gated recurrent unit(SAGRU) is proposed. Firstly, a threat feature system including air combat situations and capability features is established. Moreover, a data augmentation process based on fractional Fourier transform(FRFT) is applied to extract more valuable information from time series situation features. Furthermore, aiming to capture key characteristics of battlefield evolution, a bidirectional GRU and SA mechanisms are designed for enhanced features.Subsequently, after the concatenation of the processed air combat situation and capability features, the target threat level will be predicted by fully connected neural layers and the softmax classifier. Finally, in order to validate this model, an air combat dataset generated by a combat simulation system is introduced for model training and testing. The comparison experiments show the proposed model has structural rationality and can perform threat assessment faster and more accurately than the other existing models based on deep learning.展开更多
Traditional models for semantic segmentation in point clouds primarily focus on smaller scales.However,in real-world applications,point clouds often exhibit larger scales,leading to heavy computational and memory requ...Traditional models for semantic segmentation in point clouds primarily focus on smaller scales.However,in real-world applications,point clouds often exhibit larger scales,leading to heavy computational and memory requirements.The key to handling large-scale point clouds lies in leveraging random sampling,which offers higher computational efficiency and lower memory consumption compared to other sampling methods.Nevertheless,the use of random sampling can potentially result in the loss of crucial points during the encoding stage.To address these issues,this paper proposes cross-fusion self-attention network(CFSA-Net),a lightweight and efficient network architecture specifically designed for directly processing large-scale point clouds.At the core of this network is the incorporation of random sampling alongside a local feature extraction module based on cross-fusion self-attention(CFSA).This module effectively integrates long-range contextual dependencies between points by employing hierarchical position encoding(HPC).Furthermore,it enhances the interaction between each point’s coordinates and feature information through cross-fusion self-attention pooling,enabling the acquisition of more comprehensive geometric information.Finally,a residual optimization(RO)structure is introduced to extend the receptive field of individual points by stacking hierarchical position encoding and cross-fusion self-attention pooling,thereby reducing the impact of information loss caused by random sampling.Experimental results on the Stanford Large-Scale 3D Indoor Spaces(S3DIS),Semantic3D,and SemanticKITTI datasets demonstrate the superiority of this algorithm over advanced approaches such as RandLA-Net and KPConv.These findings underscore the excellent performance of CFSA-Net in large-scale 3D semantic segmentation.展开更多
Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.Th...Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.展开更多
On Twitter,people often use hashtags to mark the subject of a tweet.Tweets have specific themes or content that are easy for people to manage.With the increase in the number of tweets,how to automatically recommend ha...On Twitter,people often use hashtags to mark the subject of a tweet.Tweets have specific themes or content that are easy for people to manage.With the increase in the number of tweets,how to automatically recommend hashtags for tweets has received wide attention.The previous hashtag recommendation methods were to convert the task into a multi-class classification problem.However,these methods can only recommend hashtags that appeared in historical information,and cannot recommend the new ones.In this work,we extend the self-attention mechanism to turn the hashtag recommendation task into a sequence labeling task.To train and evaluate the proposed method,we used the real tweet data which is collected from Twitter.Experimental results show that the proposed method can be significantly better than the most advanced method.Compared with the state-of-the-art methods,the accuracy of our method has been increased 4%.展开更多
With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can a...With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can also reduce difficulties in management of online public opinions.A convolutional neural network model based on multi-head attention is proposed to solve the problem of how to effectively model relations among words and identify key words in emotion classification tasks with short text contents and lack of complete context information.Firstly,encode word positions so that order information of input sequences can be used by the model.Secondly,use a multi-head attention mechanism to obtain semantic expressions in different subspaces,effectively capture internal relevance and enhance dependent relationships among words,as well as highlight emotional weights of key emotional words.Then a dilated convolution is used to increase the receptive field and extract more features.On this basis,the above multi-attention mechanism is combined with a convolutional neural network to model and analyze the seven emotional categories of bullet screens.Testing from perspectives of model and dataset,experimental results can validate effectiveness of our approach.Finally,emotions of bullet screens are visualized to provide data supports for hot event controls and other fields.展开更多
基金supported by the National Natural Science Foundation of China(62073140,62073141)the Shanghai Rising-Star Program(21QA1401800).
文摘Fault diagnosis is important for maintaining the safety and effectiveness of chemical process.Considering the multivariate,nonlinear,and dynamic characteristic of chemical process,many time-series-based data-driven fault diagnosis methods have been developed in recent years.However,the existing methods have the problem of long-term dependency and are difficult to train due to the sequential way of training.To overcome these problems,a novel fault diagnosis method based on time-series and the hierarchical multihead self-attention(HMSAN)is proposed for chemical process.First,a sliding window strategy is adopted to construct the normalized time-series dataset.Second,the HMSAN is developed to extract the time-relevant features from the time-series process data.It improves the basic self-attention model in both width and depth.With the multihead structure,the HMSAN can pay attention to different aspects of the complicated chemical process and obtain the global dynamic features.However,the multiple heads in parallel lead to redundant information,which cannot improve the diagnosis performance.With the hierarchical structure,the redundant information is reduced and the deep local time-related features are further extracted.Besides,a novel many-to-one training strategy is introduced for HMSAN to simplify the training procedure and capture the long-term dependency.Finally,the effectiveness of the proposed method is demonstrated by two chemical cases.The experimental results show that the proposed method achieves a great performance on time-series industrial data and outperforms the state-of-the-art approaches.
基金supported by the National Natural Science Foundation of China under Grant 62177029the Postgraduate Research&Practice Innovation Program of Jiangsu Province(KYCX21_0740),China.
文摘Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.
基金supported by the National Key Research and Development Plan(No.2022YFB2902701)the key Natural Science Foundation of Shenzhen(No.JCYJ20220818102209020).
文摘The satellite-terrestrial networks possess the ability to transcend geographical constraints inherent in traditional communication networks,enabling global coverage and offering users ubiquitous computing power support,which is an important development direction of future communications.In this paper,we take into account a multi-scenario network model under the coverage of low earth orbit(LEO)satellite,which can provide computing resources to users in faraway areas to improve task processing efficiency.However,LEO satellites experience limitations in computing and communication resources and the channels are time-varying and complex,which makes the extraction of state information a daunting task.Therefore,we explore the dynamic resource management issue pertaining to joint computing,communication resource allocation and power control for multi-access edge computing(MEC).In order to tackle this formidable issue,we undertake the task of transforming the issue into a Markov decision process(MDP)problem and propose the self-attention based dynamic resource management(SABDRM)algorithm,which effectively extracts state information features to enhance the training process.Simulation results show that the proposed algorithm is capable of effectively reducing the long-term average delay and energy consumption of the tasks.
文摘In the application of aerial target recognition,on the one hand,the recognition error produced by the single measurement of the sensor is relatively large due to the impact of noise.On the other hand,it is difficult to apply machine learning methods to improve the intelligence and recognition effect due to few or no actual measurement samples.Aiming at these problems,an aerial target recognition algorithm based on self-attention and Long Short-Term Memory Network(LSTM)is proposed.LSTM can effectively extract temporal dependencies.The attention mechanism calculates the weight of each input element and applies the weight to the hidden state of the LSTM,thereby adjusting the LSTM’s attention to the input.This combination retains the learning ability of LSTM and introduces the advantages of the attention mechanism,making the model have stronger feature extraction ability and adaptability when processing sequence data.In addition,based on the prior information of the multidimensional characteristics of the target,the three-point estimation method is adopted to simulate an aerial target recognition dataset to train the recognition model.The experimental results show that the proposed algorithm achieves more than 91%recognition accuracy,lower false alarm rate and higher robustness compared with the multi-attribute decision-making(MADM)based on fuzzy numbers.
基金supported by Graduate Funded Project(No.JY2022A017).
文摘The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random missing(RM)that differs significantly from common missing patterns of RTT-AT.The method for solving the RM may experience performance degradation or failure when applied to RTT-AT imputation.Conventional autoregressive deep learning methods are prone to error accumulation and long-term dependency loss.In this paper,a non-autoregressive imputation model that addresses the issue of missing value imputation for two common missing patterns in RTT-AT is proposed.Our model consists of two probabilistic sparse diagonal masking self-attention(PSDMSA)units and a weight fusion unit.It learns missing values by combining the representations outputted by the two units,aiming to minimize the difference between the missing values and their actual values.The PSDMSA units effectively capture temporal dependencies and attribute correlations between time steps,improving imputation quality.The weight fusion unit automatically updates the weights of the output representations from the two units to obtain a more accurate final representation.The experimental results indicate that,despite varying missing rates in the two missing patterns,our model consistently outperforms other methods in imputation performance and exhibits a low frequency of deviations in estimates for specific missing entries.Compared to the state-of-the-art autoregressive deep learning imputation model Bidirectional Recurrent Imputation for Time Series(BRITS),our proposed model reduces mean absolute error(MAE)by 31%~50%.Additionally,the model attains a training speed that is 4 to 8 times faster when compared to both BRITS and a standard Transformer model when trained on the same dataset.Finally,the findings from the ablation experiments demonstrate that the PSDMSA,the weight fusion unit,cascade network design,and imputation loss enhance imputation performance and confirm the efficacy of our design.
基金supported in part by the Research Fund of Guangxi Key Lab of Multi-Source Information Mining&Security(MIMS21-M-02).
文摘False data injection attack(FDIA)can affect the state estimation of the power grid by tampering with the measured value of the power grid data,and then destroying the stable operation of the smart grid.Existing work usually trains a detection model by fusing the data-driven features from diverse power data streams.Data-driven features,however,cannot effectively capture the differences between noisy data and attack samples.As a result,slight noise disturbances in the power grid may cause a large number of false detections for FDIA attacks.To address this problem,this paper designs a deep collaborative self-attention network to achieve robust FDIA detection,in which the spatio-temporal features of cascaded FDIA attacks are fully integrated.Firstly,a high-order Chebyshev polynomials-based graph convolution module is designed to effectively aggregate the spatio information between grid nodes,and the spatial self-attention mechanism is involved to dynamically assign attention weights to each node,which guides the network to pay more attention to the node information that is conducive to FDIA detection.Furthermore,the bi-directional Long Short-Term Memory(LSTM)network is introduced to conduct time series modeling and long-term dependence analysis for power grid data and utilizes the temporal selfattention mechanism to describe the time correlation of data and assign different weights to different time steps.Our designed deep collaborative network can effectively mine subtle perturbations from spatiotemporal feature information,efficiently distinguish power grid noise from FDIA attacks,and adapt to diverse attack intensities.Extensive experiments demonstrate that our method can obtain an efficient detection performance over actual load data from New York Independent System Operator(NYISO)in IEEE 14,IEEE 39,and IEEE 118 bus systems,and outperforms state-of-the-art FDIA detection schemes in terms of detection accuracy and robustness.
基金This research was funded by Shenzhen Science and Technology Program(Grant No.RCBS20221008093121051)the General Higher Education Project of Guangdong Provincial Education Department(Grant No.2020ZDZX3085)+1 种基金China Postdoctoral Science Foundation(Grant No.2021M703371)the Post-Doctoral Foundation Project of Shenzhen Polytechnic(Grant No.6021330002K).
文摘In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues, we propose a speech-text multimodal dual-tower architecture for speech recognition. It employs cross-modal interactions to achieve close semantic alignment during the encoding stage and strengthen its capabilities in modeling auditory long-distance context dependencies. In addition, a two-stage training strategy is elaborately devised to derive semantics-aware acoustic representations effectively. The first stage focuses on pre-training the speech-text multimodal encoding module to enhance inter-modal semantic alignment and aural long-distance context dependencies. The second stage fine-tunes the entire network to bridge the input modality variation gap between the training and inference phases and boost generalization performance. Extensive experiments demonstrate the effectiveness of the proposed speech-text multimodal speech recognition method on the ATCC and AISHELL-1 datasets. It reduces the character error rate to 6.54% and 8.73%, respectively, and exhibits substantial performance gains of 28.76% and 23.82% compared with the best baseline model. The case studies indicate that the obtained semantics-aware acoustic representations aid in accurately recognizing terms with similar pronunciations but distinctive semantics. The research provides a novel modeling paradigm for semantics-aware speech recognition in air traffic control communications, which could contribute to the advancement of intelligent and efficient aviation safety management.
基金supported by the National Natural Science Foundation of China under Grant 51977004the Beijing Natural Science Foundation under Grant 4212042.
文摘To predict renewable energy sources such as solar power in microgrids more accurately,a hybrid power prediction method is presented in this paper.First,the self-attention mechanism is introduced based on a bidirectional gated recurrent neural network(BiGRU)to explore the time-series characteristics of solar power output and consider the influence of different time nodes on the prediction results.Subsequently,an improved quantum particle swarm optimization(QPSO)algorithm is proposed to optimize the hyperparameters of the combined prediction model.The final proposed LQPSO-BiGRU-self-attention hybrid model can predict solar power more effectively.In addition,considering the coordinated utilization of various energy sources such as electricity,hydrogen,and renewable energy,a multi-objective optimization model that considers both economic and environmental costs was constructed.A two-stage adaptive multi-objective quantum particle swarm optimization algorithm aided by a Lévy flight,named MO-LQPSO,was proposed for the comprehensive optimal scheduling of a multi-energy microgrid system.This algorithm effectively balances the global and local search capabilities and enhances the solution of complex nonlinear problems.The effectiveness and superiority of the proposed scheme are verified through comparative simulations.
基金supported by the General Program under grant funded by the National Natural Science Foundation of China(NSFC)(No.62171307)the Basic Research Program of Shanxi Province under grant funded by the Department of Science and Technology of Shanxi Province(China)(No.202103021224113).
文摘Early and timely diagnosis of stroke is critical for effective treatment,and the electroencephalogram(EEG)offers a low-cost,non-invasive solution.However,the shortage of high-quality patient EEG data often hampers the accuracy of diagnostic classification methods based on deep learning.To address this issue,our study designed a deep data amplification model named Progressive Conditional Generative Adversarial Network with Efficient Approximating Self Attention(PCGAN-EASA),which incrementally improves the quality of generated EEG features.This network can yield full-scale,fine-grained EEG features from the low-scale,coarse ones.Specially,to overcome the limitations of traditional generative models that fail to generate features tailored to individual patient characteristics,we developed an encoder with an effective approximating self-attention mechanism.This encoder not only automatically extracts relevant features across different patients but also reduces the computational resource consumption.Furthermore,the adversarial loss and reconstruction loss functions were redesigned to better align with the training characteristics of the network and the spatial correlations among electrodes.Extensive experimental results demonstrate that PCGAN-EASA provides the highest generation quality and the lowest computational resource usage compared to several existing approaches.Additionally,it significantly improves the accuracy of subsequent stroke classification tasks.
基金supported by the National Natural Science Foundation of China (6202201562088101)+1 种基金Shanghai Municipal Science and Technology Major Project (2021SHZDZX0100)Shanghai Municip al Commission of Science and Technology Project (19511132101)。
文摘Aerial threat assessment is a crucial link in modern air combat, whose result counts a great deal for commanders to make decisions. With the consideration that the existing threat assessment methods have difficulties in dealing with high dimensional time series target data, a threat assessment method based on self-attention mechanism and gated recurrent unit(SAGRU) is proposed. Firstly, a threat feature system including air combat situations and capability features is established. Moreover, a data augmentation process based on fractional Fourier transform(FRFT) is applied to extract more valuable information from time series situation features. Furthermore, aiming to capture key characteristics of battlefield evolution, a bidirectional GRU and SA mechanisms are designed for enhanced features.Subsequently, after the concatenation of the processed air combat situation and capability features, the target threat level will be predicted by fully connected neural layers and the softmax classifier. Finally, in order to validate this model, an air combat dataset generated by a combat simulation system is introduced for model training and testing. The comparison experiments show the proposed model has structural rationality and can perform threat assessment faster and more accurately than the other existing models based on deep learning.
基金funded by the National Natural Science Foundation of China Youth Project(61603127).
文摘Traditional models for semantic segmentation in point clouds primarily focus on smaller scales.However,in real-world applications,point clouds often exhibit larger scales,leading to heavy computational and memory requirements.The key to handling large-scale point clouds lies in leveraging random sampling,which offers higher computational efficiency and lower memory consumption compared to other sampling methods.Nevertheless,the use of random sampling can potentially result in the loss of crucial points during the encoding stage.To address these issues,this paper proposes cross-fusion self-attention network(CFSA-Net),a lightweight and efficient network architecture specifically designed for directly processing large-scale point clouds.At the core of this network is the incorporation of random sampling alongside a local feature extraction module based on cross-fusion self-attention(CFSA).This module effectively integrates long-range contextual dependencies between points by employing hierarchical position encoding(HPC).Furthermore,it enhances the interaction between each point’s coordinates and feature information through cross-fusion self-attention pooling,enabling the acquisition of more comprehensive geometric information.Finally,a residual optimization(RO)structure is introduced to extend the receptive field of individual points by stacking hierarchical position encoding and cross-fusion self-attention pooling,thereby reducing the impact of information loss caused by random sampling.Experimental results on the Stanford Large-Scale 3D Indoor Spaces(S3DIS),Semantic3D,and SemanticKITTI datasets demonstrate the superiority of this algorithm over advanced approaches such as RandLA-Net and KPConv.These findings underscore the excellent performance of CFSA-Net in large-scale 3D semantic segmentation.
文摘Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.
文摘随着手机短信成为人们日常生活交往的重要手段,垃圾短信的识别具有重要的现实意义.针对此提出一种结合TFIDF的self-attention-based Bi-LSTM的神经网络模型.该模型首先将短信文本以词向量的方式输入到Bi-LSTM层,经过特征提取并结合TFIDF和self-attention层的信息聚焦获得最后的特征向量,最后将特征向量通过Softmax分类器进行分类得到短信文本分类结果.实验结果表明,结合TFIDF的self-attention-based Bi-LSTM模型相比于传统分类模型的短信文本识别准确率提高了2.1%–4.6%,运行时间减少了0.6 s–10.2 s.
文摘On Twitter,people often use hashtags to mark the subject of a tweet.Tweets have specific themes or content that are easy for people to manage.With the increase in the number of tweets,how to automatically recommend hashtags for tweets has received wide attention.The previous hashtag recommendation methods were to convert the task into a multi-class classification problem.However,these methods can only recommend hashtags that appeared in historical information,and cannot recommend the new ones.In this work,we extend the self-attention mechanism to turn the hashtag recommendation task into a sequence labeling task.To train and evaluate the proposed method,we used the real tweet data which is collected from Twitter.Experimental results show that the proposed method can be significantly better than the most advanced method.Compared with the state-of-the-art methods,the accuracy of our method has been increased 4%.
基金National Natural Science Foundation of China(No.61562057)Gansu Science and Technology Plan Project(No.18JR3RA104)。
文摘With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can also reduce difficulties in management of online public opinions.A convolutional neural network model based on multi-head attention is proposed to solve the problem of how to effectively model relations among words and identify key words in emotion classification tasks with short text contents and lack of complete context information.Firstly,encode word positions so that order information of input sequences can be used by the model.Secondly,use a multi-head attention mechanism to obtain semantic expressions in different subspaces,effectively capture internal relevance and enhance dependent relationships among words,as well as highlight emotional weights of key emotional words.Then a dilated convolution is used to increase the receptive field and extract more features.On this basis,the above multi-attention mechanism is combined with a convolutional neural network to model and analyze the seven emotional categories of bullet screens.Testing from perspectives of model and dataset,experimental results can validate effectiveness of our approach.Finally,emotions of bullet screens are visualized to provide data supports for hot event controls and other fields.