期刊文献+
共找到20篇文章
< 1 >
每页显示 20 50 100
Clothing Parsing Based on Multi-Scale Fusion and Improved Self-Attention Mechanism
1
作者 陈诺 王绍宇 +3 位作者 陆然 李文萱 覃志东 石秀金 《Journal of Donghua University(English Edition)》 CAS 2023年第6期661-666,共6页
Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.Th... Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task. 展开更多
关键词 clothing parsing convolutional neural network multi-scale fusion self-attention mechanism vision Transformer
下载PDF
Keyphrase Generation Based on Self-Attention Mechanism
2
作者 Kehua Yang Yaodong Wang +2 位作者 Wei Zhang Jiqing Yao Yuquan Le 《Computers, Materials & Continua》 SCIE EI 2019年第8期569-581,共13页
Keyphrase greatly provides summarized and valuable information.This information can help us not only understand text semantics,but also organize and retrieve text content effectively.The task of automatically generati... Keyphrase greatly provides summarized and valuable information.This information can help us not only understand text semantics,but also organize and retrieve text content effectively.The task of automatically generating it has received considerable attention in recent decades.From the previous studies,we can see many workable solutions for obtaining keyphrases.One method is to divide the content to be summarized into multiple blocks of text,then we rank and select the most important content.The disadvantage of this method is that it cannot identify keyphrase that does not include in the text,let alone get the real semantic meaning hidden in the text.Another approach uses recurrent neural networks to generate keyphrases from the semantic aspects of the text,but the inherently sequential nature precludes parallelization within training examples,and distances have limitations on context dependencies.Previous works have demonstrated the benefits of the self-attention mechanism,which can learn global text dependency features and can be parallelized.Inspired by the above observation,we propose a keyphrase generation model,which is based entirely on the self-attention mechanism.It is an encoder-decoder model that can make up the above disadvantage effectively.In addition,we also consider the semantic similarity between keyphrases,and add semantic similarity processing module into the model.This proposed model,which is demonstrated by empirical analysis on five datasets,can achieve competitive performance compared to baseline methods. 展开更多
关键词 Keyphrase generation self-attention mechanism encoder-decoder framework
下载PDF
NFHP-RN:AMethod of Few-Shot Network Attack Detection Based on the Network Flow Holographic Picture-ResNet
3
作者 Tao Yi Xingshu Chen +2 位作者 Mingdong Yang Qindong Li Yi Zhu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期929-955,共27页
Due to the rapid evolution of Advanced Persistent Threats(APTs)attacks,the emergence of new and rare attack samples,and even those never seen before,make it challenging for traditional rule-based detection methods to ... Due to the rapid evolution of Advanced Persistent Threats(APTs)attacks,the emergence of new and rare attack samples,and even those never seen before,make it challenging for traditional rule-based detection methods to extract universal rules for effective detection.With the progress in techniques such as transfer learning and meta-learning,few-shot network attack detection has progressed.However,challenges in few-shot network attack detection arise from the inability of time sequence flow features to adapt to the fixed length input requirement of deep learning,difficulties in capturing rich information from original flow in the case of insufficient samples,and the challenge of high-level abstract representation.To address these challenges,a few-shot network attack detection based on NFHP(Network Flow Holographic Picture)-RN(ResNet)is proposed.Specifically,leveraging inherent properties of images such as translation invariance,rotation invariance,scale invariance,and illumination invariance,network attack traffic features and contextual relationships are intuitively represented in NFHP.In addition,an improved RN network model is employed for high-level abstract feature extraction,ensuring that the extracted high-level abstract features maintain the detailed characteristics of the original traffic behavior,regardless of changes in background traffic.Finally,a meta-learning model based on the self-attention mechanism is constructed,achieving the detection of novel APT few-shot network attacks through the empirical generalization of high-level abstract feature representations of known-class network attack behaviors.Experimental results demonstrate that the proposed method can learn high-level abstract features of network attacks across different traffic detail granularities.Comparedwith state-of-the-artmethods,it achieves favorable accuracy,precision,recall,and F1 scores for the identification of unknown-class network attacks through cross-validation onmultiple datasets. 展开更多
关键词 APT attacks spatial pyramid pooling NFHP(network flow holo-graphic picture) ResNet self-attention mechanism META-LEARNING
下载PDF
Intelligent Fault Diagnosis Method of Rolling Bearings Based on Transfer Residual Swin Transformer with Shifted Windows
4
作者 Haomiao Wang Jinxi Wang +4 位作者 Qingmei Sui Faye Zhang Yibin Li Mingshun Jiang Phanasindh Paitekul 《Structural Durability & Health Monitoring》 EI 2024年第2期91-110,共20页
Due to their robust learning and expression ability for complex features,the deep learning(DL)model plays a vital role in bearing fault diagnosis.However,since there are fewer labeled samples in fault diagnosis,the de... Due to their robust learning and expression ability for complex features,the deep learning(DL)model plays a vital role in bearing fault diagnosis.However,since there are fewer labeled samples in fault diagnosis,the depth of DL models in fault diagnosis is generally shallower than that of DL models in other fields,which limits the diagnostic performance.To solve this problem,a novel transfer residual Swin Transformer(RST)is proposed for rolling bearings in this paper.RST has 24 residual self-attention layers,which use the hierarchical design and the shifted window-based residual self-attention.Combined with transfer learning techniques,the transfer RST model uses pre-trained parameters from ImageNet.A new end-to-end method for fault diagnosis based on deep transfer RST is proposed.Firstly,wavelet transform transforms the vibration signal into a wavelet time-frequency diagram.The signal’s time-frequency domain representation can be represented simultaneously.Secondly,the wavelet time-frequency diagram is the input of the RST model to obtain the fault type.Finally,our method is verified on public and self-built datasets.Experimental results show the superior performance of our method by comparing it with a shallow neural network. 展开更多
关键词 Rolling bearing fault diagnosis TRANSFORMER self-attention mechanism
下载PDF
An Affective EEG Analysis Method Without Feature Engineering
5
作者 Jian Zhang Chunying Fang +1 位作者 Yanghao Wu Mingjie Chang 《Journal of Electronic Research and Application》 2024年第1期36-45,共10页
Emotional electroencephalography(EEG)signals are a primary means of recording emotional brain activity.Currently,the most effective methods for analyzing emotional EEG signals involve feature engineering and neural ne... Emotional electroencephalography(EEG)signals are a primary means of recording emotional brain activity.Currently,the most effective methods for analyzing emotional EEG signals involve feature engineering and neural networks.However,neural networks possess a strong ability for automatic feature extraction.Is it possible to discard feature engineering and directly employ neural networks for end-to-end recognition?Based on the characteristics of EEG signals,this paper proposes an end-to-end feature extraction and classification method for a dynamic self-attention network(DySAT).The study reveals significant differences in brain activity patterns associated with different emotions across various experimenters and time periods.The results of this experiment can provide insights into the reasons behind these differences. 展开更多
关键词 Dynamic graph classification self-attention mechanism Dynamic self-attention network SEED dataset
下载PDF
circ2CBA: prediction of circRNA-RBP binding sites combining deep learning and attention mechanism 被引量:1
6
作者 Yajing GUO Xiujuan LEI +1 位作者 Lian LIU Yi PAN 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第5期217-225,共9页
Circular RNAs(circRNAs)are RNAs with closed circular structure involved in many biological processes by key interactions with RNA binding proteins(RBPs).Existing methods for predicting these interactions have limitati... Circular RNAs(circRNAs)are RNAs with closed circular structure involved in many biological processes by key interactions with RNA binding proteins(RBPs).Existing methods for predicting these interactions have limitations in feature learning.In view of this,we propose a method named circ2CBA,which uses only sequence information of circRNAs to predict circRNA-RBP binding sites.We have constructed a data set which includes eight sub-datasets.First,circ2CBA encodes circRNA sequences using the one-hot method.Next,a two-layer convolutional neural network(CNN)is used to initially extract the features.After CNN,circ2CBA uses a layer of bidirectional long and short-term memory network(BiLSTM)and the self-attention mechanism to learn the features.The AUC value of circ2CBA reaches 0.8987.Comparison of circ2CBA with other three methods on our data set and an ablation experiment confirm that circ2CBA is an effective method to predict the binding sites between circRNAs and RBPs. 展开更多
关键词 circRNAs RBPs CNN BiLSTM self-attention mechanism
原文传递
3D Object Detection with Attention:Shell-Based Modeling
7
作者 Xiaorui Zhang Ziquan Zhao +1 位作者 Wei Sun Qi Cui 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期537-550,共14页
LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previou... LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previous object detection methods,due to the pre-processing of the original LIDAR point cloud into voxels or pillars,lose the coordinate information of the original point cloud,slow detection speed,and gain inaccurate bounding box positioning.To address the issues above,this study proposes a new two-stage network structure to extract point cloud features directly by PointNet++,which effectively preserves the original point cloud coordinate information.To improve the detection accuracy,a shell-based modeling method is proposed.It roughly determines which spherical shell the coordinates belong to.Then,the results are refined to ground truth,thereby narrowing the localization range and improving the detection accuracy.To improve the recall of 3D object detection with bounding boxes,this paper designs a self-attention module for 3D object detection with a skip connection structure.Some of these features are highlighted by weighting them on the feature dimensions.After training,it makes the feature weights that are favorable for object detection get larger.Thus,the extracted features are more adapted to the object detection task.Extensive comparison experiments and ablation experiments conducted on the KITTI dataset verify the effectiveness of our proposed method in improving recall and precision. 展开更多
关键词 3D object detection autonomous driving point cloud shell-based modeling self-attention mechanism
下载PDF
Research on Multi-Modal Time Series Data Prediction Method Based on Dual-Stage Attention Mechanism
8
作者 Xinyu Liu Yulong Meng +4 位作者 Fangwei Liu Lingyu Chen Xinfeng Zhang Junyu Lin Husheng Gou 《国际计算机前沿大会会议论文集》 EI 2023年第1期127-144,共18页
The production data in the industrialfield have the characteristics of multimodality,high dimensionality and large correlation differences between attributes.Existing data prediction methods cannot effectively capture ... The production data in the industrialfield have the characteristics of multimodality,high dimensionality and large correlation differences between attributes.Existing data prediction methods cannot effectively capture time series and modal features,which leads to prediction hysteresis and poor prediction stabil-ity.Aiming at the above problems,this paper proposes a time-series and modal fea-tureenhancementmethodbasedonadual-stageself-attentionmechanism(DATT),and a time series prediction method based on a gated feedforward recurrent unit(GFRU).On this basis,the DATT-GFRU neural network with a gated feedforward recurrent neural network and dual-stage self-attention mechanism is designed and implemented.Experiments show that the prediction effect of the neural network prediction model based on DATT is significantly improved.Compared with the traditional prediction model,the DATT-GFRU neural network has a smaller aver-age error of model prediction results,stable prediction performance,and strong generalization ability on the three datasets with different numbers of attributes and different training sample sizes. 展开更多
关键词 Multi-modal time series data Recurrent neural network self-attention mechanism
原文传递
Research on clothing patterns generation based on multi-scales self-attention improved generative adversarial network
9
作者 Zi-yan Yu Tian-jian Luo 《International Journal of Intelligent Computing and Cybernetics》 EI 2021年第4期647-663,共17页
Purpose-Clothing patterns play a dominant role in costume design and have become an important link in the perception of costume art.Conventional clothing patterns design relies on experienced designers.Although the qu... Purpose-Clothing patterns play a dominant role in costume design and have become an important link in the perception of costume art.Conventional clothing patterns design relies on experienced designers.Although the quality of clothing patterns is very high on conventional design,the input time and output amount ratio is relative low for conventional design.In order to break through the bottleneck of conventional clothing patterns design,this paper proposes a novel way based on generative adversarial network(GAN)model for automatic clothing patterns generation,which not only reduces the dependence of experienced designer,but also improve the input-output ratio.Design/methodology/approach-In view of the fact that clothing patterns have high requirements for global artistic perception and local texture details,this paper improves the conventional GAN model from two aspects:a multi-scales discriminators strategy is introduced to deal with the local texture details;and the selfattention mechanism is introduced to improve the global artistic perception.Therefore,the improved GAN called multi-scales self-attention improved generative adversarial network(MS-SA-GAN)model,which is used for high resolution clothing patterns generation.Findings-To verify the feasibility and effectiveness of the proposed MS-SA-GAN model,a crawler is designed to acquire standard clothing patterns dataset from Baidu pictures,and a comparative experiment is conducted on our designed clothing patterns dataset.In experiments,we have adjusted different parameters of the proposed MS-SA-GAN model,and compared the global artistic perception and local texture details of the generated clothing patterns.Originality/value-Experimental results have shown that the clothing patterns generated by the proposed MS-SA-GANmodel are superior to the conventional algorithms in some local texture detail indexes.In addition,a group of clothing design professionals is invited to evaluate the global artistic perception through a valencearousal scale.The scale results have shown that the proposed MS-SA-GAN model achieves a better global art perception. 展开更多
关键词 Clothing-patterns Generative adversarial network Multi-scales discriminators self-attention mechanism Global artistic perception
下载PDF
Saliency guided self-attention network for pedestrian attribute recognition in surveillance scenarios
10
作者 Li Na Wu Yangyang +2 位作者 Liu Ying Li Daxiang Gao Jiale 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2022年第5期21-29,共9页
Pedestrian attribute recognition is often considered as a multi-label image classification task. In order to make full use of attribute-related location information, a saliency guided self-attention network(SGSA-Net) ... Pedestrian attribute recognition is often considered as a multi-label image classification task. In order to make full use of attribute-related location information, a saliency guided self-attention network(SGSA-Net) was proposed to weakly supervise attribute localization, without annotations of attribute-related regions. Saliency priors were integrated into the spatial attention module(SAM). Meanwhile, channel-wise attention and spatial attention were introduced into the network. Moreover, a weighted binary cross-entropy loss(WCEL) function was employed to handle the imbalance of training data. Extensive experiments on richly annotated pedestrian(RAP) and pedestrian attribute(PETA) datasets demonstrated that SGSA-Net outperformed other state-of-the-art methods. 展开更多
关键词 pedestrian attribute recognition saliency detection self-attention mechanism
原文传递
Self-attention Based Multimodule Fusion Graph Convolution Network for Traffic Flow Prediction
11
作者 Lijie Li Hongyang Shao +1 位作者 Junhao Chen Ye Wang 《国际计算机前沿大会会议论文集》 2022年第1期3-16,共14页
With rapid economic development,the per capita ownership of automobiles in our country has begun to rise year by year.More researchers have paid attention to using scientific methods to solve traffic flow problems.Tra... With rapid economic development,the per capita ownership of automobiles in our country has begun to rise year by year.More researchers have paid attention to using scientific methods to solve traffic flow problems.Traffic flow prediction is not simply affected by the number of vehicles,but also contains various complex factors,such as time,road conditions,and people flow.However,the existing methods ignore the complexity of road conditions and the correlation between individual nodes,which leads to the poor performance.In this study,a deep learning model SAMGCN is proposed to effectively capture the correlation between individual nodes to improve the performance of traffic flow prediction.First,the theory of spatiotemporal decoupling is used to divide each time of each node into finer particles.Second,multimodule fusion is used to mine the potential periodic relationships in the data.Finally,GRU is used to obtain the potential time relationship of the three modules.Extensive experiments were conducted on two traffic flow datasets,PeMS04 and PeMS08 in the Caltrans Performance Measurement System to prove the validity of the proposed model. 展开更多
关键词 Flow prediction Temporal-spatial correlation Graph convolution network self-attention mechanism
原文传递
A novel LSTM-autoencoder and enhanced transformer-based detection method for shield machine cutterhead clogging 被引量:2
12
作者 QIN ChengJin WU RuiHong +2 位作者 HUANG GuoQiang TAO JianFeng LIU ChengLiang 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2023年第2期512-527,共16页
Shield tunneling machines are paramount underground engineering equipment and play a key role in tunnel construction.During the shield construction process,the“mud cake”formed by the difficult-to-remove clay attache... Shield tunneling machines are paramount underground engineering equipment and play a key role in tunnel construction.During the shield construction process,the“mud cake”formed by the difficult-to-remove clay attached to the cutterhead severely affects the shield construction efficiency and is harmful to the healthy operation of a shield tunneling machine.In this study,we propose an enhanced transformer-based detection model for detecting the cutterhead clogging status of shield tunneling machines.First,the working state data of shield machines are selected from historical excavation data,and a long short-term memory-autoencoder neural network module is constructed to remove outliers.Next,variational mode decomposition and wavelet transform are employed to denoise the data.After the preprocessing,nonoverlapping rectangular windows are used to intercept the working state data to obtain the time slices used for analysis,and several time-domain features of these periods are extracted.Owing to the data imbalance in the original dataset,the k-means-synthetic minority oversampling technique algorithm is adopted to oversample the extracted time-domain features of the clogging data in the training set to balance the dataset and improve the model performance.Finally,an enhanced transformer-based neural network is constructed to extract essential implicit features and detect cutterhead clogging status.Data collected from actual tunnel construction projects are used to verify the proposed model.The results show that the proposed model achieves accurate detection of shield machine cutterhead clogging status,with 98.85%accuracy and a 0.9786 F1 score.Moreover,the proposed model significantly outperforms the comparison models. 展开更多
关键词 shield tunneling machine cutterhead clogging fault diagnosis autoencoder multihead self-attention mechanism imbalanced data
原文传递
Reduction of rain effect on wave height estimation from marine X-band radar images using unsupervised generative adversarial networks
13
作者 Li Wang Hui Mei +1 位作者 Weilun Luo Yunfei Cheng 《International Journal of Digital Earth》 SCIE EI 2023年第1期2356-2373,共18页
An intelligent single radar image de-raining method based on unsupervised self-attention generative adversarial networks is proposed to improve the accuracy of wave height parameter inversion results.The method builds... An intelligent single radar image de-raining method based on unsupervised self-attention generative adversarial networks is proposed to improve the accuracy of wave height parameter inversion results.The method builds a trainable end-to-end de-raining model with an unsupervised cycle-consistent adversarial network as an AI framework,which does not require pairs of rain-contaminated and corresponding ground-truth rain-free images for training.The model is trained by feeding rain-contaminated and clean radar images in an unpaired manner,and the atmospheric scattering model parameters are not required as a prior condition.Additionally,a self-attention mechanism is introduced into the model,allowing it to focus on rain clutter when processing radar images.This combines global and local rain clutter context information to output more accurate and clear de-raining radar images.The proposed method is validated by applying it to actualfield test data,which shows that compared with the wave height derived from the original rain-contaminated data,the root-mean-square error is reduced by 0.11 m and the correlation coefficient of the wave height is increased by 14%using the de-raining method.These results demonstrate that the method effectively reduces the impact of rain on the accuracy of wave height parameter estimation from marine X-band radar images. 展开更多
关键词 Generative adversarial networks self-attention mechanism unsupervised model marine X-band radar wave height
原文传递
Designs to Improve Capability of Neural Networks to Make Structural Predictions
14
作者 Tian-Yao Wang Jian-Feng Li +1 位作者 Hong-Dong Zhang Jeff Z.Y.Chen 《Chinese Journal of Polymer Science》 SCIE EI CAS CSCD 2023年第9期1477-1485,I0009,共10页
A deep neural network model generally consists of different modules that play essential roles in performing a task.The optimal design of a module for use in modeling a physical problem is directly related to the succe... A deep neural network model generally consists of different modules that play essential roles in performing a task.The optimal design of a module for use in modeling a physical problem is directly related to the success of the model.In this work,the effectiveness of a number of special modules,the self-attention mechanism for recognizing the importance of molecular sequence information in a polymer,as well as the big-stride representation and conditional random field for enhancing the network ability to produce desired local configurations,is numerically studied.Network models containing these modules are trained by using the well documented data of the native structures of the HP model and assessed according to their capability in making structural predictions of unseen data.The specific network design of self-attention mechanism adopted here is modified from a similar idea in natural language recognition.The big-stride representation module introduced in this work is shown to drastically improve network's capability to model polymer segments of strong lattice position correlations. 展开更多
关键词 Deep neural network self-attention mechanism Big-stride representation Conditional random methods
原文传递
LDformer:a parallel neural network model for long-term power forecasting
15
作者 Ran TIAN Xinmei LI +3 位作者 Zhongyu MA Yanxing LIU Jingxia WANG Chu WANG 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2023年第9期1287-1301,共15页
Accurate long-term power forecasting is important in the decision-making operation of the power grid and power consumption management of customers to ensure the power system’s reliable power supply and the grid econ... Accurate long-term power forecasting is important in the decision-making operation of the power grid and power consumption management of customers to ensure the power system’s reliable power supply and the grid economy’s reliable operation.However,most time-series forecasting models do not perform well in dealing with long-time-series prediction tasks with a large amount of data.To address this challenge,we propose a parallel time-series prediction model called LDformer.First,we combine Informer with long short-term memory(LSTM)to obtain deep representation abilities in the time series.Then,we propose a parallel encoder module to improve the robustness of the model and combine convolutional layers with an attention mechanism to avoid value redundancy in the attention mechanism.Finally,we propose a probabilistic sparse(ProbSparse)self-attention mechanism combined with UniDrop to reduce the computational overhead and mitigate the risk of losing some key connections in the sequence.Experimental results on five datasets show that LDformer outperforms the state-of-the-art methods for most of the cases when handling the different long-time-series prediction tasks. 展开更多
关键词 Long-term power forecasting Long short-term memory(LSTM) UniDrop self-attention mechanism
原文传递
A transformer-based Siamese network and an open optical dataset for semantic change detection of remote sensing images 被引量:2
16
作者 Panli Yuan Qingzhan Zhao +3 位作者 Xingbiao Zhao Xuewen Wang Xuefeng Long Yuchen Zheng 《International Journal of Digital Earth》 SCIE EI 2022年第1期1506-1525,共20页
Recent change detection(CD)methods focus on the extraction of deep change semantic features.However,existing methods overlook the fine-grained features and have the poor ability to capture long-range space–time infor... Recent change detection(CD)methods focus on the extraction of deep change semantic features.However,existing methods overlook the fine-grained features and have the poor ability to capture long-range space–time information,which leads to the micro changes missing and the edges of change types smoothing.In this paper,a potential transformer-based semantic change detection(SCD)model,Pyramid-SCDFormer is proposed,which precisely recognizes the small changes and fine edges details of the changes.The SCD model selectively merges different semantic tokens in multi-head self-attention block to obtain multiscale features,which is crucial for extraction information of remote sensing images(RSIs)with multiple changes from different scales.Moreover,we create a well-annotated SCD dataset,Landsat-SCD with unprecedented time series and change types in complex scenarios.Comparing with three Convolutional Neural Network-based,one attention-based,and two transformer-based networks,experimental results demonstrate that the Pyramid-SCDFormer stably outperforms the existing state-of-the-art CD models and obtains an improvement in MIoU/F1 of 1.11/0.76%,0.57/0.50%,and 8.75/8.59%on the LEVIR-CD,WHU_CD,and Landsat-SCD dataset respectively.For change classes proportion less than 1%,the proposed model improves the MIoU by 7.17–19.53%on Landsat-SCD dataset.The recognition performance for small-scale and fine edges of change types has greatly improved. 展开更多
关键词 Semantic change detection(SCD) change detection dataset transformer siamese network self-attention mechanism bitemporal remote sensing
原文传递
TwinNet: Twin Structured Knowledge Transfer Network for Weakly Supervised Action Localization 被引量:1
17
作者 Xiao-Yu Zhang Hai-Chao Shi +1 位作者 Chang-Sheng Li Li-Xin Duan 《Machine Intelligence Research》 EI CSCD 2022年第3期227-246,共20页
Action recognition and localization in untrimmed videos is important for many applications and have attracted a lot of attention. Since full supervision with frame-level annotation places an overwhelming burden on man... Action recognition and localization in untrimmed videos is important for many applications and have attracted a lot of attention. Since full supervision with frame-level annotation places an overwhelming burden on manual labeling effort, learning with weak video-level supervision becomes a potential solution. In this paper, we propose a novel weakly supervised framework to recognize actions and locate the corresponding frames in untrimmed videos simultaneously. Considering that there are abundant trimmed videos publicly available and well-segmented with semantic descriptions, the instructive knowledge learned on trimmed videos can be fully leveraged to analyze untrimmed videos. We present an effective knowledge transfer strategy based on inter-class semantic relevance. We also take advantage of the self-attention mechanism to obtain a compact video representation, such that the influence of background frames can be effectively eliminated. A learning architecture is designed with twin networks for trimmed and untrimmed videos, to facilitate transferable self-attentive representation learning. Extensive experiments are conducted on three untrimmed benchmark datasets (i.e., THUMOS14, ActivityNet1.3, and MEXaction2), and the experimental results clearly corroborate the efficacy of our method. It is especially encouraging to see that the proposed weakly supervised method even achieves comparable results to some fully supervised methods. 展开更多
关键词 Knowledge transfer weakly supervised learning self-attention mechanism representation learning action localization
原文传递
Video summarization with a graph convolutional attention network 被引量:1
18
作者 Ping LI Chao TANG Xianghua XU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2021年第6期902-913,共12页
Video summarization has established itself as a fundamental technique for generating compact and concise video, which alleviates managing and browsing large-scale video data. Existing methods fail to fully consider th... Video summarization has established itself as a fundamental technique for generating compact and concise video, which alleviates managing and browsing large-scale video data. Existing methods fail to fully consider the local and global relations among frames of video, leading to a deteriorated summarization performance. To address the above problem, we propose a graph convolutional attention network(GCAN) for video summarization. GCAN consists of two parts, embedding learning and context fusion, where embedding learning includes the temporal branch and graph branch. In particular, GCAN uses dilated temporal convolution to model local cues and temporal self-attention to exploit global cues for video frames. It learns graph embedding via a multi-layer graph convolutional network to reveal the intrinsic structure of frame samples. The context fusion part combines the output streams from the temporal branch and graph branch to create the context-aware representation of frames, on which the importance scores are evaluated for selecting representative frames to generate video summary. Experiments are carried out on two benchmark databases, Sum Me and TVSum, showing that the proposed GCAN approach enjoys superior performance compared to several state-of-the-art alternatives in three evaluation settings. 展开更多
关键词 Temporal learning self-attention mechanism Graph convolutional network Context fusion Video summarization
原文传递
A Fault Diagnosis Model for Complex Industrial Process Based on Improved TCN and 1D CNN
19
作者 WANG Mingsheng HUANG Bo +4 位作者 HE Chuanpeng LI Peipei ZHANG Jiahao CHEN Yu TONG Jie 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2022年第6期453-464,共12页
Fast and accurate fault diagnosis of strongly coupled, time-varying, multivariable complex industrial processes remain a challenging problem. We propose an industrial fault diagnosis model. This model is established o... Fast and accurate fault diagnosis of strongly coupled, time-varying, multivariable complex industrial processes remain a challenging problem. We propose an industrial fault diagnosis model. This model is established on the base of the temporal convolutional network(TCN) and the one-dimensional convolutional neural network(1DCNN). We add a batch normalization layer before the TCN layer, and the activation function of TCN is replaced from the initial ReLU function to the LeakyReLU function. To extract local correlations of features, a 1D convolution layer is added after the TCN layer, followed by the multi-head selfattention mechanism before the fully connected layer to enhance the model’s diagnostic ability. The extended Tennessee Eastman Process(TEP) dataset is used as the index to evaluate the performance of our model. The experiment results show the high fault recognition accuracy and better generalization performance of our model, which proves its effectiveness. Additionally, the model’s application on the diesel engine failure dataset of our partner’s project validates the effectiveness of it in industrial scenarios. 展开更多
关键词 fault diagnosis temporal convolutional network self-attention mechanism convolutional neural network
原文传递
A hybrid spatial-temporal deep learning prediction model of industrial methanol-to-olefins process
20
作者 Jibin Zhou Xue Li +4 位作者 Duiping Liu Feng Wang Tao Zhang Mao Ye Zhongmin Liu 《Frontiers of Chemical Science and Engineering》 SCIE EI 2024年第4期73-85,共13页
Methanol-to-olefins,as a promising non-oil pathway for the synthesis of light olefins,has been successfully industrialized.The accurate prediction of process variables can yield significant benefits for advanced proce... Methanol-to-olefins,as a promising non-oil pathway for the synthesis of light olefins,has been successfully industrialized.The accurate prediction of process variables can yield significant benefits for advanced process control and optimization.The challenge of this task is underscored by the failure of traditional methods in capturing the complex characteristics of industrial processes,such as high nonlinearities,dynamics,and data distribution shift caused by diverse operating conditions.In this paper,we propose a novel hybrid spatial-temporal deep learning prediction model to address these issues.Firstly,a unique data normalization technique called reversible instance normalization is employed to solve the problem of different data distributions.Subsequently,convolutional neural network integrated with the self-attention mechanism are utilized to extract the temporal patterns.Meanwhile,a multi-graph convolutional network is leveraged to model the spatial interactions.Afterward,the extracted temporal and spatial features are fused as input into a fully connected neural network to complete the prediction.Finally,the outputs are denormalized to obtain the ultimate results.The monitoring results of the dynamic trends of process variables in an actual industrial methanol-to-olefins process demonstrate that our model not only achieves superior prediction performance but also can reveal complex spatial-temporal relationships using the learned attention matrices and adjacency matrices,making the model more interpretable.Lastly,this model is deployed onto an end-to-end Industrial Internet Platform,which achieves effective practical results. 展开更多
关键词 methanol-to-olefins process variables prediction spatial-temporal self-attention mechanism graph convolutional network
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部