期刊文献+
共找到265篇文章
< 1 2 14 >
每页显示 20 50 100
Enhancing Image Description Generation through Deep Reinforcement Learning:Fusing Multiple Visual Features and Reward Mechanisms
1
作者 Yan Li Qiyuan Wang Kaidi Jia 《Computers, Materials & Continua》 SCIE EI 2024年第2期2469-2489,共21页
Image description task is the intersection of computer vision and natural language processing,and it has important prospects,including helping computers understand images and obtaining information for the visually imp... Image description task is the intersection of computer vision and natural language processing,and it has important prospects,including helping computers understand images and obtaining information for the visually impaired.This study presents an innovative approach employing deep reinforcement learning to enhance the accuracy of natural language descriptions of images.Our method focuses on refining the reward function in deep reinforcement learning,facilitating the generation of precise descriptions by aligning visual and textual features more closely.Our approach comprises three key architectures.Firstly,it utilizes Residual Network 101(ResNet-101)and Faster Region-based Convolutional Neural Network(Faster R-CNN)to extract average and local image features,respectively,followed by the implementation of a dual attention mechanism for intricate feature fusion.Secondly,the Transformer model is engaged to derive contextual semantic features from textual data.Finally,the generation of descriptive text is executed through a two-layer long short-term memory network(LSTM),directed by the value and reward functions.Compared with the image description method that relies on deep learning,the score of Bilingual Evaluation Understudy(BLEU-1)is 0.762,which is 1.6%higher,and the score of BLEU-4 is 0.299.Consensus-based Image Description Evaluation(CIDEr)scored 0.998,Recall-Oriented Understudy for Gisting Evaluation(ROUGE)scored 0.552,the latter improved by 0.36%.These results not only attest to the viability of our approach but also highlight its superiority in the realm of image description.Future research can explore the integration of our method with other artificial intelligence(AI)domains,such as emotional AI,to create more nuanced and context-aware systems. 展开更多
关键词 Image description deep reinforcement learning attention mechanism
下载PDF
Liver Tumor Prediction with Advanced Attention Mechanisms Integrated into a Depth-Based Variant Search Algorithm
2
作者 P.Kalaiselvi S.Anusuya 《Computers, Materials & Continua》 SCIE EI 2023年第10期1209-1226,共18页
In recent days,Deep Learning(DL)techniques have become an emerging transformation in the field of machine learning,artificial intelligence,computer vision,and so on.Subsequently,researchers and industries have been hi... In recent days,Deep Learning(DL)techniques have become an emerging transformation in the field of machine learning,artificial intelligence,computer vision,and so on.Subsequently,researchers and industries have been highly endorsed in the medical field,predicting and controlling diverse diseases at specific intervals.Liver tumor prediction is a vital chore in analyzing and treating liver diseases.This paper proposes a novel approach for predicting liver tumors using Convolutional Neural Networks(CNN)and a depth-based variant search algorithm with advanced attention mechanisms(CNN-DS-AM).The proposed work aims to improve accuracy and robustness in diagnosing and treating liver diseases.The anticipated model is assessed on a Computed Tomography(CT)scan dataset containing both benign and malignant liver tumors.The proposed approach achieved high accuracy in predicting liver tumors,outperforming other state-of-the-art methods.Additionally,advanced attention mechanisms were incorporated into the CNN model to enable the identification and highlighting of regions of the CT scans most relevant to predicting liver tumors.The results suggest that incorporating attention mechanisms and a depth-based variant search algorithm into the CNN model is a promising approach for improving the accuracy and robustness of liver tumor prediction.It can assist radiologists in their diagnosis and treatment planning.The proposed system achieved a high accuracy of 95.5%in predicting liver tumors,outperforming other state-of-the-art methods. 展开更多
关键词 Deep learning convolution neural networks liver tumors CT scans attention mechanism CLASSIFIER
下载PDF
An Efficient 3D CNN Framework with Attention Mechanisms for Alzheimer’s Disease Classification
3
作者 Athena George Bejoy Abraham +2 位作者 Neetha George Linu Shine Sivakumar Ramachandran 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期2097-2118,共22页
Neurodegeneration is the gradual deterioration and eventual death of brain cells,leading to progressive loss of structure and function of neurons in the brain and nervous system.Neurodegenerative disorders,such as Alz... Neurodegeneration is the gradual deterioration and eventual death of brain cells,leading to progressive loss of structure and function of neurons in the brain and nervous system.Neurodegenerative disorders,such as Alzheimer’s,Huntington’s,Parkinson’s,amyotrophic lateral sclerosis,multiple system atrophy,and multiple sclerosis,are characterized by progressive deterioration of brain function,resulting in symptoms such as memory impairment,movement difficulties,and cognitive decline.Early diagnosis of these conditions is crucial to slowing down cell degeneration and reducing the severity of the diseases.Magnetic resonance imaging(MRI)is widely used by neurologists for diagnosing brain abnormalities.The majority of the research in this field focuses on processing the 2D images extracted from the 3D MRI volumetric scans for disease diagnosis.This might result in losing the volumetric information obtained from the whole brain MRI.To address this problem,a novel 3D-CNN architecture with an attention mechanism is proposed to classify whole-brain MRI images for Alzheimer’s disease(AD)detection.The 3D-CNN model uses channel and spatial attention mechanisms to extract relevant features and improve accuracy in identifying brain dysfunctions by focusing on specific regions of the brain.The pipeline takes pre-processed MRI volumetric scans as input,and the 3D-CNN model leverages both channel and spatial attention mechanisms to extract precise feature representations of the input MRI volume for accurate classification.The present study utilizes the publicly available Alzheimer’s disease Neuroimaging Initiative(ADNI)dataset,which has three image classes:Mild Cognitive Impairment(MCI),Cognitive Normal(CN),and AD affected.The proposed approach achieves an overall accuracy of 79%when classifying three classes and an average accuracy of 87%when identifying AD and the other two classes.The findings reveal that 3D-CNN models with an attention mechanism exhibit significantly higher classification performance compared to other models,highlighting the potential of deep learning algorithms to aid in the early detection and prediction of AD. 展开更多
关键词 3D CNN alzheimer’s disease attention mechanism CLASSIFICATION
下载PDF
Deep Neural Network Based Spam Email Classification Using Attention Mechanisms
4
作者 Md. Tofael Ahmed Mariam Akter +4 位作者 Md. Saifur Rahman Maqsudur Rahman Pintu Chandra Paul Miss. Nargis Parvin Almas Hossain Antar 《Journal of Intelligent Learning Systems and Applications》 2023年第4期144-164,共21页
Spam emails pose a threat to individuals. The proliferation of spam emails daily has rendered traditional machine learning and deep learning methods for screening them ineffective and inefficient. In our research, we ... Spam emails pose a threat to individuals. The proliferation of spam emails daily has rendered traditional machine learning and deep learning methods for screening them ineffective and inefficient. In our research, we employ deep neural networks like RNN, LSTM, and GRU, incorporating attention mechanisms such as Bahdanua, scaled dot product (SDP), and Luong scaled dot product self-attention for spam email filtering. We evaluate our approach on various datasets, including Trec spam, Enron spam emails, SMS spam collections, and the Ling spam dataset, which constitutes a substantial custom dataset. All these datasets are publicly available. For the Enron dataset, we attain an accuracy of 99.97% using LSTM with SDP self-attention. Our custom dataset exhibits the highest accuracy of 99.01% when employing GRU with SDP self-attention. The SMS spam collection dataset yields a peak accuracy of 99.61% with LSTM and SDP attention. Using the GRU (Gated Recurrent Unit) alongside Luong and SDP (Structured Self-Attention) attention mechanisms, the peak accuracy of 99.89% in the Ling spam dataset. For the Trec spam dataset, the most accurate results are achieved using Luong attention LSTM, with an accuracy rate of 99.01%. Our performance analyses consistently indicate that employing the scaled dot product attention mechanism in conjunction with gated recurrent neural networks (GRU) delivers the most effective results. In summary, our research underscores the efficacy of employing advanced deep learning techniques and attention mechanisms for spam email filtering, with remarkable accuracy across multiple datasets. This approach presents a promising solution to the ever-growing problem of spam emails. 展开更多
关键词 Spam Email Attention Mechanism Deep Neural Network Bahdanua Attention Scale Dot Product
下载PDF
Aspect-Level Sentiment Analysis Incorporating Semantic and Syntactic Information
5
作者 Jiachen Yang Yegang Li +2 位作者 Hao Zhang Junpeng Hu Rujiang Bai 《Journal of Computer and Communications》 2024年第1期191-207,共17页
Aiming at the problem that existing models in aspect-level sentiment analysis cannot fully and effectively utilize sentence semantic and syntactic structure information, this paper proposes a graph neural network-base... Aiming at the problem that existing models in aspect-level sentiment analysis cannot fully and effectively utilize sentence semantic and syntactic structure information, this paper proposes a graph neural network-based aspect-level sentiment classification model. Self-attention, aspectual word multi-head attention and dependent syntactic relations are fused and the node representations are enhanced with graph convolutional networks to enable the model to fully learn the global semantic and syntactic structural information of sentences. Experimental results show that the model performs well on three public benchmark datasets Rest14, Lap14, and Twitter, improving the accuracy of sentiment classification. 展开更多
关键词 Aspect-Level Sentiment Analysis attentional mechanisms Dependent Syntactic Trees Graph Convolutional Neural Networks
下载PDF
A Novel Tensor Decomposition-Based Efficient Detector for Low-Altitude Aerial Objects With Knowledge Distillation Scheme
6
作者 Nianyin Zeng Xinyu Li +2 位作者 Peishu Wu Han Li Xin Luo 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第2期487-501,共15页
Unmanned aerial vehicles(UAVs) have gained significant attention in practical applications, especially the low-altitude aerial(LAA) object detection imposes stringent requirements on recognition accuracy and computati... Unmanned aerial vehicles(UAVs) have gained significant attention in practical applications, especially the low-altitude aerial(LAA) object detection imposes stringent requirements on recognition accuracy and computational resources. In this paper, the LAA images-oriented tensor decomposition and knowledge distillation-based network(TDKD-Net) is proposed,where the TT-format TD(tensor decomposition) and equalweighted response-based KD(knowledge distillation) methods are designed to minimize redundant parameters while ensuring comparable performance. Moreover, some robust network structures are developed, including the small object detection head and the dual-domain attention mechanism, which enable the model to leverage the learned knowledge from small-scale targets and selectively focus on salient features. Considering the imbalance of bounding box regression samples and the inaccuracy of regression geometric factors, the focal and efficient IoU(intersection of union) loss with optimal transport assignment(F-EIoU-OTA)mechanism is proposed to improve the detection accuracy. The proposed TDKD-Net is comprehensively evaluated through extensive experiments, and the results have demonstrated the effectiveness and superiority of the developed methods in comparison to other advanced detection algorithms, which also present high generalization and strong robustness. As a resource-efficient precise network, the complex detection of small and occluded LAA objects is also well addressed by TDKD-Net, which provides useful insights on handling imbalanced issues and realizing domain adaptation. 展开更多
关键词 Attention mechanism knowledge distillation(KD) object detection tensor decomposition(TD) unmanned aerial vehicles(UAVs)
下载PDF
Real-Time Detection and Instance Segmentation of Strawberry in Unstructured Environment
7
作者 Chengjun Wang Fan Ding +4 位作者 Yiwen Wang Renyuan Wu Xingyu Yao Chengjie Jiang Liuyi Ling 《Computers, Materials & Continua》 SCIE EI 2024年第1期1481-1501,共21页
The real-time detection and instance segmentation of strawberries constitute fundamental components in the development of strawberry harvesting robots.Real-time identification of strawberries in an unstructured envi-r... The real-time detection and instance segmentation of strawberries constitute fundamental components in the development of strawberry harvesting robots.Real-time identification of strawberries in an unstructured envi-ronment is a challenging task.Current instance segmentation algorithms for strawberries suffer from issues such as poor real-time performance and low accuracy.To this end,the present study proposes an Efficient YOLACT(E-YOLACT)algorithm for strawberry detection and segmentation based on the YOLACT framework.The key enhancements of the E-YOLACT encompass the development of a lightweight attention mechanism,pyramid squeeze shuffle attention(PSSA),for efficient feature extraction.Additionally,an attention-guided context-feature pyramid network(AC-FPN)is employed instead of FPN to optimize the architecture’s performance.Furthermore,a feature-enhanced model(FEM)is introduced to enhance the prediction head’s capabilities,while efficient fast non-maximum suppression(EF-NMS)is devised to improve non-maximum suppression.The experimental results demonstrate that the E-YOLACT achieves a Box-mAP and Mask-mAP of 77.9 and 76.6,respectively,on the custom dataset.Moreover,it exhibits an impressive category accuracy of 93.5%.Notably,the E-YOLACT also demonstrates a remarkable real-time detection capability with a speed of 34.8 FPS.The method proposed in this article presents an efficient approach for the vision system of a strawberry-picking robot. 展开更多
关键词 YOLACT real-time detection instance segmentation attention mechanism STRAWBERRY
下载PDF
A Study on Enhancing Chip Detection Efficiency Using the Lightweight Van-YOLOv8 Network
8
作者 Meng Huang Honglei Wei Xianyi Zhai 《Computers, Materials & Continua》 SCIE EI 2024年第4期531-547,共17页
In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the f... In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the front side is employed for pin alignment following successful functional testing.However,recycled chips often exhibit substantial surface wear,and the identification of the relatively small marker proves challenging.Moreover,the complexity of generic target detection algorithms hampers seamless deployment.Addressing these issues,this paper introduces a lightweight YOLOv8s-based network tailored for detecting markings on recycled chips,termed Van-YOLOv8.Initially,to alleviate the influence of diminutive,low-resolution markings on the precision of deep learning models,we utilize an upscaling approach for enhanced resolution.This technique relies on the Super-Resolution Generative Adversarial Network with Extended Training(SRGANext)network,facilitating the reconstruction of high-fidelity images that align with input specifications.Subsequently,we replace the original YOLOv8smodel’s backbone feature extraction network with the lightweight VanillaNetwork(VanillaNet),simplifying the branch structure to reduce network parameters.Finally,a Hybrid Attention Mechanism(HAM)is implemented to capture essential details from input images,improving feature representation while concurrently expediting model inference speed.Experimental results demonstrate that the Van-YOLOv8 network outperforms the original YOLOv8s on a recycled chip dataset in various aspects.Significantly,it demonstrates superiority in parameter count,computational intricacy,precision in identifying targets,and speed when compared to certain prevalent algorithms in the current landscape.The proposed approach proves promising for real-time detection of recycled chips in practical factory settings. 展开更多
关键词 Lightweight neural networks attention mechanisms image super-resolution enhancement feature extraction small object detection
下载PDF
Self-supervised recalibration network for person re-identification
9
作者 Shaoqi Hou Zhiming Wang +4 位作者 Zhihua Dong Ye Li Zhiguo Wang Guangqiang Yin Xinzhong Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第1期163-178,共16页
The attention mechanism can extract salient features in images,which has been proved to be effective in improving the performance of person re-identification(Re-ID).However,most of the existing attention modules have ... The attention mechanism can extract salient features in images,which has been proved to be effective in improving the performance of person re-identification(Re-ID).However,most of the existing attention modules have the following two shortcomings:On the one hand,they mostly use global average pooling to generate context descriptors,without highlighting the guiding role of salient information on descriptor generation,resulting in insufficient ability of the final generated attention mask representation;On the other hand,the design of most attention modules is complicated,which greatly increases the computational cost of the model.To solve these problems,this paper proposes an attention module called self-supervised recalibration(SR)block,which introduces both global and local information through adaptive weighted fusion to generate a more refined attention mask.In particular,a special"Squeeze-Excitation"(SE)unit is designed in the SR block to further process the generated intermediate masks,both for nonlinearizations of the features and for constraint of the resulting computation by controlling the number of channels.Furthermore,we combine the most commonly used Res Net-50 to construct the instantiation model of the SR block,and verify its effectiveness on multiple Re-ID datasets,especially the mean Average Precision(m AP)on the Occluded-Duke dataset exceeds the state-of-the-art(SOTA)algorithm by 4.49%. 展开更多
关键词 Person re-identification Attention mechanism Global information Local information Adaptive weighted fusion
下载PDF
Scheme Based on Multi-Level Patch Attention and Lesion Localization for Diabetic Retinopathy Grading
10
作者 Zhuoqun Xia Hangyu Hu +4 位作者 Wenjing Li Qisheng Jiang Lan Pu Yicong Shu Arun Kumar Sangaiah 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期409-430,共22页
Early screening of diabetes retinopathy(DR)plays an important role in preventing irreversible blindness.Existing research has failed to fully explore effective DR lesion information in fundus maps.Besides,traditional ... Early screening of diabetes retinopathy(DR)plays an important role in preventing irreversible blindness.Existing research has failed to fully explore effective DR lesion information in fundus maps.Besides,traditional attention schemes have not considered the impact of lesion type differences on grading,resulting in unreasonable extraction of important lesion features.Therefore,this paper proposes a DR diagnosis scheme that integrates a multi-level patch attention generator(MPAG)and a lesion localization module(LLM).Firstly,MPAGis used to predict patches of different sizes and generate a weighted attention map based on the prediction score and the types of lesions contained in the patches,fully considering the impact of lesion type differences on grading,solving the problem that the attention maps of lesions cannot be further refined and then adapted to the final DR diagnosis task.Secondly,the LLM generates a global attention map based on localization.Finally,the weighted attention map and global attention map are weighted with the fundus map to fully explore effective DR lesion information and increase the attention of the classification network to lesion details.This paper demonstrates the effectiveness of the proposed method through extensive experiments on the public DDR dataset,obtaining an accuracy of 0.8064. 展开更多
关键词 DDR dataset diabetic retinopathy lesion localization multi-level patch attention mechanism
下载PDF
Deep Learning for Financial Time Series Prediction:A State-of-the-Art Review of Standalone and HybridModels
11
作者 Weisi Chen Walayat Hussain +1 位作者 Francesco Cauteruccio Xu Zhang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期187-224,共38页
Financial time series prediction,whether for classification or regression,has been a heated research topic over the last decade.While traditional machine learning algorithms have experienced mediocre results,deep lear... Financial time series prediction,whether for classification or regression,has been a heated research topic over the last decade.While traditional machine learning algorithms have experienced mediocre results,deep learning has largely contributed to the elevation of the prediction performance.Currently,the most up-to-date review of advanced machine learning techniques for financial time series prediction is still lacking,making it challenging for finance domain experts and relevant practitioners to determine which model potentially performs better,what techniques and components are involved,and how themodel can be designed and implemented.This review article provides an overview of techniques,components and frameworks for financial time series prediction,with an emphasis on state-of-the-art deep learning models in the literature from2015 to 2023,including standalonemodels like convolutional neural networks(CNN)that are capable of extracting spatial dependencies within data,and long short-term memory(LSTM)that is designed for handling temporal dependencies;and hybrid models integrating CNN,LSTM,attention mechanism(AM)and other techniques.For illustration and comparison purposes,models proposed in recent studies are mapped to relevant elements of a generalized framework comprised of input,output,feature extraction,prediction,and related processes.Among the state-of-the-artmodels,hybrid models like CNNLSTMand CNN-LSTM-AM in general have been reported superior in performance to stand-alone models like the CNN-only model.Some remaining challenges have been discussed,including non-friendliness for finance domain experts,delayed prediction,domain knowledge negligence,lack of standards,and inability of real-time and highfrequency predictions.The principal contributions of this paper are to provide a one-stop guide for both academia and industry to review,compare and summarize technologies and recent advances in this area,to facilitate smooth and informed implementation,and to highlight future research directions. 展开更多
关键词 Financial time series prediction convolutional neural network long short-term memory deep learning attention mechanism FINANCE
下载PDF
Combo Packet:An Encryption Traffic Classification Method Based on Contextual Information
12
作者 Yuancong Chai Yuefei Zhu +1 位作者 Wei Lin Ding Li 《Computers, Materials & Continua》 SCIE EI 2024年第4期1223-1243,共21页
With the increasing proportion of encrypted traffic in cyberspace, the classification of encrypted traffic has becomea core key technology in network supervision. In recent years, many different solutions have emerged... With the increasing proportion of encrypted traffic in cyberspace, the classification of encrypted traffic has becomea core key technology in network supervision. In recent years, many different solutions have emerged in this field.Most methods identify and classify traffic by extracting spatiotemporal characteristics of data flows or byte-levelfeatures of packets. However, due to changes in data transmission mediums, such as fiber optics and satellites,temporal features can exhibit significant variations due to changes in communication links and transmissionquality. Additionally, partial spatial features can change due to reasons like data reordering and retransmission.Faced with these challenges, identifying encrypted traffic solely based on packet byte-level features is significantlydifficult. To address this, we propose a universal packet-level encrypted traffic identification method, ComboPacket. This method utilizes convolutional neural networks to extract deep features of the current packet andits contextual information and employs spatial and channel attention mechanisms to select and locate effectivefeatures. Experimental data shows that Combo Packet can effectively distinguish between encrypted traffic servicecategories (e.g., File Transfer Protocol, FTP, and Peer-to-Peer, P2P) and encrypted traffic application categories (e.g.,BitTorrent and Skype). Validated on the ISCX VPN-non VPN dataset, it achieves classification accuracies of 97.0%and 97.1% for service and application categories, respectively. It also provides shorter training times and higherrecognition speeds. The performance and recognition capabilities of Combo Packet are significantly superior tothe existing classification methods mentioned. 展开更多
关键词 Encrypted traffic classification packet-level convolutional neural network attention mechanisms
下载PDF
Mobile Crowdsourcing Task Allocation Based on Dynamic Self-Attention GANs
13
作者 Kai Wei Song Yu Qingxian Pan 《Computers, Materials & Continua》 SCIE EI 2024年第4期607-622,共16页
Crowdsourcing technology is widely recognized for its effectiveness in task scheduling and resource allocation.While traditional methods for task allocation can help reduce costs and improve efficiency,they may encoun... Crowdsourcing technology is widely recognized for its effectiveness in task scheduling and resource allocation.While traditional methods for task allocation can help reduce costs and improve efficiency,they may encounter challenges when dealing with abnormal data flow nodes,leading to decreased allocation accuracy and efficiency.To address these issues,this study proposes a novel two-part invalid detection task allocation framework.In the first step,an anomaly detection model is developed using a dynamic self-attentive GAN to identify anomalous data.Compared to the baseline method,the model achieves an approximately 4%increase in the F1 value on the public dataset.In the second step of the framework,task allocation modeling is performed using a twopart graph matching method.This phase introduces a P-queue KM algorithm that implements a more efficient optimization strategy.The allocation efficiency is improved by approximately 23.83%compared to the baseline method.Empirical results confirm the effectiveness of the proposed framework in detecting abnormal data nodes,enhancing allocation precision,and achieving efficient allocation. 展开更多
关键词 Mobile crowdsourcing task allocation anomaly detection GAN attention mechanisms
下载PDF
YOLO-MFD:Remote Sensing Image Object Detection with Multi-Scale Fusion Dynamic Head
14
作者 Zhongyuan Zhang Wenqiu Zhu 《Computers, Materials & Continua》 SCIE EI 2024年第5期2547-2563,共17页
Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false... Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method. 展开更多
关键词 Object detection YOLOv8 MULTI-SCALE attention mechanism dynamic detection head
下载PDF
Predicting Traffic Flow Using Dynamic Spatial-Temporal Graph Convolution Networks
15
作者 Yunchang Liu Fei Wan Chengwu Liang 《Computers, Materials & Continua》 SCIE EI 2024年第3期4343-4361,共19页
Traffic flow prediction plays a key role in the construction of intelligent transportation system.However,due to its complex spatio-temporal dependence and its uncertainty,the research becomes very challenging.Most of... Traffic flow prediction plays a key role in the construction of intelligent transportation system.However,due to its complex spatio-temporal dependence and its uncertainty,the research becomes very challenging.Most of the existing studies are based on graph neural networks that model traffic flow graphs and try to use fixed graph structure to deal with the relationship between nodes.However,due to the time-varying spatial correlation of the traffic network,there is no fixed node relationship,and these methods cannot effectively integrate the temporal and spatial features.This paper proposes a novel temporal-spatial dynamic graph convolutional network(TSADGCN).The dynamic time warping algorithm(DTW)is introduced to calculate the similarity of traffic flow sequence among network nodes in the time dimension,and the spatiotemporal graph of traffic flow is constructed to capture the spatiotemporal characteristics and dependencies of traffic flow.By combining graph attention network and time attention network,a spatiotemporal convolution block is constructed to capture spatiotemporal characteristics of traffic data.Experiments on open data sets PEMSD4 and PEMSD8 show that TSADGCN has higher prediction accuracy than well-known traffic flow prediction algorithms. 展开更多
关键词 Intelligent transportation graph convolutional network traffic flow DTW algorithm attention mechanism
下载PDF
A Cover-Independent Deep Image Hiding Method Based on Domain Attention Mechanism
16
作者 Nannan Wu Xianyi Chen +1 位作者 James Msughter Adeke Junjie Zhao 《Computers, Materials & Continua》 SCIE EI 2024年第3期3001-3019,共19页
Recently,deep image-hiding techniques have attracted considerable attention in covert communication and high-capacity information hiding.However,these approaches have some limitations.For example,a cover image lacks s... Recently,deep image-hiding techniques have attracted considerable attention in covert communication and high-capacity information hiding.However,these approaches have some limitations.For example,a cover image lacks self-adaptability,information leakage,or weak concealment.To address these issues,this study proposes a universal and adaptable image-hiding method.First,a domain attention mechanism is designed by combining the Atrous convolution,which makes better use of the relationship between the secret image domain and the cover image domain.Second,to improve perceived human similarity,perceptual loss is incorporated into the training process.The experimental results are promising,with the proposed method achieving an average pixel discrepancy(APD)of 1.83 and a peak signal-to-noise ratio(PSNR)value of 40.72 dB between the cover and stego images,indicative of its high-quality output.Furthermore,the structural similarity index measure(SSIM)reaches 0.985 while the learned perceptual image patch similarity(LPIPS)remarkably registers at 0.0001.Moreover,self-testing and cross-experiments demonstrate the model’s adaptability and generalization in unknown hidden spaces,making it suitable for diverse computer vision tasks. 展开更多
关键词 Deep image hiding attention mechanism privacy protection data security visual quality
下载PDF
Fake News Detection Based on Text-Modal Dominance and Fusing Multiple Multi-Model Clues
17
作者 Li fang Fu Huanxin Peng +1 位作者 Changjin Ma Yuhan Liu 《Computers, Materials & Continua》 SCIE EI 2024年第3期4399-4416,共18页
In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure in... In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure information has proven to be effective in fake news detection and how to combine it while reducing the noise information is critical.Unfortunately,existing approaches fail to handle these problems.This paper proposes a multi-model fake news detection framework based on Tex-modal Dominance and fusing Multiple Multi-model Cues(TD-MMC),which utilizes three valuable multi-model clues:text-model importance,text-image complementary,and text-image inconsistency.TD-MMC is dominated by textural content and assisted by image information while using social network information to enhance text representation.To reduce the irrelevant social structure’s information interference,we use a unidirectional cross-modal attention mechanism to selectively learn the social structure’s features.A cross-modal attention mechanism is adopted to obtain text-image cross-modal features while retaining textual features to reduce the loss of important information.In addition,TD-MMC employs a new multi-model loss to improve the model’s generalization ability.Extensive experiments have been conducted on two public real-world English and Chinese datasets,and the results show that our proposed model outperforms the state-of-the-art methods on classification evaluation metrics. 展开更多
关键词 Fake news detection cross-modal attention mechanism multi-modal fusion social network transfer learning
下载PDF
Multi-scale persistent spatiotemporal transformer for long-term urban traffic flow prediction
18
作者 Jia-Jun Zhong Yong Ma +3 位作者 Xin-Zheng Niu Philippe Fournier-Viger Bing Wang Zu-kuan Wei 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第1期53-69,共17页
Long-term urban traffic flow prediction is an important task in the field of intelligent transportation,as it can help optimize traffic management and improve travel efficiency.To improve prediction accuracy,a crucial... Long-term urban traffic flow prediction is an important task in the field of intelligent transportation,as it can help optimize traffic management and improve travel efficiency.To improve prediction accuracy,a crucial issue is how to model spatiotemporal dependency in urban traffic data.In recent years,many studies have adopted spatiotemporal neural networks to extract key information from traffic data.However,most models ignore the semantic spatial similarity between long-distance areas when mining spatial dependency.They also ignore the impact of predicted time steps on the next unpredicted time step for making long-term predictions.Moreover,these models lack a comprehensive data embedding process to represent complex spatiotemporal dependency.This paper proposes a multi-scale persistent spatiotemporal transformer(MSPSTT)model to perform accurate long-term traffic flow prediction in cities.MSPSTT adopts an encoder-decoder structure and incorporates temporal,periodic,and spatial features to fully embed urban traffic data to address these issues.The model consists of a spatiotemporal encoder and a spatiotemporal decoder,which rely on temporal,geospatial,and semantic space multi-head attention modules to dynamically extract temporal,geospatial,and semantic characteristics.The spatiotemporal decoder combines the context information provided by the encoder,integrates the predicted time step information,and is iteratively updated to learn the correlation between different time steps in the broader time range to improve the model’s accuracy for long-term prediction.Experiments on four public transportation datasets demonstrate that MSPSTT outperforms the existing models by up to 9.5%on three common metrics. 展开更多
关键词 Graph neural network Multi-head attention mechanism Spatio-temporal dependency Traffic flow prediction
下载PDF
Multi-Perspective Data Fusion Framework Based on Hierarchical BERT: Provide Visual Predictions of Business Processes
19
作者 Yongwang Yuan Xiangwei Liu Ke Lu 《Computers, Materials & Continua》 SCIE EI 2024年第1期1227-1252,共26页
Predictive Business Process Monitoring(PBPM)is a significant research area in Business Process Management(BPM)aimed at accurately forecasting future behavioral events.At present,deep learning methods are widely cited ... Predictive Business Process Monitoring(PBPM)is a significant research area in Business Process Management(BPM)aimed at accurately forecasting future behavioral events.At present,deep learning methods are widely cited in PBPM research,but no method has been effective in fusing data information into the control flow for multi-perspective process prediction.Therefore,this paper proposes a process prediction method based on the hierarchical BERT and multi-perspective data fusion.Firstly,the first layer BERT network learns the correlations between different category attribute data.Then,the attribute data is integrated into a weighted event-level feature vector and input into the second layer BERT network to learn the impact and priority relationship of each event on future predicted events.Next,the multi-head attention mechanism within the framework is visualized for analysis,helping to understand the decision-making logic of the framework and providing visual predictions.Finally,experimental results show that the predictive accuracy of the framework surpasses the current state-of-the-art research methods and significantly enhances the predictive performance of BPM. 展开更多
关键词 Business process prediction monitoring deep learning attention mechanism BERT multi-perspective
下载PDF
A Method for Detecting and Recognizing Yi Character Based on Deep Learning
20
作者 Haipeng Sun Xueyan Ding +2 位作者 Jian Sun HuaYu Jianxin Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第2期2721-2739,共19页
Aiming at the challenges associated with the absence of a labeled dataset for Yi characters and the complexity of Yi character detection and recognition,we present a deep learning-based approach for Yi character detec... Aiming at the challenges associated with the absence of a labeled dataset for Yi characters and the complexity of Yi character detection and recognition,we present a deep learning-based approach for Yi character detection and recognition.In the detection stage,an improved Differentiable Binarization Network(DBNet)framework is introduced to detect Yi characters,in which the Omni-dimensional Dynamic Convolution(ODConv)is combined with the ResNet-18 feature extraction module to obtain multi-dimensional complementary features,thereby improving the accuracy of Yi character detection.Then,the feature pyramid network fusion module is used to further extract Yi character image features,improving target recognition at different scales.Further,the previously generated feature map is passed through a head network to produce two maps:a probability map and an adaptive threshold map of the same size as the original map.These maps are then subjected to a differentiable binarization process,resulting in an approximate binarization map.This map helps to identify the boundaries of the text boxes.Finally,the text detection box is generated after the post-processing stage.In the recognition stage,an improved lightweight MobileNetV3 framework is used to recognize the detect character regions,where the original Squeeze-and-Excitation(SE)block is replaced by the efficient Shuffle Attention(SA)that integrates spatial and channel attention,improving the accuracy of Yi characters recognition.Meanwhile,the use of depth separable convolution and reversible residual structure can reduce the number of parameters and computation of the model,so that the model can better understand the contextual information and improve the accuracy of text recognition.The experimental results illustrate that the proposed method achieves good results in detecting and recognizing Yi characters,with detection and recognition accuracy rates of 97.5%and 96.8%,respectively.And also,we have compared the detection and recognition algorithms proposed in this paper with other typical algorithms.In these comparisons,the proposed model achieves better detection and recognition results with a certain reliability. 展开更多
关键词 Yi characters text detection text recognition attention mechanism deep neural network
下载PDF
上一页 1 2 14 下一页 到第
使用帮助 返回顶部