期刊文献+
共找到59,358篇文章
< 1 2 250 >
每页显示 20 50 100
Design methodology of a mini-missile considering flight performance and guidance precision
1
作者 ZHANG Licong GONG Chunlin +1 位作者 SU Hua ANDREA Da Ronch 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2024年第1期195-210,共16页
The design of mini-missiles(MMs)presents several novel challenges.The stringent mission requirement to reach a target with a certain precision imposes a high guidance precision.The miniaturization of the size of MMs m... The design of mini-missiles(MMs)presents several novel challenges.The stringent mission requirement to reach a target with a certain precision imposes a high guidance precision.The miniaturization of the size of MMs makes the design of the guidance,navigation,and control(GNC)have a larger-thanbefore impact on the main-body design(shape,motor,and layout design)and its design objective,i.e.,flight performance.Pursuing a trade-off between flight performance and guidance precision,all the relevant interactions have to be accounted for in the design of the main body and the GNC system.Herein,a multi-objective and multidisciplinary design optimization(MDO)is proposed.Disciplines pertinent to motor,aerodynamics,layout,trajectory,flight dynamics,control,and guidance are included in the proposed MDO framework.The optimization problem seeks to maximize the range and minimize the guidance error.The problem is solved by using the nondominated sorting genetic algorithm II.An optimum design that balances a longer range with a smaller guidance error is obtained.Finally,lessons learned about the design of the MM and insights into the trade-off between flight performance and guidance precision are given by comparing the optimum design to a design provided by the traditional approach. 展开更多
关键词 mini-missiles(MMs) guidance NAVIGATION and control(GNC)system multi-objective optimization multidisciplinary design optimization(MDO) flight performance guidance precision
下载PDF
A semantic segmentation-based underwater acoustic image transmission framework for cooperative SLAM
2
作者 Jiaxu Li Guangyao Han +1 位作者 Shuai Chang Xiaomei Fu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第3期339-351,共13页
With the development of underwater sonar detection technology,simultaneous localization and mapping(SLAM)approach has attracted much attention in underwater navigation field in recent years.But the weak detection abil... With the development of underwater sonar detection technology,simultaneous localization and mapping(SLAM)approach has attracted much attention in underwater navigation field in recent years.But the weak detection ability of a single vehicle limits the SLAM performance in wide areas.Thereby,cooperative SLAM using multiple vehicles has become an important research direction.The key factor of cooperative SLAM is timely and efficient sonar image transmission among underwater vehicles.However,the limited bandwidth of underwater acoustic channels contradicts a large amount of sonar image data.It is essential to compress the images before transmission.Recently,deep neural networks have great value in image compression by virtue of the powerful learning ability of neural networks,but the existing sonar image compression methods based on neural network usually focus on the pixel-level information without the semantic-level information.In this paper,we propose a novel underwater acoustic transmission scheme called UAT-SSIC that includes semantic segmentation-based sonar image compression(SSIC)framework and the joint source-channel codec,to improve the accuracy of the semantic information of the reconstructed sonar image at the receiver.The SSIC framework consists of Auto-Encoder structure-based sonar image compression network,which is measured by a semantic segmentation network's residual.Considering that sonar images have the characteristics of blurred target edges,the semantic segmentation network used a special dilated convolution neural network(DiCNN)to enhance segmentation accuracy by expanding the range of receptive fields.The joint source-channel codec with unequal error protection is proposed that adjusts the power level of the transmitted data,which deal with sonar image transmission error caused by the serious underwater acoustic channel.Experiment results demonstrate that our method preserves more semantic information,with advantages over existing methods at the same compression ratio.It also improves the error tolerance and packet loss resistance of transmission. 展开更多
关键词 semantic segmentation Sonar image transmission Learning-based compression
下载PDF
A Video Captioning Method by Semantic Topic-Guided Generation
3
作者 Ou Ye Xinli Wei +2 位作者 Zhenhua Yu Yan Fu Ying Yang 《Computers, Materials & Continua》 SCIE EI 2024年第1期1071-1093,共23页
In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is de... In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is dependent on a single video input source and few visual labels,and there is a problem with semantic alignment between video contents and generated natural sentences,which are not suitable for accurately comprehending and describing the video contents.To address this issue,this paper proposes a video captioning method by semantic topic-guided generation.First,a 3D convolutional neural network is utilized to extract the spatiotemporal features of videos during the encoding.Then,the semantic topics of video data are extracted using the visual labels retrieved from similar video data.In the decoding,a decoder is constructed by combining a novel Enhance-TopK sampling algorithm with a Generative Pre-trained Transformer-2 deep neural network,which decreases the influence of“deviation”in the semantic mapping process between videos and texts by jointly decoding a baseline and semantic topics of video contents.During this process,the designed Enhance-TopK sampling algorithm can alleviate a long-tail problem by dynamically adjusting the probability distribution of the predicted words.Finally,the experiments are conducted on two publicly used Microsoft Research Video Description andMicrosoft Research-Video to Text datasets.The experimental results demonstrate that the proposed method outperforms several state-of-art approaches.Specifically,the performance indicators Bilingual Evaluation Understudy,Metric for Evaluation of Translation with Explicit Ordering,Recall Oriented Understudy for Gisting Evaluation-longest common subsequence,and Consensus-based Image Description Evaluation of the proposed method are improved by 1.2%,0.1%,0.3%,and 2.4% on the Microsoft Research Video Description dataset,and 0.1%,1.0%,0.1%,and 2.8% on the Microsoft Research-Video to Text dataset,respectively,compared with the existing video captioning methods.As a result,the proposed method can generate video captioning that is more closely aligned with human natural language expression habits. 展开更多
关键词 Video captioning encoder-decoder semantic topic jointly decoding Enhance-TopK sampling
下载PDF
Recorded recurrent deep reinforcement learning guidance laws for intercepting endoatmospheric maneuvering missiles
4
作者 Xiaoqi Qiu Peng Lai +1 位作者 Changsheng Gao Wuxing Jing 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第1期457-470,共14页
This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with u... This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws. 展开更多
关键词 Endoatmospheric interception Missile guidance Reinforcement learning Markov decision process Recurrent neural networks
下载PDF
A Random Fusion of Mix 3D and Polar Mix to Improve Semantic Segmentation Performance in 3D Lidar Point Cloud
5
作者 Bo Liu Li Feng Yufeng Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期845-862,共18页
This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information throu... This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information through a collection of 3D coordinates,have found wide-ranging applications.Data augmentation has emerged as a potent solution to the challenges posed by limited labeled data and the need to enhance model generalization capabilities.Much of the existing research is devoted to crafting novel data augmentation methods specifically for 3D lidar point clouds.However,there has been a lack of focus on making the most of the numerous existing augmentation techniques.Addressing this deficiency,this research investigates the possibility of combining two fundamental data augmentation strategies.The paper introduces PolarMix andMix3D,two commonly employed augmentation techniques,and presents a new approach,named RandomFusion.Instead of using a fixed or predetermined combination of augmentation methods,RandomFusion randomly chooses one method from a pool of options for each instance or sample.This innovative data augmentation technique randomly augments each point in the point cloud with either PolarMix or Mix3D.The crux of this strategy is the random choice between PolarMix and Mix3Dfor the augmentation of each point within the point cloud data set.The results of the experiments conducted validate the efficacy of the RandomFusion strategy in enhancing the performance of neural network models for 3D lidar point cloud semantic segmentation tasks.This is achieved without compromising computational efficiency.By examining the potential of merging different augmentation techniques,the research contributes significantly to a more comprehensive understanding of how to utilize existing augmentation methods for 3D lidar point clouds.RandomFusion data augmentation technique offers a simple yet effective method to leverage the diversity of augmentation techniques and boost the robustness of models.The insights gained from this research can pave the way for future work aimed at developing more advanced and efficient data augmentation strategies for 3D lidar point cloud analysis. 展开更多
关键词 3D lidar point cloud data augmentation RandomFusion semantic segmentation
下载PDF
Generative Multi-Modal Mutual Enhancement Video Semantic Communications
6
作者 Yuanle Chen Haobo Wang +3 位作者 Chunyu Liu Linyi Wang Jiaxin Liu Wei Wu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第6期2985-3009,共25页
Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the... Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the research and applications of natural language processing across different modalities,our goal is to accurately extract frame-level semantic information from videos and ultimately transmit high-quality videos.Specifically,we propose a deep learning-basedMulti-ModalMutual Enhancement Video Semantic Communication system,called M3E-VSC.Built upon a VectorQuantized Generative AdversarialNetwork(VQGAN),our systemaims to leverage mutual enhancement among different modalities by using text as the main carrier of transmission.With it,the semantic information can be extracted fromkey-frame images and audio of the video and performdifferential value to ensure that the extracted text conveys accurate semantic information with fewer bits,thus improving the capacity of the system.Furthermore,a multi-frame semantic detection module is designed to facilitate semantic transitions during video generation.Simulation results demonstrate that our proposed model maintains high robustness in complex noise environments,particularly in low signal-to-noise ratio conditions,significantly improving the accuracy and speed of semantic transmission in video communication by approximately 50 percent. 展开更多
关键词 Generative adversarial networks multi-modal mutual enhancement video semantic transmission deep learning
下载PDF
A Joint Entity Relation Extraction Model Based on Relation Semantic Template Automatically Constructed
7
作者 Wei Liu Meijuan Yin +1 位作者 Jialong Zhang Lunchong Cui 《Computers, Materials & Continua》 SCIE EI 2024年第1期975-997,共23页
The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of... The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of defining the semantic template of relation manually is particularly prominent in the extraction effect because it can obtain the deep semantic information of relation.However,this method has some problems,such as relying on expert experience and poor portability.Inspired by the rule-based entity relation extraction method,this paper proposes a joint entity relation extraction model based on a relation semantic template automatically constructed,which is abbreviated as RSTAC.This model refines the extraction rules of relation semantic templates from relation corpus through dependency parsing and realizes the automatic construction of relation semantic templates.Based on the relation semantic template,the process of relation classification and triplet extraction is constrained,and finally,the entity relation triplet is obtained.The experimental results on the three major Chinese datasets of DuIE,SanWen,and FinRE showthat the RSTAC model successfully obtains rich deep semantics of relation,improves the extraction effect of entity relation triples,and the F1 scores are increased by an average of 0.96% compared with classical joint extraction models such as CasRel,TPLinker,and RFBFN. 展开更多
关键词 Natural language processing deep learning information extraction relation extraction relation semantic template
下载PDF
SHEL:a semantically enhanced hardware-friendly entity linking method
8
作者 亓东林 CHEN Shudong +2 位作者 DU Rong TONG Da YU Yong 《High Technology Letters》 EI CAS 2024年第1期13-22,共10页
With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of train... With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of training data using large pre-trained language models,which is a hardware threshold to accomplish this task.Some researchers have achieved competitive results with less training data through ingenious methods,such as utilizing information provided by the named entity recognition model.This paper presents a novel semantic-enhancement-based entity linking approach,named semantically enhanced hardware-friendly entity linking(SHEL),which is designed to be hardware friendly and efficient while maintaining good performance.Specifically,SHEL's semantic enhancement approach consists of three aspects:(1)semantic compression of entity descriptions using a text summarization model;(2)maximizing the capture of mention contexts using asymmetric heuristics;(3)calculating a fixed size mention representation through pooling operations.These series of semantic enhancement methods effectively improve the model's ability to capture semantic information while taking into account the hardware constraints,and significantly improve the model's convergence speed by more than 50%compared with the strong baseline model proposed in this paper.In terms of performance,SHEL is comparable to the previous method,with superior performance on six well-established datasets,even though SHEL is trained using a smaller pre-trained language model as the encoder. 展开更多
关键词 entity linking(EL) pre-trained models knowledge graph text summarization semantic enhancement
下载PDF
Trajectory tracking guidance of interceptor via prescribed performance integral sliding mode with neural network disturbance observer
9
作者 Wenxue Chen Yudong Hu +1 位作者 Changsheng Gao Ruoming An 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第2期412-429,共18页
This paper investigates interception missiles’trajectory tracking guidance problem under wind field and external disturbances in the boost phase.Indeed,the velocity control in such trajectory tracking guidance system... This paper investigates interception missiles’trajectory tracking guidance problem under wind field and external disturbances in the boost phase.Indeed,the velocity control in such trajectory tracking guidance systems of missiles is challenging.As our contribution,the velocity control channel is designed to deal with the intractable velocity problem and improve tracking accuracy.The global prescribed performance function,which guarantees the tracking error within the set range and the global convergence of the tracking guidance system,is first proposed based on the traditional PPF.Then,a tracking guidance strategy is derived using the integral sliding mode control techniques to make the sliding manifold and tracking errors converge to zero and avoid singularities.Meanwhile,an improved switching control law is introduced into the designed tracking guidance algorithm to deal with the chattering problem.A back propagation neural network(BPNN)extended state observer(BPNNESO)is employed in the inner loop to identify disturbances.The obtained results indicate that the proposed tracking guidance approach achieves the trajectory tracking guidance objective without and with disturbances and outperforms the existing tracking guidance schemes with the lowest tracking errors,convergence times,and overshoots. 展开更多
关键词 BP network neural Integral sliding mode control(ISMC) Missile defense Prescribed performance function(PPF) State observer Tracking guidance system
下载PDF
Six-Dimensional Guidance: The Strategies of Thinking Quality Cultivation in Senior High School English Discourse Learning
10
作者 Junjie Sun 《Journal of Contemporary Educational Research》 2024年第3期237-245,共9页
Taking the discourse learning of the new senior high school English textbook published by the People’s Education Press as an example,combined with the“six-dimensional guidance”deep reading strategy,and through the ... Taking the discourse learning of the new senior high school English textbook published by the People’s Education Press as an example,combined with the“six-dimensional guidance”deep reading strategy,and through the six-skill training strategies of“memory skill training,understanding skill training,application skill training,analytical skill training,evaluation skill training,creative skill training,”this paper aims to cultivate students’thinking profundity,logic,flexibility,sensitivity,criticality,and originality.It also promotes the real implementation of senior high school English deep reading that points to the cultivation of thinking quality in classroom teaching,and realizes the transformation from“conventional reading”to“deep reading”that reflects the core literacy of the discipline. 展开更多
关键词 Six-dimensional guidance High school English Discourse learning Thinking quality Strategy
下载PDF
A Semantic-Sensitive Approach to Indoor and Outdoor 3D Data Organization
11
作者 Youchen Wei 《Journal of World Architecture》 2024年第1期1-6,共6页
Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data... Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data models are studied,and the characteristics of building information modeling standards(IFC),city geographic modeling language(CityGML),indoor modeling language(IndoorGML),and other models are compared and analyzed.CityGML and IndoorGML models face challenges in satisfying diverse application scenarios and requirements due to limitations in their expression capabilities.It is proposed to combine the semantic information of the model objects to effectively partition and organize the indoor and outdoor spatial 3D model data and to construct the indoor and outdoor data organization mechanism of“chunk-layer-subobject-entrances-area-detail object.”This method is verified by proposing a 3D data organization method for indoor and outdoor space and constructing a 3D visualization system based on it. 展开更多
关键词 Integrated data organization Indoor and outdoor 3D data models semantic models Spatial segmentation
下载PDF
A User-Centric Approach to Activity Recognition and Guidance in Semantic Smart Home
12
作者 LI Haitao GUO Kun +1 位作者 LU Yueming LI Yonghua 《China Communications》 SCIE CSCD 2015年第S2期103-113,共11页
Wireless smart home system is to facilitate people's lives and it trend to adopt a more intelligent way to provide services. It is very desirable in the recent SH market for the system to recognize users' beha... Wireless smart home system is to facilitate people's lives and it trend to adopt a more intelligent way to provide services. It is very desirable in the recent SH market for the system to recognize users' behaviors and automatically response the corresponding activities to satisfy users' actual demands. However, activity models in the existing approaches are usually defined separately through knowledge-driven methods. These approaches cause that the activity models can't be matched with the services dynamically. To address the problem, we develop the semantic association model and a novel approach of activity recognition and guidance is presented. In our approach, the smart devices and users' requirements are described by semantic models. When the requirements are detected and understood, smart gateway can provide appropriate services, achieving activity assistance. The semantic association model allows all related elements in smart home connect with each other logically. The approach has been implemented and the results show that the success rate of the approach based on semantic association model is higher than 33% at average as compared to the approach based on predefined models. The proposed approach can effectively help people who are in trouble with learning or remembering in the common life. 展开更多
关键词 Internet of THINGS smart HOME ACTIVITY RECOGNITION ACTIVITY guidance semantic model
下载PDF
Multi-task Learning of Semantic Segmentation and Height Estimation for Multi-modal Remote Sensing Images 被引量:1
13
作者 Mengyu WANG Zhiyuan YAN +2 位作者 Yingchao FENG Wenhui DIAO Xian SUN 《Journal of Geodesy and Geoinformation Science》 CSCD 2023年第4期27-39,共13页
Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively u... Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively utilize multi-modal remote sensing data to break through the performance bottleneck of single-modal interpretation.In addition,semantic segmentation and height estimation in remote sensing data are two tasks with strong correlation,but existing methods usually study individual tasks separately,which leads to high computational resource overhead.To this end,we propose a Multi-Task learning framework for Multi-Modal remote sensing images(MM_MT).Specifically,we design a Cross-Modal Feature Fusion(CMFF)method,which aggregates complementary information of different modalities to improve the accuracy of semantic segmentation and height estimation.Besides,a dual-stream multi-task learning method is introduced for Joint Semantic Segmentation and Height Estimation(JSSHE),extracting common features in a shared network to save time and resources,and then learning task-specific features in two task branches.Experimental results on the public multi-modal remote sensing image dataset Potsdam show that compared to training two tasks independently,multi-task learning saves 20%of training time and achieves competitive performance with mIoU of 83.02%for semantic segmentation and accuracy of 95.26%for height estimation. 展开更多
关键词 MULTI-MODAL MULTI-TASK semantic segmentation height estimation convolutional neural network
下载PDF
Application Studies on And Then There Were None under the Guidance of Semantic and Communicative Translation Theories
14
作者 白学新 黄巍 《海外英语》 2020年第1期191-192,204,共3页
And Then There Were None is one of Agatha Christie's representative detective novel.Since detective novel is different from general literary work,the translation of it is necessary to find some translation skills.... And Then There Were None is one of Agatha Christie's representative detective novel.Since detective novel is different from general literary work,the translation of it is necessary to find some translation skills.It analyzes the vivid effect of Newmark's semantic and communicative translations on the De's Chinese version of And Then There Were None which causes the literary fla-vor to the detective novel.De's version accurately expresses Agatha's literary accomplishment even if it is the detective novel which always pays attention to the logical and calm detection and the intelligence race with the reader.It uses Newmark's theory to explore,and has found that the semantic translation in the application of the key clue and the application of the communicative translation on the creation of the characters and the special description of the environment.In conclusion,with different applica-tions in the novel,the semantic and communicative translations have their unique effects on the literary vision.Semantic transla-tion is reflected in the novel's clues and psychological interrogations.Communicative translation is mainly reflected in the charac-terization of the characters and the description of the environment,providing the guidance for the translation of detective novels. 展开更多
关键词 NEWMARK semantic Translations Communicative Translations Agatha Christie And Then There Were None
下载PDF
Semantic Document Layout Analysis of Handwritten Manuscripts
15
作者 Emad Sami Jaha 《Computers, Materials & Continua》 SCIE EI 2023年第5期2805-2831,共27页
A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed docume... A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed document image analysis to be further used in several applications and different objectives.This research extends the traditional approaches of DLA and introduces the concept of semantic document layout analysis(SDLA)by proposing a novel framework for semantic layout analysis and characterization of handwritten manuscripts.The proposed SDLA approach enables the derivation of implicit information and semantic characteristics,which can be effectively utilized in dozens of practical applications for various purposes,in a way bridging the semantic gap and providingmore understandable high-level document image analysis and more invariant characterization via absolute and relative labeling.This approach is validated and evaluated on a large dataset ofArabic handwrittenmanuscripts comprising complex layouts.The experimental work shows promising results in terms of accurate and effective semantic characteristic-based clustering and retrieval of handwritten manuscripts.It also indicates the expected efficacy of using the capabilities of the proposed approach in automating and facilitating many functional,reallife tasks such as effort estimation and pricing of transcription or typing of such complex manuscripts. 展开更多
关键词 semantic characteristics semantic labeling document layout analysis semantic document layout analysis handwritten manuscripts clustering RETRIEVAL image processing computer vision machine learning
下载PDF
Bilateral U-Net semantic segmentation with spatial attention mechanism
16
作者 Guangzhe Zhao Yimeng Zhang +1 位作者 Maoning Ge Min Yu 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第2期297-307,共11页
Aiming at the problem that the existing models have a poor segmentation effect on imbalanced data sets with small-scale samples,a bilateral U-Net network model with a spatial attention mechanism is designed.The model ... Aiming at the problem that the existing models have a poor segmentation effect on imbalanced data sets with small-scale samples,a bilateral U-Net network model with a spatial attention mechanism is designed.The model uses the lightweight MobileNetV2 as the backbone network for feature hierarchical extraction and proposes an Attentive Pyramid Spatial Attention(APSA)module compared to the Attenuated Spatial Pyramid module,which can increase the receptive field and enhance the information,and finally adds the context fusion prediction branch that fuses high-semantic and low-semantic prediction results,and the model effectively improves the segmentation accuracy of small data sets.The experimental results on the CamVid data set show that compared with some existing semantic segmentation networks,the algorithm has a better segmentation effect and segmentation accuracy,and its mIOU reaches 75.85%.Moreover,to verify the generality of the model and the effectiveness of the APSA module,experiments were conducted on the VOC 2012 data set,and the APSA module improved mIOU by about 12.2%. 展开更多
关键词 attention mechanism receptive field semantic fusion semantic segmentation spatial attention module U-Net
下载PDF
An Improved High Precision 3D Semantic Mapping of Indoor Scenes from RGB-D Images
17
作者 Jing Xin Kenan Du +1 位作者 Jiale Feng Mao Shan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第12期2621-2640,共20页
This paper proposes an improved high-precision 3D semantic mapping method for indoor scenes using RGB-D images.The current semantic mapping algorithms suffer from low semantic annotation accuracy and insufficient real... This paper proposes an improved high-precision 3D semantic mapping method for indoor scenes using RGB-D images.The current semantic mapping algorithms suffer from low semantic annotation accuracy and insufficient real-time performance.To address these issues,we first adopt the Elastic Fusion algorithm to select key frames from indoor environment image sequences captured by the Kinect sensor and construct the indoor environment space model.Then,an indoor RGB-D image semantic segmentation network is proposed,which uses multi-scale feature fusion to quickly and accurately obtain object labeling information at the pixel level of the spatial point cloud model.Finally,Bayesian updating is used to conduct incremental semantic label fusion on the established spatial point cloud model.We also employ dense conditional random fields(CRF)to optimize the 3D semantic map model,resulting in a high-precision spatial semantic map of indoor scenes.Experimental results show that the proposed semantic mapping system can process image sequences collected by RGB-D sensors in real-time and output accurate semantic segmentation results of indoor scene images and the current local spatial semantic map.Finally,it constructs a globally consistent high-precision indoor scenes 3D semantic map. 展开更多
关键词 3D semantic map online reconstruction RGB-D images semantic segmentation indoor mobile robot
下载PDF
A Survey on Image Semantic Segmentation Using Deep Learning Techniques
18
作者 Jieren Cheng Hua Li +2 位作者 Dengbo Li Shuai Hua Victor S.Sheng 《Computers, Materials & Continua》 SCIE EI 2023年第1期1941-1957,共17页
Image semantic segmentation is an important branch of computer vision of a wide variety of practical applications such as medical image analysis,autonomous driving,virtual or augmented reality,etc.In recent years,due ... Image semantic segmentation is an important branch of computer vision of a wide variety of practical applications such as medical image analysis,autonomous driving,virtual or augmented reality,etc.In recent years,due to the remarkable performance of transformer and multilayer perceptron(MLP)in computer vision,which is equivalent to convolutional neural network(CNN),there has been a substantial amount of image semantic segmentation works aimed at developing different types of deep learning architecture.This survey aims to provide a comprehensive overview of deep learning methods in the field of general image semantic segmentation.Firstly,the commonly used image segmentation datasets are listed.Next,extensive pioneering works are deeply studied from multiple perspectives(e.g.,network structures,feature fusion methods,attention mechanisms),and are divided into four categories according to different network architectures:CNN-based architectures,transformer-based architectures,MLP-based architectures,and others.Furthermore,this paper presents some common evaluation metrics and compares the respective advantages and limitations of popular techniques both in terms of architectural design and their experimental value on the most widely used datasets.Finally,possible future research directions and challenges are discussed for the reference of other researchers. 展开更多
关键词 Deep learning semantic segmentation CNN MLP TRANSFORMER
下载PDF
Semantic Segmentation by Using Down-Sampling and Subpixel Convolution: DSSC-UNet
19
作者 Young-Man Kwon Sunghoon Bae +1 位作者 Dong-Keun Chung Myung-Jae Lim 《Computers, Materials & Continua》 SCIE EI 2023年第4期683-696,共14页
Recently, semantic segmentation has been widely applied toimage processing, scene understanding, and many others. Especially, indeep learning-based semantic segmentation, the U-Net with convolutionalencoder-decoder ar... Recently, semantic segmentation has been widely applied toimage processing, scene understanding, and many others. Especially, indeep learning-based semantic segmentation, the U-Net with convolutionalencoder-decoder architecture is a representative model which is proposed forimage segmentation in the biomedical field. It used max pooling operationfor reducing the size of image and making noise robust. However, instead ofreducing the complexity of the model, max pooling has the disadvantageof omitting some information about the image in reducing it. So, thispaper used two diagonal elements of down-sampling operation instead ofit. We think that the down-sampling feature maps have more informationintrinsically than max pooling feature maps because of keeping the Nyquisttheorem and extracting the latent information from them. In addition,this paper used two other diagonal elements for the skip connection. Indecoding, we used Subpixel Convolution rather than transposed convolutionto efficiently decode the encoded feature maps. Including all the ideas, thispaper proposed the new encoder-decoder model called Down-Sampling andSubpixel Convolution U-Net (DSSC-UNet). To prove the better performanceof the proposed model, this paper measured the performance of the UNetand DSSC-UNet on the Cityscapes. As a result, DSSC-UNet achieved89.6% Mean Intersection OverUnion (Mean-IoU) andU-Net achieved 85.6%Mean-IoU, confirming that DSSC-UNet achieved better performance. 展开更多
关键词 semantic segmentation encoder-decoder U-Net DSSC-UNet subpixel convolution DOWN-SAMPLING
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部