The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prosta...The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prostate segmentation,but due to the variability caused by prostate diseases,automatic segmentation of the prostate presents significant challenges.In this paper,we propose an attention-guided multi-scale feature fusion network(AGMSF-Net)to segment prostate MRI images.We propose an attention mechanism for extracting multi-scale features,and introduce a 3D transformer module to enhance global feature representation by adding it during the transition phase from encoder to decoder.In the decoder stage,a feature fusion module is proposed to obtain global context information.We evaluate our model on MRI images of the prostate acquired from a local hospital.The relative volume difference(RVD)and dice similarity coefficient(DSC)between the results of automatic prostate segmentation and ground truth were 1.21%and 93.68%,respectively.To quantitatively evaluate prostate volume on MRI,which is of significant clinical significance,we propose a unique AGMSF-Net.The essential performance evaluation and validation experiments have demonstrated the effectiveness of our method in automatic prostate segmentation.展开更多
Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconst...Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconstruction has become a prevalent approach for unsupervised anomaly detection.However,effectively learning representations and achieving accurate detection results remain challenging due to the intricate temporal patterns and dependencies in real-world time series.In this paper,we propose a cross-dimension attentive feature fusion network for time series anomaly detection,referred to as CAFFN.Specifically,a series and feature mixing block is introduced to learn representations in 1D space.Additionally,a fast Fourier transform is employed to convert the time series into 2D space,providing the capability for 2D feature extraction.Finally,a cross-dimension attentive feature fusion mechanism is designed that adaptively integrates features across different dimensions for anomaly detection.Experimental results on real-world time series datasets demonstrate that CAFFN performs better than other competing methods in time series anomaly detection.展开更多
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
Credit Card Fraud Detection(CCFD)is an essential technology for banking institutions to control fraud risks and safeguard their reputation.Class imbalance and insufficient representation of feature data relating to cr...Credit Card Fraud Detection(CCFD)is an essential technology for banking institutions to control fraud risks and safeguard their reputation.Class imbalance and insufficient representation of feature data relating to credit card transactions are two prevalent issues in the current study field of CCFD,which significantly impact classification models’performance.To address these issues,this research proposes a novel CCFD model based on Multifeature Fusion and Generative Adversarial Networks(MFGAN).The MFGAN model consists of two modules:a multi-feature fusion module for integrating static and dynamic behavior data of cardholders into a unified highdimensional feature space,and a balance module based on the generative adversarial network to decrease the class imbalance ratio.The effectiveness of theMFGAN model is validated on two actual credit card datasets.The impacts of different class balance ratios on the performance of the four resamplingmodels are analyzed,and the contribution of the two different modules to the performance of the MFGAN model is investigated via ablation experiments.Experimental results demonstrate that the proposed model does better than state-of-the-art models in terms of recall,F1,and Area Under the Curve(AUC)metrics,which means that the MFGAN model can help banks find more fraudulent transactions and reduce fraud losses.展开更多
With the intensifying aging of the population,the phenomenon of the elderly living alone is also increasing.Therefore,using modern internet of things technology to monitor the daily behavior of the elderly in indoors ...With the intensifying aging of the population,the phenomenon of the elderly living alone is also increasing.Therefore,using modern internet of things technology to monitor the daily behavior of the elderly in indoors is a meaningful study.Video-based action recognition tasks are easily affected by object occlusion and weak ambient light,resulting in poor recognition performance.Therefore,this paper proposes an indoor human behavior recognition method based on wireless fidelity(Wi-Fi)perception and video feature fusion by utilizing the ability of Wi-Fi signals to carry environmental information during the propagation process.This paper uses the public WiFi-based activity recognition dataset(WIAR)containing Wi-Fi channel state information and essential action videos,and then extracts video feature vectors and Wi-Fi signal feature vectors in the datasets through the two-stream convolutional neural network and standard statistical algorithms,respectively.Then the two sets of feature vectors are fused,and finally,the action classification and recognition are performed by the support vector machine(SVM).The experiments in this paper contrast experiments between the two-stream network model and the methods in this paper under three different environments.And the accuracy of action recognition after adding Wi-Fi signal feature fusion is improved by 10%on average.展开更多
Intelligent fault diagnosis in modern mechanical equipment maintenance is increasingly adopting deep learning technology.However,conventional bearing fault diagnosis models often suffer from low accuracy and unstable ...Intelligent fault diagnosis in modern mechanical equipment maintenance is increasingly adopting deep learning technology.However,conventional bearing fault diagnosis models often suffer from low accuracy and unstable performance in noisy environments due to their reliance on a single input data.Therefore,this paper proposes a dual-channel convolutional neural network(DDCNN)model that leverages dual data inputs.The DDCNN model introduces two key improvements.Firstly,one of the channels substitutes its convolution with a larger kernel,simplifying the structure while addressing the lack of global information and shallow features.Secondly,the feature layer combines data from different sensors based on their primary and secondary importance,extracting details through small kernel convolution for primary data and obtaining global information through large kernel convolution for secondary data.Extensive experiments conducted on two-bearing fault datasets demonstrate the superiority of the two-channel convolution model,exhibiting high accuracy and robustness even in strong noise environments.Notably,it achieved an impressive 98.84%accuracy at a Signal to Noise Ratio(SNR)of−4 dB,outperforming other advanced convolutional models.展开更多
A system for classifying four basic table tennis strokes using wearable devices and deep learning networks is proposed in this study.The wearable device consisted of a six-axis sensor,Raspberry Pi 3,and a power bank.M...A system for classifying four basic table tennis strokes using wearable devices and deep learning networks is proposed in this study.The wearable device consisted of a six-axis sensor,Raspberry Pi 3,and a power bank.Multiple kernel sizes were used in convolutional neural network(CNN)to evaluate their performance for extracting features.Moreover,a multiscale CNN with two kernel sizes was used to perform feature fusion at different scales in a concatenated manner.The CNN achieved recognition of the four table tennis strokes.Experimental data were obtained from20 research participants who wore sensors on the back of their hands while performing the four table tennis strokes in a laboratory environment.The data were collected to verify the performance of the proposed models for wearable devices.Finally,the sensor and multi-scale CNN designed in this study achieved accuracy and F1 scores of 99.58%and 99.16%,respectively,for the four strokes.The accuracy for five-fold cross validation was 99.87%.This result also shows that the multi-scale convolutional neural network has better robustness after fivefold cross validation.展开更多
Object Detection is the task of localization and classification of objects in a video or image.In recent times,because of its widespread applications,it has obtained more importance.In the modern world,waste pollution...Object Detection is the task of localization and classification of objects in a video or image.In recent times,because of its widespread applications,it has obtained more importance.In the modern world,waste pollution is one significant environmental problem.The prominence of recycling is known very well for both ecological and economic reasons,and the industry needs higher efficiency.Waste object detection utilizing deep learning(DL)involves training a machine-learning method to classify and detect various types of waste in videos or images.This technology is utilized for several purposes recycling and sorting waste,enhancing waste management and reducing environmental pollution.Recent studies of automatic waste detection are difficult to compare because of the need for benchmarks and broadly accepted standards concerning the employed data andmetrics.Therefore,this study designs an Entropy-based Feature Fusion using Deep Learning forWasteObject Detection and Classification(EFFDL-WODC)algorithm.The presented EFFDL-WODC system inherits the concepts of feature fusion and DL techniques for the effectual recognition and classification of various kinds of waste objects.In the presented EFFDL-WODC system,two major procedures can be contained,such as waste object detection and waste object classification.For object detection,the EFFDL-WODC technique uses a YOLOv7 object detector with a fusionbased backbone network.In addition,entropy feature fusion-based models such as VGG-16,SqueezeNet,and NASNetmodels are used.Finally,the EFFDL-WODC technique uses a graph convolutional network(GCN)model performed for the classification of detected waste objects.The performance validation of the EFFDL-WODC approach was validated on the benchmark database.The comprehensive comparative results demonstrated the improved performance of the EFFDL-WODC technique over recent approaches.展开更多
Gait is a biological typical that defines the method by that people walk.Walking is the most significant performance which keeps our day-to-day life and physical condition.Surface electromyography(sEMG)is a weak bioel...Gait is a biological typical that defines the method by that people walk.Walking is the most significant performance which keeps our day-to-day life and physical condition.Surface electromyography(sEMG)is a weak bioelectric signal that portrays the functional state between the human muscles and nervous system to any extent.Gait classifiers dependent upon sEMG signals are extremely utilized in analysing muscle diseases and as a guide path for recovery treatment.Several approaches are established in the works for gait recognition utilizing conventional and deep learning(DL)approaches.This study designs an Enhanced Artificial Algae Algorithm with Hybrid Deep Learning based Human Gait Classification(EAAA-HDLGR)technique on sEMG signals.The EAAA-HDLGR technique extracts the time domain(TD)and frequency domain(FD)features from the sEMG signals and is fused.In addition,the EAAA-HDLGR technique exploits the hybrid deep learning(HDL)model for gait recognition.At last,an EAAA-based hyperparameter optimizer is applied for the HDL model,which is mainly derived from the quasi-oppositional based learning(QOBL)concept,showing the novelty of the work.A brief classifier outcome of the EAAA-HDLGR technique is examined under diverse aspects,and the results indicate improving the EAAA-HDLGR technique.The results imply that the EAAA-HDLGR technique accomplishes improved results with the inclusion of EAAA on gait recognition.展开更多
In thefield of diagnosis of medical images the challenge lies in tracking and identifying the defective cells and the extent of the defective region within the complex structure of a brain cavity.Locating the defective...In thefield of diagnosis of medical images the challenge lies in tracking and identifying the defective cells and the extent of the defective region within the complex structure of a brain cavity.Locating the defective cells precisely during the diagnosis phase helps tofight the greatest exterminator of mankind.Early detec-tion of these defective cells requires an accurate computer-aided diagnostic system(CAD)that supports early treatment and promotes survival rates of patients.An ear-lier version of CAD systems relies greatly on the expertise of radiologist and it con-sumed more time to identify the defective region.The manuscript takes the efficacy of coalescing features like intensity,shape,and texture of the magnetic resonance image(MRI).In the Enhanced Feature Fusion Segmentation based classification method(EEFS)the image is enhanced and segmented to extract the prominent fea-tures.To bring out the desired effect the EEFS method uses Enhanced Local Binary Pattern(EnLBP),Partisan Gray Level Co-occurrence Matrix Histogram of Oriented Gradients(PGLCMHOG),and iGrab cut method to segment image.These prominent features along with deep features are coalesced to provide a single-dimensional fea-ture vector that is effectively used for prediction.The coalesced vector is used with the existing classifiers to compare the results of these classifiers with that of the gen-erated vector.The generated vector provides promising results with commendably less computatio nal time for pre-processing and classification of MR medical images.展开更多
In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resoluti...In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resolution,single viewpoint,and occlusion.Different from the existing works predicting symmetry from the complete shape,we propose a learning approach for symmetry predic-tion based on a single RGB-D image.Instead of directly predicting the symmetry from incomplete shapes,our method consists of two modules,i.e.,the multi-mod-al feature fusion module and the detection-by-reconstruction module.Firstly,we build a channel-transformer network(CTN)to extract cross-fusion features from the RGB-D as the multi-modal feature fusion module,which helps us aggregate features from the color and the depth separately.Then,our self-reconstruction net-work based on a 3D variational auto-encoder(3D-VAE)takes the global geo-metric features as input,followed by a prediction symmetry network to detect the symmetry.Our experiments are conducted on three public datasets:ShapeNet,YCB,and ScanNet,we demonstrate that our method can produce reliable and accurate results.展开更多
The deployment of vehicle micro-motors has witnessed an expansion owing to the progression in electrification and intelligent technologies.However,some micro-motors may exhibit design deficiencies,component wear,assem...The deployment of vehicle micro-motors has witnessed an expansion owing to the progression in electrification and intelligent technologies.However,some micro-motors may exhibit design deficiencies,component wear,assembly errors,and other imperfections that may arise during the design or manufacturing phases.Conse-quently,these micro-motors might generate anomalous noises during their operation,consequently exerting a substantial adverse influence on the overall comfort of drivers and passengers.Automobile micro-motors exhibit a diverse array of structural variations,consequently leading to the manifestation of a multitude of distinctive auditory irregularities.To address the identification of diverse forms of abnormal noise,this research presents a novel approach rooted in the utilization of vibro-acoustic fusion-convolutional neural network(VAF-CNN).This method entails the deployment of distinct network branches,each serving to capture disparate features from the multi-sensor data,all the while considering the auditory perception traits inherent in the human auditory sys-tem.The intermediary layer integrates the concept of adaptive weighting of multi-sensor features,thus affording a calibration mechanism for the features hailing from multiple sensors,thereby enabling a further refinement of features within the branch network.For optimal model efficacy,a feature fusion mechanism is implemented in the concluding layer.To substantiate the efficacy of the proposed approach,this paper initially employs an augmented data methodology inspired by modified SpecAugment,applied to the dataset of abnormal noise sam-ples,encompassing scenarios both with and without in-vehicle interior noise.This serves to mitigate the issue of limited sample availability.Subsequent comparative evaluations are executed,contrasting the performance of the model founded upon single-sensor data against other feature fusion models reliant on multi-sensor data.The experimental results substantiate that the suggested methodology yields heightened recognition accuracy and greater resilience against interference.Moreover,it holds notable practical significance in the engineering domain,as it furnishes valuable support for the targeted management of noise emanating from vehicle micro-motors.展开更多
Infrared target intrusion detection has significant applications in the fields of military defence and intelligent warning.In view of the characteristics of intrusion targets as well as inspection difficulties,an infr...Infrared target intrusion detection has significant applications in the fields of military defence and intelligent warning.In view of the characteristics of intrusion targets as well as inspection difficulties,an infrared target intrusion detection algorithm based on feature fusion and enhancement was proposed.This algorithm combines static target mode analysis and dynamic multi-frame correlation detection to extract infrared target features at different levels.Among them,LBP texture analysis can be used to effectively identify the posterior feature patterns which have been contained in the target library,while motion frame difference method can detect the moving regions of the image,improve the integrity of target regions such as camouflage,sheltering and deformation.In order to integrate the advantages of the two methods,the enhanced convolutional neural network was designed and the feature images obtained by the two methods were fused and enhanced.The enhancement module of the network strengthened and screened the targets,and realized the background suppression of infrared images.Based on the experiments,the effect of the proposed method and the comparison method on the background suppression and detection performance was evaluated,and the results showed that the SCRG and BSF values of the method in this paper had a better performance in multiple data sets,and it’s detection performance was far better than the comparison algorithm.The experiment results indicated that,compared with traditional infrared target detection methods,the proposed method could detect the infrared invasion target more accurately,and suppress the background noise more effectively.展开更多
For a single-structure deep learning fault diagnosis model,its disadvantages are an insufficient feature extraction and weak fault classification capability.This paper proposes a multi-scale deep feature fusion intell...For a single-structure deep learning fault diagnosis model,its disadvantages are an insufficient feature extraction and weak fault classification capability.This paper proposes a multi-scale deep feature fusion intelligent fault diagnosis method based on information entropy.First,a normal autoencoder,denoising autoencoder,sparse autoencoder,and contractive autoencoder are used in parallel to construct a multi-scale deep neural network feature extraction structure.A deep feature fusion strategy based on information entropy is proposed to obtain low-dimensional features and ensure the robustness of the model and the quality of deep features.Finally,the advantage of the deep belief network probability model is used as the fault classifier to identify the faults.The effectiveness of the proposed method was verified by a gearbox test-bed.Experimental results show that,compared with traditional and existing intelligent fault diagnosis methods,the proposed method can obtain representative information and features from the raw data with higher classification accuracy.展开更多
Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recogniti...Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches.展开更多
Rosewood is a kind of high-quality and precious wood in China.The correct identification of rosewood species is of great significance to the import and export trade and species identification of furniture materials.In...Rosewood is a kind of high-quality and precious wood in China.The correct identification of rosewood species is of great significance to the import and export trade and species identification of furniture materials.In this paper,micro CT was used to obtain the micro images of CTOSS sections,radial sections and tangential sections of 24 kinds of rosewood,and the data sets were constructed.PCA method was used to reduce the dimension of four features including logical binary pattern,local configuration pattern,rotation invariant LBP,uniform LBP.These four fea-tures and one feature not reducing dimension(rotation invariant uniform LBP)was fused with Gray Level Co-Occurrence Matrix and Tamura features,respectively,a total of five fused features LBP+GLCM+Tamura,LCP+GLCM+Tamura,LBP_(P,R)^(u2)+GLCM+Tamura,LBP_(P,R)^(ri)+GLCM+Tamura and LBP_(P,R)^(riu2)+GLCM+Tamura were obtained.The five fused features were classified by extreme learning machine and BP neural network.The clas-sification effect of feature LBP_(P,R)^(u2)+GLCM+Tamura combined with extreme learning machine was the best,and the classification accuracy of CroSS,radial and tangential sections reached 100%,97.63%and 94.72%,respectively,which is 0.83%,2.77%and 5.70%higher than that of BP neural network.The classification running time of ELM is less than 1 s,and the classfcation eficiency is high.In condusion,the LBP_(P,R)^(u2)+GLCM+Tamura method com-bined with extreme learning machine can be used as a quick and acurate classifier,providing an efficient and feasible class ification method of rosewood.展开更多
Edge detection is one of the core steps of image processing and computer vision.Accurate and fine image edge will make further target detection and semantic segmentation more effective.Holistically-Nested edge detecti...Edge detection is one of the core steps of image processing and computer vision.Accurate and fine image edge will make further target detection and semantic segmentation more effective.Holistically-Nested edge detection(HED)edge detection network has been proved to be a deep-learning network with better performance for edge detection.However,it is found that when the HED network is used in overlapping complex multi-edge scenarios for automatic object identification.There will be detected edge incomplete,not smooth and other problems.To solve these problems,an image edge detection algorithm based on improved HED and feature fusion is proposed.On the one hand,features are extracted using the improved HED network:the HED convolution layer is improved.The residual variable convolution block is used to replace the normal convolution enhancement model to extract features from edges of different sizes and shapes.Meanwhile,the empty convolution is used to replace the original pooling layer to expand the receptive field and retain more global information to obtain comprehensive feature information.On the other hand,edges are extracted using Otsu algorithm:Otsu-Canny algorithm is used to adaptively adjust the threshold value in the global scene to achieve the edge detection under the optimal threshold value.Finally,the edge extracted by improved HED network and Otsu-Canny algorithm is fused to obtain the final edge.Experimental results show that on the Berkeley University Data Set(BSDS500)the optimal data set size(ODS)F-measure of the proposed algorithm is 0.793;the average precision(AP)of the algorithm is 0.849;detection speed can reach more than 25 frames per second(FPS),which confirms the effectiveness of the proposed method.展开更多
Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information mor...Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information more comprehensively than traditional methods using a single-view.How to use hashing to combine multi-view data for image retrieval is still a challenge.In this paper,a multi-view fusion hashing method based on RKCCA(Random Kernel Canonical Correlation Analysis)is proposed.In order to describe image content more accurately,we use deep learning dense convolutional network feature DenseNet to construct multi-view by combining GIST feature or BoW_SIFT(Bag-of-Words model+SIFT feature)feature.This algorithm uses RKCCA method to fuse multi-view features to construct association features and apply them to image retrieval.The algorithm generates binary hash code with minimal distortion error by designing quantization regularization terms.A large number of experiments on benchmark datasets show that this method is superior to other multi-view hashing methods.展开更多
Latest advancements in vision technology offer an evident impact on multi-object recognition and scene understanding.Such sceneunderstanding task is a demanding part of several technologies,like augmented reality-base...Latest advancements in vision technology offer an evident impact on multi-object recognition and scene understanding.Such sceneunderstanding task is a demanding part of several technologies,like augmented reality-based scene integration,robotic navigation,autonomous driving,and tourist guide.Incorporating visual information in contextually unified segments,convolution neural networks-based approaches will significantly mitigate the clutter,which is usual in classical frameworks during scene understanding.In this paper,we propose a convolutional neural network(CNN)based segmentation method for the recognition of multiple objects in an image.Initially,after acquisition and preprocessing,the image is segmented by using CNN.Then,CNN features are extracted from these segmented objects,and discrete cosine transform(DCT)and discrete wavelet transform(DWT)features are computed.After the extraction of CNN features and computation of classical machine learning features,fusion is performed using a fusion technique.Then,to select theminimal set of features,genetic algorithm-based feature selection is used.In order to recognize and understand the multi-objects in the scene,a neuro-fuzzy approach is applied.Once objects in the scene are recognized,the relationship between these objects is examined by employing the object-to-object relation approach.Finally,a decision tree is incorporated to assign the relevant labels to the scenes based on recognized objects in the image.The experimental results over complex scene datasets including SUN Red Green Blue-Depth(RGB-D)and Cityscapes’demonstrated a remarkable performance.展开更多
Medical image segmentation is an important application field of computer vision in medical image processing.Due to the close location and high similarity of different organs in medical images,the current segmentation ...Medical image segmentation is an important application field of computer vision in medical image processing.Due to the close location and high similarity of different organs in medical images,the current segmentation algorithms have problems with mis-segmentation and poor edge segmentation.To address these challenges,we propose a medical image segmentation network(AF-Net)based on attention mechanism and feature fusion,which can effectively capture global information while focusing the network on the object area.In this approach,we add dual attention blocks(DA-block)to the backbone network,which comprises parallel channels and spatial attention branches,to adaptively calibrate and weigh features.Secondly,the multi-scale feature fusion block(MFF-block)is proposed to obtain feature maps of different receptive domains and get multi-scale information with less computational consumption.Finally,to restore the locations and shapes of organs,we adopt the global feature fusion blocks(GFF-block)to fuse high-level and low-level information,which can obtain accurate pixel positioning.We evaluate our method on multiple datasets(the aorta and lungs dataset),and the experimental results achieve 94.0%in mIoU and 96.3%in DICE,showing that our approach performs better than U-Net and other state-of-art methods.展开更多
基金This work was supported in part by the National Natural Science Foundation of China(Grant#:82260362)in part by the National Key R&D Program of China(Grant#:2021ZD0111000)+1 种基金in part by the Key R&D Project of Hainan Province(Grant#:ZDYF2021SHFZ243)in part by the Major Science and Technology Project of Haikou(Grant#:2020-009).
文摘The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prostate segmentation,but due to the variability caused by prostate diseases,automatic segmentation of the prostate presents significant challenges.In this paper,we propose an attention-guided multi-scale feature fusion network(AGMSF-Net)to segment prostate MRI images.We propose an attention mechanism for extracting multi-scale features,and introduce a 3D transformer module to enhance global feature representation by adding it during the transition phase from encoder to decoder.In the decoder stage,a feature fusion module is proposed to obtain global context information.We evaluate our model on MRI images of the prostate acquired from a local hospital.The relative volume difference(RVD)and dice similarity coefficient(DSC)between the results of automatic prostate segmentation and ground truth were 1.21%and 93.68%,respectively.To quantitatively evaluate prostate volume on MRI,which is of significant clinical significance,we propose a unique AGMSF-Net.The essential performance evaluation and validation experiments have demonstrated the effectiveness of our method in automatic prostate segmentation.
基金supported in part by the National Natural Science Foundation of China(Grants 62376172,62006163,62376043)in part by the National Postdoctoral Program for Innovative Talents(Grant BX20200226)in part by Sichuan Science and Technology Planning Project(Grants 2022YFSY0047,2022YFQ0014,2023ZYD0143,2022YFH0021,2023YFQ0020,24QYCX0354,24NSFTD0025).
文摘Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconstruction has become a prevalent approach for unsupervised anomaly detection.However,effectively learning representations and achieving accurate detection results remain challenging due to the intricate temporal patterns and dependencies in real-world time series.In this paper,we propose a cross-dimension attentive feature fusion network for time series anomaly detection,referred to as CAFFN.Specifically,a series and feature mixing block is introduced to learn representations in 1D space.Additionally,a fast Fourier transform is employed to convert the time series into 2D space,providing the capability for 2D feature extraction.Finally,a cross-dimension attentive feature fusion mechanism is designed that adaptively integrates features across different dimensions for anomaly detection.Experimental results on real-world time series datasets demonstrate that CAFFN performs better than other competing methods in time series anomaly detection.
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金supported by the National Key R&D Program of China(Nos.2022YFB3104103,and 2019QY1406)the National Natural Science Foundation of China(Nos.61732022,61732004,61672020,and 62072131).
文摘Credit Card Fraud Detection(CCFD)is an essential technology for banking institutions to control fraud risks and safeguard their reputation.Class imbalance and insufficient representation of feature data relating to credit card transactions are two prevalent issues in the current study field of CCFD,which significantly impact classification models’performance.To address these issues,this research proposes a novel CCFD model based on Multifeature Fusion and Generative Adversarial Networks(MFGAN).The MFGAN model consists of two modules:a multi-feature fusion module for integrating static and dynamic behavior data of cardholders into a unified highdimensional feature space,and a balance module based on the generative adversarial network to decrease the class imbalance ratio.The effectiveness of theMFGAN model is validated on two actual credit card datasets.The impacts of different class balance ratios on the performance of the four resamplingmodels are analyzed,and the contribution of the two different modules to the performance of the MFGAN model is investigated via ablation experiments.Experimental results demonstrate that the proposed model does better than state-of-the-art models in terms of recall,F1,and Area Under the Curve(AUC)metrics,which means that the MFGAN model can help banks find more fraudulent transactions and reduce fraud losses.
基金supported by the National Natural Science Foundation of China(No.62006135)the Natural Science Foundation of Shandong Province(No.ZR2020QF116)。
文摘With the intensifying aging of the population,the phenomenon of the elderly living alone is also increasing.Therefore,using modern internet of things technology to monitor the daily behavior of the elderly in indoors is a meaningful study.Video-based action recognition tasks are easily affected by object occlusion and weak ambient light,resulting in poor recognition performance.Therefore,this paper proposes an indoor human behavior recognition method based on wireless fidelity(Wi-Fi)perception and video feature fusion by utilizing the ability of Wi-Fi signals to carry environmental information during the propagation process.This paper uses the public WiFi-based activity recognition dataset(WIAR)containing Wi-Fi channel state information and essential action videos,and then extracts video feature vectors and Wi-Fi signal feature vectors in the datasets through the two-stream convolutional neural network and standard statistical algorithms,respectively.Then the two sets of feature vectors are fused,and finally,the action classification and recognition are performed by the support vector machine(SVM).The experiments in this paper contrast experiments between the two-stream network model and the methods in this paper under three different environments.And the accuracy of action recognition after adding Wi-Fi signal feature fusion is improved by 10%on average.
基金supported by the Key Research and Development Plan of Shanxi Province(Grant No.202102030201012).
文摘Intelligent fault diagnosis in modern mechanical equipment maintenance is increasingly adopting deep learning technology.However,conventional bearing fault diagnosis models often suffer from low accuracy and unstable performance in noisy environments due to their reliance on a single input data.Therefore,this paper proposes a dual-channel convolutional neural network(DDCNN)model that leverages dual data inputs.The DDCNN model introduces two key improvements.Firstly,one of the channels substitutes its convolution with a larger kernel,simplifying the structure while addressing the lack of global information and shallow features.Secondly,the feature layer combines data from different sensors based on their primary and secondary importance,extracting details through small kernel convolution for primary data and obtaining global information through large kernel convolution for secondary data.Extensive experiments conducted on two-bearing fault datasets demonstrate the superiority of the two-channel convolution model,exhibiting high accuracy and robustness even in strong noise environments.Notably,it achieved an impressive 98.84%accuracy at a Signal to Noise Ratio(SNR)of−4 dB,outperforming other advanced convolutional models.
基金supporting of the Ministry of Science and Technology MOST(Grant No.MOST 108–2221-E-150–022-MY3,MOST 110–2634-F-019–002)the National Taiwan Ocean University,China.
文摘A system for classifying four basic table tennis strokes using wearable devices and deep learning networks is proposed in this study.The wearable device consisted of a six-axis sensor,Raspberry Pi 3,and a power bank.Multiple kernel sizes were used in convolutional neural network(CNN)to evaluate their performance for extracting features.Moreover,a multiscale CNN with two kernel sizes was used to perform feature fusion at different scales in a concatenated manner.The CNN achieved recognition of the four table tennis strokes.Experimental data were obtained from20 research participants who wore sensors on the back of their hands while performing the four table tennis strokes in a laboratory environment.The data were collected to verify the performance of the proposed models for wearable devices.Finally,the sensor and multi-scale CNN designed in this study achieved accuracy and F1 scores of 99.58%and 99.16%,respectively,for the four strokes.The accuracy for five-fold cross validation was 99.87%.This result also shows that the multi-scale convolutional neural network has better robustness after fivefold cross validation.
基金funded by Institutional Fund Projects under Grant No. (IFPIP:557-135-1443).
文摘Object Detection is the task of localization and classification of objects in a video or image.In recent times,because of its widespread applications,it has obtained more importance.In the modern world,waste pollution is one significant environmental problem.The prominence of recycling is known very well for both ecological and economic reasons,and the industry needs higher efficiency.Waste object detection utilizing deep learning(DL)involves training a machine-learning method to classify and detect various types of waste in videos or images.This technology is utilized for several purposes recycling and sorting waste,enhancing waste management and reducing environmental pollution.Recent studies of automatic waste detection are difficult to compare because of the need for benchmarks and broadly accepted standards concerning the employed data andmetrics.Therefore,this study designs an Entropy-based Feature Fusion using Deep Learning forWasteObject Detection and Classification(EFFDL-WODC)algorithm.The presented EFFDL-WODC system inherits the concepts of feature fusion and DL techniques for the effectual recognition and classification of various kinds of waste objects.In the presented EFFDL-WODC system,two major procedures can be contained,such as waste object detection and waste object classification.For object detection,the EFFDL-WODC technique uses a YOLOv7 object detector with a fusionbased backbone network.In addition,entropy feature fusion-based models such as VGG-16,SqueezeNet,and NASNetmodels are used.Finally,the EFFDL-WODC technique uses a graph convolutional network(GCN)model performed for the classification of detected waste objects.The performance validation of the EFFDL-WODC approach was validated on the benchmark database.The comprehensive comparative results demonstrated the improved performance of the EFFDL-WODC technique over recent approaches.
基金supported by a grant from the Korea Health Technology R&D Project through the KoreaHealth Industry Development Institute (KHIDI)funded by the Ministry of Health&Welfare,Republic of Korea (grant number:HI21C1831)the Soonchunhyang University Research Fund.
文摘Gait is a biological typical that defines the method by that people walk.Walking is the most significant performance which keeps our day-to-day life and physical condition.Surface electromyography(sEMG)is a weak bioelectric signal that portrays the functional state between the human muscles and nervous system to any extent.Gait classifiers dependent upon sEMG signals are extremely utilized in analysing muscle diseases and as a guide path for recovery treatment.Several approaches are established in the works for gait recognition utilizing conventional and deep learning(DL)approaches.This study designs an Enhanced Artificial Algae Algorithm with Hybrid Deep Learning based Human Gait Classification(EAAA-HDLGR)technique on sEMG signals.The EAAA-HDLGR technique extracts the time domain(TD)and frequency domain(FD)features from the sEMG signals and is fused.In addition,the EAAA-HDLGR technique exploits the hybrid deep learning(HDL)model for gait recognition.At last,an EAAA-based hyperparameter optimizer is applied for the HDL model,which is mainly derived from the quasi-oppositional based learning(QOBL)concept,showing the novelty of the work.A brief classifier outcome of the EAAA-HDLGR technique is examined under diverse aspects,and the results indicate improving the EAAA-HDLGR technique.The results imply that the EAAA-HDLGR technique accomplishes improved results with the inclusion of EAAA on gait recognition.
文摘In thefield of diagnosis of medical images the challenge lies in tracking and identifying the defective cells and the extent of the defective region within the complex structure of a brain cavity.Locating the defective cells precisely during the diagnosis phase helps tofight the greatest exterminator of mankind.Early detec-tion of these defective cells requires an accurate computer-aided diagnostic system(CAD)that supports early treatment and promotes survival rates of patients.An ear-lier version of CAD systems relies greatly on the expertise of radiologist and it con-sumed more time to identify the defective region.The manuscript takes the efficacy of coalescing features like intensity,shape,and texture of the magnetic resonance image(MRI).In the Enhanced Feature Fusion Segmentation based classification method(EEFS)the image is enhanced and segmented to extract the prominent fea-tures.To bring out the desired effect the EEFS method uses Enhanced Local Binary Pattern(EnLBP),Partisan Gray Level Co-occurrence Matrix Histogram of Oriented Gradients(PGLCMHOG),and iGrab cut method to segment image.These prominent features along with deep features are coalesced to provide a single-dimensional fea-ture vector that is effectively used for prediction.The coalesced vector is used with the existing classifiers to compare the results of these classifiers with that of the gen-erated vector.The generated vector provides promising results with commendably less computatio nal time for pre-processing and classification of MR medical images.
文摘In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resolution,single viewpoint,and occlusion.Different from the existing works predicting symmetry from the complete shape,we propose a learning approach for symmetry predic-tion based on a single RGB-D image.Instead of directly predicting the symmetry from incomplete shapes,our method consists of two modules,i.e.,the multi-mod-al feature fusion module and the detection-by-reconstruction module.Firstly,we build a channel-transformer network(CTN)to extract cross-fusion features from the RGB-D as the multi-modal feature fusion module,which helps us aggregate features from the color and the depth separately.Then,our self-reconstruction net-work based on a 3D variational auto-encoder(3D-VAE)takes the global geo-metric features as input,followed by a prediction symmetry network to detect the symmetry.Our experiments are conducted on three public datasets:ShapeNet,YCB,and ScanNet,we demonstrate that our method can produce reliable and accurate results.
基金The author received the funding from Sichuan Natural Science Foundation(2022NSFSC1892).
文摘The deployment of vehicle micro-motors has witnessed an expansion owing to the progression in electrification and intelligent technologies.However,some micro-motors may exhibit design deficiencies,component wear,assembly errors,and other imperfections that may arise during the design or manufacturing phases.Conse-quently,these micro-motors might generate anomalous noises during their operation,consequently exerting a substantial adverse influence on the overall comfort of drivers and passengers.Automobile micro-motors exhibit a diverse array of structural variations,consequently leading to the manifestation of a multitude of distinctive auditory irregularities.To address the identification of diverse forms of abnormal noise,this research presents a novel approach rooted in the utilization of vibro-acoustic fusion-convolutional neural network(VAF-CNN).This method entails the deployment of distinct network branches,each serving to capture disparate features from the multi-sensor data,all the while considering the auditory perception traits inherent in the human auditory sys-tem.The intermediary layer integrates the concept of adaptive weighting of multi-sensor features,thus affording a calibration mechanism for the features hailing from multiple sensors,thereby enabling a further refinement of features within the branch network.For optimal model efficacy,a feature fusion mechanism is implemented in the concluding layer.To substantiate the efficacy of the proposed approach,this paper initially employs an augmented data methodology inspired by modified SpecAugment,applied to the dataset of abnormal noise sam-ples,encompassing scenarios both with and without in-vehicle interior noise.This serves to mitigate the issue of limited sample availability.Subsequent comparative evaluations are executed,contrasting the performance of the model founded upon single-sensor data against other feature fusion models reliant on multi-sensor data.The experimental results substantiate that the suggested methodology yields heightened recognition accuracy and greater resilience against interference.Moreover,it holds notable practical significance in the engineering domain,as it furnishes valuable support for the targeted management of noise emanating from vehicle micro-motors.
基金This work was supported by the National Natural Science Foundation of China(grant number:61671470)the National Key Research and Development Program of China(grant number:2016YFC0802904)the Postdoctoral Science Foundation Funded Project of China(grant number:2017M623423).
文摘Infrared target intrusion detection has significant applications in the fields of military defence and intelligent warning.In view of the characteristics of intrusion targets as well as inspection difficulties,an infrared target intrusion detection algorithm based on feature fusion and enhancement was proposed.This algorithm combines static target mode analysis and dynamic multi-frame correlation detection to extract infrared target features at different levels.Among them,LBP texture analysis can be used to effectively identify the posterior feature patterns which have been contained in the target library,while motion frame difference method can detect the moving regions of the image,improve the integrity of target regions such as camouflage,sheltering and deformation.In order to integrate the advantages of the two methods,the enhanced convolutional neural network was designed and the feature images obtained by the two methods were fused and enhanced.The enhancement module of the network strengthened and screened the targets,and realized the background suppression of infrared images.Based on the experiments,the effect of the proposed method and the comparison method on the background suppression and detection performance was evaluated,and the results showed that the SCRG and BSF values of the method in this paper had a better performance in multiple data sets,and it’s detection performance was far better than the comparison algorithm.The experiment results indicated that,compared with traditional infrared target detection methods,the proposed method could detect the infrared invasion target more accurately,and suppress the background noise more effectively.
基金Supported by National Natural Science Foundation of China and Civil Aviation Administration of China Joint Funded Project(Grant No.U1733108)Key Project of Tianjin Science and Technology Support Program(Grant No.16YFZCSY00860).
文摘For a single-structure deep learning fault diagnosis model,its disadvantages are an insufficient feature extraction and weak fault classification capability.This paper proposes a multi-scale deep feature fusion intelligent fault diagnosis method based on information entropy.First,a normal autoencoder,denoising autoencoder,sparse autoencoder,and contractive autoencoder are used in parallel to construct a multi-scale deep neural network feature extraction structure.A deep feature fusion strategy based on information entropy is proposed to obtain low-dimensional features and ensure the robustness of the model and the quality of deep features.Finally,the advantage of the deep belief network probability model is used as the fault classifier to identify the faults.The effectiveness of the proposed method was verified by a gearbox test-bed.Experimental results show that,compared with traditional and existing intelligent fault diagnosis methods,the proposed method can obtain representative information and features from the raw data with higher classification accuracy.
文摘Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches.
基金The Natural Science Foundation of Shandong Province,China(Grant No.ZR2020QC174)The Application of Computed Tomography(CT)Scanning Technology to Damage Detection of Timber Frames of Architectural Heritage.The Taishan Scholar Project of Shandong Province,China(Grant No.2015162).
文摘Rosewood is a kind of high-quality and precious wood in China.The correct identification of rosewood species is of great significance to the import and export trade and species identification of furniture materials.In this paper,micro CT was used to obtain the micro images of CTOSS sections,radial sections and tangential sections of 24 kinds of rosewood,and the data sets were constructed.PCA method was used to reduce the dimension of four features including logical binary pattern,local configuration pattern,rotation invariant LBP,uniform LBP.These four fea-tures and one feature not reducing dimension(rotation invariant uniform LBP)was fused with Gray Level Co-Occurrence Matrix and Tamura features,respectively,a total of five fused features LBP+GLCM+Tamura,LCP+GLCM+Tamura,LBP_(P,R)^(u2)+GLCM+Tamura,LBP_(P,R)^(ri)+GLCM+Tamura and LBP_(P,R)^(riu2)+GLCM+Tamura were obtained.The five fused features were classified by extreme learning machine and BP neural network.The clas-sification effect of feature LBP_(P,R)^(u2)+GLCM+Tamura combined with extreme learning machine was the best,and the classification accuracy of CroSS,radial and tangential sections reached 100%,97.63%and 94.72%,respectively,which is 0.83%,2.77%and 5.70%higher than that of BP neural network.The classification running time of ELM is less than 1 s,and the classfcation eficiency is high.In condusion,the LBP_(P,R)^(u2)+GLCM+Tamura method com-bined with extreme learning machine can be used as a quick and acurate classifier,providing an efficient and feasible class ification method of rosewood.
基金This research was funded by College Student Innovation and Entrepreneurship Training Program,Grant Numbers 2021055Z and S202110082031the Special Project for Cultivating Scientific and Technological Innovation Ability of College and Middle School Students in Hebei Province,Grant Numbers 2021H011404 and 2021H010203.
文摘Edge detection is one of the core steps of image processing and computer vision.Accurate and fine image edge will make further target detection and semantic segmentation more effective.Holistically-Nested edge detection(HED)edge detection network has been proved to be a deep-learning network with better performance for edge detection.However,it is found that when the HED network is used in overlapping complex multi-edge scenarios for automatic object identification.There will be detected edge incomplete,not smooth and other problems.To solve these problems,an image edge detection algorithm based on improved HED and feature fusion is proposed.On the one hand,features are extracted using the improved HED network:the HED convolution layer is improved.The residual variable convolution block is used to replace the normal convolution enhancement model to extract features from edges of different sizes and shapes.Meanwhile,the empty convolution is used to replace the original pooling layer to expand the receptive field and retain more global information to obtain comprehensive feature information.On the other hand,edges are extracted using Otsu algorithm:Otsu-Canny algorithm is used to adaptively adjust the threshold value in the global scene to achieve the edge detection under the optimal threshold value.Finally,the edge extracted by improved HED network and Otsu-Canny algorithm is fused to obtain the final edge.Experimental results show that on the Berkeley University Data Set(BSDS500)the optimal data set size(ODS)F-measure of the proposed algorithm is 0.793;the average precision(AP)of the algorithm is 0.849;detection speed can reach more than 25 frames per second(FPS),which confirms the effectiveness of the proposed method.
基金This work is supported by the National Natural Science Foundation of China(No.61772561)the Key Research&Development Plan of Hunan Province(No.2018NK2012)+1 种基金the Science Research Projects of Hunan Provincial Education Department(Nos.18A174,18C0262)the Science&Technology Innovation Platform and Talent Plan of Hunan Province(2017TP1022).
文摘Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information more comprehensively than traditional methods using a single-view.How to use hashing to combine multi-view data for image retrieval is still a challenge.In this paper,a multi-view fusion hashing method based on RKCCA(Random Kernel Canonical Correlation Analysis)is proposed.In order to describe image content more accurately,we use deep learning dense convolutional network feature DenseNet to construct multi-view by combining GIST feature or BoW_SIFT(Bag-of-Words model+SIFT feature)feature.This algorithm uses RKCCA method to fuse multi-view features to construct association features and apply them to image retrieval.The algorithm generates binary hash code with minimal distortion error by designing quantization regularization terms.A large number of experiments on benchmark datasets show that this method is superior to other multi-view hashing methods.
基金This research was supported by a grant(2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation(NRF)funded by the Ministry of Education,Republic of Korea.
文摘Latest advancements in vision technology offer an evident impact on multi-object recognition and scene understanding.Such sceneunderstanding task is a demanding part of several technologies,like augmented reality-based scene integration,robotic navigation,autonomous driving,and tourist guide.Incorporating visual information in contextually unified segments,convolution neural networks-based approaches will significantly mitigate the clutter,which is usual in classical frameworks during scene understanding.In this paper,we propose a convolutional neural network(CNN)based segmentation method for the recognition of multiple objects in an image.Initially,after acquisition and preprocessing,the image is segmented by using CNN.Then,CNN features are extracted from these segmented objects,and discrete cosine transform(DCT)and discrete wavelet transform(DWT)features are computed.After the extraction of CNN features and computation of classical machine learning features,fusion is performed using a fusion technique.Then,to select theminimal set of features,genetic algorithm-based feature selection is used.In order to recognize and understand the multi-objects in the scene,a neuro-fuzzy approach is applied.Once objects in the scene are recognized,the relationship between these objects is examined by employing the object-to-object relation approach.Finally,a decision tree is incorporated to assign the relevant labels to the scenes based on recognized objects in the image.The experimental results over complex scene datasets including SUN Red Green Blue-Depth(RGB-D)and Cityscapes’demonstrated a remarkable performance.
基金This work was supported in part by the National Natural Science Foundation of China under Grant 61772561,author J.Q,http://www.nsfc.gov.cn/in part by the Science Research Projects of Hunan Provincial Education Department under Grant 18A174,author X.X,http://kxjsc.gov.hnedu.cn/+5 种基金in part by the Science Research Projects of Hunan Provincial Education Department under Grant 19B584,author Y.T,http://kxjsc.gov.hnedu.cn/in part by the Natural Science Foundation of Hunan Province(No.2020JJ4140),author Y.T,http://kjt.hunan.gov.cn/in part by the Natural Science Foundation of Hunan Province(No.2020JJ4141),author X.X,http://kjt.hunan.gov.cn/in part by the Key Research and Development Plan of Hunan Province under Grant 2019SK2022,author Y.T,http://kjt.hunan.gov.cn/in part by the Key Research and Development Plan of Hunan Province under Grant CX20200730,author G.H,http://kjt.hunan.gov.cn/in part by the Graduate Science and Technology Innovation Fund Project of Central South University of Forestry and Technology under Grant CX20202038,author G.H,http://jwc.csuft.edu.cn/.
文摘Medical image segmentation is an important application field of computer vision in medical image processing.Due to the close location and high similarity of different organs in medical images,the current segmentation algorithms have problems with mis-segmentation and poor edge segmentation.To address these challenges,we propose a medical image segmentation network(AF-Net)based on attention mechanism and feature fusion,which can effectively capture global information while focusing the network on the object area.In this approach,we add dual attention blocks(DA-block)to the backbone network,which comprises parallel channels and spatial attention branches,to adaptively calibrate and weigh features.Secondly,the multi-scale feature fusion block(MFF-block)is proposed to obtain feature maps of different receptive domains and get multi-scale information with less computational consumption.Finally,to restore the locations and shapes of organs,we adopt the global feature fusion blocks(GFF-block)to fuse high-level and low-level information,which can obtain accurate pixel positioning.We evaluate our method on multiple datasets(the aorta and lungs dataset),and the experimental results achieve 94.0%in mIoU and 96.3%in DICE,showing that our approach performs better than U-Net and other state-of-art methods.