期刊文献+
共找到159,130篇文章
< 1 2 250 >
每页显示 20 50 100
Multi-Objective Equilibrium Optimizer for Feature Selection in High-Dimensional English Speech Emotion Recognition
1
作者 Liya Yue Pei Hu +1 位作者 Shu-Chuan Chu Jeng-Shyang Pan 《Computers, Materials & Continua》 SCIE EI 2024年第2期1957-1975,共19页
Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is ext... Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is extremely high,so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system.The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy.First,we use the information gain and Fisher Score to sort the features extracted from signals.Then,we employ a multi-objective ranking method to evaluate these features and assign different importance to them.Features with high rankings have a large probability of being selected.Finally,we propose a repair strategy to address the problem of duplicate solutions in multi-objective feature selection,which can improve the diversity of solutions and avoid falling into local traps.Using random forest and K-nearest neighbor classifiers,four English speech emotion datasets are employed to test the proposed algorithm(MBEO)as well as other multi-objective emotion identification techniques.The results illustrate that it performs well in inverted generational distance,hypervolume,Pareto solutions,and execution time,and MBEO is appropriate for high-dimensional English SER. 展开更多
关键词 speech emotion recognition filter-wrapper HIGH-DIMENSIONAL feature selection equilibrium optimizer MULTI-OBJECTIVE
下载PDF
Exploring Sequential Feature Selection in Deep Bi-LSTM Models for Speech Emotion Recognition
2
作者 Fatma Harby Mansor Alohali +1 位作者 Adel Thaljaoui Amira Samy Talaat 《Computers, Materials & Continua》 SCIE EI 2024年第2期2689-2719,共31页
Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state.The examination of the emotiona... Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state.The examination of the emotional states of speakers holds significant importance in a range of real-time applications,including but not limited to virtual reality,human-robot interaction,emergency centers,and human behavior assessment.Accurately identifying emotions in the SER process relies on extracting relevant information from audio inputs.Previous studies on SER have predominantly utilized short-time characteristics such as Mel Frequency Cepstral Coefficients(MFCCs)due to their ability to capture the periodic nature of audio signals effectively.Although these traits may improve their ability to perceive and interpret emotional depictions appropriately,MFCCS has some limitations.So this study aims to tackle the aforementioned issue by systematically picking multiple audio cues,enhancing the classifier model’s efficacy in accurately discerning human emotions.The utilized dataset is taken from the EMO-DB database,preprocessing input speech is done using a 2D Convolution Neural Network(CNN)involves applying convolutional operations to spectrograms as they afford a visual representation of the way the audio signal frequency content changes over time.The next step is the spectrogram data normalization which is crucial for Neural Network(NN)training as it aids in faster convergence.Then the five auditory features MFCCs,Chroma,Mel-Spectrogram,Contrast,and Tonnetz are extracted from the spectrogram sequentially.The attitude of feature selection is to retain only dominant features by excluding the irrelevant ones.In this paper,the Sequential Forward Selection(SFS)and Sequential Backward Selection(SBS)techniques were employed for multiple audio cues features selection.Finally,the feature sets composed from the hybrid feature extraction methods are fed into the deep Bidirectional Long Short Term Memory(Bi-LSTM)network to discern emotions.Since the deep Bi-LSTM can hierarchically learn complex features and increases model capacity by achieving more robust temporal modeling,it is more effective than a shallow Bi-LSTM in capturing the intricate tones of emotional content existent in speech signals.The effectiveness and resilience of the proposed SER model were evaluated by experiments,comparing it to state-of-the-art SER techniques.The results indicated that the model achieved accuracy rates of 90.92%,93%,and 92%over the Ryerson Audio-Visual Database of Emotional Speech and Song(RAVDESS),Berlin Database of Emotional Speech(EMO-DB),and The Interactive Emotional Dyadic Motion Capture(IEMOCAP)datasets,respectively.These findings signify a prominent enhancement in the ability to emotional depictions identification in speech,showcasing the potential of the proposed model in advancing the SER field. 展开更多
关键词 Artificial intelligence application multi features sequential selection speech emotion recognition deep Bi-LSTM
下载PDF
Improved Speech Emotion Recognition Focusing on High-Level Data Representations and Swift Feature Extraction Calculation
3
作者 Akmalbek Abdusalomov Alpamis Kutlimuratov +1 位作者 Rashid Nasimov Taeg Keun Whangbo 《Computers, Materials & Continua》 SCIE EI 2023年第12期2915-2933,共19页
The performance of a speech emotion recognition(SER)system is heavily influenced by the efficacy of its feature extraction techniques.The study was designed to advance the field of SER by optimizing feature extraction... The performance of a speech emotion recognition(SER)system is heavily influenced by the efficacy of its feature extraction techniques.The study was designed to advance the field of SER by optimizing feature extraction tech-niques,specifically through the incorporation of high-resolution Mel-spectrograms and the expedited calculation of Mel Frequency Cepstral Coefficients(MFCC).This initiative aimed to refine the system’s accuracy by identifying and mitigating the shortcomings commonly found in current approaches.Ultimately,the primary objective was to elevate both the intricacy and effectiveness of our SER model,with a focus on augmenting its proficiency in the accurate identification of emotions in spoken language.The research employed a dual-strategy approach for feature extraction.Firstly,a rapid computation technique for MFCC was implemented and integrated with a Bi-LSTM layer to optimize the encoding of MFCC features.Secondly,a pretrained ResNet model was utilized in conjunction with feature Stats pooling and dense layers for the effective encoding of Mel-spectrogram attributes.These two sets of features underwent separate processing before being combined in a Convolutional Neural Network(CNN)outfitted with a dense layer,with the aim of enhancing their representational richness.The model was rigorously evaluated using two prominent databases:CMU-MOSEI and RAVDESS.Notable findings include an accuracy rate of 93.2%on the CMU-MOSEI database and 95.3%on the RAVDESS database.Such exceptional performance underscores the efficacy of this innovative approach,which not only meets but also exceeds the accuracy benchmarks established by traditional models in the field of speech emotion recognition. 展开更多
关键词 feature extraction MFCC ResNet speech emotion recognition
下载PDF
Modified Cepstral Feature for Speech Anti-spoofing
4
作者 何明瑞 ZAIDI Syed Faham Ali +3 位作者 田娩鑫 单志勇 江政儒 徐珑婷 《Journal of Donghua University(English Edition)》 CAS 2023年第2期193-201,共9页
The hidden danger of the automatic speaker verification(ASV)system is various spoofed speeches.These threats can be classified into two categories,namely logical access(LA)and physical access(PA).To improve identifica... The hidden danger of the automatic speaker verification(ASV)system is various spoofed speeches.These threats can be classified into two categories,namely logical access(LA)and physical access(PA).To improve identification capability of spoofed speech detection,this paper considers the research on features.Firstly,following the idea of modifying the constant-Q-based features,this work considered adding variance or mean to the constant-Q-based cepstral domain to obtain good performance.Secondly,linear frequency cepstral coefficients(LFCCs)performed comparably with constant-Q-based features.Finally,we proposed linear frequency variance-based cepstral coefficients(LVCCs)and linear frequency mean-based cepstral coefficients(LMCCs)for identification of speech spoofing.LVCCs and LMCCs could be attained by adding the frame variance or the mean to the log magnitude spectrum based on LFCC features.The proposed novel features were evaluated on ASVspoof 2019 datase.The experimental results show that compared with known hand-crafted features,LVCCs and LMCCs are more effective in resisting spoofed speech attack. 展开更多
关键词 spoofed speech detection log magnitude spectrum linear frequency cepstral coefficient(LFCC) hand-crafted feature
下载PDF
Japanese Sign Language Recognition by Combining Joint Skeleton-Based Handcrafted and Pixel-Based Deep Learning Features with Machine Learning Classification
5
作者 Jungpil Shin Md.Al Mehedi Hasan +2 位作者 Abu Saleh Musa Miah Kota Suzuki Koki Hirooka 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第6期2605-2625,共21页
Sign language recognition is vital for enhancing communication accessibility among the Deaf and hard-of-hearing communities.In Japan,approximately 360,000 individualswith hearing and speech disabilities rely on Japane... Sign language recognition is vital for enhancing communication accessibility among the Deaf and hard-of-hearing communities.In Japan,approximately 360,000 individualswith hearing and speech disabilities rely on Japanese Sign Language(JSL)for communication.However,existing JSL recognition systems have faced significant performance limitations due to inherent complexities.In response to these challenges,we present a novel JSL recognition system that employs a strategic fusion approach,combining joint skeleton-based handcrafted features and pixel-based deep learning features.Our system incorporates two distinct streams:the first stream extracts crucial handcrafted features,emphasizing the capture of hand and body movements within JSL gestures.Simultaneously,a deep learning-based transfer learning stream captures hierarchical representations of JSL gestures in the second stream.Then,we concatenated the critical information of the first stream and the hierarchy of the second stream features to produce the multiple levels of the fusion features,aiming to create a comprehensive representation of the JSL gestures.After reducing the dimensionality of the feature,a feature selection approach and a kernel-based support vector machine(SVM)were used for the classification.To assess the effectiveness of our approach,we conducted extensive experiments on our Lab JSL dataset and a publicly available Arabic sign language(ArSL)dataset.Our results unequivocally demonstrate that our fusion approach significantly enhances JSL recognition accuracy and robustness compared to individual feature sets or traditional recognition methods. 展开更多
关键词 Japanese Sign Language(JSL) hand gesture recognition geometric feature distance feature angle feature GoogleNet
下载PDF
Artificial intelligence-driven radiomics study in cancer:the role of feature engineering and modeling
6
作者 Yuan-Peng Zhang Xin-Yun Zhang +11 位作者 Yu-Ting Cheng Bing Li Xin-Zhi Teng Jiang Zhang Saikit Lam Ta Zhou Zong-Rui Ma Jia-Bao Sheng Victor CWTam Shara WYLee Hong Ge Jing Cai 《Military Medical Research》 SCIE CAS CSCD 2024年第1期115-147,共33页
Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of... Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research. 展开更多
关键词 Artificial intelligence Radiomics feature extraction feature selection Modeling INTERPRETABILITY Multimodalities Head and neck cancer
原文传递
Cross-Dimension Attentive Feature Fusion Network for Unsupervised Time-Series Anomaly Detection
7
作者 Rui Wang Yao Zhou +2 位作者 Guangchun Luo Peng Chen Dezhong Peng 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第6期3011-3027,共17页
Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconst... Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconstruction has become a prevalent approach for unsupervised anomaly detection.However,effectively learning representations and achieving accurate detection results remain challenging due to the intricate temporal patterns and dependencies in real-world time series.In this paper,we propose a cross-dimension attentive feature fusion network for time series anomaly detection,referred to as CAFFN.Specifically,a series and feature mixing block is introduced to learn representations in 1D space.Additionally,a fast Fourier transform is employed to convert the time series into 2D space,providing the capability for 2D feature extraction.Finally,a cross-dimension attentive feature fusion mechanism is designed that adaptively integrates features across different dimensions for anomaly detection.Experimental results on real-world time series datasets demonstrate that CAFFN performs better than other competing methods in time series anomaly detection. 展开更多
关键词 Time series anomaly detection unsupervised feature learning feature fusion
下载PDF
Aerodynamic Features of High-Speed Maglev Trains with Different Marshaling Lengths Running on a Viaduct under Crosswinds
8
作者 Zun-Di Huang Zhen-Bin Zhou +2 位作者 Ning Chang Zheng-Wei Chen Su-Mei Wang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期975-996,共22页
The safety and stability of high-speed maglev trains traveling on viaducts in crosswinds critically depend on their aerodynamic characteristics.Therefore,this paper uses an improved delayed detached eddy simulation(ID... The safety and stability of high-speed maglev trains traveling on viaducts in crosswinds critically depend on their aerodynamic characteristics.Therefore,this paper uses an improved delayed detached eddy simulation(IDDES)method to investigate the aerodynamic features of high-speed maglev trains with different marshaling lengths under crosswinds.The effects of marshaling lengths(varying from 3-car to 8-car groups)on the train’s aerodynamic performance,surface pressure,and the flow field surrounding the train were investigated using the three-dimensional unsteady compressible Navier-Stokes(N-S)equations.The results showed that the marshaling lengths had minimal influence on the aerodynamic performance of the head and middle cars.Conversely,the marshaling lengths are negatively correlated with the time-average side force coefficient(CS)and time-average lift force coefficient(Cl)of the tail car.Compared to the tail car of the 3-car groups,the CS and Cl fell by 27.77%and 18.29%,respectively,for the tail car of the 8-car groups.It is essential to pay more attention to the operational safety of the head car,as it exhibits the highest time average CS.Additionally,the mean pressure difference between the two sides of the tail car body increased with the marshaling lengths,and the side force direction on the tail car was opposite to that of the head and middle cars.Furthermore,the turbulent kinetic energy of the wake structure on the windward side quickly decreased as marshaling lengths increased. 展开更多
关键词 High-speed maglev train marshaling lengths crosswinds aerodynamic features
下载PDF
Feature Matching via Topology-Aware Graph Interaction Model
9
作者 Yifan Lu Jiayi Ma +2 位作者 Xiaoguang Mei Jun Huang Xiao-Ping Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期113-130,共18页
Feature matching plays a key role in computer vision. However, due to the limitations of the descriptors, the putative matches are inevitably contaminated by massive outliers.This paper attempts to tackle the outlier ... Feature matching plays a key role in computer vision. However, due to the limitations of the descriptors, the putative matches are inevitably contaminated by massive outliers.This paper attempts to tackle the outlier filtering problem from two aspects. First, a robust and efficient graph interaction model,is proposed, with the assumption that matches are correlated with each other rather than independently distributed. To this end, we construct a graph based on the local relationships of matches and formulate the outlier filtering task as a binary labeling energy minimization problem, where the pairwise term encodes the interaction between matches. We further show that this formulation can be solved globally by graph cut algorithm. Our new formulation always improves the performance of previous localitybased method without noticeable deterioration in processing time,adding a few milliseconds. Second, to construct a better graph structure, a robust and geometrically meaningful topology-aware relationship is developed to capture the topology relationship between matches. The two components in sum lead to topology interaction matching(TIM), an effective and efficient method for outlier filtering. Extensive experiments on several large and diverse datasets for multiple vision tasks including general feature matching, as well as relative pose estimation, homography and fundamental matrix estimation, loop-closure detection, and multi-modal image matching, demonstrate that our TIM is more competitive than current state-of-the-art methods, in terms of generality, efficiency, and effectiveness. The source code is publicly available at http://github.com/YifanLu2000/TIM. 展开更多
关键词 feature matching graph cut outlier filtering topology preserving
下载PDF
Clinical features and prognostic factors of duodenal neuroendocrine tumours:A comparative study of ampullary and nonampullary regions
10
作者 Sa Fang Yu-Peng Shi +2 位作者 Lu Wang Shuang Han Yong-Quan Shi 《World Journal of Gastrointestinal Oncology》 SCIE 2024年第3期907-918,共12页
BACKGROUND Duodenal neuroendocrine tumours(DNETs)are rare neoplasms.However,the incidence of DNETs has been increasing in recent years,especially as an incidental finding during endoscopic studies.Regrettably,there is... BACKGROUND Duodenal neuroendocrine tumours(DNETs)are rare neoplasms.However,the incidence of DNETs has been increasing in recent years,especially as an incidental finding during endoscopic studies.Regrettably,there is no consensus regarding the ideal treatment of DNETs.Even there are few studies on the clinical features and survival analysis of DNETs.AIM To analyze the clinical characteristics and prognostic factors of patients with duodenal neuroendocrine tumours.METHODS The clinical data of DNETs diagnosed in the First Affiliated Hospital of Air Force Military Medical University from June 2011 to July 2022 were collected.Neuroen-docrine tumours located in the ampulla area of the duodenum were divided into the ampullary region group;neuroendocrine tumours in any part of the duo-denum outside the ampullary area were divided into the nonampullary region group.Using a retrospective study,the clinical characteristics of the two groups and risk factors affecting the survival of DNET patients were analysed.RESULTS Twenty-nine DNET patients were screened.The male to female ratio was 1:1.9,and females comprised the majority.The ampullary region group accounted for 24.1%(7/29),while the nonampullary region group accounted for 75.9%(22/29).When diagnosed,the clinical symptoms of the ampullary region group were mainly abdominal pain(85.7%),while those of the nonampullary region groups were mainly abdominal distension(59.1%).There were differences in the composition of staging of tumours between the two groups(Fisher's exact probability method,P=0.001),with nonampullary stage II tumours(68.2%)being the main stage(P<0.05).After the diagnosis of DNETs,the survival rate of the ampullary region group was 14.3%(1/7),which was lower than that of 72.7%(16/22)in the nonampullary region group(Fisher's exact probability method,P=0.011).The survival time of the ampullary region group was shorter than that of the nonampullary region group(P<0.000).The median survival time of the ampullary region group was 10.0 months and that of the nonampullary region group was 451.0 months.Multivariate analysis showed that tumours in the ampulla region and no surgical treatment after diagnosis were independent risk factors for the survival of DNET patients(HR=0.029,95%CI 0.004-0.199,P<0.000;HR=12.609,95%CI:2.889-55.037,P=0.001).Further analysis of nonampullary DNET patients showed that the survival time of patients with a tumour diameter<2 cm was longer than that of patients with a tumour diameter≥2 cm(t=7.243,P=0.048).As of follow-up,6 patients who died of nonampullary DNETs had a tumour diameter that was≥2 cm,and 3 patients in stage IV had liver metastasis.Patients with a tumour diameter<2 cm underwent surgical treatment,and all survived after surgery.CONCLUSION Surgical treatment is a protective factor for prolonging the survival of DNET patients.Compared to DNETs in the ampullary region,patients in the nonampullary region group had a longer survival period.The liver is the organ most susceptible to distant metastasis of nonampullary DNETs. 展开更多
关键词 DUODENUM NEUROENDOCRINE TUMOUR Ampullary Nonampullary Clinical features PROGNOSTIC
下载PDF
Audio-Text Multimodal Speech Recognition via Dual-Tower Architecture for Mandarin Air Traffic Control Communications
11
作者 Shuting Ge Jin Ren +3 位作者 Yihua Shi Yujun Zhang Shunzhi Yang Jinfeng Yang 《Computers, Materials & Continua》 SCIE EI 2024年第3期3215-3245,共31页
In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a p... In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues, we propose a speech-text multimodal dual-tower architecture for speech recognition. It employs cross-modal interactions to achieve close semantic alignment during the encoding stage and strengthen its capabilities in modeling auditory long-distance context dependencies. In addition, a two-stage training strategy is elaborately devised to derive semantics-aware acoustic representations effectively. The first stage focuses on pre-training the speech-text multimodal encoding module to enhance inter-modal semantic alignment and aural long-distance context dependencies. The second stage fine-tunes the entire network to bridge the input modality variation gap between the training and inference phases and boost generalization performance. Extensive experiments demonstrate the effectiveness of the proposed speech-text multimodal speech recognition method on the ATCC and AISHELL-1 datasets. It reduces the character error rate to 6.54% and 8.73%, respectively, and exhibits substantial performance gains of 28.76% and 23.82% compared with the best baseline model. The case studies indicate that the obtained semantics-aware acoustic representations aid in accurately recognizing terms with similar pronunciations but distinctive semantics. The research provides a novel modeling paradigm for semantics-aware speech recognition in air traffic control communications, which could contribute to the advancement of intelligent and efficient aviation safety management. 展开更多
关键词 speech-text multimodal automatic speech recognition semantic alignment air traffic control communications dual-tower architecture
下载PDF
A Fusion Localization Method Based on Target Measurement Error Feature Complementarity and Its Application
12
作者 Xin Yang Hongming Liu +3 位作者 Xiaoke Wang Wen Yu Jingqiu Liu Sipei Zhang 《Journal of Beijing Institute of Technology》 EI CAS 2024年第1期75-88,共14页
In the multi-radar networking system,aiming at the problem of locating long-distance targets synergistically with difficulty and low accuracy,a dual-station joint positioning method based on the target measurement err... In the multi-radar networking system,aiming at the problem of locating long-distance targets synergistically with difficulty and low accuracy,a dual-station joint positioning method based on the target measurement error feature complementarity is proposed.For dual-station joint positioning,by constructing the target positioning error distribution model and using the complementarity of spatial measurement errors of the same long-distance target,the area with high probability of target existence can be obtained.Then,based on the target distance information,the midpoint of the intersection between the target positioning sphere and the positioning tangent plane can be solved to acquire the target's optimal positioning result.The simulation demonstrates that this method greatly improves the positioning accuracy of target in azimuth direction.Compared with the traditional the dynamic weighted fusion(DWF)algorithm and the filter-based dynamic weighted fusion(FBDWF)algorithm,it not only effectively eliminates the influence of systematic error in the azimuth direction,but also has low computational complexity.Furthermore,for the application scenarios of multi-radar collaborative positioning and multi-sensor data compression filtering in centralized information fusion,it is recommended that using radar with higher ranging accuracy and the lengths of baseline between radars are 20–100 km. 展开更多
关键词 dual-station positioning feature complementarity information fusion engineering applicability
下载PDF
Olive Leaf Disease Detection via Wavelet Transform and Feature Fusion of Pre-Trained Deep Learning Models
13
作者 Mahmood A.Mahmood Khalaf Alsalem 《Computers, Materials & Continua》 SCIE EI 2024年第3期3431-3448,共18页
Olive trees are susceptible to a variety of diseases that can cause significant crop damage and economic losses.Early detection of these diseases is essential for effective management.We propose a novel transformed wa... Olive trees are susceptible to a variety of diseases that can cause significant crop damage and economic losses.Early detection of these diseases is essential for effective management.We propose a novel transformed wavelet,feature-fused,pre-trained deep learning model for detecting olive leaf diseases.The proposed model combines wavelet transforms with pre-trained deep-learning models to extract discriminative features from olive leaf images.The model has four main phases:preprocessing using data augmentation,three-level wavelet transformation,learning using pre-trained deep learning models,and a fused deep learning model.In the preprocessing phase,the image dataset is augmented using techniques such as resizing,rescaling,flipping,rotation,zooming,and contrasting.In wavelet transformation,the augmented images are decomposed into three frequency levels.Three pre-trained deep learning models,EfficientNet-B7,DenseNet-201,and ResNet-152-V2,are used in the learning phase.The models were trained using the approximate images of the third-level sub-band of the wavelet transform.In the fused phase,the fused model consists of a merge layer,three dense layers,and two dropout layers.The proposed model was evaluated using a dataset of images of healthy and infected olive leaves.It achieved an accuracy of 99.72%in the diagnosis of olive leaf diseases,which exceeds the accuracy of other methods reported in the literature.This finding suggests that our proposed method is a promising tool for the early detection of olive leaf diseases. 展开更多
关键词 Olive leaf diseases wavelet transform deep learning feature fusion
下载PDF
Point Cloud Classification Using Content-Based Transformer via Clustering in Feature Space
14
作者 Yahui Liu Bin Tian +2 位作者 Yisheng Lv Lingxi Li Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期231-239,共9页
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to est... Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT. 展开更多
关键词 Content-based Transformer deep learning feature aggregator local attention point cloud classification
下载PDF
A Weakly-Supervised Crowd Density Estimation Method Based on Two-Stage Linear Feature Calibration
15
作者 Yong-Chao Li Rui-Sheng Jia +1 位作者 Ying-Xiang Hu Hong-Mei Sun 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第4期965-981,共17页
In a crowd density estimation dataset,the annotation of crowd locations is an extremely laborious task,and they are not taken into the evaluation metrics.In this paper,we aim to reduce the annotation cost of crowd dat... In a crowd density estimation dataset,the annotation of crowd locations is an extremely laborious task,and they are not taken into the evaluation metrics.In this paper,we aim to reduce the annotation cost of crowd datasets,and propose a crowd density estimation method based on weakly-supervised learning,in the absence of crowd position supervision information,which directly reduces the number of crowds by using the number of pedestrians in the image as the supervised information.For this purpose,we design a new training method,which exploits the correlation between global and local image features by incremental learning to train the network.Specifically,we design a parent-child network(PC-Net)focusing on the global and local image respectively,and propose a linear feature calibration structure to train the PC-Net simultaneously,and the child network learns feature transfer factors and feature bias weights,and uses the transfer factors and bias weights to linearly feature calibrate the features extracted from the Parent network,to improve the convergence of the network by using local features hidden in the crowd images.In addition,we use the pyramid vision transformer as the backbone of the PC-Net to extract crowd features at different levels,and design a global-local feature loss function(L2).We combine it with a crowd counting loss(LC)to enhance the sensitivity of the network to crowd features during the training process,which effectively improves the accuracy of crowd density estimation.The experimental results show that the PC-Net significantly reduces the gap between fullysupervised and weakly-supervised crowd density estimation,and outperforms the comparison methods on five datasets of Shanghai Tech Part A,ShanghaiTech Part B,UCF_CC_50,UCF_QNRF and JHU-CROWD++. 展开更多
关键词 Crowd density estimation linear feature calibration vision transformer weakly-supervision learning
下载PDF
An Improved Binary Quantum-based Avian Navigation Optimizer Algorithm to Select Effective Feature Subset from Medical Data:A COVID-19 Case Study
16
作者 Ali Fatahi Mohammad H.Nadimi-Shahraki Hoda Zamani 《Journal of Bionic Engineering》 SCIE EI CSCD 2024年第1期426-446,共21页
Feature Subset Selection(FSS)is an NP-hard problem to remove redundant and irrelevant features particularly from medical data,and it can be effectively addressed by metaheuristic algorithms.However,existing binary ver... Feature Subset Selection(FSS)is an NP-hard problem to remove redundant and irrelevant features particularly from medical data,and it can be effectively addressed by metaheuristic algorithms.However,existing binary versions of metaheuristic algorithms have issues with convergence and lack an effective binarization method,resulting in suboptimal solutions that hinder diagnosis and prediction accuracy.This paper aims to propose an Improved Binary Quantum-based Avian Navigation Optimizer Algorithm(IBQANA)for FSS in medical data preprocessing to address the suboptimal solutions arising from binary versions of metaheuristic algorithms.The proposed IBQANA’s contributions include the Hybrid Binary Operator(HBO)and the Distance-based Binary Search Strategy(DBSS).HBO is designed to convert continuous values into binary solutions,even for values outside the[0,1]range,ensuring accurate binary mapping.On the other hand,DBSS is a two-phase search strategy that enhances the performance of inferior search agents and accelerates convergence.By combining exploration and exploitation phases based on an adaptive probability function,DBSS effectively avoids local optima.The effectiveness of applying HBO is compared with five transfer function families and thresholding on 12 medical datasets,with feature numbers ranging from 8 to 10,509.IBQANA's effectiveness is evaluated regarding the accuracy,fitness,and selected features and compared with seven binary metaheuristic algorithms.Furthermore,IBQANA is utilized to detect COVID-19.The results reveal that the proposed IBQANA outperforms all comparative algorithms on COVID-19 and 11 other medical datasets.The proposed method presents a promising solution to the FSS problem in medical data preprocessing. 展开更多
关键词 feature subset selection Optimization Binary metaheuristic algorithms BIOINSPIRED Machine learning Medical datasets
下载PDF
Attention Guided Multi Scale Feature Fusion Network for Automatic Prostate Segmentation
17
作者 Yuchun Li Mengxing Huang +1 位作者 Yu Zhang Zhiming Bai 《Computers, Materials & Continua》 SCIE EI 2024年第2期1649-1668,共20页
The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prosta... The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prostate segmentation,but due to the variability caused by prostate diseases,automatic segmentation of the prostate presents significant challenges.In this paper,we propose an attention-guided multi-scale feature fusion network(AGMSF-Net)to segment prostate MRI images.We propose an attention mechanism for extracting multi-scale features,and introduce a 3D transformer module to enhance global feature representation by adding it during the transition phase from encoder to decoder.In the decoder stage,a feature fusion module is proposed to obtain global context information.We evaluate our model on MRI images of the prostate acquired from a local hospital.The relative volume difference(RVD)and dice similarity coefficient(DSC)between the results of automatic prostate segmentation and ground truth were 1.21%and 93.68%,respectively.To quantitatively evaluate prostate volume on MRI,which is of significant clinical significance,we propose a unique AGMSF-Net.The essential performance evaluation and validation experiments have demonstrated the effectiveness of our method in automatic prostate segmentation. 展开更多
关键词 Prostate segmentation multi-scale attention 3D Transformer feature fusion MRI
下载PDF
Epidemiology, Clinical Features and Antifungal Resistance Profile of Candida auris in Africa: Systematic Review
18
作者 Isidore Wendkièta Yerbanga Seydou Nakanabo Diallo +8 位作者 Toussaint Rouamba Delwendé Florence Ouedraogo Katrien Lagrou Rita Oladele Jean-Pierre Gangneux Olivier Denis Hector Rodriguez-Villalobos Isabel Montesinos Sanata Bamba 《Journal of Biosciences and Medicines》 2024年第1期126-149,共24页
Candida auris since it discovery in 2009 is becoming a severe threat to human health due to its very quickly spread, its worldwide high resistance to systemic antifungal drugs. In resource-constrained settings where s... Candida auris since it discovery in 2009 is becoming a severe threat to human health due to its very quickly spread, its worldwide high resistance to systemic antifungal drugs. In resource-constrained settings where several conditions are met for its emergence and spread, this worrisome fungus could cause large hospital and/or community-based outbreaks. This review aimed to summarize the available data on C. auris in Africa focusing on its epidemiology and antifungal resistance profile. Major databases were searched for articles on the epidemiology and antifungal resistance profile of C. auris in Africa. Out of 2,521 articles identified 22 met the inclusion criteria. In Africa, nearly 89% of African countries have no published data on C. auris. The prevalence of C. auris in Africa was 8.74%. The case fatality rate of C. auris infection in Africa was 39.46%. The main C. auris risk factors reported in Africa were cardiovascular disease, renal failure, diabetes, HIV, recent intake of antimicrobial drugs, ICU admissions, surgery, hemodialysis, parenteral nutrition and indwelling devices. Four phylogenetic clades were reported in Africa, namely clades I, II, III and IV. Candida auris showed a pan-African very high resistance rate to fluconazole, moderate resistance to amphotericin B, and high susceptibility to echinocandins. Finally, C. auris clade-specific mutations were observed within the ERG2, ERG3, ERG9, ERG11, FKS1, TAC1b and MRR1 genes in Africa. This systematic review showed the presence of C. auris in the African continent and a worrying unavailability of data on this resilient fungus in most African countries. 展开更多
关键词 AFRICA Antifungal Resistance Candida auris Clinical features Phylogenetic Clades
下载PDF
A process-oriented approach for identifying potential landslides considering time-dependent behaviors beyond geomorphological features
19
作者 Xiang Sun Guoqing Chen +4 位作者 Xing Yang Zhengxuan Xu Jingxi Yang Zhiheng Lin Yunpeng Liu 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第3期961-978,共18页
Geomorphological features are commonly used to identify potential landslides.Nevertheless,overemphasis on these features could lead to misjudgment.This research proposes a process-oriented approach for potential lands... Geomorphological features are commonly used to identify potential landslides.Nevertheless,overemphasis on these features could lead to misjudgment.This research proposes a process-oriented approach for potential landslide identification that considers time-dependent behaviors.The method integrates comprehensive remote sensing and geological analysis to qualitatively assess slope stability,and employs numerical analysis to quantitatively calculate aging stability.Specifically,a time-dependent stability calculation method for anticlinal slopes is developed and implemented in discrete element software,incorporating time-dependent mechanical and strength reduction calculations.By considering the time-dependent evolution of slopes,this method highlights the importance of both geomorphological features and time-dependent behaviors in landslide identification.This method has been applied to the Jiarishan slope(JRS)on the Qinghai-Tibet Plateau as a case study.The results show that the JRS,despite having landslide geomorphology,is a stable slope,highlighting the risk of misjudgment when relying solely on geomorphological features.This work provides insights into the geomorphological characterization and evolution history of the JRS and offers valuable guidance for studying slopes with similar landslide geomorphology.Furthermore,the process-oriented method incorporating timedependent evolution provides a means to evaluate potential landslides,reducing misjudgment due to excessive reliance on geomorphological features. 展开更多
关键词 Geomorphological features Evolution history Time-dependent stability calculation Landslides identification Qinghai-Tibet Plateau
下载PDF
Spatial Distribution Feature Extraction Network for Open Set Recognition of Electromagnetic Signal
20
作者 Hui Zhang Huaji Zhou +1 位作者 Li Wang Feng Zhou 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期279-296,共18页
This paper proposes a novel open set recognition method,the Spatial Distribution Feature Extraction Network(SDFEN),to address the problem of electromagnetic signal recognition in an open environment.The spatial distri... This paper proposes a novel open set recognition method,the Spatial Distribution Feature Extraction Network(SDFEN),to address the problem of electromagnetic signal recognition in an open environment.The spatial distribution feature extraction layer in SDFEN replaces convolutional output neural networks with the spatial distribution features that focus more on inter-sample information by incorporating class center vectors.The designed hybrid loss function considers both intra-class distance and inter-class distance,thereby enhancing the similarity among samples of the same class and increasing the dissimilarity between samples of different classes during training.Consequently,this method allows unknown classes to occupy a larger space in the feature space.This reduces the possibility of overlap with known class samples and makes the boundaries between known and unknown samples more distinct.Additionally,the feature comparator threshold can be used to reject unknown samples.For signal open set recognition,seven methods,including the proposed method,are applied to two kinds of electromagnetic signal data:modulation signal and real-world emitter.The experimental results demonstrate that the proposed method outperforms the other six methods overall in a simulated open environment.Specifically,compared to the state-of-the-art Openmax method,the novel method achieves up to 8.87%and 5.25%higher micro-F-measures,respectively. 展开更多
关键词 Electromagnetic signal recognition deep learning feature extraction open set recognition
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部