期刊文献+
共找到72,806篇文章
< 1 2 250 >
每页显示 20 50 100
Interactive System for Video Summarization Based on Multimodal Fusion 被引量:1
1
作者 Zheng Li Xiaobing Du +2 位作者 Cuixia Ma Yanfeng Li Hongan Wang 《Journal of Beijing Institute of Technology》 EI CAS 2019年第1期27-34,共8页
Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is ... Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is proposed,which is a novel approach of visualizing the specific features for biography video and interacting with video content by taking advantage of the ability of multimodality.In general,a story of movie progresses by dialogues of characters and the subtitles are produced with the basis on the dialogues which contains all the information related to the movie.In this paper,JGibbsLDA is applied to extract key words from subtitles because the biography video consists of different aspects to depict the characters' whole life.In terms of fusing keywords and key-frames,affinity propagation is adopted to calculate the similarity between each key-frame cluster and keywords.Through the method mentioned above,a video summarization is presented based on multimodal fusion which describes video content more completely.In order to reduce the time spent on searching the interest video content and get the relationship between main characters,a kind of map is adopted to visualize video content and interact with video summarization.An experiment is conducted to evaluate video summarization and the results demonstrate that this system can formally facilitate the exploration of video content while improving interaction and finding events of interest efficiently. 展开更多
关键词 VIDEO VISUALIZATION INTERACTION multimodal fusion VIDEO SUMMARIZATION
下载PDF
Deep Learning Based Optimal Multimodal Fusion Framework for Intrusion Detection Systems for Healthcare Data
2
作者 Phong Thanh Nguyen Vy Dang Bich Huynh +3 位作者 Khoa Dang Vo Phuong Thanh Phan Mohamed Elhoseny Dac-Nhuong Le 《Computers, Materials & Continua》 SCIE EI 2021年第3期2555-2571,共17页
Data fusion is a multidisciplinary research area that involves different domains.It is used to attain minimum detection error probability and maximum reliability with the help of data retrieved from multiple healthcar... Data fusion is a multidisciplinary research area that involves different domains.It is used to attain minimum detection error probability and maximum reliability with the help of data retrieved from multiple healthcare sources.The generation of huge quantity of data from medical devices resulted in the formation of big data during which data fusion techniques become essential.Securing medical data is a crucial issue of exponentially-pacing computing world and can be achieved by Intrusion Detection Systems(IDS).In this regard,since singularmodality is not adequate to attain high detection rate,there is a need exists to merge diverse techniques using decision-based multimodal fusion process.In this view,this research article presents a new multimodal fusion-based IDS to secure the healthcare data using Spark.The proposed model involves decision-based fusion model which has different processes such as initialization,pre-processing,Feature Selection(FS)and multimodal classification for effective detection of intrusions.In FS process,a chaotic Butterfly Optimization(BO)algorithmcalled CBOA is introduced.Though the classic BO algorithm offers effective exploration,it fails in achieving faster convergence.In order to overcome this,i.e.,to improve the convergence rate,this research work modifies the required parameters of BO algorithm using chaos theory.Finally,to detect intrusions,multimodal classifier is applied by incorporating three Deep Learning(DL)-based classification models.Besides,the concepts like Hadoop MapReduce and Spark were also utilized in this study to achieve faster computation of big data in parallel computation platform.To validate the outcome of the presented model,a series of experimentations was performed using the benchmark NSLKDDCup99 Dataset repository.The proposed model demonstrated its effective results on the applied dataset by offering the maximum accuracy of 99.21%,precision of 98.93%and detection rate of 99.59%.The results assured the betterment of the proposed model. 展开更多
关键词 Big data data fusion deep learning intrusion detection bio-inspired algorithm SPARK
下载PDF
Multimodal Fusion of Brain Imaging Data: Methods and Applications
3
作者 Na Luo Weiyang Shi +2 位作者 Zhengyi Yang Ming Song Tianzi Jiang 《Machine Intelligence Research》 EI CSCD 2024年第1期136-152,共17页
Neuroimaging data typically include multiple modalities,such as structural or functional magnetic resonance imaging,dif-fusion tensor imaging,and positron emission tomography,which provide multiple views for observing... Neuroimaging data typically include multiple modalities,such as structural or functional magnetic resonance imaging,dif-fusion tensor imaging,and positron emission tomography,which provide multiple views for observing and analyzing the brain.To lever-age the complementary representations of different modalities,multimodal fusion is consequently needed to dig out both inter-modality and intra-modality information.With the exploited rich information,it is becoming popular to combine multiple modality data to ex-plore the structural and functional characteristics of the brain in both health and disease status.In this paper,we first review a wide spectrum of advanced machine learning methodologies for fusing multimodal brain imaging data,broadly categorized into unsupervised and supervised learning strategies.Followed by this,some representative applications are discussed,including how they help to under-stand the brain arealization,how they improve the prediction of behavioral phenotypes and brain aging,and how they accelerate the biomarker exploration of brain diseases.Finally,we discuss some exciting emerging trends and important future directions.Collectively,we intend to offer a comprehensive overview of brain imaging fusion methods and their successful applications,along with the chal-lenges imposed by multi-scale and big data,which arises an urgent demand on developing new models and platforms. 展开更多
关键词 multimodal fusion supervised learning unsupervised learning brain atlas COGNITION brain disorders
原文传递
Multimodality Medical Image Fusion Based on Pixel Significance with Edge-Preserving Processing for Clinical Applications
4
作者 Bhawna Goyal Ayush Dogra +4 位作者 Dawa Chyophel Lepcha Rajesh Singh Hemant Sharma Ahmed Alkhayyat Manob Jyoti Saikia 《Computers, Materials & Continua》 SCIE EI 2024年第3期4317-4342,共26页
Multimodal medical image fusion has attained immense popularity in recent years due to its robust technology for clinical diagnosis.It fuses multiple images into a single image to improve the quality of images by reta... Multimodal medical image fusion has attained immense popularity in recent years due to its robust technology for clinical diagnosis.It fuses multiple images into a single image to improve the quality of images by retaining significant information and aiding diagnostic practitioners in diagnosing and treating many diseases.However,recent image fusion techniques have encountered several challenges,including fusion artifacts,algorithm complexity,and high computing costs.To solve these problems,this study presents a novel medical image fusion strategy by combining the benefits of pixel significance with edge-preserving processing to achieve the best fusion performance.First,the method employs a cross-bilateral filter(CBF)that utilizes one image to determine the kernel and the other for filtering,and vice versa,by considering both geometric closeness and the gray-level similarities of neighboring pixels of the images without smoothing edges.The outputs of CBF are then subtracted from the original images to obtain detailed images.It further proposes to use edge-preserving processing that combines linear lowpass filtering with a non-linear technique that enables the selection of relevant regions in detailed images while maintaining structural properties.These regions are selected using morphologically processed linear filter residuals to identify the significant regions with high-amplitude edges and adequate size.The outputs of low-pass filtering are fused with meaningfully restored regions to reconstruct the original shape of the edges.In addition,weight computations are performed using these reconstructed images,and these weights are then fused with the original input images to produce a final fusion result by estimating the strength of horizontal and vertical details.Numerous standard quality evaluation metrics with complementary properties are used for comparison with existing,well-known algorithms objectively to validate the fusion results.Experimental results from the proposed research article exhibit superior performance compared to other competing techniques in the case of both qualitative and quantitative evaluation.In addition,the proposed method advocates less computational complexity and execution time while improving diagnostic computing accuracy.Nevertheless,due to the lower complexity of the fusion algorithm,the efficiency of fusion methods is high in practical applications.The results reveal that the proposed method exceeds the latest state-of-the-art methods in terms of providing detailed information,edge contour,and overall contrast. 展开更多
关键词 Image fusion fractal data analysis BIOMEDICAL diseases research multiresolution analysis numerical analysis
下载PDF
3D Vehicle Detection Algorithm Based onMultimodal Decision-Level Fusion
5
作者 Peicheng Shi Heng Qi +1 位作者 Zhiqiang Liu Aixi Yang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第6期2007-2023,共17页
3D vehicle detection based on LiDAR-camera fusion is becoming an emerging research topic in autonomous driving.The algorithm based on the Camera-LiDAR object candidate fusion method(CLOCs)is currently considered to be... 3D vehicle detection based on LiDAR-camera fusion is becoming an emerging research topic in autonomous driving.The algorithm based on the Camera-LiDAR object candidate fusion method(CLOCs)is currently considered to be a more effective decision-level fusion algorithm,but it does not fully utilize the extracted features of 3D and 2D.Therefore,we proposed a 3D vehicle detection algorithm based onmultimodal decision-level fusion.First,project the anchor point of the 3D detection bounding box into the 2D image,calculate the distance between 2D and 3D anchor points,and use this distance as a new fusion feature to enhance the feature redundancy of the network.Subsequently,add an attention module:squeeze-and-excitation networks,weight each feature channel to enhance the important features of the network,and suppress useless features.The experimental results show that the mean average precision of the algorithm in the KITTI dataset is 82.96%,which outperforms previous state-ofthe-art multimodal fusion-based methods,and the average accuracy in the Easy,Moderate and Hard evaluation indicators reaches 88.96%,82.60%,and 77.31%,respectively,which are higher compared to the original CLOCs model by 1.02%,2.29%,and 0.41%,respectively.Compared with the original CLOCs algorithm,our algorithm has higher accuracy and better performance in 3D vehicle detection. 展开更多
关键词 3D vehicle detection multimodal fusion CLOCs network structure optimization attention module
下载PDF
MFF-Net: Multimodal Feature Fusion Network for 3D Object Detection
6
作者 Peicheng Shi Zhiqiang Liu +1 位作者 Heng Qi Aixi Yang 《Computers, Materials & Continua》 SCIE EI 2023年第6期5615-5637,共23页
In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection ... In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection will be affected by problems such as illumination changes,object occlusion,and object detection distance.To this purpose,we face these challenges by proposing a multimodal feature fusion network for 3D object detection(MFF-Net).In this research,this paper first uses the spatial transformation projection algorithm to map the image features into the feature space,so that the image features are in the same spatial dimension when fused with the point cloud features.Then,feature channel weighting is performed using an adaptive expression augmentation fusion network to enhance important network features,suppress useless features,and increase the directionality of the network to features.Finally,this paper increases the probability of false detection and missed detection in the non-maximum suppression algo-rithm by increasing the one-dimensional threshold.So far,this paper has constructed a complete 3D target detection network based on multimodal feature fusion.The experimental results show that the proposed achieves an average accuracy of 82.60%on the Karlsruhe Institute of Technology and Toyota Technological Institute(KITTI)dataset,outperforming previous state-of-the-art multimodal fusion networks.In Easy,Moderate,and hard evaluation indicators,the accuracy rate of this paper reaches 90.96%,81.46%,and 75.39%.This shows that the MFF-Net network has good performance in 3D object detection. 展开更多
关键词 3D object detection multimodal fusion neural network autonomous driving attention mechanism
下载PDF
Data-driven multimodal fusion:approaches and applications in psychiatric research
7
作者 Jing Sui Dongmei Zhi Vince D Calhoun 《Psychoradiology》 2023年第1期135-153,共19页
In the era of big data,where vast amounts of information are being generated and collected at an unprecedented rate,there is a pressing demand for innovative data-driven multi-modal fusion methods.These methods aim to... In the era of big data,where vast amounts of information are being generated and collected at an unprecedented rate,there is a pressing demand for innovative data-driven multi-modal fusion methods.These methods aim to integrate diverse neuroimaging per-spectives to extract meaningful insights and attain a more comprehensive understanding of complex psychiatric disorders.However,analyzing each modality separately may only reveal partial insights or miss out on important correlations between different types of data.This is where data-driven multi-modal fusion techniques come into play.By combining information from multiple modalities in a synergistic manner,these methods enable us to uncover hidden patterns and relationships that would otherwise remain unnoticed.In this paper,we present an extensive overview of data-driven multimodal fusion approaches with or without prior information,with specific emphasis on canonical correlation analysis and independent component analysis.The applications of such fusion methods are wide-ranging and allow us to incorporate multiple factors such as genetics,environment,cognition,and treatment outcomes across various brain disorders.After summarizing the diverse neuropsychiatric magnetic resonance imaging fusion applications,we further discuss the emerging neuroimaging analyzing trends in big data,such as N-way multimodal fusion,deep learning approaches,and clinical translation.Overall,multimodal fusion emerges as an imperative approach providing valuable insights into the under-lying neural basis of mental disorders,which can uncover subtle abnormalities or potential biomarkers that may benefit targeted treatments and personalized medical interventions. 展开更多
关键词 multimodal fusion approach data driven functional magnetic resonance imaging(fMRI) structural MRI diffusion mag-netic resonance imaging independent component analysis canonical correlation analysis psychiatric disorder
下载PDF
Multimodal Sentiment Analysis Using BiGRU and Attention-Based Hybrid Fusion Strategy 被引量:1
8
作者 Zhizhong Liu Bin Zhou +1 位作者 Lingqiang Meng Guangyu Huang 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期1963-1981,共19页
Recently,multimodal sentiment analysis has increasingly attracted attention with the popularity of complementary data streams,which has great potential to surpass unimodal sentiment analysis.One challenge of multimoda... Recently,multimodal sentiment analysis has increasingly attracted attention with the popularity of complementary data streams,which has great potential to surpass unimodal sentiment analysis.One challenge of multimodal sentiment analysis is how to design an efficient multimodal feature fusion strategy.Unfortunately,existing work always considers feature-level fusion or decision-level fusion,and few research works focus on hybrid fusion strategies that contain feature-level fusion and decision-level fusion.To improve the performance of multimodal sentiment analysis,we present a novel multimodal sentiment analysis model using BiGRU and attention-based hybrid fusion strategy(BAHFS).Firstly,we apply BiGRU to learn the unimodal features of text,audio and video.Then we fuse the unimodal features into bimodal features using the bimodal attention fusion module.Next,BAHFS feeds the unimodal features and bimodal features into the trimodal attention fusion module and the trimodal concatenation fusion module simultaneously to get two sets of trimodal features.Finally,BAHFS makes a classification with the two sets of trimodal features respectively and gets the final analysis results with decision-level fusion.Based on the CMU-MOSI and CMU-MOSEI datasets,extensive experiments have been carried out to verify BAHFS’s superiority. 展开更多
关键词 Multimdoal sentiment analysis BiGRU attention mechanism features-level fusion hybrid fusion strategy
下载PDF
Fusion of color and hallucinated depth features for enhanced multimodal deep learning-based damage segmentation
9
作者 Tarutal Ghosh Mondal Mohammad Reza Jahanshahi 《Earthquake Engineering and Engineering Vibration》 SCIE EI CSCD 2023年第1期55-68,共14页
Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside th... Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside the advantages,depth-sensing also presents many practical challenges.For instance,the depth sensors impose an additional payload burden on the robotic inspection platforms limiting the operation time and increasing the inspection cost.Additionally,some lidar-based depth sensors have poor outdoor performance due to sunlight contamination during the daytime.In this context,this study investigates the feasibility of abolishing depth-sensing at test time without compromising the segmentation performance.An autonomous damage segmentation framework is developed,based on recent advancements in vision-based multi-modal sensing such as modality hallucination(MH)and monocular depth estimation(MDE),which require depth data only during the model training.At the time of deployment,depth data becomes expendable as it can be simulated from the corresponding RGB frames.This makes it possible to reap the benefits of depth fusion without any depth perception per se.This study explored two different depth encoding techniques and three different fusion strategies in addition to a baseline RGB-based model.The proposed approach is validated on computer-generated RGB-D data of reinforced concrete buildings subjected to seismic damage.It was observed that the surrogate techniques can increase the segmentation IoU by up to 20.1%with a negligible increase in the computation cost.Overall,this study is believed to make a positive contribution to enhancing the resilience of critical civil infrastructure. 展开更多
关键词 multimodal data fusion depth sensing vision-based inspection UAV-assisted inspection damage segmentation post-disaster reconnaissance modality hallucination monocular depth estimation
下载PDF
基于改进Centerfusion的自动驾驶3D目标检测模型
10
作者 黄俊 刘家森 《无线电工程》 2024年第2期507-514,共8页
针对自动驾驶路面上目标漏检和错检的问题,提出一种基于改进Centerfusion的自动驾驶3D目标检测模型。该模型通过将相机信息和雷达特征融合,构成多通道特征数据输入,从而增强目标检测网络的鲁棒性,减少漏检问题;为了能够得到更加准确丰富... 针对自动驾驶路面上目标漏检和错检的问题,提出一种基于改进Centerfusion的自动驾驶3D目标检测模型。该模型通过将相机信息和雷达特征融合,构成多通道特征数据输入,从而增强目标检测网络的鲁棒性,减少漏检问题;为了能够得到更加准确丰富的3D目标检测信息,引入了改进的注意力机制,用于增强视锥网格中的雷达点云和视觉信息融合;使用改进的损失函数优化边框预测的准确度。在Nuscenes数据集上进行模型验证和对比,实验结果表明,相较于传统的Centerfusion模型,提出的模型平均检测精度均值(mean Average Precision,mAP)提高了1.3%,Nuscenes检测分数(Nuscenes Detection Scores,NDS)提高了1.2%。 展开更多
关键词 传感器融合 3D目标检测 注意力机制 毫米波雷达
下载PDF
Leveraging Multimodal Ensemble Fusion-Based Deep Learning for COVID-19 on Chest Radiographs
11
作者 Mohamed Yacin Sikkandar K.Hemalatha +4 位作者 M.Subashree S.Srinivasan Seifedine Kadry Jungeun Kim Keejun Han 《Computer Systems Science & Engineering》 SCIE EI 2023年第10期873-889,共17页
Recently,COVID-19 has posed a challenging threat to researchers,scientists,healthcare professionals,and administrations over the globe,from its diagnosis to its treatment.The researchers are making persistent efforts ... Recently,COVID-19 has posed a challenging threat to researchers,scientists,healthcare professionals,and administrations over the globe,from its diagnosis to its treatment.The researchers are making persistent efforts to derive probable solutions formanaging the pandemic in their areas.One of the widespread and effective ways to detect COVID-19 is to utilize radiological images comprising X-rays and computed tomography(CT)scans.At the same time,the recent advances in machine learning(ML)and deep learning(DL)models show promising results in medical imaging.Particularly,the convolutional neural network(CNN)model can be applied to identifying abnormalities on chest radiographs.While the epidemic of COVID-19,much research is led on processing the data compared with DL techniques,particularly CNN.This study develops an improved fruit fly optimization with a deep learning-enabled fusion(IFFO-DLEF)model for COVID-19 detection and classification.The major intention of the IFFO-DLEF model is to investigate the presence or absence of COVID-19.To do so,the presented IFFODLEF model applies image pre-processing at the initial stage.In addition,the ensemble of three DL models such as DenseNet169,EfficientNet,and ResNet50,are used for feature extraction.Moreover,the IFFO algorithm with a multilayer perceptron(MLP)classification model is utilized to identify and classify COVID-19.The parameter optimization of the MLP approach utilizing the IFFO technique helps in accomplishing enhanced classification performance.The experimental result analysis of the IFFO-DLEF model carried out on the CXR image database portrayed the better performance of the presented IFFO-DLEF model over recent approaches. 展开更多
关键词 COVID-19 computer vision deep learning image classification fusion model
下载PDF
Disparity estimation for multi-scale multi-sensor fusion
12
作者 SUN Guoliang PEI Shanshan +2 位作者 LONG Qian ZHENG Sifa YANG Rui 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第2期259-274,共16页
The perception module of advanced driver assistance systems plays a vital role.Perception schemes often use a single sensor for data processing and environmental perception or adopt the information processing results ... The perception module of advanced driver assistance systems plays a vital role.Perception schemes often use a single sensor for data processing and environmental perception or adopt the information processing results of various sensors for the fusion of the detection layer.This paper proposes a multi-scale and multi-sensor data fusion strategy in the front end of perception and accomplishes a multi-sensor function disparity map generation scheme.A binocular stereo vision sensor composed of two cameras and a light deterction and ranging(LiDAR)sensor is used to jointly perceive the environment,and a multi-scale fusion scheme is employed to improve the accuracy of the disparity map.This solution not only has the advantages of dense perception of binocular stereo vision sensors but also considers the perception accuracy of LiDAR sensors.Experiments demonstrate that the multi-scale multi-sensor scheme proposed in this paper significantly improves disparity map estimation. 展开更多
关键词 stereo vision light deterction and ranging(LiDAR) multi-sensor fusion multi-scale fusion disparity map
下载PDF
Effect of different anesthetic modalities with multimodal analgesia on postoperative pain level in colorectal tumor patients
13
作者 Ji-Chun Tang Jia-Wei Ma +2 位作者 Jin-Jin Jian Jie Shen Liang-Liang Cao 《World Journal of Gastrointestinal Oncology》 SCIE 2024年第2期364-371,共8页
BACKGROUND According to clinical data,a significant percentage of patients experience pain after surgery,highlighting the importance of alleviating postoperative pain.The current approach involves intravenous self-con... BACKGROUND According to clinical data,a significant percentage of patients experience pain after surgery,highlighting the importance of alleviating postoperative pain.The current approach involves intravenous self-control analgesia,often utilizing opioid analgesics such as morphine,sufentanil,and fentanyl.Surgery for colo-rectal cancer typically involves general anesthesia.Therefore,optimizing anes-thetic management and postoperative analgesic programs can effectively reduce perioperative stress and enhance postoperative recovery.The study aims to analyze the impact of different anesthesia modalities with multimodal analgesia on patients'postoperative pain.AIM To explore the effects of different anesthesia methods coupled with multi-mode analgesia on postoperative pain in patients with colorectal cancer.METHODS Following the inclusion criteria and exclusion criteria,a total of 126 patients with colorectal cancer admitted to our hospital from January 2020 to December 2022 were included,of which 63 received general anesthesia coupled with multi-mode labor pain and were set as the control group,and 63 received general anesthesia associated with epidural anesthesia coupled with multi-mode labor pain and were set as the research group.After data collection,the effects of postoperative analgesia,sedation,and recovery were compared.RESULTS Compared to the control group,the research group had shorter recovery times for orientation,extubation,eye-opening,and spontaneous respiration(P<0.05).The research group also showed lower Visual analog scale scores at 24 h and 48 h,higher Ramany scores at 6 h and 12 h,and improved cognitive function at 24 h,48 h,and 72 h(P<0.05).Additionally,interleukin-6 and interleukin-10 levels were significantly reduced at various time points in the research group compared to the control group(P<0.05).Levels of CD3+,CD4+,and CD4+/CD8+were also lower in the research group at multiple time points(P<0.05).CONCLUSION For patients with colorectal cancer,general anesthesia coupled with epidural anesthesia and multi-mode analgesia can achieve better postoperative analgesia and sedation effects,promote postoperative rehabilitation of patients,improve inflammatory stress and immune status,and have higher safety. 展开更多
关键词 multimodal analgesia ANESTHESIA Colorectal cancer Postoperative pain
下载PDF
A Robust Framework for Multimodal Sentiment Analysis with Noisy Labels Generated from Distributed Data Annotation
14
作者 Kai Jiang Bin Cao Jing Fan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第6期2965-2984,共20页
Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and sha... Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and share such multimodal data.However,due to professional discrepancies among annotators and lax quality control,noisy labels might be introduced.Recent research suggests that deep neural networks(DNNs)will overfit noisy labels,leading to the poor performance of the DNNs.To address this challenging problem,we present a Multimodal Robust Meta Learning framework(MRML)for multimodal sentiment analysis to resist noisy labels and correlate distinct modalities simultaneously.Specifically,we propose a two-layer fusion net to deeply fuse different modalities and improve the quality of the multimodal data features for label correction and network training.Besides,a multiple meta-learner(label corrector)strategy is proposed to enhance the label correction approach and prevent models from overfitting to noisy labels.We conducted experiments on three popular multimodal datasets to verify the superiority of ourmethod by comparing it with four baselines. 展开更多
关键词 Distributed data collection multimodal sentiment analysis meta learning learn with noisy labels
下载PDF
A Hand Features Based Fusion Recognition Network with Enhancing Multi-Modal Correlation
15
作者 Wei Wu Yuan Zhang +2 位作者 Yunpeng Li Chuanyang Li YanHao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期537-555,共19页
Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and ... Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and recognition performance of the system can be enhanced through judiciously leveraging the correlation among multimodal features.Nevertheless,two issues persist in multi-modal feature fusion recognition:Firstly,the enhancement of recognition performance in fusion recognition has not comprehensively considered the inter-modality correlations among distinct modalities.Secondly,during modal fusion,improper weight selection diminishes the salience of crucial modal features,thereby diminishing the overall recognition performance.To address these two issues,we introduce an enhanced DenseNet multimodal recognition network founded on feature-level fusion.The information from the three modalities is fused akin to RGB,and the input network augments the correlation between modes through channel correlation.Within the enhanced DenseNet network,the Efficient Channel Attention Network(ECA-Net)dynamically adjusts the weight of each channel to amplify the salience of crucial information in each modal feature.Depthwise separable convolution markedly reduces the training parameters and further enhances the feature correlation.Experimental evaluations were conducted on four multimodal databases,comprising six unimodal databases,including multispectral palmprint and palm vein databases from the Chinese Academy of Sciences.The Equal Error Rates(EER)values were 0.0149%,0.0150%,0.0099%,and 0.0050%,correspondingly.In comparison to other network methods for palmprint,palm vein,and finger vein fusion recognition,this approach substantially enhances recognition performance,rendering it suitable for high-security environments with practical applicability.The experiments in this article utilized amodest sample database comprising 200 individuals.The subsequent phase involves preparing for the extension of the method to larger databases. 展开更多
关键词 BIOMETRICS MULTI-MODAL CORRELATION deep learning feature-level fusion
下载PDF
A Novel Multi-Stream Fusion Network for Underwater Image Enhancement
16
作者 Guijin Tang Lian Duan +1 位作者 Haitao Zhao Feng Liu 《China Communications》 SCIE CSCD 2024年第2期166-182,共17页
Due to the selective absorption of light and the existence of a large number of floating media in sea water, underwater images often suffer from color casts and detail blurs. It is therefore necessary to perform color... Due to the selective absorption of light and the existence of a large number of floating media in sea water, underwater images often suffer from color casts and detail blurs. It is therefore necessary to perform color correction and detail restoration. However,the existing enhancement algorithms cannot achieve the desired results. In order to solve the above problems, this paper proposes a multi-stream feature fusion network. First, an underwater image is preprocessed to obtain potential information from the illumination stream, color stream and structure stream by histogram equalization with contrast limitation, gamma correction and white balance, respectively. Next, these three streams and the original raw stream are sent to the residual blocks to extract the features. The features will be subsequently fused. It can enhance feature representation in underwater images. In the meantime, a composite loss function including three terms is used to ensure the quality of the enhanced image from the three aspects of color balance, structure preservation and image smoothness. Therefore, the enhanced image is more in line with human visual perception.Finally, the effectiveness of the proposed method is verified by comparison experiments with many stateof-the-art underwater image enhancement algorithms. Experimental results show that the proposed method provides superior results over them in terms of MSE,PSNR, SSIM, UIQM and UCIQE, and the enhanced images are more similar to their ground truth images. 展开更多
关键词 image enhancement multi-stream fusion underwater image
下载PDF
Research on Optimal Preload Method of Controllable Rolling Bearing Based on Multisensor Fusion
17
作者 Kuosheng Jiang Chengrui Han Yasheng Chang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第6期3329-3352,共24页
Angular contact ball bearings have been widely used in machine tool spindles,and the bearing preload plays an important role in the performance of the spindle.In order to solve the problems of the traditional optimal ... Angular contact ball bearings have been widely used in machine tool spindles,and the bearing preload plays an important role in the performance of the spindle.In order to solve the problems of the traditional optimal preload prediction method limited by actual conditions and uncertainties,a roller bearing preload test method based on the improved D-S evidence theorymulti-sensor fusion method was proposed.First,a novel controllable preload system is proposed and evaluated.Subsequently,multiple sensors are employed to collect data on the bearing parameters during preload application.Finally,a multisensor fusion algorithm is used to make predictions,and a neural network is used to optimize the fitting of the preload data.The limitations of conventional preload testing methods are identified,and the integration of complementary information frommultiple sensors is used to achieve accurate predictions,offering valuable insights into the optimal preload force.Experimental results demonstrate that the multi-sensor fusion approach outperforms traditional methods in accurately measuring the optimal preload for rolling bearings. 展开更多
关键词 MULTI-SENSOR information fusion neural network preload force
下载PDF
Relativistic Heavy Ion Collider and the Large Hadron Collider for Heavy Ion Fusion
18
作者 Ardeshir Irani 《Journal of High Energy Physics, Gravitation and Cosmology》 CAS 2024年第2期825-827,共3页
Heavy Ion Fusion makes use of the Relativistic Heavy Ion Collider at Brookhaven National Lab and the Large Hadron Collider in Geneva, Switzerland for Inertial Confinement Fusion. Two Storage Rings, which may or may no... Heavy Ion Fusion makes use of the Relativistic Heavy Ion Collider at Brookhaven National Lab and the Large Hadron Collider in Geneva, Switzerland for Inertial Confinement Fusion. Two Storage Rings, which may or may not initially be needed, added to each of the Colliders increases the intensity of the Heavy Ion Beams making it comparable to the Total Energy delivered to the DT target by the National Ignition Facility at the Lawrence Livermore Lab. The basic Physics involved gives Heavy Ion Fusion an advantage over Laser Fusion because heavy ions have greater penetration power than photons. The Relativistic Heavy Ion Collider can be used as a Prototype Heavy Ion Fusion Reactor for the Large Hadron Collider. 展开更多
关键词 Heavy Ion fusion Relativistic Heavy Ion Collider Large Hadron Collider Inertial Confinement fusion National Ignition Facility
下载PDF
Audio-Text Multimodal Speech Recognition via Dual-Tower Architecture for Mandarin Air Traffic Control Communications
19
作者 Shuting Ge Jin Ren +3 位作者 Yihua Shi Yujun Zhang Shunzhi Yang Jinfeng Yang 《Computers, Materials & Continua》 SCIE EI 2024年第3期3215-3245,共31页
In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a p... In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues, we propose a speech-text multimodal dual-tower architecture for speech recognition. It employs cross-modal interactions to achieve close semantic alignment during the encoding stage and strengthen its capabilities in modeling auditory long-distance context dependencies. In addition, a two-stage training strategy is elaborately devised to derive semantics-aware acoustic representations effectively. The first stage focuses on pre-training the speech-text multimodal encoding module to enhance inter-modal semantic alignment and aural long-distance context dependencies. The second stage fine-tunes the entire network to bridge the input modality variation gap between the training and inference phases and boost generalization performance. Extensive experiments demonstrate the effectiveness of the proposed speech-text multimodal speech recognition method on the ATCC and AISHELL-1 datasets. It reduces the character error rate to 6.54% and 8.73%, respectively, and exhibits substantial performance gains of 28.76% and 23.82% compared with the best baseline model. The case studies indicate that the obtained semantics-aware acoustic representations aid in accurately recognizing terms with similar pronunciations but distinctive semantics. The research provides a novel modeling paradigm for semantics-aware speech recognition in air traffic control communications, which could contribute to the advancement of intelligent and efficient aviation safety management. 展开更多
关键词 Speech-text multimodal automatic speech recognition semantic alignment air traffic control communications dual-tower architecture
下载PDF
Adaptation analysis and fusion correction method of CMIP6 precipitation simulation data on the Qinghai-Tibetan Plateau
20
作者 PENG Hao QIN Dahui +3 位作者 WANG Zegen ZHANG Menghan YANG Yanmei YONG Zhiwei 《Journal of Mountain Science》 SCIE CSCD 2024年第2期555-573,共19页
In order to obtain more accurate precipitation data and better simulate the precipitation on the Tibetan Plateau,the simulation capability of 14 Coupled Model Intercomparison Project Phase 6(CMIP6)models of historical... In order to obtain more accurate precipitation data and better simulate the precipitation on the Tibetan Plateau,the simulation capability of 14 Coupled Model Intercomparison Project Phase 6(CMIP6)models of historical precipitation(1982-2014)on the Qinghai-Tibetan Plateau was evaluated in this study.Results indicate that all models exhibit an overestimation of precipitation through the analysis of the Taylor index,temporal and spatial statistical parameters.To correct the overestimation,a fusion correction method combining the Backpropagation Neural Network Correction(BP)and Quantum Mapping(QM)correction,named BQ method,was proposed.With this method,the historical precipitation of each model was corrected in space and time,respectively.The correction results were then analyzed in time,space,and analysis of variance(ANOVA)with those corrected by the BP and QM methods,respectively.Finally,the fusion correction method results for each model were compared with the Climatic Research Unit(CRU)data for significance analysis to obtain the trends of precipitation increase and decrease for each model.The results show that the IPSL-CM6A-LR model is relatively good in simulating historical precipitation on the Qinghai-Tibetan Plateau(R=0.7,RSME=0.15)among the uncorrected data.In terms of time,the total precipitation corrected by the fusion method has the same interannual trend and the closest precipitation values to the CRU data;In terms of space,the annual average precipitation corrected by the fusion method has the smallest difference with the CRU data,and the total historical annual average precipitation is not significantly different from the CRU data,which is better than BP and QM.Therefore,the correction effect of the fusion method on the historical precipitation of each model is better than that of the QM and BP methods.The precipitation in the central and northeastern parts of the plateau shows a significant increasing trend.The correlation coefficients between monthly precipitation and site-detected precipitation for all models after BQ correction exceed 0.8. 展开更多
关键词 GCM CMIP6 Precipitation correction BP-QM fusion correction Spatio-temporal characteristics
原文传递
上一页 1 2 250 下一页 到第
使用帮助 返回顶部