期刊文献+

融合注意力机制的图文多模态情绪分类模型 被引量:2

A Multimodal Emotion Classification Model for Image and Text Fusion with Attention Mechanism
下载PDF
导出
摘要 在图文双模态情绪分类任务中,特征提取不充分和多模态特征融合出现信息冗余等问题较为普遍,本文在多通道特征提取和融合的过程中引入注意力机制,提出融合注意力机制的多模态情绪分类模型。首先,使用TextCNN和BERT模型分别提取文本局部特征、文本上下文特征,用残差网络提取图像特征;其次,利用跨模态注意力机制实现模态间的信息交互,从而增强各模态特征表示;然后,利用自注意力机制进行多模态特征融合;最后,通过Softmax分类器获得最终情绪分类结果。在公开的TumEmo图文数据集上,情绪七分类的准确率和F1值分别达到了75.2%、74.3%,表现出良好的性能。 In the task of image text bimodal emotion classification,insufficient feature extraction and information redundancy in multimodal feature fusion are common problems.In this paper,attention mechanism is introduced in the process of multi-channel feature extraction and fusion,and a multimodal emotion classification model integrating attention mechanism is proposed.Firstly,TextCNN and BERT models are used to extract text local features and text context features respectively,and residual network is used to extract image features.Secondly,the cross modal attention mechanism is used to realize the information interaction between modes,so as to enhance the representation of modal features;Then,the self attention mechanism is used to fuse multimodal features in turn;Finally,the final emotion classification result is obtained through Softmax classifier.On the public TumEmo image text data set,the accuracy rate of emotion seven classification and F1 value reached 75.2%and 74.3%respectively,showing good performance.
作者 彭俊文 李磊 PENG Junwen;LI Lei(School of Statistics and Data Science,Xinjiang University of Finance and Economics,Urumqi Xinjiang 830012)
出处 《软件》 2023年第12期176-180,共5页 Software
关键词 情绪分类 多模态 注意力机制 BERT ResNet TextCNN sentiment classification multi-mode attention mechanism BERT ResNet TextCNN
  • 相关文献

参考文献4

二级参考文献12

共引文献30

同被引文献5

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部