摘要
近年来,多模态情感分析成为了越来越受欢迎的热门领域,它将传统的基于文本的情感分析扩展到文本、图像以及声音相结合的多模态分析层面。多模态情感分析通常需要获取单模态内部的信息以及多模态之间的交互信息。为了利用每个模态中语言表达的上下文来帮助获取这两种信息,文中提出了一种基于上下文增强LSTM的多模态情感分析方法。具体而言,首先对于多模态的每种表达,结合上下文特征,分别使用LSTM进行编码,再分别捕捉单模态内部的信息;接着融合这些单模态的独立信息,再使用LSTM获得多模态间的交互信息,从而形成多模态特征表示;最后采用最大池化策略,对多模态表示进行降维,从而构建情感分类器。该方法在MOSI数据集上的ACC值达到75.3%,F1达到了74.9。相比传统的机器学习方法(如SVM),所提方法的ACC值高出8.1%,F 1值高出7.3。相比目前较为先进的深度学习方法值,所提方法在ACC值上高出0.9%,F1值上高出1.3,与此同时可训练参数量只有之前方法的1/20,训练速度提高了约10倍。大量的对比实验结果表明,相比传统的多模态情感分类方法,所提方法的性能有显著提升。
In recent years,multi-modal sentiment analysis has become an increasingly popular research area,which extends traditional text-based sentiment analysis to a multi-modal level that combines text,images and sound.Multi-modal sentiment analysis usually requires the acquisition of independent information within a single modality and interactive information between different modalities.In order to use the context information of language expression in each modality to obtain these two kinds of information,a multi-modal sentiment analysis approach based on context-augmented LSTM was proposed.Specifically,each modality is encoded in combination with the context feature using LSTM which aims to capture the independent information within single modality firstly.Subsequently,the independent information of multi-modality is merged,and the other LSTM layer is utilized to obtain the interactive information between the different modalities to form a multi-modal feature representation.Finally,the max-pooling strategy is used to reduce the dimension of the multi-modal representation,which will be fed to the sentiment classifier.The method achieves 75.3%ACC on the MOSI data set and F1 reaches 74.9.Compared to traditional machine learning methods such as SVM,ACC is 8.1%higher and F1 is 7.3 higher.Compared with the current advanced deep learning method,it is 0.9%higher on ACC and 1.3 higher on F1.At the same time,the trainable parameters are reduced by about 20 times,and the training speed is increased by 10 times.The experimental results demonstrate that the performance of the proposed approach significantly outperforms the competitive multi-modal sentiment classification baselines.
作者
刘启元
张栋
吴良庆
李寿山
LIU Qi-yuan;ZHANG Dong;WU Liang-qing;LI Shou-shan(School of Computer Science&Technology,Soochow University,Suzhou,Jiangsu 215006,China)
出处
《计算机科学》
CSCD
北大核心
2019年第11期181-185,共5页
Computer Science
基金
国家自然科学基金(61331011,61375073)资助
关键词
多模态
情感分析
上下文增强
Multi-modal
Sentiment analysis
Context enhancement