期刊文献+

考虑情感程度相对顺序的维度语音情感识别 被引量:2

Considering relative order of emotional degree in dimensional speech emotion recognition
下载PDF
导出
摘要 维度语音情感识别(Dim-SER)是情感计算领域的一个新兴分支,它从多维、连续的角度看待情感,将SER问题建模为连续值的预测回归任务。当前的Dim-SER系统在进行情感预测时缺少对语料间情感程度相对顺序的考虑,严重影响了人机交互系统对说话人情感变化趋势的把握。从该需求出发,本文以人类情感认知特性为参照,构建了一个对情感程度相对顺序敏感的Dim-SER系统,并引入Gamma统计对SER系统性能评价标准加以完善。系统构建过程中,本文构造了Top-rank概率分布对语料间的情感顺序进行描述,并使用Kullback-Leibler距离对预测造成的顺序一致性损失进行度量,最后提出顺序敏感的神经网络算法实现系统预测损失的最小化。情感预测实验结果表明,同常用的k近邻算法和支持向量回归算法相比,该系统有效地提高了语料间情感程度相对顺序的正确性。 Dimensional speech emotion recognition(Dim-SER) is a rising branch of emotion computing field.It views emotion from dimensional and continuous perspective,and formalizesthe SER problem as a regression task.Current Dim-SER researches never consider the relative order of emotional degree between utterances,which would makethe human-machine interface get wrong information about speaker' s emotion variation trend.Starting from this demand,this paper constructs an order sensitive Dim-SER system with the human emotion cognitive characteristics as reference,and employsGamma statisticto evaluate emotion recognition performance.Specifically, the Top-rank probability distribution is developed to describethe emotional ordering of utterances,and the Kullback-Leibler divergence is usedto measure the loss of order consistency caused by emotion recognition.Finally,the Order-Senstive Network(OSNet) algorithm is proposed to minimized prediction loss.Experimental results show that,compared with the commonly usedA-Nearest Neighbor (k-NN) and Support Vector Regression(SVR) approaches,the proposed system effectively improve thecorrectness of emotional relative order between utterances.
出处 《信号处理》 CSCD 北大核心 2011年第11期1658-1663,共6页 Journal of Signal Processing
基金 自然科学基金(60772076) 语言语音教育部微软重点实验室开放基金资助项目(HIT.KLOF.2009015) 高等学校博士学科点专项科研基金(No.20050213032)
关键词 维度语音情感识别 情感空间 Kullback-Leibler距离 神经网络 梯度下降 dimensional speech emotion recognition emotion space Kullback-Leibler divergence neural network gradient descent
  • 相关文献

参考文献13

  • 1Lee, C., Narayanan, S., Pieraccini, R. Combining acous- tic and language information for emotion recognition [ C ]//J Proc. ICSLP, 2002. 873-876.
  • 2Dellaert, F., Polzin, T.,Waibel, A.. Recognizing emotion in speeeh[C]ffProc. ICSLP, 1996. 1970-1973.
  • 3Schuller, B., Rigoll,G. Lang, M. Hidden Markov mod- el-based speech emotion recognition [ C ] ff Proc. ICME, 2003, 1. 401-404.
  • 4Giannakopoulos, T., Pikrakis, A. Theodoridis, S.. A di- mensional approach to emotion recognition of speech from movies[ C]////Proc. ICASSP, 2009. 65-68.
  • 5Grimm,M., Kroschel, K. Narayanan, S.. Support vector regression for automatic recognition of spontaneous emo-tions in speech [ C ] ,//Proe. ICASSP, 2007, IV. 1085- 1088.
  • 6Russell, J.. A eireumplex model of affect [ J ]. Journal of Personality & Social Psychology, 1980, 39 (6) : 1161- 1178.
  • 7Yang,Y., Lin, Y., Su, Y., Chen, H. A regression ap- proach to music emotion recognition [ J ]. IEEE Trans. Au- dio, Speech & Language Processing, 2008, 16, (2) : 448- 457.
  • 8Hanjalie, A. ,Xu, L.. Affective video content representa- tionand modeling[J]. IEEE Trans. on Multimedia, 2005, 7(1) :143-154.
  • 9Kehrein, R. The prosody of authentic emotions [ C ]//Proc. Speech Prosody Conference, 2002. 423-426.
  • 10Grimm, M. , Kroschel, K. and Narayanan, S.. The Vera am Mittag German audio-visual emotional speech database [ C]//Proc. ICME, 2008. 865-868.

同被引文献10

引证文献2

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部