期刊文献+

基于耳蜗倒谱系数和Teager能量算子相位融合的说话人识别系统 被引量:4

Speaker recognition system based on fusion of cochlear filter cepstral coefficients and Teager energy operator phase
下载PDF
导出
摘要 为了提高说话人识别系统的性能,该文在传统特征的基础上提出利用相位特征对听觉倒谱特征进行补偿的方法。该方法利用Teager能量算子(Teager energy operator,TEO)能够真实反映气流在通过声道系统呈现的涡流非线性作用的模型,再利用希尔伯特变换从TEO导出分析信号的瞬时相位信息,结合耳蜗倒谱系数(Cochlear filter cepstral coefficients,CFCC)得到融合特征参数。实现了对特征参数的补偿,提高了说话人识别系统的识别率。使用NIST-2002说话者识别评估(Speakers recognition evaluation,SRE)数据库,在高斯混合模型-通用背景模型(Gaussian mixture model-universal background model,GMM-UBM)的说话人识别系统上进行实验。实验结果表明TEO相位与CFCC的结合比单独CFCC更好,其识别精度比现有的CFCC特征和线性预测梅尔频率倒谱系数(Linear prediction Meyer frequency cepstral coefficient,LPMFCC)分别提高了8.32%和3.15%。这表明TEO相位包含与CFCC特征互补的信息,且具有较高的识别率。 In order to improve the performance of speaker recognition system,this paper proposes a method of compensating auditory cepstrum features by using phase features based on traditional fea-tures. In this method,Teager energy operator( TEO) can truly reflect the model of the eddy current nonlinearity caused by the airflow in the channel system. The Hilbert transform is used to derive the instantaneous phase information of the analyzing signal from TEO. The fusion characteristic parameters are obtained by combining with cochlear filter cepstral coefficients( CFCC). It realizes the compensation of characteristic parameters and improves the recognition rate of speaker recognition system. The NIST-2002 speakers recognition evaluation( SRE) database is used to experiment with the Gaussian mixture model-universal background model( GMM-UBM) speaker recognition system.The experimental results show that the combination of the TEO phase and CFCC is better than the CFCC alone,and its recognition accuracy is improved by 8.32% and 3.15%,respectively,compared with the existing CFCC characteristics and linear prediction Meyer frequency cepstral coefficient(LPMFCC). This indicates that the TEO phase contains the information that is complementary to the CFCC feature and has a high recognition rate.
出处 《南京理工大学学报》 EI CAS CSCD 北大核心 2018年第1期82-88,共7页 Journal of Nanjing University of Science and Technology
基金 国家自然科学基金(60973095) 江苏省自然科学基金(BK20131107)
关键词 能量算子 耳蜗倒谱系数 高斯混合模型-通用背景模型 说话人识别 energy operator cochlear filter cepstral coefficient Gaussian mixture model-universal background model speaker recognition
  • 相关文献

参考文献8

二级参考文献57

  • 1谢迎春,于湘珍,刘建平,张卫华.基于多特征有效组合的说话人识别[J].现代电子技术,2005,28(9):68-70. 被引量:5
  • 2董志荣.三点等间隔线列阵目标定位存在唯一解的充要条件[J].情报指挥控制系统与仿真技术,2005,27(6):4-7. 被引量:5
  • 3高慧,苏广川,陈善广.基于Teager能量算子(TEO)非线性特征的语音情绪识别[J].航天医学与医学工程,2005,18(6):427-431. 被引量:8
  • 4王书诏,邱天爽.说话人识别研究综述[J].电声技术,2007,31(1):51-55. 被引量:9
  • 5赵力.语音信号处理[M].北京:机械工业出版社,2008.
  • 6Campbell J P.Speaker recognition:a tutorial[J].Proceedings of the IEEE,1997,85(9):1437-1462.
  • 7Hayakawa S,Itakura F.Text dependent speaker recognition using the information in the higher frequency band[A].Proceedings of the Conference on Acoustic,Speech and Signal Processing[C].Adelaide,SA,Australia,IEEE,1994:19-22.
  • 8Miyajima C,Watanable H,Tokuda K,et al.A new approach to designing a feature extractor in speaker identification based on discriminative feature extraction[J].Speech Communication,2001,35(3):203-218.
  • 9Lu Xugang,Dang Jianwu.An investigation of dependencies between frequency components and speaker characteristics for text independent speaker identification[J].Speech Communication,2008,50:312-322.
  • 10Reynolds D A,Rose R C.Robust text independent speaker identification using Gaussian mixture speaker models[J].IEEE Transactions on Speech and Audio Processing,1995,3(1):72-83.

共引文献44

同被引文献54

引证文献4

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部