期刊文献+

BLIND SPEECH SEPARATION FOR ROBOTS WITH INTELLIGENT HUMAN-MACHINE INTERACTION

BLIND SPEECH SEPARATION FOR ROBOTS WITH INTELLIGENT HUMAN-MACHINE INTERACTION
下载PDF
导出
摘要 Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation (BSS) for intelligent Human-Machine Interaction(HMI). Main idea of the algorithm is to simultaneously diagonalize the correlation matrix of the pre-whitened signals at different time delays for every frequency bins in time-frequency domain. The prososed method has two merits: (1) fast convergence speed; (2) high signal to interference ratio of the separated signals. Numerical evaluations are used to compare the performance of the proposed algorithm with two other deconvolution algorithms. An efficient algorithm to resolve permutation ambiguity is also proposed in this paper. The algorithm proposed saves more than 10% of computational time with properly selected parameters and achieves good performances for both simulated convolutive mixtures and real room recorded speeches. Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation (BSS) for intelligent Human-Machine Interaction(HMI). Main idea of the algorithm is to simultaneously diagonalize the correlation matrix of the pre-whitened signals at dif- ferent time delays for every frequency bins in time-frequency domain. The prososed method has two merits: (1) fast convergence speed; (2) high signal to interference ratio of the separated signals. Nu- merical evaluations are used to compare the performance of the proposed algorithm with two other deconvolution algorithms. An efficient algorithm to resolve permutation ambiguity is also proposed in this paper. The algorithm proposed saves more than 10% of computational time with properly selected parameters and achieves good performances for both simulated convolutive mixtures and real room recorded speeches.
出处 《Journal of Electronics(China)》 2012年第3期286-293,共8页 电子科学学刊(英文版)
关键词 Blind Source Separation (BSS) Blind deconvolution Speech signal processing Human-machine interaction Simultaneous diagonalization Blind Source Separation (BSS) Blind deconvolution Speech signal processing Human-machine interaction Simultaneous diagonalization
  • 相关文献

参考文献11

  • 1A.Hyv(a)rinen. Fast and robust fixed-point algorithms for independent component analysis[J].IEEE Transactions on Neural Networks,1993,(03):626-634.
  • 2A.Bell,T.Sejnowski. An information maximization approach to blind separation and blind deconvolution[J].Neural Computation,1995,(06):1129-1159.
  • 3P.Comon. Independent component analysis,a new concept[J].Signal Processing,1994,(03):287-314.
  • 4O.Yilmaz,S.Rickard. Blind separation of speech mixtures via time-frequency masking[J].IEEE Transactions on Signal Processing,2004,(07):1830-1847.
  • 5L.Tong,R.Liu. Blind estimation of correlated source signals[A].Stanford,California:Stanford University,1990.258-262.
  • 6V.G.Reju,S.N.Koh,I.Y.Soon. Underdetermined convolutive blind source separation via time-frequency masking[J].IEEE Transactions on Audio Speech and Language Processing,2010,(01):101-116.
  • 7Kiyoung Park,Sung Joo Lee,Ho-Young Jung. Human-robot interface using robust speech recognition and user localization based on noise separation device[A].Toyama,Japan,September,2009.328-333.
  • 8H.Nakajima,K.Nakadai. Blind source separation with parameter-free adaptive step-size method for robot audition[J].IEEE Transactions on Audio Speech and Language Processing,2010,(06):1476-1484.
  • 9S.C.Douglas,M.Gupta,Hiroshi Sawada. Spatio-temporal fastICA algorithms for the blind separation of convolutive mixtures[J].IEEE Transosactions on Audio Speech and Language Processing,2007,(05):1511-1520.
  • 10K.Rahbar,J.P.Reilly. A frequency domain method for blind source separation of convolutive audio mixtures[J].IEEE Transactions on Audio Speech and Language Processing,2005,(05):832-844.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部