BLIND SPEECH SEPARATION FOR ROBOTS WITH INTELLIGENT HUMAN-MACHINE INTERACTION

BLIND SPEECH SEPARATION FOR ROBOTS WITH INTELLIGENT HUMAN-MACHINE INTERACTION

下载PDF

导出

摘要 Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation (BSS) for intelligent Human-Machine Interaction(HMI). Main idea of the algorithm is to simultaneously diagonalize the correlation matrix of the pre-whitened signals at different time delays for every frequency bins in time-frequency domain. The prososed method has two merits: (1) fast convergence speed; (2) high signal to interference ratio of the separated signals. Numerical evaluations are used to compare the performance of the proposed algorithm with two other deconvolution algorithms. An efficient algorithm to resolve permutation ambiguity is also proposed in this paper. The algorithm proposed saves more than 10% of computational time with properly selected parameters and achieves good performances for both simulated convolutive mixtures and real room recorded speeches. Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker＇s speech mixes with a bystander＇s voice. This paper proposes a time-frequency approach for Blind Source Seperation （BSS） for intelligent Human-Machine Interaction（HMI）. Main idea of the algorithm is to simultaneously diagonalize the correlation matrix of the pre-whitened signals at dif- ferent time delays for every frequency bins in time-frequency domain. The prososed method has two merits：（1） fast convergence speed; （2） high signal to interference ratio of the separated signals. Nu- merical evaluations are used to compare the performance of the proposed algorithm with two other deconvolution algorithms. An efficient algorithm to resolve permutation ambiguity is also proposed in this paper. The algorithm proposed saves more than 10% of computational time with properly selected parameters and achieves good performances for both simulated convolutive mixtures and real room recorded speeches.

作者 Huang Yulei Ding Zhizhong Dai Lirong Chen Xiaoping

机构地区 Department of Communication Engineering Department of Electronic Engineering and Information Science

出处《Journal of Electronics(China)》 2012年第3期286-293,共8页 电子科学学刊（英文版）

关键词 Blind Source Separation (BSS) Blind deconvolution Speech signal processing Human-machine interaction Simultaneous diagonalization Blind Source Separation （BSS） Blind deconvolution Speech signal processing Human-machine interaction Simultaneous diagonalization

分类号 TN911 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献11

1A.Hyv(a)rinen. Fast and robust fixed-point algorithms for independent component analysis[J].IEEE Transactions on Neural Networks,1993,(03):626-634.
2A.Bell,T.Sejnowski. An information maximization approach to blind separation and blind deconvolution[J].Neural Computation,1995,(06):1129-1159.
3P.Comon. Independent component analysis,a new concept[J].Signal Processing,1994,(03):287-314.
4O.Yilmaz,S.Rickard. Blind separation of speech mixtures via time-frequency masking[J].IEEE Transactions on Signal Processing,2004,(07):1830-1847.
5L.Tong,R.Liu. Blind estimation of correlated source signals[A].Stanford,California:Stanford University,1990.258-262.
6V.G.Reju,S.N.Koh,I.Y.Soon. Underdetermined convolutive blind source separation via time-frequency masking[J].IEEE Transactions on Audio Speech and Language Processing,2010,(01):101-116.
7Kiyoung Park,Sung Joo Lee,Ho-Young Jung. Human-robot interface using robust speech recognition and user localization based on noise separation device[A].Toyama,Japan,September,2009.328-333.
8H.Nakajima,K.Nakadai. Blind source separation with parameter-free adaptive step-size method for robot audition[J].IEEE Transactions on Audio Speech and Language Processing,2010,(06):1476-1484.
9S.C.Douglas,M.Gupta,Hiroshi Sawada. Spatio-temporal fastICA algorithms for the blind separation of convolutive mixtures[J].IEEE Transosactions on Audio Speech and Language Processing,2007,(05):1511-1520.
10K.Rahbar,J.P.Reilly. A frequency domain method for blind source separation of convolutive audio mixtures[J].IEEE Transactions on Audio Speech and Language Processing,2005,(05):832-844.

1Li Gengtian, Li ZhibinBeijing Institute of Control Engineering, P.O.Box 2729, Beijing 100080, ChinaLü YingxiangBeijing Institute of Electromechanic System Engineering.P.O.Box 142-206, Beijing 100854, China.Project Systems Engineering for Developing Space Robots[J].Journal of Systems Engineering and Electronics,1991,2(1):97-106.
2赵千捷.Universal Robots：助力昆山东威电镀设备技术公司实现PCB无人化生产[J].数控机床市场,2013(3):39-40.
3Mylena Samantha Ferreira Mendes,Lucas Silva Soares,William Cunha Rezende Medeiros Tavares.Numerical Analysis of Neodymium Magnet Influence in Relation Induction/Weight Structures of Sumo Robots[J].Journal of Mechanics Engineering and Automation,2015,5(3):197-200.
4Shen JieBeijing Institute of Electromechanic Equipment, P. O. Box 3926, Beijing 100854, China.The Configuration Optimization for Multilimbed Robots with the Viewpoint of Maximum Loading Capacity[J].Journal of Systems Engineering and Electronics,1991,2(1):107-118.
5可靠性工程与环境工程[J].电子科技文摘,1999(1):3-5.
6机器人、机械手、自动调节、控制与执行机构[J].电子科技文摘,2001,0(9):126-127.
7ZHANG Xu-ping,YU Yue-qing.Optimization of structural parameters for spatial flexible redundant manipulators with maximum ratio of load to mass[J].光学精密工程,2005,13(5):561-569. 被引量：1
8机器人、机械手、自动调节、控制与执行机构[J].电子科技文摘,2000(9):132-137.
9Xiaohui HU,Xu ZHANG,Ming LIU,Yuanfang CHEN,Peng LI,Jialin LIU,Zhaolin YAO,Weihua PEI,Chun ZHANG,Hongda CHEN.High precision intelligent flexible grasping front-end with CMOS interface for robots application[J].Science China(Information Sciences),2016,59(3):172-182.

Journal of Electronics(China)

2012年第3期

浏览历史

内容加载中请稍等...

BLIND SPEECH SEPARATION FOR ROBOTS WITH INTELLIGENT HUMAN-MACHINE INTERACTION

参考文献11

相关作者

相关机构

相关主题

浏览历史