摘要
由于对非平稳噪声进行估计是提高含噪语音降噪效果的重要影响因素,因此利用卷积模块提高单帧含噪语音所包含的信息,并通过Transformer中的自注意力机制模块,使模型能够更加精确区分含噪语音中的噪声部分和语音部分,从而使转置卷积模块更加高效的完成语音降噪。针对Noisex-92噪声库中的15种噪声,分别应用LSTM网络、卷积循环网络和基于通道注意力机制的卷积循环网络模型进行对比分析,同时对测试集含噪语音进行降噪处理。实验结果表明,经过所提出的基于自注意力机制的卷积循环网络降噪后的语音在PESQ和STOI评分上均有较大提高,语谱图显示有效减少了噪声的残留。
Since the estimation of non-stationary noise is an important factor to improve the noise reduction effect of noisy speech,we used the convolution module to improve the information contained in single frame of noisy speech,and through the self-attention mechanism module in Transformer,enabled the model to distinguish the noise part from the speech part more accurately,so that the transpose convolution module more efficiently completed the speech noise reduction.LSTM network,convolutional loop network and convolutional loop network based on channel attention mechanism were used to compare and analyze 15 kinds of noise in Noisex-92 library.Meanwhile,noise reduction was performed for the noisy speech in the test set.The experimental results show that the proposed convolutional loop network based on self-attention mechanism has a great improvement in both PESQ and STOI scores,and the spectrogram display effectively reduces the residual noise.
作者
徐浩森
姜囡
齐志坤
XU Hao-sen;JIANG Nan;QI Zhi-kun(College of Public Security Information Technology and Intelligence,Criminal Investigation Police University of China,Shenyang Liaoningl10854,China;Key Laboratory of Evidence Science,Ministry of Education,China University of Political Science and Law,Beijing 100088,China)
出处
《计算机仿真》
2024年第4期500-506,共7页
Computer Simulation
基金
证据科学教育部重点实验室(中国政法大学)开放基金资助课题(2021KFKT09)
辽宁省科技厅联合开放基金机器人学国家重点实验室开放基金资助项目(2020-KF-12-11)
中国刑事警察学院重大计划培育项目(3242019010)
教育部重点研究项目(E-AQGABQ20202710)
辽宁省自然科学基金项目(2019-ZD-0168)
公安学科基础理论研究创新计划项目
中央高校基本科研业务费专项资金资助(3242019010)
公安学科基础理论研究创新计划项目“公安技术基础理论与学科体系研究”(安全防范技术与工程基础理论与学科体系研究2022XKGJ0110)。
关键词
语音降噪
非平稳噪声
自注意力机制
深度学习
Speech noise reduction
Non-stationary noise
Self-attention mechanism
Deep learning