摘要
不同噪声在频谱上具有不同的特性,为了解决卷积神经网络对含有不同噪声的语音降噪的局限性,通过引入通道注意力机制作为卷积循环网络的中间层,将卷积层中不同功能的卷积核赋予不同的权重,使模型在训练时能够对输入数据更有针对性地去除噪声部分,从而达到更好的降噪效果。针对含有15种噪声的含噪语音分别应用循环神经网络、编解码卷积网络和卷积循环神经网络等三种模型进行降噪处理,结果表明引入注意力机制的模型相比于其他两种模型,在感知语音质量评价(perceptual evaluation of speech quality,PESQ)和短时客观可懂度(short time objective intelligibility,STOI)评分上都有所提高,且引入注意力机制的模型能够更好地保留语音的谐波信息。
Different noises have different characteristics in frequency spectrum,in order to solve the limitation of convolutional neural network for speech denoising with different noises,through the introduction of channel attention mechanism as the middle layer of convolution loop network,the convolution kernel of different functions in the convolution layer was given different weights,so that the model can be more targeted to remove the noise part of the input data in training,so as to achieve better denoising effect.For noisy speech with 15 kinds of noise,three models of recurrent neural network,codec convolutional neural network and convolutional recurrent neural network were used for noise reduction respectively.The results show that the model with attention mechanism can improve the perceptual evaluation of speech quality(PESQ)and short time objective intelligibility(STOI)scores compared with the other two models,and the model with attention mechanism can better retain the harmonic information of speech.
作者
徐浩森
姜囡
齐志坤
XU Hao-sen;JIANG Nan;QI Zhi-kun(College of Public Security Information Technology and Intelligence, Criminal Investigation Police University of China, Shenyang 110854, China)
出处
《科学技术与工程》
北大核心
2022年第5期1950-1957,共8页
Science Technology and Engineering
基金
广州市科技计划(2019030004)
辽宁省科技厅联合开放基金机器人学国家重点实验室开放基金(2020-KF-12-11)
中央高校基本科研业务费专项资金(3242019010)
辽宁省自然科学基金(2019-ZD-0168)
科技部国家重点研发专项(2017YFC0821005)
教育部重点研究项目(E-AQGABQ20202710)
证据科学教育部重点实验室开放基金(2021KFKT09)。
关键词
语音降噪
自编解码网络
卷积循环网络
通道注意力机制
speech denoising
self-encoding and decoding network
convolution cyclic network
channel attention mechanism