期刊文献+

联合卡尔曼滤波和泰勒残差展开的回声消除方法

Echo Cancellation Method Combining Kalman Filtering and Taylor Residual Expansion
下载PDF
导出
摘要 现有基于深度学习的声学回声消除算法主要采用端到端的结构,这种结构使得神经网络模型在设计上的可解释性难以实现。针对这一问题,提出一种联合卡尔曼滤波和泰勒残差展开的回声消除方法,可以为网络结构设计提供很好的可解释性。该方法由线性自适应滤波和深度神经网络两部分组成。首先,采用神经卡尔曼滤波(Neural Kalman Filtering,NKF)作为自适应滤波器去除线性噪声,获得目标语音的粗略频谱估计;然后,通过泰勒展开神经网络对粗谱估计的结果进一步处理,以抑制非线性残留回声,并逐步修复目标语音的复数频谱。在泰勒展开神经网络中设计融合不同尺度时频特征的编解码网络用于零阶项估计,构建轻量级高阶项估计网络,并按颗粒度由大到小重建目标语音复数频谱。结果表明,相比现有的主流回声消除方法,本文所提方法的性能有显著提升。双讲情况下,语音质量感知评估(Perceptual Evaluation of Speech Quality,PESQ)和短时客观可懂度(Short-Time Objective Intelligibility,STOI)均有大幅提升;单讲情况下,回声损失增强度量(Echo Return Loss Enhancement,ERLE)达到了56.106的优良表现,相比先进的UNET神经网络方法提高了6.5%。 The existing acoustic echo cancellation algorithms based on deep learning mainly adopt an end-to-end structure,which makes it difficult for neural network models to explain their internal mechanisms.To solve this problem,an echo cancellation method combining the Kalman filter and Taylor residual expansion was proposed,which can provide better interpretability for each layer of the network structure.The method consists of two parts,i.e.linear adaptive filtering and deep neural network.Firstly,Neural Kalman Filtering(NKF)is used as an adaptive filter to remove linear noise and obtain a rough spectral estimation of the target speech.Then,Taylor expansion is used to gradually learn the value of the rough spectral estimation,suppress nonlinear residual echoes,and gradually repair the complex spectrum of the target speech.In the Taylor expansion neural network,an encoding and decoding network integrating time-frequency features of different scales was designed for zero-order term estimation.A lightweight high-order term estimation network was constructed to reconstruct the target speech complex spectrum from large to small granularity.The experiment shows that the proposed method has significant performance improvement compared to existing mainstream echo cancellation methods.In the case of double lectures,the Perceptual Evaluation of Speech Quality(PESQ)and Short-Time Objective Intelligibility(STOI)were greatly improved.In the single lecture case,the Echo Return Loss Enhancement(ERLE)measure was greatly improved,achieving an excellent performance of 56.106,which has a 6.5%improvement over the advanced UNET neural network method.
作者 李勇 孙成立 陈飞龙 LI Yong;SUN Cheng-li;CHEN Fei-long(School of Information and Engineering,Nanchang Hangkong University,Nanchang 330063,China;School of Information and Communication Engineering,Guangzhou Maritime University,Guangzhou 510725,China)
出处 《南昌航空大学学报(自然科学版)》 CAS 2024年第1期32-42,共11页 Journal of Nanchang Hangkong University(Natural Sciences)
基金 国家自然科学基金(61861033) 江西省赣鄱俊才支持项目(20232BCJ22050) 江西省教育厅科技项目(DA202104170) 山东省自然科学基金(ZR2020MF020) 江西省自然科学基金重点项目(20202ACBL202007) 南昌航空大学博士启动基金(EA201904283)。
关键词 回声消除 自适应滤波 泰勒展开 渐进学习 神经网络 echo cancellation adaptive filtering Taylor unfolds progressive learning neural networks
  • 相关文献

参考文献1

二级参考文献4

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部