摘要
电力调度通话过程不可避免地伴随有噪声的干扰,现有模型在干净条件下识别性能表现良好,然而在噪声条件下识别准确率迅速下降,这降低了指令下达的识别率和安全性。为提高调度说话人确认系统的抗噪性能和深层动态特征提取能力,提出一种基于小波包倒谱系数和强调通道注意、传播和聚合的时延神经网络的说话人确认模型。在Mel倒谱系数的基础上将快速傅里叶变换更换为小波包分解,加入倒谱均值方差归一化和delta、delta-delta系数,并将模型的输入改为小波包倒谱系数。在TIMIT数据集上实验结果表明,其抗噪性和识别成功率相较于传统模型有不同程度的提升。
The communication process of power dispatching is inevitably accompanied by noise interference.The existing models perform well in clean conditions,but the recognition accuracy decreases rapidly under noisy conditions,which reduces the recognition rate and security of command issuance. In order to improve the anti-noise performance and deep dynamic feature extraction ability of the speaker verification system in power dispatching,a speaker verification model was proposed based on wavelet packet cepstral coefficient and emphasized channel attention,propagation and aggregation in time delay neural nework(ECAPA-TDNN).On the basis of mel-frequency cepstral coefficient,the fast Fourier transform was replaced by wavelet packet decomposition,cepstral mean and variance normalization and delta/delta-delta coefficient were added.Furthermore,the input of the model was changed to wavelet packet cepstral coefficient.The experimental results on the TIMIT dataset show that the noise resistance and recognition success rate of the proposed method are improved to varying degrees compared with traditional models.
作者
张志伟
杨可林
冯志常
王天俣
ZHANG Zhiwei;YANG Kelin;FENG Zhichang;WANG Tianyu(State Grid Heze Power Supply Company,Heze 274002,China)
出处
《山东电力技术》
2023年第2期52-57,共6页
Shandong Electric Power
基金
国网山东省电力公司科技项目“基于声纹识别的身份验证在调度指令通话过程中的应用研究”(5206002000UD)。
关键词
电力调度
说话人确认
小波包分解
小波包倒谱系数
ECAPA-TDNN
electric power dispatching
speaker verification
wavelet packet decomposition
wavelet packet cepstral coefficient
ECAPA-TDNN