摘要
为获得说话人发音特征,基于仿生思想,提出一种基于语谱图统计的方法,通过对说话人短时语谱图的线性叠加获得可表征说话人稳定发音特征的特征语谱图。为解决资源受限的设备中说话人识别系统网络训练速度慢、识别效率低的问题,基于传统自组织映射(self-organizing feature map,SOM)神经网络提出了一种自适应聚类SOM (adaptive clustering-SOM,ACSOM)算法,随着待识别说话人数的增加,自动调节增加竞争层神经元个数,直至聚类数达到说话人个数。采用该AC-SOM模型对100人的自建特征语谱图样本库进行聚类识别,最大训练时间只需304 s,最大单张识别时间小于28 ms;在识别人数相同时,相对于所对比的其他识别方法,该方法大大提升了网络训练速度和识别速度,满足了边缘智能(edge intelligence)系统中对数据处理与执行的实时性的要求。
To obtain a speaker’s pronunciation characteristics,a spectrogram statistics method based on bionics idea was proposed. This method used a linear superposition of short-time spectrograms to achieve a characteristic spectrogram,giving a stable representation of the speaker’s pronunciation. To deal with the issue of slow network training and recognition speeds for speaker recognition systems on resource-constrained devices,an adaptive clustering self-organizing feature map SOM( AC-SOM) algorithm based on a traditional SOM neural network was proposed. As the number of speakers to be recognized increases,the number of neurons in the competition layer was automatically adjusted until the number of clusters reaches the number of speakers. A 100-speaker database of characteristic spectrogram samples was built and applied AC-SOM model to it,yielding a maximum training time of only 304 s,with a maximum sample recognition time of less than 28 ms. Compared with applying other approaches to the same number of people,the method offers greatly improved training and recognition speeds. This means it can potentially satisfy the real-time data processing and execution requirements of edge intelligence systems more easily than previous speaker recognition methods.
作者
贾艳洁
陈曦
于洁琼
王连明
JIA Yan-jie;CHEN Xi;YU Jie-qiong;WANG Lian-ming(Institute of Computational Intelligence,School of Physics,Northeast Normal University,Changchun 130024,China)
出处
《科学技术与工程》
北大核心
2019年第15期211-218,共8页
Science Technology and Engineering
基金
国家自然科学基金(21227008)
吉林省科技发展计划项目(20170204035GX)资助
关键词
说话人识别
特征语谱图
自适应聚类
神经网络
统计
深度学习
speaker recognition
characteristic spectrogram
adaptive clustering
neural network
statistics
deep learning