期刊文献+

广播语音的音频分割 被引量:11

Broadcasting Segmentation
下载PDF
导出
摘要 本文的广播电视新闻的分割系统分为三部分 :分割、分类和聚类。分割部分是采用本文提出的基于检测熵变化趋势的分割算法来检测连续语音音频信号的声学特征跳变点 ,从而实现不同性质的音频信号的分割。这种检测方法不同于传统的需要门限的跳变点检测方法 ,它是以检测一定窗长的信号内部的每一个可能的分割点所分割的两段信号的信号熵的变化趋势来检测音频信号声学特征跳变点的 ,可以避免由于门限的选择不当所带来的分割错误。分类部分是采用传统的基于高斯混合模型 (GMM )的高斯分类器进行分类 ,聚类部分采用基于矢量量化 (VQ)的说话人聚类算法进行说话人聚类。应用此系统分割三段 30分钟的新闻 ,成功的实现了连续音频信号的分割 ,去除掉了所有的背景音乐 ,以较高的精度把属于同一个人的说话语音划归为一类 。 Speaker change point detection based on BIC criterion is the most widely used method in speaker change detection in broadcasting segmentation.Although the author asserts that this method is free from threshold,the BIC value of a change point must above 0 is too strict for some short utterance.Because speakers are different from each other,the BIC value of two different speakers is spread over a large range in our test.In this paper,a speaker change detection method based on entropy changing trend is used to locate the change point in a sliding window with definite length.The entropy change trend is tested for every hypothesized speaker change point in the window.By this change trend detection,the threshold is avoided successfully,which makes the proposed speaker change detection method is possible for the detection of different kinds of speaker change and the speaker change of the short utterance.
出处 《中文信息学报》 CSCD 北大核心 2002年第1期37-42,共6页 Journal of Chinese Information Processing
基金 国家自然科学基金重点项目 (6 9835 0 0 3) 国家"973"项目 (G19980 30 5 0 4 )
关键词 广播语音 音频分割 声学特征跳变点检测 BIC准则 熵变化趋势 语音处理 broadcasting segmentation speaker change detection BIC criterion Entropy change trend
  • 相关文献

参考文献6

  • 1[1]R. Bakis et al., Transcription of broadcast news shows with the IBM large vocabulary speech recognition system, proceedings of the Speech Recognition Workshop, 1997,67-72,1997
  • 2[2]F. Kubala et al. The 1996 BBN Byblos Hub-4 transcription system, Proceedings of the Speech Recognition Workshop, 1997,90-93
  • 3[3]M. Siegler, U. Jain, B. Ray and R. Stem, Automation segment, classification and clustering of broadcast news audio, Proceedings of the Speech Recognition Workshop, 1997,97-99
  • 4[4]S. Chen and P. S. Gopalakrishnan, Speaker, Environment and Channel Change Detection and Clustering via Bayesian Information Criterion, Proceedings of the Speech Recognition Workshop, 1998
  • 5[5]azumasa MORI and Seiichi NAKAGAWA, Speaker Change Detection and Speaker Clustering Using VQ Distortion For Broadcast News Recognition,Proceedings of ICASSP 2001
  • 6[6]V.V. Digalakis,P. Monaco,andH. Murveit,Generalized MixtureTying in Continuous Hideen Markov ModelBased Speech Recognizers, IEEE Transactions On Speech and Audio Processing,1996,4(4) :281-288

同被引文献97

引证文献11

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部