基于GLR距离和BIC的混合音频分割算法被引量：3

Hybrid approach for audio segmentation based on GLR distance and BIC

下载PDF

导出

摘要针对传统单一音频分割算法中存在的冗余分割点过多问题,研究了一种基于一般似然比(GLR)和贝叶斯信息准则(BIC)相结合的广播音频顺序分割算法,提出了候选跳变点潜在区域的判断准则,并给出跳变点在潜在区域的检测方法,最后对检测到的跳变点进行校验。实验结果表明,与传统的音频分割算法相比,该算法的综合性能大大提高,达到较好的分割效果。 Due to traditional single audio segmentation algorithm suffers from a large amount of redundancy change points, a hybrid approach for audio sequential segmentation in broadcasting based on generalized likelihood ratio （GLR） and Bayesian Information Criterion （BIC） is proposed. The criterion of potential region of candidate change point and the detection method of change point is presented, and the validation of true change points is given. Compared with the algorithms of traditional audio segmentation, the results show that this approach is effective and feasible.

作者郑继明俞佳

机构地区重庆邮电大学应用数学研究所重庆邮电大学计算机科学与技术学院

出处《计算机工程与设计》 CSCD 北大核心 2009年第13期3120-3123,共4页 Computer Engineering and Design

基金重庆市教育委员会科学技术研究基金项目(KJ080524)

关键词广播音频分割一般似然比贝叶斯信息准则声学特征跳变点校验 broadcasting segmentation generalized likelihood ratio （GLR） Bayesian information criterion （BIC） acoustic change points validation

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献8

1Zhou Bowen,Hansen John H L.Efficient audio stream segmentation via the combined T2 statistic and Bayesian information criterion[J].IEEE Transactions on Speech and Audio Processing,2005,13(4):467-474.
2Nishida Masafumi,Kawahara Tatsuya.Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing[J].IEEE Transactions on Speech and Audio Processing,2005,13(4):583-592.
3Zhou Bowen,Hansen John H L.Unsuporvised audio stream segmentation and clustering via Bayesian information criterion[C].Proceedings of the International Conference of Spoken Language Processing,2000:714-717.
4Zhang Shilei,Zhnng Shuwu,XU Bo.A two-level method for unsupervised speaker-based audio segmentation[C].Proceedings of the 18th International Conference on Pattern Recognition,2006:298-301.
5Gangadharaiah Rashmi,Narayanaswamy B,Balakrishnan N.A novel method for two-speaker segmentation[C].Proceedings of the 8th International Conference on Spoken Language,2004:2337-2340.
6卢坚,毛兵,孙正兴,张福炎.一种改进的基于说话者的语音分割算法[J].软件学报,2002,13(2):274-279. 被引量：17
7Cheng Shi-sian,Wang Hsin-min.METRIC-SEQDAC:a hybrid approach for audio segmentation[C].Proc of the International Conference of Spoken Language Processing,2004:1617-1620.
8Cheng Shi-sian,Wang Hsin-min.A sequential metric-based audio segmentation method via the Bayesian information criterion[C].Proceedings of Euro Speech,2003:945-948.

二级参考文献11

1Delacourt, P., Wellekens, C.J. DISTBIC: a speaker-based segmentation for audio data indexing. Speech Communication, 2000,32(1～2):111～126.
2Guo, Xue-feng, Zhu, Wei-bin, Shi, Qiu. The IBM LVCSR system used for 1998 Mandarin broadcast news transcription evaluation. In: Proceedings of the 1999 DARPA Broadcast News Workshop. 1999. http://www.nist.gov/.
3Bakis, R., Chen, S., Gopalakrishnan, P.S., et al. Transcription of broadcast news shows with the IBM large vocabulary speech recognition system. In: Proceedings of the DARPA Speech Recognition Workshop. Chantilly, 1997. 67～72.
4Wegmann, S., Zhan, P., Gillick, L. Progress in broadcast news transcription at Dragon systems. In: Proceedings of the ICASSP'99, Vol. 1. Phoenix, Arizona: IEEE. 1999. 33～36.
5Siegler, M.A., Jain U., Raj, B., et al. Automatic segmentation, classification, and clustering of broadcast news audio. In: Proceedings of the DARPA Speech Recognition Workshop. Chantilly, 1997. 97～99.
6Cover, T.M., Tomas, J.A. Elements of Information Theory. New York: John Wiley & Sons, 1991. 1197-1208.
7Gish, H., Schmidt, N. Text-Independent speaker identification. IEEE Signal Processing Magazine, 1994,11(4):18～32.
8Chen, S.S., Gopalakrishnan, P.S. Clustering via the bayesian information criterion with applications in speech recognition. In: Proceedings of the ICASSP'98, Vol. 2, Seattle, Washington: IEEE, 1998. 645～648.
9Schwarz, G. Estimating the dimension of a model. The Annuals of Statistics, 1978,6:461～464.
10Delacourt, P., Wellejkens, C.J. Audio data indexing: use of second-order statistics for speaker-based segmentation. In: Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS'1999), Vol.2. Florence, Italy: IEEE, 1999. 959～963.

共引文献16

1陈莘萌,陈刚,姚昱.基于最小平均复杂度的矢量量化音频分类方法[J].武汉大学学报（理学版）,2005,51(1):69-73. 被引量：1
2杨新旭,王长山,王东琦,郑丽娜.基于隐马尔可夫模型的入侵检测系统[J].计算机工程与应用,2005,41(12):149-151. 被引量：9
3李超,熊璋,薛玲,刘云.一种阈值自适应调整的实时音频分割方法[J].北京航空航天大学学报,2005,31(12):1317-1321. 被引量：2
4张世磊,张树武,徐波.一种两层次无监督的音频分割算法[J].中文信息学报,2007,21(2):106-111. 被引量：5
5付中华,张艳宁.在线无监督说话人检索中稳健的模型自举算法[J].软件学报,2007,18(3):608-616. 被引量：3
6王志明,周序生.基于定长窗分层检测的音频分割算法[J].中小企业管理与科技,2009(21):296-297.
7王志明,张瑞杰,李弼程.基于分层熵检测的音频分割算法[J].科学技术与工程,2009,9(17):5012-5016. 被引量：1
8王志明,周序生.基于定长窗分层检测的音频分割算法[J].计算机仿真,2009,26(9):350-354. 被引量：1
9王志明.一种有效的音频分割算法[J].湖南理工学院学报（自然科学版）,2009,22(3):37-40. 被引量：3
10于俊清,胡小强,孙凯.改进的音频混合分割方法[J].计算机辅助设计与图形学学报,2010,22(7):1174-1181. 被引量：4

同被引文献16

1Taras Butko,Climent Nadeu. Audio segmentation of broadcast news in the Albayzin-2010 evaluation: Overview, results, and discussion [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2011 (1): 1-10.
2Sebastien Lefevre, Nicole Vincent. A two level strategy for au- dio segmentation[J]. Journal of Digital Signal Processing, 2010, 21 (2): 270-277.
3Dalibor Mitrovic, Matthias Zeppelzauer, Christian Breithene- der. Features for content-based audio retrieval [J]. Journal of Advances in Computer, 2010, 78 (10): 71-150.
4Cheng Shisian, Wang Hsinmin, Fu Hsinchia. BIC-based au- dio segmentation by divide and conquer [C] //International Conference on Acoustics, 2008: 4841-4844.
5王志明,张瑞杰,李弼程.基于分层熵检测的音频分割算法[J].科学技术与工程,2009,9(17):5012-5016. 被引量：1
6张瑞杰,李弼程,屈丹.基于可信度变化趋势的音频分割算法[J].计算机工程,2010,36(8):177-179. 被引量：3
7于俊清,胡小强,孙凯.改进的音频混合分割方法[J].计算机辅助设计与图形学学报,2010,22(7):1174-1181. 被引量：4
8郑继明,张萍.改进的BIC说话人分割算法[J].计算机工程,2010,36(17):240-242. 被引量：7
9郑继明,司可宁.改进的T^2-BIC说话人二级分割算法[J].计算机工程,2011,37(6):291-292. 被引量：1
10籍永生.LPC-10声码器算法研究与实现[J].河南科技,2011,30(12):52-53. 被引量：1

引证文献3

1冷娇娇,赵彤洲,方晖,李翔,李碧.基于方差稳定性度量的乐器音频分割算法[J].计算机工程与设计,2016,37(3):768-772. 被引量：4
2刘景天,姜囡.基于混合特征的说话人语音分割聚类研究[J].光电技术应用,2019,34(5):37-41. 被引量：3
3王琳,阴桂梅,张玉铭.基于BIC准则的音频分割方法[J].电脑编程技巧与维护,2021(2):152-153. 被引量：1

二级引证文献8

1刘莹,赵彤洲,江逸琪,柴悦,李翔.基于自相关函数的钢琴乐音改进识别算法[J].武汉工程大学学报,2018,40(2):208-213. 被引量：6
2刘莹,赵彤洲,邹冲,赵娜.基于频谱包络分析的音乐推荐算法[J].软件导刊,2018,17(6):74-76. 被引量：5
3余琳,姜囡.基于Gammatone滤波器的混合特征语音情感识别[J].光电技术应用,2020,35(3):50-54. 被引量：7
4刘超.基于频谱包络的钢琴乐音仿真模型构建[J].自动化技术与应用,2021,40(6):104-108. 被引量：4
5王冬霞,余佳琪,谭欢,杨文文,张志远.复杂场景下OSAHS鼾声快速检测及辅助诊断算法[J].天津职业技术师范大学学报,2023,33(2):1-6.
6余佳琪,王冬霞,马晓冬,张严.一步优化OSAHS鼾声分类算法[J].实验室研究与探索,2023,42(7):136-140.
7陆思宇,姜囡.典型多说话人语音自动分割算法研究[J].警察技术,2024(2):35-38.
8杨静.基于三维时空域的音符信号切分识别方法研究[J].科技通报,2019,35(9):119-122. 被引量：1

1储岳中.一类基于贝叶斯信息准则的k均值聚类算法[J].安徽工业大学学报（自然科学版）,2010,27(4):409-412. 被引量：15
2郑继明,张萍.改进的BIC说话人分割算法[J].计算机工程,2010,36(17):240-242. 被引量：7
3赵凯,史长琼,张理阳.基于聚类分析的P2P流量识别[J].长沙理工大学学报（自然科学版）,2010,7(3):58-62. 被引量：3
4郑继明,张萍.基于小波变换的音频分割[J].计算机工程与应用,2011,47(7):139-142. 被引量：2
5白志杰,李弼程,彭天强.基于BIC的新闻视频近似重复帧检测方法[J].计算机应用,2009,29(6):1694-1695.
6邸若海,高晓光,郭志高.基于改进BIC评分的贝叶斯网络结构学习[J].系统工程与电子技术,2017,39(2):437-444. 被引量：10
7许明,韩军伟,郭雷,尹文杰.利用模型选择确定视觉词袋模型中词汇数目[J].计算机工程与应用,2011,47(31):148-150. 被引量：3
8于俊清,胡小强,孙凯.改进的音频混合分割方法[J].计算机辅助设计与图形学学报,2010,22(7):1174-1181. 被引量：4
9郭鹏,李乃祥,刘同海.基于进化MCMC的DBN学习算法[J].计算机工程,2011,37(10):143-145.
10谭立球,夏利民,谷士文.基于信息瓶颈算法的图像分割[J].计算机工程,2008,34(18):215-216.

计算机工程与设计

2009年第13期

浏览历史

内容加载中请稍等...

基于GLR距离和BIC的混合音频分割算法被引量：3

参考文献8

二级参考文献11

共引文献16

同被引文献16

引证文献3

二级引证文献8

相关作者

相关机构

相关主题

浏览历史

基于GLR距离和BIC的混合音频分割算法 被引量：3

参考文献8

二级参考文献11

共引文献16

同被引文献16

引证文献3

二级引证文献8

相关作者

相关机构

相关主题

浏览历史

基于GLR距离和BIC的混合音频分割算法被引量：3