期刊文献+

基于聚类分析的网络流量高斯混合模型 被引量:2

Gaussian Mixture Model of Network Traffic Based on Clustering Analysis
下载PDF
导出
摘要 基于聚类算法对数据对象多个属性综合聚类的特点,研究网络流量的GMM模型及其在数据流尺度上的Log-normal分布。用EM算法研究了具有交互特征的网络流量的分类;通过与K-means算法比较,讨论了EM算法在流量聚类中的适用性;通过平衡和不平衡流量的聚类分析,研究了不同类型流量GMM建模的有效性。研究流量的幂律关系及其在不同尺度间的传递性,用户行为和应用程序特征通过传输层控制协议分解传递到IP层后,在数据包尺度上表现出分形和自相似性,在数据流尺度上表现出Log-normal分布。 The cluster algorithm may make classification on a few attributes of objects.Based on the above feature,this paper studies the Gaussian mixture model(GMM) of network traffic and its log-normal distribution on flow scale.The EM algorithm is used to cluster traffics with interactive features.It is shown that EM algorithm is more appropriate on traffic clustering than K-means algorithm.The clustering analysis on both the balanced and unbalanced traffics shows that GMM is effective on different kinds of traffics.The log-normal distribution and the transitivity of power law from application layer to IP layer are studied. After the log-normal distribution in application layer produced by user behaviors and application features is transferred to IP layer via the control protocols in transport layer,the traffic presents fractal and self-similar on the packet scale.
作者 程华 房一泉
出处 《华东理工大学学报(自然科学版)》 CAS CSCD 北大核心 2010年第2期255-260,共6页 Journal of East China University of Science and Technology
关键词 高斯混合模型 EM算法 聚类 Log-normal分布 幂律关系 Gaussian mixture model EM algorithm clustering Log-normal distribution power law
  • 相关文献

参考文献12

  • 1Moore A W, Zuev D. Internet traffic classification using bayesian analysis techniques [J]. ACM SIOMETRICS Performance Evaluation Review, 2005, 33(1):50-60.
  • 2Erman J, Mahanti A, Arlitt M. Traffic classification using clustering algorithms[C] // The 2006 SIGCOMM Workshop on Mining Network Data. USA:ACM Press, 2006:281-286.
  • 3McGregor A, Hall M, Lorier P, et al. Flow clustering using machine learning techniques [C]//The Fifth International Workshop in PAM 2004. France:Springer, 2004:205-214.
  • 4Erman J, Mahanti A, Arlitt M. Internet traffic identification using machine learning[C] //The 49th IEEE Global Telecommunications Conference ( GLOBECOM 2006 ). USA: IEEE Computer Society, 2006 : 1-6.
  • 5张景顺,李名楚.基于统计学的网络业务流建模及分类研究[EB/OL].[2009-4-20].http://www.paper.edu.cn/down-loadpaper.php?seriaLnumber=200811-288.
  • 6Tan Pangning,Steinbach M,Kumar V.数据挖掘导论[M].范明,范宏建,译.北京:人民邮电出版社,2006:241-327
  • 7Antoniou I, Ivanov V V, Ivanov Valery V, etal, On the lognormal distribution of network traffic[J].Physica D: Nonlinear Phenomena, 2002,167(1-2) :72-85.
  • 8van de Meent R, Mandjes M R H, Pras A. Gaussian traffic everywhere? [C]//IEEE International Conference on Communications. Turkey: IEEE Computer Society, 2006: 573- 578.
  • 9林智勇,郝志峰,杨晓伟.不平衡数据分类的研究现状[J].计算机应用研究,2008,25(2):332-336. 被引量:46
  • 10Schwardt Ludwig. Gaussian mixture models [ EB/OL]. [2009-4-20]. http://staff.ee. sun. ac, za/-dupreez/pr813/ lectures/lecture06/pr813_lecture06, pdf.

二级参考文献55

  • 1KUBAT M, HOLTE R C, MATWIN S. Machine learning for the detection of oil spills in satellite radar images[ J] . Machine Learning, 1998, 30 ( 2- 3) : 195 -215 .
  • 2PHUA C, ALAHAKOON D. Minority report in fraud detection: classication of skewed data[ J] . SIGKDD Exp lorations, 2004 , 6 ( 1 ) :50- 59 .
  • 3PEREZ J M, MUGUERZA J, ARBELAITZ O, et al. Consolidated tree classifier learning in a car insurance fraud detection domain with class imbalance[ C] / / Proc of the 3rd International Conference on Advances in Pattern Recognition( ICAPR’05) . 2005 : 381- 389.
  • 4CASTILLO M D del, SERRANO J I. A multistrategy approach for digital text categorization from imbalanced documents [ J] . SIGKDD Exploration s, 2004, 6 ( 1) : 70- 79 .
  • 5ZHENG Zhao-hui, WU X, SRIHARI R K. Feature selection for text categorization on imbalanced data [ J] . SIGKDD Explorat ions,2004, 6 ( 1) : 80 - 89.
  • 6COHEN G, HILARIO M, SAX H, et al. Data imbalance in surveillance of nosocomial infections[ C] / / Proc of the 4th International Symposium on Medical Data Analysis ( ISMDA’03 ) . Berlin: [ s. n. ] ,2003: 109-117 .
  • 7CHEN Jian-xun, CHENG T H, CHAN A L F, et al. An application of classification analysis for skewed class distribution in therapeutic drug monitoring the case of vancomycin[ C] / / Proc of Workshop on Medical Information Systems ( IDEAS-DH’04 ) . Beijing: [ s. n. ] ,2004: 35 - 39.
  • 8YOON K, KWEK S. An unsupervised learning approach to resolving the data imbalanced issue in supervised learning problems in functional genomics[ C] / / Proc of the 5th International Conference on Hybrid Intelligent Systems( HIS’05 ) . Rio de Janeiro: [ s. n. ] , 2005 : 303-308.
  • 9RADIVOJAC P, KORAD U, SIVALINGAM K M, et al. Learning from class-imbalanced data in wireless sensor networks[ C] / /Proc of Vehicular Technology Conference( VTC’03-Fall) . Orlando: [ s. n. ] ,2003: 3030- 3034 .
  • 10JAPKOWICZ N, STEPHEN S. The class imbalance problem: a systematic study[ J] . Intelligent Data Analysis, 2002, 6 ( 5 ) : 203-231.

共引文献70

同被引文献18

  • 1IEC 61850-6 Ed.2 Configuration description language for communication in electrical substations related to IEDs[S]. Washington DC: International Electrotechnical Commission, 2004.
  • 2Leland W E, Taqqu M S, Willinger W, et al. On the self-similar nature of Ethemet traffic[J]. IEEE/ACM Trans on Networking, 1994, 2(1): 1-15.
  • 3Riedi R H. An improved multifractal formalism and self-similar measure[J]. Math Analysis Application, 1995, 189(9): 462-490.
  • 4Antoniou I, Ivanov V V, Valery V 1, et al. On the log-normal distribution of network traffic[J]. Physica D, 2002, 167(12).. 72-85.
  • 5Downey A B. Lognormal and Pareto distributions in the internet [J]. Computer Communication, 2005, 28(5): 790-801.
  • 6Cheng Y. Statistical multiplexing admission region and contention window optimization in multicasts wireless LANs[J]. Wireless Networks, 2009, 15(1): 73-86.
  • 7Vaclav S, Anthony Q. The variational bayes method in signal processing[M]. Berlin: Springer, 2006: 25-56.
  • 8Lappalainen H, Miskin J. Ensemble learning advances in independent componentanalysis[M]. London: Springer UK.. 2000, 75-92.
  • 9Fery B J, Dueck D. Clustering by passing messages between data points[J]. Science, 2007, 315(2): 973-977.
  • 10Tan Pangning,Steinbach M,Kumar V.数据挖掘导论[M].范明,范宏建,译.北京:人民邮电出版社,2006:241-327

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部