摘要
互联网流量分类是识别网络应用和分类相应流量的过程,这被认为是现代网络管理和安全系统中最基本的功能。与应用相关的流量分类是网络安全的基础技术。传统的流量分类方法包括基于端口的预测方法和基于有效载荷的深度检测方法。在目前的网络环境下,传统的方法存在一些实际问题,如动态端口和加密应用,因此采用基于流量统计特征的机器学习(ML)技术来进行流量分类识别。机器学习可以利用提供的流量数据进行集中自动搜索,并描述有用的结构模式,这有助于智能地进行流量分类。起初使用朴素贝叶斯方法进行网络流量分类的识别和分类,对特定流量进行实验时,表现较好,准确度可达90%以上,但对点对点传输网络流量(P2P)等流量识别准确度仅能达到50%左右。然后有使用支持向量机(SVM)和神经网络(NN)等方法,神经网络方法使整体网络流量的分类准确度能达到80%以上。多项研究结果表明,对于多种机器学习方法的使用和后续的改进,很好地提高了流量分类的准确性。
Internet traffic classification is a process of identifying network applications and classifying corresponding traffic,which is considered as the most basic function of modern network management and security system.And applicationrelated traffic classification is the basic technology of recent network security.Traditional traffic classification methods include port-based prediction methods and payload-based depth detection methods.In current network environment,there are some practical problems in traditional methods,such as dynamic ports and encryption applications.Therefore,Machine Learning(ML)technology based on traffic statistics is used to classify and identify traffic.Machine learning can realize centralized automatic search by using provided traffic data and describe useful structural patterns,which is helpful to intelligently classify traffic.Initially,Naive Bayes method was used to identify and classify network traffic classification,performing well on specific flows with accuracy over 90%,while on traffic such as peer-to-peer transmission network traffic(P2P)with accuracy only about 50%.Then,methods such as Support Vector Machine(SVM)and Neural Network(NN)were used,and neural network method could make accuracy of overall network classification reach 80%or more.A number of studies show that the use of a variety of machine learning methods and their improvements can improve the accuracy of traffic classification.
作者
邹腾宽
汪钰颖
吴承荣
ZOU Tengkuan;WANG Yuying;WU Chengrong(School of Computer Science,Fudan University,Shanghai 200433,China;Engineering Research Center of Cyber Security Auditing and Monitoring,Ministry of Education,Shanghai 200433,China)
出处
《计算机应用》
CSCD
北大核心
2019年第3期802-811,共10页
journal of Computer Applications
基金
国家重点研发计划项目(2017YFB0803203)~~
关键词
流量分类
背景流量
机器学习
深度包检测技术
基于行为模式的分类
traffic classification
background traffic
Machine Learning(ML)
Deep Packet Inspection(DPI)technology
classification based on behavior patterns