期刊文献+

基于机器学习的数据类型识别方法

Data Type Recognition Method Based on Machine Learning
下载PDF
导出
摘要 为优化数据类型识别技术,进一步完善数据类型识别的方法,改善当前数据类型识别难以识别出复合文件的问题。笔者通过对8种常见的数据类型进行实验,初步选定朴素贝叶斯等几种分类算法,并提出基于支持向量机(Support Vector Machine,SVM)的多方面参数选定方法,然后依据新的数据类型识别方法与传统文件类型分别进行对比实验,同时确定数据类型识别的函数分析方法。通过实验可知,基于SVM支持向量机算法的数据类型识别方法建模时间长,但识别率高,被认定为以后要采用的新的基于机器学习的数据类型识别方法。 In order to optimize the technology of data type identification,the method of data type identification is further improved to improve the problem that the current data type identification is difficult to identify composite files.Based on the experiment of 8 kinds of common data types,and preliminary selected such as naive bayes classification algorithm,and puts forward the various parameter selection method based on support vector machine SVM,then on the basis of new data types identification method and the comparative experiments with traditional file types respectively,at the same time to determine the data type recognition function analysis method.It can be known through experiments,the data type recognition method based on SVM support vector machine algorithm takes a long time to model,but the recognition rate is high,which is identified as a new data type recognition method based on machine learning to be adopted in the future.
作者 李锐 LI Rui(School of Mathematics and Computational Science,Hunan University of Science and Technology,Xiangtan Hunan 411201,China)
出处 《信息与电脑》 2021年第16期150-153,共4页 Information & Computer
关键词 机器学习 文件类型 支持向量机算法 文件碎片 machine learning file type support vector machine algorithm file fragmentation
  • 相关文献

参考文献7

二级参考文献58

  • 1童宗鹏,章艺,沈荣瀛,华宏星.基于频响函数灵敏度分析的舰艇模型修正[J].上海交通大学学报,2005,39(11):1847-1850. 被引量:5
  • 2刘增宏,许建平,修义瑞,孙朝辉.参考数据集对Argo剖面浮标盐度观测资料校正的影响[J].海洋预报,2006,23(4):1-12. 被引量:10
  • 3郑洁,罗军勇,芦斌.基于统计特征值的文件类型识别算法[J].计算机工程,2007,33(1):142-144. 被引量:7
  • 4Martin Karresand, Nahid Shahmehri. File Type Identification of Data Fragments by Their Binary Structure [ J ]. IEEE Information Assurance Workshop, 2006 ( 6 ).
  • 5Irfan Ahmed, Kyung-suk Lhee, Hyunjung Shin, et al. On Improving the Accuracy and Performance of Content-Based File Type Identification [ C ]//Lecture Notes In Computer Science Vol. 5594.
  • 6Li W J, Wang K, Stolfo S, et al. Fileprints : Identifying File Types by n- gram Analysis [ C ]//Workshop on Information Assurance and security (IAW 2005 ), United States Military Academy, West Point, NY, 2005:64 - 71.
  • 7Sarah J Moody, Robert F Erbacher. SaDI-Statistical Analysis for Data type Identification [ C ]//2008 Third International Workshop on Sys- tematic Approaches to Digital Forensic Engineering.
  • 8Roussev Vassil, Garfinkel Simson. File Fragment Classification -The Case for Specialized Approaches,Systematic Approaches to Digital Fo- rensics Engineering[ C ]//IEE/SADFE 2009, Oakland, California.
  • 9Garfinkel S. Carving contiguous and fragmented files with fast object validation [ C ]//Proc. 2007 Digital Forensics Research Workshop ( DFRWS), Pittsburgh, PA, Aug. 2007,4S:2 - 12.
  • 108 - jpeg - search, zip. http ://dftt. sourceforge, net/test8/index, html.

共引文献38

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部