期刊文献+

基于Nesterov平滑的高阶路径朴素贝叶斯文本隐式分类研究 被引量:2

On High Order Path Naive Bayes Text Classification Based on Nesterov Smoothing
下载PDF
导出
摘要 为提高电子文本分类效果,解决独立同分布模型在标记数据不足时存在的参数估计问题,提出了一种基于Nesterov平滑的高阶路径朴素贝叶斯文本分类算法.首先,利用传统意义的朴素贝叶斯事件模型构建高阶路径形式的文本分类模型,利用高阶路径中的隐式链接信息来提高文本分类模型的性能;其次,针对朴素贝叶斯事件模型中采用拉普拉斯平滑的二阶差分过程容易产生信息丢失、噪声增强的问题,提出基于Nesterov平滑的高阶路径朴素贝叶斯文本分类改进算法;最后,通过基准数据集和图书馆电子文本分类实验,验证了所提算法的有效性. In order to improve the classification effect of electronic text,and to solve the problem of parameter estimation in insufficient labeled data,a new method of text classification based on Nesterov smoothing has been proposed.Firstly,the text classification model based on the traditional meaning of naive Bayesian event model is constructed,which can improve the performance of text classification model with implicit link information in higher order path;Secondly,according to the naive Bayes model for events in the Laplacian smoothing of second order difference process tends to result in information loss and noise generated on the strengthening of the role of the problem,the Nesterov smooth high order path naive Bayes text classification algorithm has been put forward;Finally,the effectiveness of the proposed algorithm is verified by the benchmark data set and the electronic text classification experiment of the library.
作者 邓广彪 黄振功 岳晓光 DENG Guang-biao;HUANG Zheng-gong;YUE Xiao-guang(School of Mathematics and Computer Sciences,Guangxi Normal University for Nationalities,Chongzuo guangxi 532200,China;Department of Engineering Management,Wuhan University,Wuhan 430070,China)
出处 《西南师范大学学报(自然科学版)》 CAS 北大核心 2018年第7期107-112,共6页 Journal of Southwest China Normal University(Natural Science Edition)
基金 2015年度广西高校科学技术研究项目(KY2015LX539)
关键词 文本分类 Nesterov平滑 高阶路径 朴素贝叶斯 图书馆文本 text categorization Nesterov smoothing higher order path plain Bias library text
  • 相关文献

参考文献7

二级参考文献122

  • 1宋枫溪,高秀梅,刘树海,杨静宇.统计模式识别中的维数削减与低损降维[J].计算机学报,2005,28(11):1915-1922. 被引量:44
  • 2张鹏,童云海,唐世渭,杨冬青,马秀莉.一种有效的隐私保护关联规则挖掘方法[J].软件学报,2006,17(8):1764-1774. 被引量:53
  • 3廖述梅.基于本体的语义标注原型评述[J].计算机工程与科学,2006,28(9):123-125. 被引量:16
  • 4刘华.基于关键短语的文本分类研究[J].中文信息学报,2007,21(4):34-41. 被引量:14
  • 5Verykios V S,Bertino E,Fowno I N,Provenza L P,Saygin Y,Theodoridis Y.State-of-the-art in privacy preserving data mining.SIGMOD Record,2004,33(1):50-57
  • 6Agrawal R,Srikant R.Privacy-preserving data mining//Proceedings of the 2000 ACM SIGMOD Conference on Management of Data.Dallas,Texas,USA,2000:439-450
  • 7Evfimievski A.Randomization in privacy preserving data mining.SIGKDD Explorations,2002,4(2):43-48
  • 8Agrawal D,Aggarwal C C.On the design and quantification of privacy preserving data mining algorithms//Proceedings of the 20th ACM Symposium on Principles of Database Systems.Santa Barbara,California,USA,2001:247-255
  • 9Du W L,Zhan Z J.Using randomized response techniques for privacy-preserving data mining//Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Washington DC,USA,2003:505-510
  • 10Rizvi S J,Haritsa J R.Maintaining data privacy in association rule mining//Proceedings of the 28th International Conference on Very Large Data Bases.Hong Kong,China,2002:682-693

共引文献148

同被引文献26

引证文献2

二级引证文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部