期刊文献+

一种融入可信度的集成SVM垃圾书签检测方法

An Ensemble SVM Approach Integrated with Confidence for Detecting Bookmark Spam
原文传递
导出
摘要 针对现有垃圾书签检测方法在用户概貌信息较少情况下检测性能下降的问题,提出一种融入可信度的集成SVM垃圾书签检测方法.首先基于Bootstrap技术对训练样本进行可重复采样,得到个体SVM的训练子集,然后将SVM的标准输出直接拟合Sigmoid函数得到SVM的后验概率输出,作为类别输出的可信度,并提出一种性能优于投票策略的融入可信度的融合方法对个体SVM的输出结果进行融合.实验结果表明,该方法在用户概貌信息较少的情况下具有较好的检测性能. The performance of existing methods for bookmark spam detection is decreased when there is less user profile information. An ensemble SVM approach integrated with confidence for detecting bookmark spare is proposed to solve this problem. The Bootstrap technology is firstly used to repeatedly sample the training data so as to get the subset of training samples for individual SVM. Then, sigmoid function is use to transform the standard output of SVM into a posterior probability which is used as the confidence of categories output. Finally, a method integrated with the confidence is proposed to aggregate the output of individual SVM, which is better than voting strategy. The experimental results show that the detection performance of the proposed approach outperforms the existing methods in the case of less user profile information.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2011年第4期591-596,共6页 Pattern Recognition and Artificial Intelligence
基金 国家973重点基础研究发展计划(No.2005CB321902) 河北省自然科学基金项目(No.F2008000877 F2011203219) 教育部科技发展中心网络时代的科技论文快速共享专项研究课题(No.20091333110011)资助
关键词 垃圾书签 垃圾检测 支持向量机 可信度 集成学习 Bookmark Spam, Spam Detection, Support Vector Machine, Confidence, EnsembleLearning
  • 相关文献

参考文献13

  • 1Koutrika G, Effendi F A, Gyongyi Z, et al. Combating Spare in Tagging Systems// Proc of the 3rd International Workshop on Adversarial Information Retrieval on the Web. Banff, Canada, 2007: 57 - 64.
  • 2Heymann P, Koutrika G, Gareia-Molina H. Fighting Spam on Social Web Sites - A Survey of Approaches and Future Challenges. IEEE Internet Computing, 2007, 11 (6) : 36 - 45.
  • 3Krause B, Schmitz C, Hotho A, et al. The Anti-Social Tagger-Detecting Spam in Social Bookmarking Systems// Proc of the 4th International Workshop on Adversarial Information Retrieval on the Web. Beijing, China, 2008 : 61 - 68.
  • 4Kyriakopoulou A, Kalamboukis T. Combining Clustering with Classification for Spare Detection in Social Bookmarking Systems// Proc of the International Workshop at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Antwerp, Belgium, 2008:47 -54.
  • 5Markines B, Cattuto C, Menczer F. Social Spare Detection// Proc of the 5th International Workshop on Adversarial Information Retrieval on the Web. Madrid, Spain, 2009:41 -48.
  • 6Gkanogiannis A, Kalamboukis T. A Novel Supervised Learning Algorithm and Its Use for Spam Detection// Proc of the International Workshop at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Antwerp, Belgium, 2008 : 13 -20.
  • 7Lin H J, Yeh J P. Optimal Reduction of Solutions for Support Vector Machines. Applied Mathematics and Computation, 2009, 214 (2) : 329 - 335.
  • 8Rokach L. Ensemble-Based Classifiers. Artificial Intelligence Review, 2010, 33(1/2): 1 -39.
  • 9Hanczar B, Nadif M. Using the Bagging Approach for Biclusterlng of Gene Expression Data. Neurocomputing, 2001, 74 (10) : 1595 - 1605.
  • 10Acevedo F J, Maldonado S, Dominguez E, et al. Probabilistic Support Vector Machines for Muhi-Class Alcohol Identification. Sensors and Actuators B: Chemical, 2007, 122 (1) : 227 -235.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部