期刊文献+

基于新型文档频和优化的Tabu搜索的特征选择

Feature selection using new document frequency and improved Tabu search
原文传递
导出
摘要 针对特征选择这一文本分类的核心问题,首先提出一个基于最小词频的文档频方法,然后引进粗糙集和Tabu搜索,分析了把Tabu搜索用于属性约简所存在的问题并给出了解决办法,并以此为基础详细设计了一个基于优化的Tabu搜索的属性约简方法,最后把上述两种方法结合起来提出了一个综合性特征选择方法.该方法利用基于最小词频的文档频方法提取初始特征,利用所给属性约简方法进行优选以消除冗余,从而获得较具代表性的特征子集.实验结果表明该综合方法优于IG,CHI和MI方法. Feature selection is the core research topic in text categorization. A document frequency method based on minimum word frequency was presented. Then RS and Tabu search were introduced, the problems in attribute reduction based on Tabu search were analyzed, and some corresponding solutions were provided. Subsequently, an attribute reduction method based on the improved Tabu search was proposed. Finally, a comprehensive feature selection method based on the above-mentioned two methods was provided. The comprehensive method firstly uses the document frequency method based on minimum word frequency to extract original features, and then employs the proposed attribute reduction method to optimize and eliminate redundancy. Experimental results show that the comprehensive method is betterthan IG, CHI and MI.
作者 朱颢东 钟勇
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2010年第2期4-7,40,共5页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金 四川省科技计划资助项目(2008GZ0003) 四川省科技攻关项目(07GG006-019)
关键词 特征选择 文本分类 文档频 TABU搜索 属性约简 feature selection text categorization document frequency Tabu search attribute reduction
  • 相关文献

参考文献10

二级参考文献53

共引文献255

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部