摘要
特征选择方法是文本自动分类中的一项关键技术,提出了一种基于量子遗传算法的文本特征选择新方法,该方法用量子比特对文本向量进行编码,用量子旋转门和量子非门对染色体进行更新,同时,针对信息过滤的特点,对适应度函数进行了改进,充分考虑了特征权值、文本相似度和向量维数等。实验证明,该方法可以极大地降低文本的维数,提高分类的准确率。
Feature selection method is the critical technique of the automatic text categorization.The paper presents a new method of the text feature selection based on the quantum genetic algorithm.In the method,the text vector is coded by quantum bit,and the chromosome is updated by the quantum rotating gate and quantum not-gate.Meanwhile,according to the characteristics of the information filtering,we consider adequately on the feature, weight,text similarity and vector dimension in order to im- prove the fitness function.The experiment has proved that the method can reduee the dimension of text vector and improve the precision of text classification.
出处
《计算机工程与应用》
CSCD
北大核心
2008年第25期140-142,154,共4页
Computer Engineering and Applications
基金
山东省自然科学基金No.Y2006G20~~
关键词
文本分类
特征选择
量子遗传算法
text categorization
feature selection
quantum Genetic Algorithm