摘要
文本表示中特征项的权值确定方法决定了文本特征的提取,在很大程度上影响了文本分类的准确率.通过系统总结常用的几种特征项权值的确定方法,并逐一比较分析和研究,提出了一种性能较好的确定方法——据位定权函数,经实验验证据位定权函数确实能够有效地提高文本分类的准确性.
The method of feature weighting ascertainment in text categorizing determines the text feature pick-ups. And the accuracy of text categorizing often depends on the method of ascertaining weighting feature in text pick-ups. Some commonly used methods are outlined here. By comparing, analyzing and studying them, a better method of feature weighting ascertainment is presented here. This method can improve the accuracy of text categorizing efficiently, which is proved by experiments.
出处
《甘肃科学学报》
2005年第3期86-89,共4页
Journal of Gansu Sciences
基金
教育部"春辉计划"(20455)
甘肃省科技攻关计划项目(ZGS045-352-009)
光电技术与智能控制教育部重点实验室(兰州交通大学)开放基金资助项目(K040103)
关键词
文本分类
特征项
权值确定
text categorization
feature
weighting ascertainment