摘要
文本分类的研究者一直在提高文本的分类精度方面做着不懈的努力,在实验中发现,相似主题的文档的分类错误率比较高,该文尝试着提出了一种二次权重分配的新的特征权值分配策略,构造了一种计算难以区分的主题类别的特征辨别能力的权值函数,目的是减少相似主题类别的文档的分类错误。
The researchers on text classification are working assiduously in promoting the precision of classification.In many experiments,we find that the error rate is high in the documents of similar classes,therefore,we attempt to point out a new weight distribution method which named second weight distribution.We define a function which can measure the strength of different feature term in distinguishing a pair of hard and similar classes so that the error rate in the documents of similar classes can be reduced.
出处
《计算机工程与应用》
CSCD
北大核心
2004年第13期185-188,共4页
Computer Engineering and Applications
基金
国家自然科学基金重大项目(编号:79990580)
国家973重点基础研究发展规划项目(编号:G1998030414)资助