摘要
为了对用户访问过并感兴趣的网页进行准确描述,分析了对网页特征描述中涉及到的特征抽取范围以及特征词权重计算方法。根据"主题相关词非线性加权的方法"提出了一种改进特征词权重计算的方法,该方法不仅考虑了出现在标题中的特征词的重要性,而且利用非线性函数对特征词出现频率的处理思想,使得权重的计算更加准确。使用改进的特征权重计算方法提高了网页特征描述的准确性,从而提高了用户个性化搜索的效率。
In order to accurately describe the Web pages that users have visited and been interested in,it analyzes the scope of characteristic extraction and the method used to compute the weight of characteristic words in the page characteristic description.According to"A nonlinear weighted method of handling related topic words",an improved method based on the weight of characteristic words is raised.In this new method,it considers the importance of characteristic words in the title, and gives an idea using nonlinear-function to process the frequency of characteristic words,which will make the weight calculation more precise and will increase the accuracy of the page characteristic description.As a result,the efficiency of the user’s personalized searching can be enhanced.
出处
《计算机工程与应用》
CSCD
北大核心
2011年第11期94-97,共4页
Computer Engineering and Applications
关键词
个性化搜索
网页特征
权重计算
特征词
非线性函数
personalized search
Web page characteristics
term-weighing
characteristic words
non-linear function