摘要
本文提出利用百度百科这个开放的中文知识百科全书来计算知识领域间关联度的方法。通过抽取百度百科中词条的解释和分类信息,并经过分词以后,我们可以用向量空间模型(VSM)量化分类中的词条,然后提出用领域内词条的关联矩阵迭代的方法来计算单个领域中各个词条的权重。要计算2个领域的相关度,首先要分别算出它们各自领域中每个词条的权重,而后通过扩展向量空间的方式把它们的概念空间扩展为一个公共的向量空间,并在此公共向量空间中用余弦夹角的方法计算2个领域的相关度。该研究成果可以辅助我们发现领域间的关联,加快不同领域间知识的融合。
As modern society diversification develops ,interdisciplinary studies have turned out to be the inherent need of this irreversible trend. But the problem is that there are thousands of well-developed subjects in the world and the discovery of possibility to integrate different domains can only be handled by specialists in different domains separately. The reason for this is obviously that no one can master all knowledge in all domains. Therefore ,an algorithm should be brought out to calculate the relevance between two domains. By this method,it can figure out which domain is more relevant with a specified domain,and thus it might be possible to start a cross-domain research or build up a new subject.
出处
《广西师范大学学报(自然科学版)》
CAS
北大核心
2011年第4期28-34,共7页
Journal of Guangxi Normal University:Natural Science Edition
基金
国家自然科学基金资助项目(70871115)
法律信息元数据及其语义检索研究规划基金资助项目(08JA820039)