摘要
查询扩展是在原查询词的基础上加入相关的词或者词组,以克服自然语言的"二义性"问题,改进查询意愿的描述。在概念语义空间中进行查询词扩展,可以充分挖掘出查询词之间的关联程度,在整体上把握查询意愿。利用WordNet语义词典中的上下文关系和相似度关系为各个原始查询词构建语义树,并将这些语义树向上溯源建立完整的概念语义空间,以共现信息为特征参数对扩展源中的词进行筛选,以避免过度扩展引起查询语义漂移。还引入动态观察窗口加权模型,以强化共现信息对单词之间关联度的表示。实验结果表明,该扩展算法比传统伪相关反馈算法的扩展质量有明显提高。
Query expansion is a method which adds new words into original query. The method aims at overcoming ambiguity problem and improving description of nature language. Query expansion in concept semantic space can greatly figure out the degree of association among query words and describe the sense of query on the whole. An algorithm proposed is that structures semantic trees for every query word making use of relationships of context and similarity in WordNet dictionary and finds a common root node upwards to structure concept semantic space, filters expansion words in source by co-occurrence information as parameter in order to prevent query drift. It imports dynamic observation windows weighting model into the algorithm, so that enhances the function of co-occurrence information in relevancy between every two words. The experimental results show that the expansion algorithm is much more effective than traditional pseudo-relevance feedback algorithm.
出处
《计算机工程与应用》
CSCD
2012年第35期106-109,193,共5页
Computer Engineering and Applications
基金
河南省科技攻关项目(No.102102210159)
关键词
查询扩展
伪相关反馈
语义空间
观察窗口
加权
平均倒数排名
query expansion
pseudo-relevance feedback
semantic space
observation windows
weighting
Mean Reciprocal Rank (MRR)