摘要
构建倒排文本空间索引树(IR)分裂聚类多目标模型,对非支配排序遗传算法(NSGA-Ⅲ)的求解过程进行改进,提出一种基于先验初始种群策略的非支配排序遗传算法(PIPS-NSGA-Ⅲ),使其更适应于倒排文本空间对象分裂聚类问题的求解.通过PIPS-NSGA-Ⅲ算法寻求对象最小包围矩形(MBR)之间的重叠与覆盖面积、对象群间平均距离以及语义相似度等目标的最优前端解.通过对比PIPS-NSGA-Ⅲ,NSGA-Ⅱ,NSGA-Ⅲ和SPEA-Ⅱ进化多目标算法,从对象分类时间、效率、查询时间和准确度等多个方面来评估算法的优劣.实验结果表明:PIPS-NSGA-Ⅲ算法对文本空间对象聚类分裂具有较高的效率;相对于简化传统R树(STR树)与R树空间索引结构,基于改进NSGA-Ⅲ文本空间索引的平均查询时间减少24.8%,平均准确度提高3.75%.
A multi-objective model for inverted-file R-tree(IR-tree)was constructed,and by improving the solving process of the non-dominating sorting genetic algorithm-Ⅲ(NSGA-Ⅲ),a priori initial population strategy non-dominating sorting genetic algorithm-Ⅲ(PIPS-NSGA-Ⅲ)was proposed to optimize the clustering splitting of text space nodes in IR-Tree.The optimal solution of the minimum overlap and coverage area,the average distance and the semantic similarity between the minimum bounding rectangles(MBRs)of nodes were considered by PIPS-NSGA-Ⅲ.By comparing the evolutionary multi-objective algorithms such as PIPS-NSGA-Ⅲ,NSGA-Ⅱ,NSGA-Ⅲ,SPEA-Ⅱ,the advantages and disadvantages of the algorithms were evaluated from the node classification time,efficiency,query time and accuracy.Experimental results show that the spatial-text index based on PIPS-NSGA-Ⅲis more efficient,and the average searching time is reduced by 24.8%with the average accuracy improved by 3.75%,compared with the traditional simple traditional R-tree(STR-tree)and R-Tree spatial index structure.
作者
马武彬
王锐
吴亚辉
邓苏
MA Wubin;WANG Rui;WU Yahui;DENG Su(Science and Technology on Information System Engineering Laboratory,National University of Defense Technology,Changsha 410073,China)
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2020年第5期86-92,共7页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
湖南省自然科学基金资助项目(2018JJ3619)
国家自然科学基金资助项目(61871388)。
关键词
倒排文本空间索引
遗传算法
非支配排序
先验初始种群策略
多目标优化
inverted spatial-text index
genetic algorithm
non-dominating sorting
priori initial population strategy
multi-objective optimization