摘要
海量数据的索引是提高分布式环境下海量数据的查询重要手段。为了构建高效的索引结构,人们提出了多种异构数据索引优化方法。文中给出了基于决策树分类算法下的索引优化方法。基于决策树分类算法构建索引决策树,利用该索引决策树对各个子空间表的属性列进行决策,建立索引表,根据索引表数据建立索引,再根据各子空间上的索引构建全局索引。该二级索引结构为快速定位索引信息提供了技术支持。实验结果表明,索引决策树是一个对优化异构数据索引合适的方法。
The massive data index is an important means to improve the query efficiency of massive data in distributed environment. In order to construct an efficient index structure,Some heterogeneous data index optimization methods have proposed. This paper gives the index optimization method based on the index of decision tree classification,Firstly,an index decision tree is build up based on data tables and their index. then an index structure is obtained according to decisions given by the decision tree for each subspace. A global level index structure can be created based on local index. The two level index structure can used to rapid position index information and reduce data searching time. Finally,the experimental results show that the index of decision tree is a proper method to optimize heterogeneous spatial data index.
出处
《电子科技》
2018年第3期48-52,60,共6页
Electronic Science and Technology
基金
国家自然科学基金青年基金(61402288)
关键词
决策树
索引结构
大数据
索引优化
decision tree
index structure
big data
optimizing index