摘要
MongoDB数据库中的自动分片(Auto-Sharding)机制仅通过数据量来进行分片迁移,会导致负载不均衡的问题。为此,提出一种基于数据冷热访问特征的Auoto-Sharding优化机制。通过朴素贝叶斯算法对数据的访问特性进行冷热数据判定,将数据分片中热数据的所占比重作为热负载值以确定数据迁移时机,并根据数据片之间的热负载差异建立新的数据迁移策略。实验结果表明,在高并发条件下,该优化机制的数据吞吐量高于原有的AutoShading机制。
The Auto-Sharding mechanism in MongoDB database finishes shard migration only through the data quantity, which causes unbalanced load imbalance. Aiming at this problem, this paper proposes an optimized Auto- Sharding mechanism based on the access characteristics of hot and cold data. It uses the naive Bayes algorithm to determine the data access characteristics of hot and cold data,and takes the proportion of the hot data in a data block as the heat load to determine the data migration time. It establishes new data migration strategy through the heat load differences between data blocks. Experimental results show that the data throughput of the improved mechanism is obviously better than that of the original Auto-Sharding mechanism under high concurrent condition.
出处
《计算机工程》
CAS
CSCD
北大核心
2017年第3期7-10,17,共5页
Computer Engineering
基金
重庆市教委科学技术研究项目(KJ1400414)
工信部2012年物联网发展专项(2-5)
重庆邮电大学博士启动基金(A2015-17)