针对Ceph分布式存储系统中可扩展哈希下的受控复制(Controlled Replication Under Scalable Hashing,CRUSH)数据分布算法导致设备间存储数据容量之差达到40%,进而在数据量大、高并发情况下“热点”成为系统性能瓶颈的问题,本文对CRUSH...针对Ceph分布式存储系统中可扩展哈希下的受控复制(Controlled Replication Under Scalable Hashing,CRUSH)数据分布算法导致设备间存储数据容量之差达到40%,进而在数据量大、高并发情况下“热点”成为系统性能瓶颈的问题,本文对CRUSH算法进行深入研究,设计并实现了Writing_Balance算法来对数据分布进行性能优化,以达到消除“热点”所导致的负载失衡以及磁盘利用率过高的问题。通过实验发现,Writing_Balance算法可使“热点”的PG数量分布优化率较之前提升4.4%;磁盘利用率稳定性提高了3%左右;并且在较小输入key空间下对于数据整体均衡度优化也有明显的提升。展开更多
Base on the character of the cluster of workstation(COW)and the latest development of the parallel computer,this paper analyzes the data deflexion problem in data distributing of parallel DB in COW. On the basis of th...Base on the character of the cluster of workstation(COW)and the latest development of the parallel computer,this paper analyzes the data deflexion problem in data distributing of parallel DB in COW. On the basis of this analysis,we get a dynamic data balanced distributing algorithm which has adaptability.展开更多
文摘针对Ceph分布式存储系统中可扩展哈希下的受控复制(Controlled Replication Under Scalable Hashing,CRUSH)数据分布算法导致设备间存储数据容量之差达到40%,进而在数据量大、高并发情况下“热点”成为系统性能瓶颈的问题,本文对CRUSH算法进行深入研究,设计并实现了Writing_Balance算法来对数据分布进行性能优化,以达到消除“热点”所导致的负载失衡以及磁盘利用率过高的问题。通过实验发现,Writing_Balance算法可使“热点”的PG数量分布优化率较之前提升4.4%;磁盘利用率稳定性提高了3%左右;并且在较小输入key空间下对于数据整体均衡度优化也有明显的提升。
文摘Base on the character of the cluster of workstation(COW)and the latest development of the parallel computer,this paper analyzes the data deflexion problem in data distributing of parallel DB in COW. On the basis of this analysis,we get a dynamic data balanced distributing algorithm which has adaptability.