摘要
通过分析SQL查询的相似性,提出了一种新的SQL查询的距离函数.通过该距离函数对SQL负载集合进行聚类,并且提取具有代表性的SQL子集,达到减小SQL集合的目的,从而提高基于负载分析的性能优化工具(以物理设计优化为例)的扩展性,同时又不会大幅度降低优化的结果.分别采用TPC-H负载和客户数据库的实际负载作为SQL负载集合,通过算法实现和在DB2上进行Index A dv isor实验证实:该算法可以裁剪SQL负载到原有负载的65%和43%;减少Index A dv isor的运行时间达到63%和72%;同时性能的损失分别是8%和4%,证明本算法是行之有效的.
By means of analyzing the similarity between two SQI. queries a novel distance function was presented. Based on this distance function we developed an algorithm to cluster SQL workload, which can prune the SQL workload to a smaller SQL set and can improve the scalability of database tuning tools (specifically database index tuning tool), meanwhile the pruned SQL workload does not lower the optimization effect too much. We implemented the algorithm and experiment on it with IBM DB2 index advisor using TPC-H workload and a real customer database's workload. The results show that the algorithm can prune the SQL workload by 65% and 43%, cut down the running time of Index Advisor by 63% and 72%, meanwhile the performance lost is 8% and 4%, respectively.
出处
《中国矿业大学学报》
EI
CAS
CSCD
北大核心
2006年第2期269-273,共5页
Journal of China University of Mining & Technology