商立方体分布式查询研究

Quotient Cube Query in Distributed Computing Environment

下载PDF

导出

摘要传统数据库处理分析大量历史数据的性能有限,无法达到满意效果。针对该问题,通过对商立方体的研究,提出等价区间的概念,并利用区间之间的独立性,使商立方体能更好地适应分布式环境下的查询。同时,提出了商立方体在Spark集群上的并行查询算法,充分利用等价区间点查询面命中的特性,使在保证查询有效的情况下尽可能并行化。最后,通过实验验证了算法高效性。 Traditional database processing has limited performance in analyzing large amounts of historical data and cannot achieve satisfactory results.Aiming at this problem,through the study of the business cube,we propose the concept of equivalence interval.The independence between the intervals is used to make the quotient cube better adapt to the query in the distributed environment.At the same time,the parallel query algorithm of the business cube on the Spark cluster is proposed,which makes full use of the characteristics of the equivalent interval point query surface hit so as to ensure parallelization as much as possible while ensuring the query is valid.Finally,the efficiency of the algorithm is verified by experiments.

作者张正凡都仪敏 ZHANG Zheng-fan;DU Yi-min(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)

机构地区昆明理工大学信息工程与自动化学院

出处《软件导刊》 2018年第11期37-39,44,共4页 Software Guide

关键词商立方体大数据 SPARK MAPREDUCE 等价类 quotient cube big data Spark MapReduce equivalent class

分类号 TP301 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献2

1李盛恩,王珊.封闭数据立方体技术研究[J].软件学报,2004,15(8):1165-1171. 被引量：25
2彭辅权,金苍宏,吴明晖,应晶.MapReduce中shuffle优化与重构[J].中国科技论文,2012,7(4):241-245. 被引量：8

二级参考文献14

1Lakshmanan LVS, Pei J, Han JW. Quotient cube: How to summarize the semantics of a data cube. In: Bressan S, Chaudhri AB, Lee ML, Yu JX, Lacroix Z, eds. Proc. of the 23rd Int'l Conf. on Very Large Data Bases. Hong Kong: Morgan Kaufmann, 2002. 778～789.
2Sismanis Y, Deligiannakis A, Roussopoulos N, Kotidis Y. Dwarf: Shrinking the PetaCube. In: Franklin MJ, Moon B, Ailamaki A, eds. Proc. of the 2002 ACM SIGMOD Int'l Conf. on Management of Data. Madison: ACM Press, 2002. 464～475.
3Mumick IS, Quass D, Mumick BS. Maintenance of data cubes and summary tables in a warehouse. In: Peckham J, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Tucson: ACM Press, 1997. 100-111.
4Hahn C, Warren S, London J. Edited synoptic cloud reports from ships and land stations over the globe. 1996. http://cdiac.esd.ornl.gov/cdiac/ndps/ndp026b.html
5Gray J, Bosworth A, Layman A, Pirahesh H. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In: Su SYW, ed. Proc. of the 12th Int'l Conf. on Data Engineering. New Orleans: IEEE Computer Society, 1996. 152～159.
6Agarwal S, Agrawal R, Deshpande PM, Gupta A, Naughton JF, Ramarkrishman R, Sarawagi S. On the computation of multidimensional aggregates. In: Vijayaraman TM, Buchmann AP, Mohan C, Sarda NL, eds. Proc. of the 22nd Int'l Conf. on Very Large Data Bases. Mumb
7Zhao Y, Deshpande PM, Naughton JF. An array-based algorithm for simultaneous multidimensional. In: Peckham J, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Tucson: ACM Press, 1997. 159-170.
8Ross KA, Srivastava D. Fast computation of sparse datacubes. In: Jarke M, Carey MJ, Dittrich KR, Lochovsky FH, Loucopoulos P, Jeusfeld MA, eds. Proc. of the 23rd Int'l Conf. on Very Large Data Bases. Athens: Morgan Kaufmann, 1997. 116～125.
9Harinarayan V, Rajaraman A, Ullman JD. Implementing data cubes efficiently. In: Jagadish HV, Mumick IS, eds. Proc. of the 1996 ACM SIGMOD Int'l Conf. on Management of Data. Montreal: ACM Press, 1996. 205-216.
10Shukla A, Deshpande PM, Naughton JF. Materialized view selection for multidimensional datasets. In: Gupta A, Shmueli O, Widom J, eds. Proc. of the 24th Int'l Conf. on Very Large Data Base. New York: Morgan Kaufmann, 1998. 488～499.

共引文献31

1冷芳玲,鲍玉斌,于戈,高伟.基于MapReduce的封闭数据立方[J].计算机研究与发展,2011,48(S3):232-238. 被引量：4
2牟雁超,李红燕,王腾蛟.PHCC:一种处理稀疏变化的封闭数据立方体算法[J].计算机研究与发展,2013,50(S2):85-93. 被引量：2
3Sheng-EnLi,ShanWang.Semi-Closed Cube： An Effective Approach to Trading Off Data Cube Size and Query Response Time[J].Journal of Computer Science & Technology,2005,20(3):367-372. 被引量：2
4吴杰,蒋外文.基于集合运算的数据立方体结构[J].计算机应用研究,2007,24(11):225-227.
5陈富强,奚建清.一种新的封闭立方体查询算法[J].微计算机应用,2008,29(4):63-66. 被引量：1
6肖伟吉,奚建清,欧国华.封闭立方体反转索引查询优化技术[J].计算机应用研究,2008,25(10):2977-2981.
7侯东风,陆昌辉,刘青宝,张维明.数据立方体计算方法研究综述[J].计算机科学,2008,35(10):1-5. 被引量：6
8奚建清,游进国,汤德佑,肖伟吉.基于MapReduce的封闭立方体并行计算方法[J].华南理工大学学报（自然科学版）,2009,37(1):91-95. 被引量：8
9游进国,奚建清,张平健,刘艳霞.在PC集群上的封闭立方体计算[J].计算机科学,2009,36(6):153-155. 被引量：1
10张应龙,盛立琨.超大型压缩数据仓库的查询研究[J].计算机与现代化,2009(6):5-8. 被引量：1

1刘晓刚.农产品大数据的抓取和分析方法探索[J].农村经济与科技,2018,29(19):304-305. 被引量：1
2陈圆圆.大数据背景下新闻发展模式及策略研究[J].神州,2018,0(31):289-289.
3木星的小红斑[J].百科探秘（航空航天）,2018,0(11):38-39.
4成舟.电力设备交流接触器运行维护与常见故障处理分析[J].名城绘,2018,0(10):0450-0450.
5赵耀,陈武,刘立峰,杨怀志,张野.变电运行跳闸故障的处理分析[J].名城绘,2018,0(10):0530-0530.
6杨胜金,瞿运斌.水库溢洪道高边坡加固处理分析[J].冶金与材料,2018,38(5):15-16. 被引量：3
7陈名杰.大数据时代为企业财务分析带来的变化[J].环球市场信息导报,2018,0(35):70-71.
8陈海涛,宋姗姗,单标安.创业生态系统的共生演化模型及仿真研究——基于中关村历史数据的分析[J].管理学季刊,2018,3(3):68-86. 被引量：11
9周伟辉,蒋年德.智能包装中的RFID标签防碰撞算法研究[J].包装工程,2018,39(21):11-16. 被引量：7
10胡阳,黄金晶,刘光富,赵雷.不确定图上Top-k最大影响力边查询算法[J].计算机工程,2018,44(11):19-26.

软件导刊

2018年第11期

浏览历史

内容加载中请稍等...

商立方体分布式查询研究

参考文献2

二级参考文献14

共引文献31

相关作者

相关机构

相关主题

浏览历史