大规模图上标签集约束路径的集合查询被引量：2

All Pairs Label-constraint Path Query in Large Graph

下载PDF

导出

摘要图数据模型被广泛用于社交网络、生物技术、语义网络等开放、异构环境下的数据建模。标签集约束路径查询是基本路径查询问题之一,因其具有路径描述的灵活性而受到目前研究的重视。目前重点研究布尔查询问题:判断给定顶点对间是否有满足标签集约束的路径,返回是或否。现研究布尔查询问题的正交问题,称为集合查询问题:给定标签约束集,返回满足标签集约束可达的顶点对。集合查询问题面临两个困难:1)简单地将集合查询问题简化为布尔查询问题的迭代会陷入穷举困境;2)压缩传递闭包的生成树结构虽然能够有效地回答布尔查询问题,但是,这种压缩结构不能有效支持集合查询,因为集合查询需要搜索满足约束连通的所有顶点对。为此,继续采用生成树来压缩标签路径传递闭包,用倒排索引表来加快集合查询所导致的搜索,并进一步给出两个优化算法。在大规模的数据集上的测试表明,本方法在时间和空间效率方面都具有优势。 Graph data has been used to model open and heterogeneous data such as social network, biological network and semantic Web. The edge-labeled graphs are drawing the attention of researchers for its scalability to describe the path teachability. Its fundamental problem is about returning true or false of the label-constraint path query. Based on this, we put forward all pairs label-constraint path query problem. There are two kinds of difficulty to solve this prob- lem： 1） It needs to enumerate all pairs of vertices exhaustively if taking the label-constraint path query to solve it; 2） The spanning tree method can＇t support the all pairs path query problem even though it can answer the path query effi- ciently. In this work, we compressed the label path transitive closure through spanning tree and quickened the query time by inverted index technique. We also gave two optimal algorithms for the query when searching answers on the spanning tree. The extensive experiments value the effectiveness and efficiency of our approach both on computing time and storage space.

作者包佳佳田伟

机构地区东南大学计算机科学与工程学院

出处《计算机科学》 CSCD 北大核心 2013年第4期172-176,192,共6页 Computer Science

基金国家自然科学基金(60973023 61003057)资助

关键词图标签集约束路径查询标签集约束路径的集合查询倒排索引 Graph Label-constraint path query All pairs label-constraint path query Inverted index

分类号 TP392 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献15

1Jin Ruo-ming, Hong Hui, Wang Hai-xun, et al. Computing La- bel-Constraint Reachability in Graph Database [C]//SIGMOD' 10. 2010:123-134.
2Zou Lei,Xu Kun, Yu J X. Answering Label-constraint Reach- ability in Large Graphs[R]. TR-DB-ICST-PKU-2011-002.
3Insti- tute of Computer Science and Techniloge Fang Wei. TEDI: Efficient Shortest Path Query Answering on Graphs [C]//SIGMOD ' 10. 2010:99-110.
4Jin Ruo-ming, Xiang Yang, Ruan Ning, et al. 3-HOP.. A High- Compression Indexing Scheme for Reaehability Query [C]// SIGMOD '09. 2009:813-826.
5Wang Hai-xun, He Hao, Yang Jun, et al. Dual labeling: Answer- ing graph teachability queries in constant time [C]//lCDE '06.2006:75.
6Yan Y, Wang C, Zhou A, et al. Efficiently querying RDF data in triple stores [R]//Tech report. 2008.
7Gou Gang, Chirkova R. Efficiently querying large xml data re- positories:A survey [J]. IEEE Trans. Knowl. Data Eng. , 2007, 19(10) : 1381-1403.
8Jagadish H V. A compression technique to materialize transitive closure[J]. ACM Trans. Database Syst. , 1990,15(4) : 558-598.
9Cohen E, Halperin E, Kaplan H, et al. Reachability and distance queries via 2-hop labels [C]//Proc of the 13th annual ACM- SIAM Symp on Discrete Algorithms. 2002:937-946.
10Chomsky, Noarn. Three Models for the Description of Language [J]. IRE Transactions on Information Theory, 1956,2 (3) : 1 13- 124.

同被引文献22

1刘勇,李建中,朱敬华.一种新的基于频繁闭显露模式的图分类方法[J].计算机研究与发展,2007,44(7):1169-1176. 被引量：10
2Khan A,Yan X,Wu K L.Towards proximity pattern mining in large graphs[C]//Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data,2010:867-878.
3Han J,Pei J,Yin Y.Mining frequent patterns without candidate generation[J].ACM SIGMOD Record,2000,29(2):1-12.
4Agrawal R,Imieliński T,Swami A.Mining association rules between sets of items in large databases[J].ACM SIGMOD Record,1993,22(2):207-216.
5Kuramochi M,Karypis G.Frequent subgraph discovery[C]//Proceedings IEEE International Conference on Data Mining,2001:313-320.
6Zhao P,Yu J X,Yu P S.Graph indexing:tree+delta<=graph[C]//Proceedings of the 33rd International Conference on Very Large Data Bases,2007:938-949.
7Agrawal R,Srikant R.Fast algorithms for mining association rules[C]//Proc 20th Int Conf Very Large Data Bases,1994:487-499.
8Han Jiawei,Kamber M.数据挖掘概念与技术[M].北京:机械工业出版社,2007:146-159,351-384.
9Wang K,Tang L,Han J,et al.Top down fp-growth for association rule mining[M].Berlin Heidelberg:Springer,2002.
10Huan J,Wang W,Prins J,et al.Spin:mining maximal frequent subgraphs from graph databases[C]//Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2004:581-586.

引证文献2

1郑海雁,王远方,熊政,李昆明,崇志宏,尹飞.标签集约束近似频繁模式的并行挖掘[J].计算机工程与应用,2015,51(9):135-141. 被引量：7
2赵星.大规模图数据可达性索引技术研究[J].电子设计工程,2015,23(23):152-154.

二级引证文献7

1孟彩霞.大数据环境下不良网络内容识别技术研究[J].软件导刊,2015,14(11):19-21. 被引量：3
2刘鑫.电力大系统振荡数据的可靠性挖掘算法[J].电力与能源,2015,36(6):831-835.
3黄宏本.基于改进关联规则的危险Web信息挖掘技术研究[J].现代电子技术,2016,39(6):14-17. 被引量：9
4韩冬,韩春庆.协同云计算下的差异区域数据挖掘平台设计与实现[J].现代电子技术,2017,40(5):118-121. 被引量：2
5刘祥佳,程良伦.IFAMR:一种基于MapReduce的高效频繁项挖掘算法[J].广东工业大学学报,2017,34(2):86-91.
6王远敏.电力网络DCS数据库中的过负荷数据挖掘方法研究[J].电网与清洁能源,2015,31(11):36-40. 被引量：3
7陈丹,罗烨,吴智勤.基于大数据挖掘和用户画像的高校图书馆个性化服务研究[J].图书馆研究与工作,2019(4):50-53. 被引量：33

1张文增,孙振国,赵冬斌,陈强.基于Web数据库的子集合查询技术[J].计算机应用,2002,22(1):53-55.
2徐海涛.LINQ to Object与传统集合查询的性能比较[J].计算机时代,2009(8):32-33. 被引量：1
3钟斌,邬毅松,李思敏.基于无线传感器网络的节能路由协议[J].电子技术应用,2011,37(2):99-101. 被引量：1
4学英语下歌词 QQ搜索满足你的特殊需求[J].计算机与网络,2006(3):99-99.
5杨鹏.MANET中基于遗传算法的QoS多播路由协议[J].计算机工程与应用,2008,44(5):140-142. 被引量：1
6舒孝阳,刘斌.角度约束路径法的网格曲面兴趣区域边界快速交互选取[J].华侨大学学报（自然科学版）,2014,35(3):246-249.
7曾志,刘仁义,杜震洪,张丰.云格环境下基于P2P的动态资源发现机制[J].浙江大学学报（理学版）,2013,40(4):463-468. 被引量：5
8李宏,王秀芳,陈雪松,张秀艳.MPLS流量工程的约束路由技术[J].大庆石油学院学报,2005,29(5):82-84.
9李贵,陈成,李征宇,韩子扬,孙平,孙焕良.基于标签路径的Web结构化数据自动抽取[J].计算机科学,2013,40(06A):141-144. 被引量：3
10王文焕,赵卓峰.关系数据库的关键词查询性能优化[J].计算机与数字工程,2012,40(11):18-20. 被引量：1

计算机科学

2013年第4期

浏览历史

内容加载中请稍等...

大规模图上标签集约束路径的集合查询被引量：2

参考文献15

同被引文献22

引证文献2

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

大规模图上标签集约束路径的集合查询 被引量：2

参考文献15

同被引文献22

引证文献2

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

大规模图上标签集约束路径的集合查询被引量：2