一种高效挖掘生物网络闭合频繁子图的算法被引量：1

An efficient algorithm for detecting closed frequent subgraphs in biological networks

下载PDF

导出

摘要针对生物网络中频繁子图的挖掘问题,提出了一种基于FP-树结构的MaxFP算法。此算法以代谢路径作为研究对象,在适合于生物网络图简化模型的基础上,采用一种不产生候选集的改进FP-growth算法挖掘生物网络中的闭合频繁子图。此算法考虑了基于频繁项目集的算法应用于网络的缺陷,根据生物网络的特点对FP-growth算法进行了改进。实验证明,提出的MaxFP算法比基于Apriori的频繁模式挖掘算法运行速度快,不仅能挖掘出最大的频繁子图,且能找到更多具有生物意义的频繁子图。 In this paper, the MaxFP, an algorithm for detecting frequent subgraphs in biological networks based on the FP-tree structure is proposed. The algorithm takes the metabolic pathway as the object of study, and uses an improved FP-growth algorithm without generating candidates for detecting closed frequent subgraphs in biological networks based on the simplification model appropriate to biological networks. Accordingly the MaxFP Considers the defects of the algorithms based on item-set mining which are applied to networks, and improves the FP-growth algorithm according to the features of biological networks. The experimental results show that the MaxFP runs faster than the algorithms based on Apriori, and the MaxFP not only detects maximal frequent subgraphs, but also finds more frequent subgraphs having biological meaning.

作者彭佳扬杨路明王建新刘振李敏

机构地区中南大学信息科学与工程学院长沙

出处《高技术通讯》 CAS CSCD 北大核心 2009年第2期188-193,共6页 Chinese High Technology Letters

基金国家自然科学基金(60433020) 新世纪优秀人才支持计划(No.NCET-05-0683)资助项目

关键词生物网络图挖掘闭合频繁子图 FP-树 FP-GROWTH算法 biological networks, graph mining, closed frequent subgraph, FP-tree, FP-growth algorithm

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献16

1Altschul S F, Madden T L, Scheffer A A, et al. Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucleic Acids Res, 1997, 25 ( 17 ) : 3389- 34O2
2Thompson J D, Higgins D G, Gibson T J. CLUSTALW: im- proving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res, 1994,22 (22), 4673-4680
3Hartwell L H, Hopfield J J, Leibler S, et al. From molecular to modular cell biology. Nature, 1999,402(6761) : C47-C51
4Oltvai Z N, Barabasi A L. Life's complexity pyramid. Science, 2002,298 (5594): 763-764
5Cook D J, Holder L B. Graph-based data mining. IEEE Intell Syst , 2000,15(2): 32-41
6Inokuchi A, Washio T, Okada T, et al. Applying the Apriori-based graph mining method to mutagenesis data analysis. Comput Aided Chem, 2001,2:87-92
7Kuramochi M, Karypis G. Frequent subgraph discovery. In: Proceedings of the 2001 IEEE International Conference on Data Mining, California, USA, 2001. 313-320
8Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB' 94), Sanfiaogo di Chile,Chile, 1994. 487-499
9Yah X, Han J. gSpan: Graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM'02), Maebashi City, Japan, 2002. 721-724
10Koyuturk M, Grama A, Szpankowski W. An eficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics, 2004,20(1) : i200-i207

二级参考文献9

1Pasquier N,Bastide Y.Discovering frequent closed itemsets for association rules[C]//Proceedings of the 7th Int'l Conf on Database Theory.Jerusalem:Springer-Verlag,1999:398-416.
2Pei J,Han J,Mao R.CLOSET:an efficient algorithm for mining frequent closed itemsets[C]//Proc 2000 ACM-SIGMOD Int Workshop Data Mining and Knowledge Discovery.New York:ACM Press,2000:11-20.
3Wang J,Han J,Pei J.Closet+:searching for the best strategies for mining frequent closed itemsets[C]//Proc 2003 ACM SIGKDD.New York:ACM Press,2003:236-245.
4Zaki M,Hsiao C.ChARM:an efficient algorithm for closed association rule mining[C]//Proc of 2002 SIAM Data Mining Conf.Arlington,VA,2002:457473.
5Pan F,Cong G,Zaki M.CARPENTER:finding closed patterns in long biological datasets[C]//Proc ACM SIGKDD 2003.New York:ACM Press,2003:637-642.
6Cong G,Tung A,Xu X.FARMER:finding interesting rule groups in microarray datasets[C]//Proc 23rd ACM Int Conf Management of Data.New York:ACM Press,2004:143-154.
7Liu H,Han J,Xin D,et al.Mining interesting patterns from very high dimensional data:a top-down row enumeration approach[C/OL]//Proc of the 6th SIAM International Conference on Data Mining.Bethesda,MD,2006.http://www.siam.org/meetings/sdmob/proceedings/026liuh.pdf.
8Liu H,Han J,Xin D,et al.Top-down mining of interesting patterns from very high dimensional data[C]//Proc 22nd International Conference on Data Engineering.Los Alamitos:IEEE Computer Society Press,2006:114-116.
9Creighton C,Hanash S.Mining gene expression databases for association rules[J].Bioinformatics,2003,19(1):79-86.

同被引文献10

1李先通,李建中,高宏.一种高效频繁子图挖掘算法[J].软件学报,2007,18(10):2469-2480. 被引量：35
2Johnson D S,Garey M R.Computers and Intractability:A Guide to the Theory of Np-completeness[M].[S.1.]:W.H.Freeman and Company,1979.
3Giugno R,Shasha D.Graph Grep:A Fast and Universal Method for Querying Graphs[C]//Proceedings of ICPR’02.Quebec,Canada:IEEE Press,2002:123-129.
4Zhao Peixiang,Yu J X,Yu P S.Graph Indexing:Tree+delta>=graph[C]//Proceedings of VLDB’2007.[S.1.]:IEEE Press,2007:233-241.
5Klein K,Kriege N,Mutzel P.CT-Index:Fingerprint-based Graph Indexing Combining Cycles and Trees[C]//Proceedings of ICDE’11.Hannover,Germany:IEEE Press,2011:258-265.
6Yan Xifeng,Yu P S,Han Jiawei.Graph Indexing:A Frequent Structure-based Approach[C]//Proceedings of SIGMOD’04.Paris,France:ACM Press,2004:568-576.
7Cheng J,Ke Y,Ng A,et al.FG-index:Towards Verification-free Query Processing on Graph Data-bases[C]//Proceedings of SIGMOD’07.Beijing,China:[s.n.],2007:541-549.
8Yan Xifeng,Han Jiawei.g Span:Graph-based Substructure Pattern Mining[C]//Proceedings of ICDM’02.Maebashi,Japan:IEEE Press,2002:236-246.
9Zou Lei,Chen Lei,Xu J,et al.A Novel Spectral Coding in A Large Graph Database[C]//Proceedings of EDBT’08.Nantes,France:ACM Press,2008:321-329.
10楼宇波,马坚,周皓峰,袁晴晴,施伯乐.基于频繁链接的Web权威资源挖掘[J].计算机研究与发展,2003,40(7):1095-1103. 被引量：6

引证文献1

1陆慧琳,黄博.基于双索引的子图查询算法[J].计算机工程,2015,41(1):44-48. 被引量：2

二级引证文献2

1张宇彤,王思檬,曹佳.基于邻域等价类的同构子图搜索算法[J].计算机工程,2017,43(9):7-11. 被引量：2
2施炜杰,董一鸿,王雄,潘剑飞.基于索引的子图查询技术研究进展[J].计算机应用,2019,39(1):39-45.

1谢飞君.朱频频:做有“脑子”的机器人[J].今日中国,2016,0(10):17-19.
2张延萍.基于关联规则数据挖掘的研究及应用[J].盐城工学院学报（自然科学版）,2007,20(1):47-49. 被引量：2
3曾安平,黄永平,唐远翔,阳万安.基于关联系数的Apriori算法改进[J].苏州科技学院学报（工程技术版）,2008,21(3):61-65.
4张浩然,张兆祥,刘娟,胡黔楠.基于Android的代谢网络可视化和生化反应搜索的实现[J].计算机应用与软件,2016,33(6):224-226. 被引量：1
5周义建,王轶,王辉.基于Apriori数据挖掘优化方法研究[J].计算机与数字工程,2008,36(2):20-22. 被引量：1
6朱娴,马卫.一种基于层次聚类的双聚类算法[J].微计算机应用,2009,30(5):12-17. 被引量：4
7赵卫绩,赵文正,刘井莲.基于SQL的Apriori改进算法[J].科学技术与工程,2006,6(17):2759-2761. 被引量：4
8刘钟涛,王虹.基于Apriori优化算法的IDS模型研究[J].河南科学,2013,31(12):2186-2189.
9张慧,陈恩红,童庆.基因组序列中转录始点的预测方法研究[J].计算机科学,2005,32(12):151-154.
10彭敢.Apriori算法及其改进算法Apriori-new的应用分析[J].现代计算机,2012,18(22):3-5.

高技术通讯

2009年第2期

浏览历史

内容加载中请稍等...

一种高效挖掘生物网络闭合频繁子图的算法被引量：1

参考文献16

二级参考文献9

同被引文献10

引证文献1

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

一种高效挖掘生物网络闭合频繁子图的算法 被引量：1

参考文献16

二级参考文献9

同被引文献10

引证文献1

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

一种高效挖掘生物网络闭合频繁子图的算法被引量：1