期刊文献+

一种高效挖掘生物网络闭合频繁子图的算法 被引量:1

An efficient algorithm for detecting closed frequent subgraphs in biological networks
下载PDF
导出
摘要 针对生物网络中频繁子图的挖掘问题,提出了一种基于FP-树结构的MaxFP算法。此算法以代谢路径作为研究对象,在适合于生物网络图简化模型的基础上,采用一种不产生候选集的改进FP-growth算法挖掘生物网络中的闭合频繁子图。此算法考虑了基于频繁项目集的算法应用于网络的缺陷,根据生物网络的特点对FP-growth算法进行了改进。实验证明,提出的MaxFP算法比基于Apriori的频繁模式挖掘算法运行速度快,不仅能挖掘出最大的频繁子图,且能找到更多具有生物意义的频繁子图。 In this paper, the MaxFP, an algorithm for detecting frequent subgraphs in biological networks based on the FP-tree structure is proposed. The algorithm takes the metabolic pathway as the object of study, and uses an improved FP-growth algorithm without generating candidates for detecting closed frequent subgraphs in biological networks based on the simplification model appropriate to biological networks. Accordingly the MaxFP Considers the defects of the algorithms based on item-set mining which are applied to networks, and improves the FP-growth algorithm according to the features of biological networks. The experimental results show that the MaxFP runs faster than the algorithms based on Apriori, and the MaxFP not only detects maximal frequent subgraphs, but also finds more frequent subgraphs having biological meaning.
出处 《高技术通讯》 CAS CSCD 北大核心 2009年第2期188-193,共6页 Chinese High Technology Letters
基金 国家自然科学基金(60433020) 新世纪优秀人才支持计划(No.NCET-05-0683)资助项目
关键词 生物网络 图挖掘 闭合频繁子图 FP-树 FP-GROWTH算法 biological networks, graph mining, closed frequent subgraph, FP-tree, FP-growth algorithm
  • 相关文献

参考文献16

  • 1Altschul S F, Madden T L, Scheffer A A, et al. Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucleic Acids Res, 1997, 25 ( 17 ) : 3389- 34O2
  • 2Thompson J D, Higgins D G, Gibson T J. CLUSTALW: im- proving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res, 1994,22 (22), 4673-4680
  • 3Hartwell L H, Hopfield J J, Leibler S, et al. From molecular to modular cell biology. Nature, 1999,402(6761) : C47-C51
  • 4Oltvai Z N, Barabasi A L. Life's complexity pyramid. Science, 2002,298 (5594): 763-764
  • 5Cook D J, Holder L B. Graph-based data mining. IEEE Intell Syst , 2000,15(2): 32-41
  • 6Inokuchi A, Washio T, Okada T, et al. Applying the Apriori-based graph mining method to mutagenesis data analysis. Comput Aided Chem, 2001,2:87-92
  • 7Kuramochi M, Karypis G. Frequent subgraph discovery. In: Proceedings of the 2001 IEEE International Conference on Data Mining, California, USA, 2001. 313-320
  • 8Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB' 94), Sanfiaogo di Chile,Chile, 1994. 487-499
  • 9Yah X, Han J. gSpan: Graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM'02), Maebashi City, Japan, 2002. 721-724
  • 10Koyuturk M, Grama A, Szpankowski W. An eficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics, 2004,20(1) : i200-i207

二级参考文献9

  • 1Pasquier N,Bastide Y.Discovering frequent closed itemsets for association rules[C]//Proceedings of the 7th Int'l Conf on Database Theory.Jerusalem:Springer-Verlag,1999:398-416.
  • 2Pei J,Han J,Mao R.CLOSET:an efficient algorithm for mining frequent closed itemsets[C]//Proc 2000 ACM-SIGMOD Int Workshop Data Mining and Knowledge Discovery.New York:ACM Press,2000:11-20.
  • 3Wang J,Han J,Pei J.Closet+:searching for the best strategies for mining frequent closed itemsets[C]//Proc 2003 ACM SIGKDD.New York:ACM Press,2003:236-245.
  • 4Zaki M,Hsiao C.ChARM:an efficient algorithm for closed association rule mining[C]//Proc of 2002 SIAM Data Mining Conf.Arlington,VA,2002:457473.
  • 5Pan F,Cong G,Zaki M.CARPENTER:finding closed patterns in long biological datasets[C]//Proc ACM SIGKDD 2003.New York:ACM Press,2003:637-642.
  • 6Cong G,Tung A,Xu X.FARMER:finding interesting rule groups in microarray datasets[C]//Proc 23rd ACM Int Conf Management of Data.New York:ACM Press,2004:143-154.
  • 7Liu H,Han J,Xin D,et al.Mining interesting patterns from very high dimensional data:a top-down row enumeration approach[C/OL]//Proc of the 6th SIAM International Conference on Data Mining.Bethesda,MD,2006.http://www.siam.org/meetings/sdmob/proceedings/026liuh.pdf.
  • 8Liu H,Han J,Xin D,et al.Top-down mining of interesting patterns from very high dimensional data[C]//Proc 22nd International Conference on Data Engineering.Los Alamitos:IEEE Computer Society Press,2006:114-116.
  • 9Creighton C,Hanash S.Mining gene expression databases for association rules[J].Bioinformatics,2003,19(1):79-86.

同被引文献10

  • 1李先通,李建中,高宏.一种高效频繁子图挖掘算法[J].软件学报,2007,18(10):2469-2480. 被引量:35
  • 2Johnson D S,Garey M R.Computers and Intractability:A Guide to the Theory of Np-completeness[M].[S.1.]:W.H.Freeman and Company,1979.
  • 3Giugno R,Shasha D.Graph Grep:A Fast and Universal Method for Querying Graphs[C]//Proceedings of ICPR’02.Quebec,Canada:IEEE Press,2002:123-129.
  • 4Zhao Peixiang,Yu J X,Yu P S.Graph Indexing:Tree+delta>=graph[C]//Proceedings of VLDB’2007.[S.1.]:IEEE Press,2007:233-241.
  • 5Klein K,Kriege N,Mutzel P.CT-Index:Fingerprint-based Graph Indexing Combining Cycles and Trees[C]//Proceedings of ICDE’11.Hannover,Germany:IEEE Press,2011:258-265.
  • 6Yan Xifeng,Yu P S,Han Jiawei.Graph Indexing:A Frequent Structure-based Approach[C]//Proceedings of SIGMOD’04.Paris,France:ACM Press,2004:568-576.
  • 7Cheng J,Ke Y,Ng A,et al.FG-index:Towards Verification-free Query Processing on Graph Data-bases[C]//Proceedings of SIGMOD’07.Beijing,China:[s.n.],2007:541-549.
  • 8Yan Xifeng,Han Jiawei.g Span:Graph-based Substructure Pattern Mining[C]//Proceedings of ICDM’02.Maebashi,Japan:IEEE Press,2002:236-246.
  • 9Zou Lei,Chen Lei,Xu J,et al.A Novel Spectral Coding in A Large Graph Database[C]//Proceedings of EDBT’08.Nantes,France:ACM Press,2008:321-329.
  • 10楼宇波,马坚,周皓峰,袁晴晴,施伯乐.基于频繁链接的Web权威资源挖掘[J].计算机研究与发展,2003,40(7):1095-1103. 被引量:6

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部