摘要
针对生物网络中频繁子图的挖掘问题,提出了一种基于FP-树结构的MaxFP算法。此算法以代谢路径作为研究对象,在适合于生物网络图简化模型的基础上,采用一种不产生候选集的改进FP-growth算法挖掘生物网络中的闭合频繁子图。此算法考虑了基于频繁项目集的算法应用于网络的缺陷,根据生物网络的特点对FP-growth算法进行了改进。实验证明,提出的MaxFP算法比基于Apriori的频繁模式挖掘算法运行速度快,不仅能挖掘出最大的频繁子图,且能找到更多具有生物意义的频繁子图。
In this paper, the MaxFP, an algorithm for detecting frequent subgraphs in biological networks based on the FP-tree structure is proposed. The algorithm takes the metabolic pathway as the object of study, and uses an improved FP-growth algorithm without generating candidates for detecting closed frequent subgraphs in biological networks based on the simplification model appropriate to biological networks. Accordingly the MaxFP Considers the defects of the algorithms based on item-set mining which are applied to networks, and improves the FP-growth algorithm according to the features of biological networks. The experimental results show that the MaxFP runs faster than the algorithms based on Apriori, and the MaxFP not only detects maximal frequent subgraphs, but also finds more frequent subgraphs having biological meaning.
出处
《高技术通讯》
CAS
CSCD
北大核心
2009年第2期188-193,共6页
Chinese High Technology Letters
基金
国家自然科学基金(60433020)
新世纪优秀人才支持计划(No.NCET-05-0683)资助项目