摘要
本文研究如何快速有效地从XML数据中挖掘频繁模式,提出了从XML数据中挖掘频繁模式的增量式算法FreqtTree。该算法首先将XML文档转化成DOM树,然后从DOM树中挖掘所有频繁模式。FreqtTree算法采用最右扩展技术,对DOM树仅遍历一次,因此具有很高的效率。在此基础上详细描述了基于DOM树的关联规则挖掘算法DFreqtTree。最后将本文提出的算法用Java语言实现,并进行性能分析,结果表明算法是高效可行的。
It presents an efficient mining algorithm FreqtTree for discovering all frequent patterns from XML data, and then considers mining global frequent patterns from XML data in distributed environment in this paper. First of all, the XML files are transferred to DOM tree, and then it mines all the frequent patterns from the DOM tree. It’s a high efficient algorithm because it adopts the right extension technology and scans the DOM tree only one time. After that, it describes the distributed association rule data mining algorithm DFreqtTree based on DOM tree. At last, this algorithm is implemented and analyzed by Java language.
出处
《微计算机信息》
2009年第12期221-222,共2页
Control & Automation