摘要
针对商业交易数据构成项目繁多、动态数据增加量大、历史数据量更大的特点,根据频繁项集的商业特征,分为新生、成熟、老化、过期4种类型并分类统计;提出了基于分类统计增量地挖掘新增业务数据中关联规则的算法,算法只需两次扫描新增数据库,无需扫描历史数据库,算法将发现的规则按照其反应的商业特征分为4种类型:新生规则、成熟规则、老化规则、过期规则,在提升规则内容识别效率的同时,强化规则特点的识别能力。
Business activities always generate large dataset and accumulate much larger dataset with many items in each transaction.According to the commercial character,frequent itemsets are classified into four classes:new,mature,aging,expired,and an incremental updating algorithm based on statistical information of classified frequent itemsets is put forward.The algorithm only scans newly increased transaction dataset twice without scanning original transaction dataset,and it can also classify all rules into four classes: new,mature,aging,expired with strong rule identification capability as well as higher rule identification efficiency.
出处
《重庆工商大学学报(自然科学版)》
2015年第12期43-47,共5页
Journal of Chongqing Technology and Business University:Natural Science Edition
基金
福建省教育厅A类项目"数据挖掘批量建模技术"(JA12401)
关键词
频繁项集分类
统计信息
增量式更新
关联规则分类
classified frequent itemsets
statistical information
incremental updating
classified association rules