摘要
目的探讨关联规则方法在肝癌病人资料分析中的应用研究。方法采用基于列联表的χ2检验和可信度提高的规则剪除方法,通过设置最小支持度和最小可信度对肝癌资料进行关联规则分析。结果设置最小支持度和最小可信度分别为0·10和0·60,样本例数1457例,所得结果根据列联系数C>0·45获得51条规则,作用度lift>2·4得到19条规则;考虑规则后件为肝癌是否复发得到19条规则。这些规则揭示了影响肝癌复发的变量值及变量值的组合,并且反映了数据中高度相关的变量值间的关系。结论在肝癌病人资料中应用关联规则分析可以揭示多因素间潜在的、有价值的关系。
Objective To explore the application of association rules in the analysis of the liver cancer patients data, Methods Based on pruning rules with chi-square test and improvement of the confidence, association rules analysis was applied to the liver cancer patients data through setting minimum support and minimum confidence. Results Assigning minimum support and minimum confidence 0.10 and 0.60 respectively, and analyzing liver cancer data with 1 457 patient, we got 51 association rules according to associated coefficient C which is more than 0.45. In addition, when lift 〉 2, 4, there were 19 rules obtained. Fixing variable indicating weather liver cancer recurs or not on the right hand side of rules, we obtained 19 rules. These rules showed that some variable values and combinations of variable values may influence the recurrence of liver cancer, and there were a few high correlations among some variable values. Conclusion Applying association rules analysis in the liver cancer patients data could discover some potential, valuable relationships of multi-factors in the data.
出处
《中国卫生统计》
CSCD
北大核心
2006年第1期34-38,共5页
Chinese Journal of Health Statistics
基金
国家自然科学基金(30471502)
上海市自然科学基金(04ZR14049)资助
关键词
肝癌
关联规则
可信度
数据挖掘
Liver neoplasm, Association rules, Confidence, Data mining