期刊文献+

应用分类树模型筛选恶性肿瘤危险因素的研究 被引量:24

Study on the application of classification tree model in screening the risk factors of malignant tumor
原文传递
导出
摘要 目的介绍分类树模型筛选恶性肿瘤危险因素基本原理、运算法则和应用价值。方法以浙江省嘉善县乳腺癌现场调查数据为例,采用Exhaustive CHAID法建立分类树模型对调查结果进行危险因素筛选,使用错分概率Risk值和ROC曲线下面积对模型进行评价。结果分类树模型从全部105个候选变量中筛选出9个危险因素,其中职业是最重要的影响因素,工人、教师及退休人员的乳腺癌发生概率显著高于其他人员。另外,模型显示经常参加体育锻炼在不同人群中对乳腺癌的影响效果有所不同。模型错分概率Risk值为0.174,利用预测概率绘制的ROC曲线下面积为0.872,与0.5比较具有显著的统计学意义,模型拟合效果很好。结论分类树模型不仅可以有效挖掘筛选出主要的影响因素,还可以对研究变量科学定义分界点,展示变量间复杂的相互作用,在流行病学研究中具有较高的应用价值。 Objective To introduce the partitioning algorithm of classification tree model, and to explore the value of this data mining technique applied in data analysis of multifactorial diseases as malignant tumors. Methods Data was analyzed from a survey that conducted on 84 breast cancer patients and 273 cancer-free controls selected randomly in Jiashan county, The classification tree model was constructed using Exhaustive CHAID method and evaluated by the Risk statistics and the area under the ROC curve. Results 9 out of 105 effect risks factors were selected, in which career was the most important factor indicating that workers, teachers and retirees suffered much more risks than others. Nevertheless, the number of pregnancies, breast examination, reasons for menopause, age at menarche, intake of shrimp, crab, kipper, kelp and laver ere were also risk factors on breast cancer. However, physical exercise played different roles on different people, The Risk statistics of model was 0. 174, and the area under the ROC curve was 0. 872 which was significantly different from 0.5, suggesting that the classification tree model fit the actuality very well. Conclusion The classification tree model could screen out the major affecting factors quickly and effectively and could also identify the cuttlng-points for continuous and ordinal variables, as well as revealing the complex interaction among the factors at many levels. This model might become a powerful tool to explore the complexities of the risks on diseases.
出处 《中华流行病学杂志》 CAS CSCD 北大核心 2006年第6期540-543,共4页 Chinese Journal of Epidemiology
基金 国家自然科学基金资助项目(30471492)
关键词 分类树模型 乳腺肿瘤 危险因素 卡方自动交互检测法 Classification tree model Breast neoplasm Risk factor Exhaustive chi-square automatic interaction detection method
  • 相关文献

参考文献9

二级参考文献78

  • 1闻芝梅 陈君石(译).现代营养学[M].北京:人民卫生出版社,1998..
  • 2葛可佑.中国八省居民健康与营养状况[M].北京:科学技术出版社,1998.11-12,38-39.
  • 3国家计划生育委员会.1997年全国人口与生殖健康调查技术文件[M].,1997..
  • 4葛可佑 翟凤英 金水高.中国八省居民健康与营养状况[M].北京:科学技术出版社,1998.15-20.
  • 5GA Bray, BM Popkin. Dietary fat intake does affect obesity![J]. Am J Clin Nutr, 1998,68:1157-- 1173.
  • 6Ann M Coulston. The role of dietary fats in plantbased diets^1,2,3[J]. Am J Clin Nutr, 1999,70 : 512S--515S.
  • 7Ⅱ Suh, Kyung Won Oh, Kang Hee Lee, et al. Moderate dietary fat consumption as a risk factor for ischemic heart disease in a population with a low fat intake: a case-control study in Korean men[J]. Am J Clin Nutr,2001,73: 722--727.
  • 8Joel Schwartz. Role of polyunsaturated fatty acids in lung disease[J]. Am J Clin Nutr, 2000, 71 : 393S--396S.
  • 9Grundy SM, Vega GL. Plasma cholesterol responsiveness to saturated fatty acid[J]. Am J Clin Nutr,1988,47: 822--824.
  • 10Mensink RP, Katan MB. Effect of a diet enriched with monounsaturated or polyunsaturated fatty acidson levels of low-density and high-density lipoprotein cholesterol in health women and men.[J]N Engl J Med, 1989, 321: 436--441.

共引文献182

同被引文献238

引证文献24

二级引证文献173

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部