摘要
综合选用数据挖掘中的EM聚类算法和C4.5分类算法,设计并进行了一系列以研究2型糖尿病发病危险因素与血糖变化关系为目的的实验。研究结果包括:发现了新的血糖门限值5.26和未发病门限值4.28,发现和验证了影响最大的8个重要发病危险因素及其对应的一系列重要临界值,定性定量相结合地给出了各个重要发病危险因素的影响程度在血糖值不同预警门限值下的变化关系等。
This paper combines expectation maximization(EM) algorithm and C4.5 algorithm to build a Type 2 diabetes data processing system. With the system, a series of data mining experiment is designed to seek for important Type 2 diabetes risk factors and their relationships with blood glucose. Through a large quantity of experiments, some pathological knowledge of Type 2 diabetes is obtained, which includes 2 new blood glucose threshold 5.26 and 4.28, and 8 important Type 2 diabetes risk factors. Based on these factors and the results, the relationship between the functions of those risk factors and different blood glucose thresholds is studied and illustrated. And the relationship between important risk factors and blood glucose is analyzed.
出处
《计算机工程》
CAS
CSCD
北大核心
2007年第9期103-105,共3页
Computer Engineering
基金
国家"十五"攻关基金资助项目(2001BA702B01)
国家自然科学基金资助项目(60671008)
关键词
EM算法
C4.5算法
2型糖尿病
发病危险因素
Expectation maximization algorithm: C4.5 algorithm
Type 2 diabetes
Risk factors of disease