摘要
本文对某地区连续13年的死因登记数据进行了分析与挖掘,旨在探寻出生日期的启示性,特别是出生日期与死亡日期的关系。统计分析结果表明,男女性中"生日即忌日"的比例明显偏高,最高诊断医院为县级(区)医院、死因为癌症、受教育水平为大学的这些人群生日当天死亡的比例均较高,超过除生日当天外平常一周比例的总和。利用随机森林算法寻找对这种现象影响严重的特征,结果显示最高诊断医院为影响最大的特征,其后依次为死因、受教育程度、最高诊断手段、死亡地点、婚姻状况、性别等。关联规则挖掘的结果也说明最高诊断医院对于"生日即忌日"的现象有重要影响。本文探寻了出生日期与死亡日期的关系,对于深入了解出生与死亡的内在联系有重要意义。
This paper analyzes and excavates the data about the cause of death in13years to examine the apocalyptic connection between the day of birth and that of death.The statistical analysis results show that the proportion of the day of deaths coincide with their birthday is significantly higher among both the males and the females.The death cases declared by the county(District)hospitals boasts of the highest proportion dying on their birthdays,and the proportion is also significantly higher among the deaths cases caused by cancers and the among the well-educated people.The proportion of above-mentioned groups is higher than the sum of those died in the other days of the week of birthday.The random forest algorithm is used to find the significant characteristics affect this phenomenon.The results indicate that the factors ranking as diagnostic hospital,the causes of death,education,the diagnosis methods,the location of the death,marital status,gender.The association rule mining also confirms the hospital of death diagnosis is the most significant factor.
作者
林强
唐加山
LIN Qiang;TANG Jia-shan(College of Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, P R China)
出处
《南方人口》
CSSCI
2017年第6期34-41,共8页
South China Population
关键词
生日
死亡
数据分析
随机森林
关联规则挖掘
Birthday
Day of Death
Data Analysis
Random Forest
Association Rule Mining