摘要
目的:介绍加权Fisher线性判别法在非平衡医学数据集中的应用。方法:在两类分类问题中,当两类样本的协方差矩阵不同时,样本不平衡会导致Fisher线性判别的性能下降,使用加权Fisher线性判别法对两类样本同时进行不同倍数的过抽样,可促使两类的样本数目趋向平衡。结果:利用社区居民的血糖流行病学调查资料进行验证,加权Fisher线性判别法较传统Fisher线性判别法的灵敏度高,分类性能明显提高。结论:加权Fisher线性判别法可适用于非平衡数据集,算法简单高效,且基本不增加计算复杂度。
Objective: To introduce the application of Weighted Fisher Linear Discriminant Model in imbalanced medical datasets classified. Methods. Majority of two-class classification methods usually assume that their training sets are well-balanced, but when the two sample covariance matrices are not identical , class imbalance has a negative effect on the performance of Fisher linear discriminant. A weighted Fisher linear discriminant is introduced for reducing the negative effects of the class imbalance. Results. Using the DM data sets from the community, the new algorithm was compared with the conventional method. The experimental results show that the new algorithm performs better than the old one. Conclusion. The weighted Fisher linear discriminant displays better performance in the imbalanced medical datasets for its simple algorithm, effectiveness and well algorithm complexity.
出处
《数理医药学杂志》
2009年第1期59-61,共3页
Journal of Mathematical Medicine
基金
2008年度广东省医学科研基金(B2008082)