摘要
通过对覆盖算法(CA)结果的分析,将覆盖某一类样本的每个覆盖看成一个Gauss分布,利用有限混合模型的极大似然拟合,用期望最大化算法(EM算法)来对覆盖算法进行优化处理。算法的迭代过程,就是不断调整各覆盖的中心、"半径"以及其线性组合系数,逐渐趋向最优解的过程。目的是为了提高覆盖算法的精度。应用于文本分类的实验证明,通过EM方法对均值、方差和线性组合系数进行迭代计算,将所求得的参数用于测试时所得到的平均精度都高于原覆盖算法的最高分类精度以及SVM处理同类数据的分类精度。
Analyze the results of Cover Algorithm.It considers every coverage which is included in one class of samples as a Gauss distribution.Then,with the help of maximum likelihood estimation of finite mixture of models,one could optimize the Cover Algorithm with the expectation maximization algorithm(EM algorithm).The process of the iterative algorithm is the optimized process that adjusts continuously the center,radius of every coverage and their linear combined coefficient.The aim is to improve the examination precision of Cover Algorithm.Such a model has been used on text classification database and their results have shown that the new parameters,which have been got through the iterative calculation of the mean value,square deviation and the linear combined coefficients by EM algorithm,have got the higher examination precision than the precisions of the original Cover Algorithm and SVM in processing the same database.
出处
《计算机技术与发展》
2010年第6期18-20,24,共4页
Computer Technology and Development
基金
安徽省哲学社会科学规划基金(AHSKF07-08D13)
安徽省人文社会科学研究基金(2009sk038)
关键词
有限混合模型
EM算法
覆盖算法
文本分类
finite mixture model
EM algorithm
cover algorithm
text classification