摘要
提出了一种基于训练集中已有类别的数学期望的分类算法,该算法先将离散属性值映射为相应的数值,并计算各类别中各属性的数学期望,以各类别中各属性的数学期望为坐标,当有新的数据需要进行类别判定时,只要以新数据的属性为坐标,求取其到各个类别的距离,距离最短的类别即为该数据所属类别。该算法不受属性离散性及类别个数限制,可用于属性类别不统一(既有离散型属性,又有连续型属性),且所属类别数较多的分类情况。
A classification algorithm based on the existing categories of mathematical expectation which in the training set is proposed in this paper. The algorithm first maps the discrete attribute values to the corresponding values, and calculates the mathematical expectation of each attribute in each category. The mathematical expectation of each attribute in each category is used as coordinates. When there are new data need to determine the category, we just need to use the attributes of the new data as coordi- nates, and calculate its distance to each category, then the data type is the shortest distance category. This algorithm doesn' t mind if the attribute is discrete and how much the number of the attribute' s type. It can be used to uncertain attribute categories ( both discrete attributes and continuous attributes), and the situation that the category classification is a few more.
出处
《安庆师范学院学报(自然科学版)》
2013年第1期31-34,共4页
Journal of Anqing Teachers College(Natural Science Edition)
基金
安徽省高等学校省级自然科学研究项目(KJ2012B065)资助
关键词
数据挖掘
分类
数字特征
data mining, classification, mathematical expectation