摘要
针对海量数据处理过程中大量相似特征会给数据分类造成冗余干扰,在分类中心确定时出现多次校验、重复等弊端,提出一种海量数据干扰下冗余数据高性能消除方法.采用主动采样方法提取海量数据干扰下冗余数据特征,并对其进行分类.引入均值漂移传递函数对冗余数据进行分类处理,获取冗余数据活跃程度,实现冗余数据的高性能消除.结果表明,相比传统的消除方法,高效消除方法性能良好,所需时间短,具有一定的优越性.
Aiming at the problem that in the data treatment process of massive data,a lot of similar characteristics can bring the redundant interference to the data classification,and cause such drawbacks as multiple check and repetition during the determination with the classification center,a high performance elimination method for redundant data under the massive data interference was proposed. With the active sampling method,the redundancy data feature under the massive data interference was extracted and classified. In addition,the mean shift transfer function was introduced to perform the classification of redundant data, obtain the redundant data activity and realize the high performance elimination of redundancy data. The results showthat compared with the traditional method,the high performance elimination method has good properties,short processed time and certain superiority.
出处
《沈阳工业大学学报》
EI
CAS
北大核心
2017年第6期686-690,共5页
Journal of Shenyang University of Technology
基金
河南省科技厅科技攻关资助项目(172102210445)
河南省科技厅软科学研究资助项目(152400410345)
河南省教育厅资助项目(15A520093)
关键词
海量数据
干扰
冗余数据
高性能
消除方法
改进
均值
传递函数
massive data
interference
redundant data
high performance
elimination method
improvement
mean value
transfer function