针对当前的云计算服务器缺少对不稳定数据的识别与检测,设计并实现一种云服务器中不稳定数据挖掘系统。介绍系统的总体结构,利用数据采样预处理模块实现从源数据到挖掘数据的映射,完成离散化、数据过滤等处理过程。依据2.0 mm ERmet Har...针对当前的云计算服务器缺少对不稳定数据的识别与检测,设计并实现一种云服务器中不稳定数据挖掘系统。介绍系统的总体结构,利用数据采样预处理模块实现从源数据到挖掘数据的映射,完成离散化、数据过滤等处理过程。依据2.0 mm ERmet Hard Metric连接器,采用Rapid IO协议,通过接口模块完成数据间的传输,以达到信号传输效率与稳定性的要求。通过数据挖掘模块对云服务器中不稳定数据的确认与挖掘,将挖掘结果传输至控制模块进行处理。软件设计过程中,对云服务器中不稳定数据挖掘系统进行了详细地分析,并给出不稳定数据挖掘的实现过程以及系统部分程序代码。实验结果表明,所设计的系统具有很高的实用性和可靠性。展开更多
A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,wh...A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely,which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient,practical and suitable for time-varying data streams processing.展开更多
文摘针对当前的云计算服务器缺少对不稳定数据的识别与检测,设计并实现一种云服务器中不稳定数据挖掘系统。介绍系统的总体结构,利用数据采样预处理模块实现从源数据到挖掘数据的映射,完成离散化、数据过滤等处理过程。依据2.0 mm ERmet Hard Metric连接器,采用Rapid IO协议,通过接口模块完成数据间的传输,以达到信号传输效率与稳定性的要求。通过数据挖掘模块对云服务器中不稳定数据的确认与挖掘,将挖掘结果传输至控制模块进行处理。软件设计过程中,对云服务器中不稳定数据挖掘系统进行了详细地分析,并给出不稳定数据挖掘的实现过程以及系统部分程序代码。实验结果表明,所设计的系统具有很高的实用性和可靠性。
基金The High Technology Research Plan of Jiangsu Prov-ince (No.BG2004034)the Foundation of Graduate Creative Program ofJiangsu Province (No.xm04-36).
文摘A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely,which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient,practical and suitable for time-varying data streams processing.