期刊文献+

基于属性约简的粗糙集海量数据分割算法研究 被引量:1

Mass Data Partition for Rough Set on Attribute Reduction Algorithm
下载PDF
导出
摘要 结合Rough Set理论研究了分布式处理海量数据中的关键问题,即分割海量数据集的问题。经典的Rough Set算法要求数据常驻内存,因此不能有效地处理海量数据。为了能够直接处理海量数据集,根据最佳分割的定义,结合属性约简的思想,提出基于属性约简的粗糙集海量数据分割算法(Mass Data Partition for Rough Set on Attribute Reduction,MD-PRS-AR)。通过实验表明,MDPRS-AR算法的分割效率比传统的算法约高70%,而且与处理整个数据集的算法相比,正确性损失不大。 An effective rough-set-based method is developed to study the key problem of process distributed mass data, which is the problem of segment massive dataset. Most other rough- set - based algorithms are designed only for memory- resident data, so it is hard for these algorithms to deal with mass data set. On the base of definition of best partition, and combined with the idea of attribute reduction, a mass data partition for rough set on attribute reduction algorithm is developed for processing mass data sets directly. It is proved by simulation experiments that the MDPRS- AR method presented is faster than original rough- set- based algorithms by about 70%, while its performance is close to those algorithms that process the original data set as a whole.
出处 《计算机技术与发展》 2010年第4期5-7,11,共4页 Computer Technology and Development
基金 国家自然科学基金(60973139 60773041) 江苏省自然科学基金(BK2008451) 国家高科技863项目(2007AA01Z404 2007AA01Z478) 现代通信国家重点实验室基金(9140C1105040805) 国家和江苏省博士后基金(0801019C 20090451240 20090451241) 江苏高校科技创新计划项目(CX08B-086Z) 江苏省六大高峰人才项目(2008118) 江苏省青蓝工程资助项目
关键词 海量数据 粗糙集 数据分割 分布式处理 属性约简 mass data rough set data partition distributed information procession attribute reduction
  • 相关文献

参考文献7

  • 1苗卿,单立新,裘昱.信息熵在数据集分割中的应用研究[J].电脑知识与技术,2007(3):1193-1194. 被引量:3
  • 2Pawlak Z. Rough Set Approach to Multi - Attriute Decision Analysis[J]. European Journal of Operational Research, 1994(72) :443 - 459.
  • 3Pawlak Z, Grzymala - Busse J, Slowinski R, et al. Rough Sets [ J ]. Communications of the ACM, 1995,38(11 ) :89 - 95.
  • 4姚辉学,卢章平.海量数据多边形布尔运算的区域分割算法[J].中国图象图形学报,2007,12(3):552-557. 被引量:7
  • 5伍东,李建,税敏.海量数据并行压缩算法研究[J].山西电子技术,2007(2):85-87. 被引量:2
  • 6AnAJ, Shan N, Chan C, etal. Discovering Rules for Water Demand Prediction: An Enhanced Rough - set Approach[J]. Artificial Intelligence, 1996,9(6) :645 - 653.
  • 7Wu X D, Zhang S C. Synthesizing High- Frequency Rules from Different Data Sources[J]. IEEE. Transaction on Knowledge and Data Engineering,2003,15 (2) : 353 - 367.

二级参考文献12

  • 1杨智君,田地,马骏骁,隋欣,周斌.入侵检测技术研究综述[J].计算机工程与设计,2006,27(12):2119-2123. 被引量:45
  • 2[3]David Salomon.数据压缩原理与应用[M].吴乐南,译.北京:电子工业出版社,2003.
  • 3[2]Cleary J.G.and I.H.Witten(1984),Data Compression Using Adaptive Coding and Partial String Matching.IEEE Transactions on Communications COM-32(4):396-402,April
  • 4[3]Eric Bodden,make clasen.Arithmetic Coding revealed-A guided tour from theory to praxis.Translated and updated version,May 2004.
  • 5[4]Barry Wilkinson、Michad Allent.并行程序设计[M].陆鑫达,等译.北京:机械工业出版社,2005.
  • 6Rivero M,Feito F R.Boolean operations on general planar polygons[J].Computer& Graphics,2000,24(6):881 -896.
  • 7Ruiz J,de Miras,Feito F R.Inclusion test for curved-edged polygons[J].Computers & Graphics,1997,21(6):815 -824.
  • 8Feito F,Rivero M L,Rueda A J.Boolean representations of general planar polygons[A].In:Proceedings of the 7th International Conference in Central Europe on Computer Graphics,Visualization and Interactive Digital Media[C],Plzen-Bory,Czech Republic,1999:87 - 92.
  • 9周培德著.计算几何--算法分析与设计[M].北京:清华大学出版社,1999:133-176.
  • 10谢步瀛,张岩.用分段法与链表法的二维布尔运算[J].工程图学学报,2003,24(2):78-84. 被引量:7

共引文献9

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部