摘要
为解决粗糙集和形式概念分析在数据库中所施展的无母体论的样本操作问题,提出了以因素空间为母体的关系数据库样本理论,为非传统概率统计方法提供了新的信度基础.通过回溯因素空间的发展历史及其成果,介绍与关系数据库的关系,说明了因素空间乃是数据科学最贴切的数学基础理论.研究结果表明:在因素空间基础上所建立的样本理论与传统的概率统计理论有本质的不同,样本不仅是分析的根据,更是培植的对象;对于凸背景关系,面对着大数据流,数据分析师只需把握住为数不多的样本基点,随时按规则调整这组基点,便可获得母体完整的信息.
In allusion to the drawback of population-lacking in the sampling operations provided by rough sets and formal concept analysis,a sample theory taking factor spaces as populations has been introduced in the paper firstly,which provides a credible basis for non-traditional statistical methods in databases.The paper reviews the history of factor space and its achievements,and introduces the relationship between it with databases,then shows that factor space is the best mathematical framework for data science.The sample theory built on factor space is essentially different with traditional probability statistics:Samples are not only tool of analysis,but the objects of cultivation.For convex background relationship,facing big data flow,analysts only need to grasp a small number of basic simple points,and do notation according to rules at all times,the complete information implicated in population will be obtained.
出处
《辽宁工程技术大学学报(自然科学版)》
CAS
北大核心
2015年第2期273-280,共8页
Journal of Liaoning Technical University (Natural Science)
基金
国家自然科学基金资助项目(61350003)
教育部高校博士学科点专项科研基金资助项目(20102121110002)
关键词
因素空间
因素库
因素的独立与相关
因素背景关系
基样本
样本培植
粗糙集
形式概念分析
factor spaces
factorial databases
factorial independent and relation
factorial background relation
basic sample
sample cultivation
rough sets
formal concept analysis