期刊文献+

基于随机森林的多源小样本数据快速集成方法

Random Forest Based Fast Integration of Multi-Source Small Sample Data
下载PDF
导出
摘要 受多源小样本数据属性复杂性的影响,对其进行集成处理时,过拟合和欠拟合情况较为明显。为此,文章提出基于随机森林的多源小样本数据快速集成方法。考虑多源小样本数据自身的属性特征,在构建随机森林模型阶段,充分利用粒向量与多源小样本数据特征的贴合性,将其作为随机森林的基础结构,利用粒化层归一化多源小样本数据,并将输出的粒化结果作为决策层的节点。在集成阶段,根据多源小样本数据与决策层节点之间的距离,集成数据。在测试结果中,数据集成的过拟合情况占比仅为0.29%,欠拟合情况占比也仅为0.27%,具有良好的集成效果。 Influenced by the attribute complexity of multi-source small sample data,the overfitting and underfiting are obvious.Therefore,the rapid integration method of multi-source small sample data based on random forest is proposed.Considering the properties of multi-source small sample data itself,in the construction of the random forest model stage,make full use of the fit of particle vector and small sample data features,as the basis of the random forest,using the granulation layer of multi-source small sample data normalization operation,and the output granulation results as a decision-making node.In the integration stage,the integration of the data is realized according to the distance between the multi-source small sample data and the nodes at the decision level.In the test results,the proportion of overfitting of data integration was only 0.29%,and the proportion of underfitting was only 0.27%,which had good integration effect.
作者 何昀 张川 张继夫 陈伟 HE Yun;ZHANG Chuan;ZHANG Jifu;CHEN Wei(Aviation University of Air Force,Changchun Jilin 130021,China)
机构地区 空军航空大学
出处 《信息与电脑》 2024年第1期52-54,共3页 Information & Computer
关键词 随机森林 多源小样本数据 快速集成 属性特征 随机森林模型 random forest multi-source small sample data fast integration attribute characteristics random forest model
  • 相关文献

参考文献7

二级参考文献71

共引文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部