期刊文献+

基于集成特征选择的FSSD算法 被引量:2

FSSD Algorithm Based on Ensemble Feature Selection
下载PDF
导出
摘要 FSSD(fast and efficient subgroup set discovery)是一种子群发现算法,旨在短时间内提供多样性模式集,然而此算法为了减少运行时间,选择域数量少的特征子集,当特征子集与目标类不相关或者弱相关时,模式集质量下降.针对这个问题,提出一种基于集成特征选择的FSSD算法,它在预处理阶段使用基于ReliefF(Relief-F)和方差分析的集成特征选择来获得多样性和相关性强的特征子集,再使用FSSD算法返回高质量模式集.在UCI数据集、全国健康和营养调查报告(NHANES)数据集上的实验结果表明,改进后的FSSD算法提高了模式集质量,归纳出更有趣的知识.在NHANES数据集上,进一步分析模式集的特征有效性和阳性预测值. Fast and efficient subgroup set discovery(FSSD)is a subgroup discovery algorithm that aims to provide a diverse set of patterns in a short period of time.However,in order to reduce the running time,this algorithm selects a feature subset with a small number of domains.When the feature subset is irrelevant or weakly related to the target class,the quality of the pattern set decreases.To solve this problem,this study proposes a FSSD algorithm based on ensemble feature selection.In the preprocessing stage,it uses ensemble feature selection based on ReliefF(Relief-F)and analysis of variance to obtain feature subset with diversity and strong correlation,and then uses FSSD algorithm to return highquality pattern set.The experimental results on the UCI datasets and the National Health and Nutrition Examination Survey(NHANES)dataset show that the improved FSSD algorithm improves the quality of the pattern set,thereby summarizing more interesting knowledge.Furthermore,the feature validity and positive predictive value of the pattern set are further analyzed on the NHANES dataset.
作者 张崟 何振峰 ZHANG Yin;HE Zhen-Feng(College of Mathematics and Computer Science,Fuzhou University,Fuzhou 350108,China)
出处 《计算机系统应用》 2022年第3期275-281,共7页 Computer Systems & Applications
基金 福建省自然科学基金(2018J01794)。
关键词 子群发现 集成特征选择 RELIEFF 方差分析 subgroup discovery ensemble feature selection ReliefF analysis of variance
  • 相关文献

参考文献1

二级参考文献52

  • 1Fayyad U, Piatetsky-Shapiro G, Smyth P. Knowledge dis- covery and data mining: Towards a unifying framework. In Proc. the 2rid International Conference on Knowledge Discovery and Data Mining (KDD), Aug. 1996, pp.82-88.
  • 2Novak P K, Lavra5 N, Webb G I. Supervised descriptive rule discovery: A unifying survey of contrast set, emerg- ing pattern and subgroup mining. The Journal of Machine Learning Research, 2009, 10: 377-403.
  • 3Gamberger D, Lavra N, Krstai5 G. Active subgroup min- ing: A case study in coronary heart disease risk group detec- tion. Artificial Intelligence in Medicine, 2003, 28(1): 27-57.
  • 4Gamberger D, Lavra N. Supporting factors in descriptive analysis of brain ischaemia. In Proc. the 11th Conference on Artificial Intelligence in Medicine (AIME), Jul. 2007, pp.155-159.
  • 5Gamberger D, Lavra6 N, Krsta6i6 A, Krstaid G. Clinical data analysis based on iterative subgroup discovery: Ex- periments in brain ischaemia data analysis. Applied Intelli- gence, 2007, 27(3): 205-217.
  • 6K16sgen W. Applications and research problems of subgroup mining. In Proc. the 11th ISMIS, June 1999.
  • 7Lavra6 N, Cestnik B, Gamberger D, Flach P. Decision sup- port through subgroup discovery: Three case studies and the lessons learned. Machine Learning, 2004, 57(1/2): 115- 143.
  • 8Romero C, Gonz1ez P, Ventura S, del Jesus M J, Her- rera F. Evolutionary algorithms for subgroup discovery in e-learning: A practical application using Moodle data. Ex- pert Systems with Applications: An International Journal, 2009, 36(2): 1632-1644.
  • 9Klosgen W, May M. Spatial subgroup mining integrated in an object-relational spatial database. In Proc. the 6th European Conference on Principles of Data Mining and Knowledge Discovery ( PKDD), Aug. 2002, pp.275-286.
  • 10May M, Ragia L. Spatial subgroup discovery applied to the analysis of vegetation data. In Proc. the th Practical As- pects of Knowledge Management, Dec. 2002, pp.49-61.

同被引文献8

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部