基于集成特征选择的FSSD算法被引量：2

FSSD Algorithm Based on Ensemble Feature Selection

下载PDF

导出

摘要 FSSD(fast and efficient subgroup set discovery)是一种子群发现算法,旨在短时间内提供多样性模式集,然而此算法为了减少运行时间,选择域数量少的特征子集,当特征子集与目标类不相关或者弱相关时,模式集质量下降.针对这个问题,提出一种基于集成特征选择的FSSD算法,它在预处理阶段使用基于ReliefF(Relief-F)和方差分析的集成特征选择来获得多样性和相关性强的特征子集,再使用FSSD算法返回高质量模式集.在UCI数据集、全国健康和营养调查报告(NHANES)数据集上的实验结果表明,改进后的FSSD算法提高了模式集质量,归纳出更有趣的知识.在NHANES数据集上,进一步分析模式集的特征有效性和阳性预测值. Fast and efficient subgroup set discovery(FSSD)is a subgroup discovery algorithm that aims to provide a diverse set of patterns in a short period of time.However,in order to reduce the running time,this algorithm selects a feature subset with a small number of domains.When the feature subset is irrelevant or weakly related to the target class,the quality of the pattern set decreases.To solve this problem,this study proposes a FSSD algorithm based on ensemble feature selection.In the preprocessing stage,it uses ensemble feature selection based on ReliefF(Relief-F)and analysis of variance to obtain feature subset with diversity and strong correlation,and then uses FSSD algorithm to return highquality pattern set.The experimental results on the UCI datasets and the National Health and Nutrition Examination Survey(NHANES)dataset show that the improved FSSD algorithm improves the quality of the pattern set,thereby summarizing more interesting knowledge.Furthermore,the feature validity and positive predictive value of the pattern set are further analyzed on the NHANES dataset.

作者张崟何振峰 ZHANG Yin;HE Zhen-Feng(College of Mathematics and Computer Science,Fuzhou University,Fuzhou 350108,China)

机构地区福州大学数学与计算机科学学院

出处《计算机系统应用》 2022年第3期275-281,共7页 Computer Systems & Applications

基金福建省自然科学基金(2018J01794)。

关键词子群发现集成特征选择 RELIEFF 方差分析 subgroup discovery ensemble feature selection ReliefF analysis of variance

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献1

1Sumyea Helal.Subgroup Discovery Algorithms： A Survey and Empirical Evaluation[J].Journal of Computer Science & Technology,2016,31(3):561-576. 被引量：1

二级参考文献52

1Fayyad U, Piatetsky-Shapiro G, Smyth P. Knowledge dis- covery and data mining: Towards a unifying framework. In Proc. the 2rid International Conference on Knowledge Discovery and Data Mining (KDD), Aug. 1996, pp.82-88.
2Novak P K, Lavra5 N, Webb G I. Supervised descriptive rule discovery: A unifying survey of contrast set, emerg- ing pattern and subgroup mining. The Journal of Machine Learning Research, 2009, 10: 377-403.
3Gamberger D, Lavra N, Krstai5 G. Active subgroup min- ing: A case study in coronary heart disease risk group detec- tion. Artificial Intelligence in Medicine, 2003, 28(1): 27-57.
4Gamberger D, Lavra N. Supporting factors in descriptive analysis of brain ischaemia. In Proc. the 11th Conference on Artificial Intelligence in Medicine (AIME), Jul. 2007, pp.155-159.
5Gamberger D, Lavra6 N, Krsta6i6 A, Krstaid G. Clinical data analysis based on iterative subgroup discovery: Ex- periments in brain ischaemia data analysis. Applied Intelli- gence, 2007, 27(3): 205-217.
6K16sgen W. Applications and research problems of subgroup mining. In Proc. the 11th ISMIS, June 1999.
7Lavra6 N, Cestnik B, Gamberger D, Flach P. Decision sup- port through subgroup discovery: Three case studies and the lessons learned. Machine Learning, 2004, 57(1/2): 115- 143.
8Romero C, Gonz1ez P, Ventura S, del Jesus M J, Her- rera F. Evolutionary algorithms for subgroup discovery in e-learning: A practical application using Moodle data. Ex- pert Systems with Applications: An International Journal, 2009, 36(2): 1632-1644.
9Klosgen W, May M. Spatial subgroup mining integrated in an object-relational spatial database. In Proc. the 6th European Conference on Principles of Data Mining and Knowledge Discovery ( PKDD), Aug. 2002, pp.275-286.
10May M, Ragia L. Spatial subgroup discovery applied to the analysis of vegetation data. In Proc. the th Practical As- pects of Knowledge Management, Dec. 2002, pp.49-61.

同被引文献8

1梁晔,刘宏哲.基于视觉注意力机制的图像检索研究[J].北京联合大学学报,2010,24(1):30-35. 被引量：12
2陈白强,盛静文,江开忠.基于损失函数的代价敏感集成算法[J].计算机应用,2020,40(S02):60-65. 被引量：5
3刘洪宇,杨林,姜蕾.恶劣环境下图像算法数据增强方法[J].计算机工程与设计,2021,42(9):2545-2551. 被引量：6
4张家钧,唐云祁,杨智雄.基于改进残差网络和数据增强的鞋型识别算法[J].电子测量技术,2021,44(19):139-147. 被引量：2
5张剑飞,柯赛.改进YOLOX火灾场景检测方法的研究[J].计算机与数字工程,2022,50(2):318-322. 被引量：7
6王一田,唐开强,留沧海,刘东.基于YOLO v3的地面垃圾检测与清洁度评定方法[J].传感器与微系统,2022,41(4):129-133. 被引量：10
7莫云.基于混合特征选择的脑电解码方法[J].计算机与现代化,2022(4):92-96. 被引量：3
8王宏,敬忠良,李建勋.一种基于图像块分割的多聚焦图像融合方法[J].上海交通大学学报,2003,37(11):1743-1746. 被引量：33

引证文献2

1欧阳飞,吴旭,向东升.基于改进YOLOX的垃圾分类检测方法[J].计算机与现代化,2023(8):68-73.
2刘紫恒,周建华.一种用于运动想象脑电信号的混合特征选择算法[J].兰州大学学报（自然科学版）,2024,60(2):167-172.

1张小清,王晨曦,吕彦,林耀进.基于ReliefF的层次分类在线流特征选择算法[J].计算机应用,2022,42(3):688-694. 被引量：8
2刘莉,孙柳,习洋,邓玉琴,焦沃尔,陈始明.酒类摄入与过敏的相关性研究[J].华南国防医学杂志,2022,36(1):46-50.
3钱建波,朱建平,董进.一种高选择性三阶带通三维频率选择表面[J].无线电工程,2022,52(4):645-650. 被引量：2
4史书晓,张妍,肖萍,田英.健康成年人群尿中对硝基酚水平对甲状腺功能的影响——基于美国NHANES数据库[J].环境与职业医学,2021,38(12):1350-1355.
5阎腾龙,朱晓俊,杜慧华,丁晓文,牛东升,李珏.农药使用与人群脂代谢水平的关系[J].中华劳动卫生职业病杂志,2022,40(1):24-27. 被引量：3
6李天籽,陆铭俊.中国人口流动网络特征及影响因素研究——基于腾讯位置大数据的分析[J].当代经济管理,2022,44(2):1-9. 被引量：13
7许敏.隐空间特征增强自标记半监督SVM分类新方法[J].统计与决策,2022,38(7):11-15. 被引量：4
8王朝欣,马磊,袁家鼎,袁晨,田莉,王子阳.基于Kinect的ISS_CPD算法在混合现实医学中的应用[J].现代计算机,2022,28(4):59-63. 被引量：2
9魏丛,牛艳芳,卢洁.审计对象网络化分析及应用案例[J].中小企业管理与科技,2022(3):147-149. 被引量：1
10Wen-Tao Wu,Yuan-Jie Li,Ao-Zi Feng,Li Li,Tao Huang,An-Ding Xu,Jun Lv.Data mining in clinical big data:the frequently used databases,steps,and methodological models[J].Military Medical Research,2021,8(4):552-563. 被引量：24

计算机系统应用

2022年第3期

浏览历史

内容加载中请稍等...

基于集成特征选择的FSSD算法被引量：2

参考文献1

二级参考文献52

同被引文献8

引证文献2

相关作者

相关机构

相关主题

浏览历史

基于集成特征选择的FSSD算法 被引量：2

参考文献1

二级参考文献52

同被引文献8

引证文献2

相关作者

相关机构

相关主题

浏览历史

基于集成特征选择的FSSD算法被引量：2