Ensemble Filter-Wrapper Text Feature Selection Methods for Text Classification

下载PDF

导出

摘要 Feature selection is a crucial technique in text classification for improving the efficiency and effectiveness of classifiers or machine learning techniques by reducing the dataset’s dimensionality.This involves eliminating irrelevant,redundant,and noisy features to streamline the classification process.Various methods,from single feature selection techniques to ensemble filter-wrapper methods,have been used in the literature.Metaheuristic algorithms have become popular due to their ability to handle optimization complexity and the continuous influx of text documents.Feature selection is inherently multi-objective,balancing the enhancement of feature relevance,accuracy,and the reduction of redundant features.This research presents a two-fold objective for feature selection.The first objective is to identify the top-ranked features using an ensemble of three multi-univariate filter methods:Information Gain(Infogain),Chi-Square(Chi^(2)),and Analysis of Variance(ANOVA).This aims to maximize feature relevance while minimizing redundancy.The second objective involves reducing the number of selected features and increasing accuracy through a hybrid approach combining Artificial Bee Colony(ABC)and Genetic Algorithms(GA).This hybrid method operates in a wrapper framework to identify the most informative subset of text features.Support Vector Machine(SVM)was employed as the performance evaluator for the proposed model,tested on two high-dimensional multiclass datasets.The experimental results demonstrated that the ensemble filter combined with the ABC+GA hybrid approach is a promising solution for text feature selection,offering superior performance compared to other existing feature selection algorithms.

作者 Oluwaseun Peter Ige Keng Hoon Gan

机构地区 School of Computer Sciences Universal Basic Education Commission

出处《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第11期1847-1865,共19页 工程与科学中的计算机建模（英文）

基金 supported by Universiti Sains Malaysia(USM)and School of Computer Sciences,USM。

关键词 Metaheuristic algorithms text classification multi-univariate filter feature selection ensemble filter-wrapper techniques

分类号 TP3 [自动化与计算机技术—计算机科学与技术]

引文网络
相关文献

1Jianhong Sun,Chenlu Wang,Daren Zheng,Zhi Sun,Hao Liu,Zhuoran Sheng,Shengrun Zhang,Weidong Zhao.The future and technique challenges of high-speed ground effect vehicle enrolled in maritime transportation[J].Aerospace Traffic and Safety,2024,1(1):43-54.
2Yue Hao.Machine learning-assisted smart epitaxy ofⅢ-Ⅴsemiconductors[J].Science China Materials,2024,67(9):3041-3042.
3Han Li,Shumin Yuan,Han Wu,Yajie Wang,Yichen Ma,Xiance Tang,Xiaomin Fu,Lingdi Zhao,Benling Xu,Tiepeng Li,Peng Qin,Hongqin You,Lu Han,Zibing Wang.Combination therapy using low‐dose anlotinib and immune checkpoint inhibitors for extensive‐stage small cell lung cancer[J].Cancer Innovation,2024,3(6):76-85.
4Xianglin Shan,Xuxiang Sun,Wenbo Cao,Weiwei Zhang,Zhenhua Xia.Modeling Reynolds stress anisotropy invariants via machine learning[J].Acta Mechanica Sinica,2024,40(6):50-63.
5An Tian,Ziwei Cui,Jian Ren,Yeqing Ren,Ming Ye,Guilin Li,Chuan He,Xiaoyu Li,Gao Zeng,Peng Hu,Yongjie Ma,Jiaxing Yu,Jingwei Li,Lisong Bian,Fan Yang,Qianwen Li,Feng Ling,Tao Hong,Liyong Sun,Hongqi Zhang.Surgical timing and long-term outcomes in patients with severe haemorrhagic spinal cord cavernous malformations[J].Stroke & Vascular Neurology,2024,9(4):439-445.
6Abdelaziz I.Hammouri,Mohammed A.Awadallah,Malik Sh.Braik,Mohammed Azmi Al-Betar,Majdi Beseiso.Improved Dwarf Mongoose Optimization Algorithm for Feature Selection:Application in Software Fault Prediction Datasets[J].Journal of Bionic Engineering,2024,21(4):2000-2033.
7曹玉飞,刘为国,朱洪波.基于人工蜂群算法优化随机森林的变压器故障诊断[J].兰州工业学院学报,2024,31(5):29-34.
8武嘉琦,季友昌,袁伟伟.基于迁移学习的飞机燃油系统故障检测方法研究[J].飞机设计,2024,44(4):29-33.
9禹洁,周佳,黄韡,白斌芳,李春晶.基于稳定同位素与矿物元素的蜂蜜产地溯源判别模型的构建[J].食品安全质量检测学报,2024,15(18):84-93.
10Wen-Yong Fan,Yi-Ming Chen,Yi-Fan Wang,Yu-Qi Wang,Jia-Qi Hu,Wen-Xu Tang,Yi Feng,Qian Cheng,Lei Xue.L-Type Calcium Channel Modulates Low-Intensity Pulsed Ultrasound-Induced Excitation in Cultured Hippocampal Neurons[J].Neuroscience Bulletin,2024,40(7):921-936. 被引量：2

Computer Modeling in Engineering & Sciences

2024年第11期

浏览历史

内容加载中请稍等...

Ensemble Filter-Wrapper Text Feature Selection Methods for Text Classification

相关作者

相关机构

相关主题

浏览历史