期刊文献+
共找到579篇文章
< 1 2 29 >
每页显示 20 50 100
Multi-Label Feature Selection Based on Improved Ant Colony Optimization Algorithm with Dynamic Redundancy and Label Dependence
1
作者 Ting Cai Chun Ye +5 位作者 Zhiwei Ye Ziyuan Chen Mengqing Mei Haichao Zhang Wanfang Bai Peng Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第10期1157-1175,共19页
The world produces vast quantities of high-dimensional multi-semantic data.However,extracting valuable information from such a large amount of high-dimensional and multi-label data is undoubtedly arduous and challengi... The world produces vast quantities of high-dimensional multi-semantic data.However,extracting valuable information from such a large amount of high-dimensional and multi-label data is undoubtedly arduous and challenging.Feature selection aims to mitigate the adverse impacts of high dimensionality in multi-label data by eliminating redundant and irrelevant features.The ant colony optimization algorithm has demonstrated encouraging outcomes in multi-label feature selection,because of its simplicity,efficiency,and similarity to reinforcement learning.Nevertheless,existing methods do not consider crucial correlation information,such as dynamic redundancy and label correlation.To tackle these concerns,the paper proposes a multi-label feature selection technique based on ant colony optimization algorithm(MFACO),focusing on dynamic redundancy and label correlation.Initially,the dynamic redundancy is assessed between the selected feature subset and potential features.Meanwhile,the ant colony optimization algorithm extracts label correlation from the label set,which is then combined into the heuristic factor as label weights.Experimental results demonstrate that our proposed strategies can effectively enhance the optimal search ability of ant colony,outperforming the other algorithms involved in the paper. 展开更多
关键词 Multi-label feature selection ant colony optimization algorithm dynamic redundancy high-dimensional data label correlation
下载PDF
Oncological features and prognosis of colorectal cancer in human immunodeficiency virus-positive patients: A retrospective study
2
作者 Fu-Yu Yang Fan He +4 位作者 De-Fei Chen Cheng-Lin Tang Saed Woraikat Yao Li Kun Qian 《World Journal of Gastrointestinal Surgery》 SCIE 2024年第1期29-39,共11页
BACKGROUND Due to the prolonged life expectancy and increased risk of colorectal cancer(CRC)among patients with human immunodeficiency virus(HIV)infection,the prognosis and pathological features of CRC in HIV-positive... BACKGROUND Due to the prolonged life expectancy and increased risk of colorectal cancer(CRC)among patients with human immunodeficiency virus(HIV)infection,the prognosis and pathological features of CRC in HIV-positive patients require examination.AIM To compare the differences in oncological features,surgical safety,and prognosis between patients with and without HIV infection who have CRC at the same tumor stage and site.METHODS In this retrospective study,we collected data from HIV-positive and-negative patients who underwent radical resection for CRC.Using random stratified sampling,24 HIV-positive and 363 HIV-negative patients with colorectal adenocarcinoma after radical resection were selected.Using propensity score matching,we selected 72 patients,matched 1:2(HIV-positive:negative=24:48).Differences in basic characteristics,HIV acquisition,perioperative serological indicators,surgical safety,oncological features,and long-term prognosis were compared between the two groups.RESULTS Fewer patients with HIV infection underwent chemotherapy compared to patients without.HIV-positive patients had fewer preoperative and postoperative leukocytes,fewer preoperative lymphocytes,lower carcinoembryonic antigen levels,more intraoperative blood loss,more metastatic lymph nodes,higher node stage,higher tumor node metastasis stage,shorter overall survival,and shorter progression-free survival compared to patients who were HIV-negative.CONCLUSION Compared with CRC patients who are HIV-negative,patients with HIV infection have more metastatic lymph nodes and worse long-term survival after surgery.Standard treatment options for HIV-positive patients with CRC should be explored. 展开更多
关键词 Colorectal cancer Human immunodeficiency virus Propensity score matching Oncological features Surgical safety PROGNOSIS
下载PDF
Feature Selection via Analysis of Relevance and Redundancy 被引量:2
3
作者 王飒 王克勇 郑链 《Journal of Beijing Institute of Technology》 EI CAS 2008年第3期300-304,共5页
Feature selection is an important problem in pattern classification systems. High dimension fisher criterion(HDF) is a good indicator of class separability. However, calculating the high dimension fisher ratio is di... Feature selection is an important problem in pattern classification systems. High dimension fisher criterion(HDF) is a good indicator of class separability. However, calculating the high dimension fisher ratio is difficult. A new feature selection method, called fisher-and-correlation (FC), is proposed. The proposed method is combining fisher criterion and correlation criterion based on the analysis of feature relevance and redundancy. The proposed methodology is tested in five different classification applications. The presented resuits confirm that FC performs as well as HDF does at much lower computational complexity. 展开更多
关键词 feature selection high dimension fisher criterion(HDF) RELEVANCE redundancy
下载PDF
基于F-score和二进制灰狼优化的肿瘤基因选择方法
4
作者 穆晓霞 郑李婧 《南京师大学报(自然科学版)》 CAS 北大核心 2024年第1期111-120,共10页
针对肿瘤基因数据维度高、噪声多、冗余性高的现状,结合Spearman相关系数改进F-score算法,在此基础上优化二进制灰狼算法,提出了一种基于改进F-score和二进制灰狼算法的肿瘤基因选择算法.首先,考虑特征之间的相关性,计算每个特征的F-sc... 针对肿瘤基因数据维度高、噪声多、冗余性高的现状,结合Spearman相关系数改进F-score算法,在此基础上优化二进制灰狼算法,提出了一种基于改进F-score和二进制灰狼算法的肿瘤基因选择算法.首先,考虑特征之间的相关性,计算每个特征的F-score值和特征之间的Spearman相关系数的绝对值;然后,计算权重系数得出各个特征的权重值,依据重要性进行排序,选出初选特征子集;最后,通过收敛因子的衰减曲线和初始化方法优化二进制灰狼算法,调整全局搜索和局部搜索所占比例,增强全局搜索能力并提高局部搜索速度,有效节省时间开销,提升特征选择的分类性能和效率,得到最优特征子集.在9个肿瘤基因数据集上测试所提算法,在分类准确率和筛选特征数目两个指标上进行仿真实验,并与4种其他算法进行对比,实验结果证明所提算法表现良好,可有效降低基因数据维度,并具有较好的分类精度. 展开更多
关键词 肿瘤基因 Fisher-score Spearman 相关系数 二进制灰狼优化算法 特征选择
下载PDF
Chemical Reactivity Properties, Drug-Likeness Features and Bioactivity Scores of the Cholecystokinin Peptide Hormone 被引量:2
5
作者 Norma Flores-Holguín Juan Frau Daniel Glossman-Mitnik 《Computational Molecular Bioscience》 2019年第2期41-47,共7页
Five density functionals, CAM-B3LYP, LC-ωPBE, MN12SX, N12SX and ωB97XD, in connection with the Def2TZVP basis set were assessed together with the SMD solvation model for the calculation of the molecular and chemical... Five density functionals, CAM-B3LYP, LC-ωPBE, MN12SX, N12SX and ωB97XD, in connection with the Def2TZVP basis set were assessed together with the SMD solvation model for the calculation of the molecular and chemical reactivity properties of the Cholecystokinin peptide hormone (CCK-8) in the presence of water. All the chemical reactivity descriptors for the systems were calculated via Conceptual Density Functional Theory (CDFT). The potential bioavailability and druggability as well as the bioactivity scoresfor CCK-8 were predicted through different methodologies already reported in the literature which have been previously validated during the study of different peptidic systems. The conclusion was that the CCK-8 peptide will be moderately bioactive regarding all the interactions. 展开更多
关键词 CHOLECYSTOKININ Peptide HORMONE (CCK-8) Conceptual DFT Chemical Reactivity DRUG-LIKENESS featureS Bioactivity scoreS
下载PDF
基于中心偏移的Fisher score与直觉邻域模糊熵的多标记特征选择 被引量:1
6
作者 孙林 马天娇 《计算机科学》 CSCD 北大核心 2024年第7期96-107,共12页
现有多标记Fisher score模型中边缘样本会影响算法分类效果。鉴于邻域直觉模糊熵处理不确定信息时具有更强的表达能力与分辨能力的优势,文中提出了一种基于中心偏移的Fisher score与邻域直觉模糊熵的多标记特征选择方法。首先,根据标记... 现有多标记Fisher score模型中边缘样本会影响算法分类效果。鉴于邻域直觉模糊熵处理不确定信息时具有更强的表达能力与分辨能力的优势,文中提出了一种基于中心偏移的Fisher score与邻域直觉模糊熵的多标记特征选择方法。首先,根据标记将多标记论域划分为多个样本集,计算样本集的特征均值作为标记下样本的原始中心点,以最远样本的距离乘以距离系数,去除边缘样本集,定义了新的有效样本集,计算中心偏移处理后的标记下每个特征的得分以及标记集的特征得分,进而建立了基于中心偏移的多标记Fisher score模型,预处理多标记数据。然后,引入多标记分类间隔作为自适应模糊邻域半径参数,定义了模糊邻域相似关系和模糊邻域粒,由此构造了多标记模糊邻域粗糙集的上、下近似集;在此基础上提出了多标记邻域粗糙直觉隶属度函数和非隶属度函数,定义了多标记邻域直觉模糊熵。最后,给出了特征的外部和内部重要度的计算公式,设计了基于邻域直觉模糊熵的多标记特征选择算法,筛选出最优特征子集。在多标记K近邻分类器下、9个多标记数据集上的实验结果表明,所提算法选择的最优子集具有良好的分类性能。 展开更多
关键词 多标记学习 特征选择 Fisher score 多标记模糊邻域粗糙集 邻域直觉模糊熵
下载PDF
融合单类F-score和遗传算法的微生物特征选择方法
7
作者 卢福梅 温柳英 《信息技术》 2024年第11期125-131,共7页
微生物数据由于其维度高和类别不平衡特点,在传统机器学习分类算法中不能得到理想的分类效果。传统特征选择算法可以达到降维的效果,但对于类别不平衡问题显得有点力不从心。因此,文中提出了一种融合单类F-score和遗传算法的特征选择方... 微生物数据由于其维度高和类别不平衡特点,在传统机器学习分类算法中不能得到理想的分类效果。传统特征选择算法可以达到降维的效果,但对于类别不平衡问题显得有点力不从心。因此,文中提出了一种融合单类F-score和遗传算法的特征选择方法。首先,利用单类F-score操作生成遗传操作中的初始种群;其次,利用SVM分类模型的AUC值作为遗传操作中个体的适应度值;再次,结合遗传操作来更新种群;最终,得到最优特征子集。实验在五个微生物数据集上进行,与四种特征选择算法进行对比,结果表明,文中所提方法在一定程度上优于其他方法。 展开更多
关键词 F-score 高维 不平衡 遗传算法 特征选择
下载PDF
基于Fisher Score特征选择的电力系统暂态稳定评估方法 被引量:8
8
作者 李鹏 董鑫剑 +1 位作者 孟庆伟 陈继明 《电力自动化设备》 EI CSCD 北大核心 2023年第7期117-123,共7页
针对不同电气输入特征与电力系统暂态稳定关联程度不同以及当输入特征受到干扰时评估准确率明显下降的问题,提出一种基于Fisher Score特征选择的电力系统暂态稳定评估方法。设计一种面向电力系统暂态稳定评估二分类问题的样本特征Fisher... 针对不同电气输入特征与电力系统暂态稳定关联程度不同以及当输入特征受到干扰时评估准确率明显下降的问题,提出一种基于Fisher Score特征选择的电力系统暂态稳定评估方法。设计一种面向电力系统暂态稳定评估二分类问题的样本特征Fisher Score值计算方案;通过Fisher Score值排序有效区分重要特征与冗余特征、噪声特征与非噪声特征;将选择的电气特征输入不同机器学习模型中进行训练和评估。新英格兰39节点系统和IEEE 145节点系统的仿真结果表明,所提特征选择方案能有效筛选电力系统暂态稳定评估中重要度高的特征,提升了评估模型的预测性能。 展开更多
关键词 电力系统 暂态稳定评估 特征选择 Fisher score算法
下载PDF
基于互信息的Fisher Score多标记特征选择 被引量:2
9
作者 孙林 张起峰 徐久成 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2023年第1期55-66,共12页
目前,Fisher Score模型在处理多标记数据时没有考虑样本和整个特征空间之间以及特征和标记之间的关系.提出一种基于互信息的Fisher Score多标记特征选择方法.首先,在多标记决策系统中考虑整个样本空间对特征选择的影响,根据异类样本与... 目前,Fisher Score模型在处理多标记数据时没有考虑样本和整个特征空间之间以及特征和标记之间的关系.提出一种基于互信息的Fisher Score多标记特征选择方法.首先,在多标记决策系统中考虑整个样本空间对特征选择的影响,根据异类样本与同类样本之间的欧式距离定义权重公式,并在特征空间下对标记赋予权重衡量标记的重要程度.然后,基于互信息理论定义特征与每个标记之间的互信息来计算每个特征和每个标记之间的相关度,将特征与标记之间的相关度与该标记所占的权重相结合来定义特征和标记集之间的总相关度.将Fisher得分与总相关度结合,定义每个特征的新的Fisher得分,进而构建多标记Fisher Score模型.最后,设计了一种基于互信息的Fisher Score多标记特征选择算法.在六个多标记数据集上的实验证明,提出的算法与其他算法相比,其四种评价指标都表现良好,分类性能出色. 展开更多
关键词 多标记学习 特征选择 互信息 Fisher score
下载PDF
基于Fisher score与模糊邻域熵的多标记特征选择算法 被引量:3
10
作者 孙林 马天娇 薛占熬 《计算机应用》 CSCD 北大核心 2023年第12期3779-3789,共11页
针对Fisher score未充分考虑特征与标记以及标记之间的相关性,以及一些邻域粗糙集模型容易忽略边界域中知识粒的不确定性,导致算法分类性能偏低等问题,提出一种基于Fisher score与模糊邻域熵的多标记特征选择算法(MLFSF)。首先,利用最... 针对Fisher score未充分考虑特征与标记以及标记之间的相关性,以及一些邻域粗糙集模型容易忽略边界域中知识粒的不确定性,导致算法分类性能偏低等问题,提出一种基于Fisher score与模糊邻域熵的多标记特征选择算法(MLFSF)。首先,利用最大信息系数(MIC)衡量特征与标记之间的关联程度,构建特征与标记关系矩阵;基于修正余弦相似度定义标记关系矩阵,分析标记之间的相关性。其次,给出一种二阶策略获得多个二阶标记关系组,以此重新划分多标记论域;通过增强标记之间的强相关性和削弱标记之间的弱相关性得到每个特征的得分,进而改进Fisher score模型,对多标记数据进行预处理。再次,引入多标记分类间隔,定义自适应邻域半径和邻域类并构造了上、下近似集;在此基础上提出了多标记粗糙隶属度函数,将多标记邻域粗糙集映射到模糊集,基于多标记模糊邻域给出了上、下近似集以及多标记模糊邻域粗糙集模型,由此定义模糊邻域熵和多标记模糊邻域熵,有效度量边界域的不确定性。最后,设计基于二阶标记相关性的多标记Fisher score特征选择算法(MFSLC),从而构建MLFSF。在多标记K近邻(MLKNN)分类器下11个多标记数据集上的实验结果表明,相较于ReliefF多标记特征选择(MFSR)等6种先进算法,MLFSF的平均分类精度(AP)的均值提高了2.47~6.66个百分点;同时,在多数数据集上,MLFSF在5个评价指标上均能取得最优值。 展开更多
关键词 多标记学习 特征选择 Fisher score 多标记模糊邻域粗糙集 模糊邻域熵
下载PDF
Improved color feature arrangement for mean shift tracking
11
作者 Xiaowei An Youngjoon Han Hernsoo Hahn 《Journal of Measurement Science and Instrumentation》 CAS 2013年第1期38-42,共5页
In order to reduce redundant empty bin capacity arrangement mechanism for mean shift tracking objects in the probability representation, we present a new color feature In the proposed mechanism, the important optimal ... In order to reduce redundant empty bin capacity arrangement mechanism for mean shift tracking objects in the probability representation, we present a new color feature In the proposed mechanism, the important optimal color, or we call it optimal color vector, is clustered by closing Euclidean distance which happens inside the original RGB color 3-D spatial domain. After obtaining clustering colors from the reference image RGB spatial domain, novel clustering groups substitute for original color data. So the new color substitution distribution is as similar as the original one. And then target region in the candidate frame is mapped by the constructed optimal clustering colors and the cluster Indices. In the final, mean shift algorithm gives a performance in the new optimal color distribution. Comparison under the same circumstance between the proposed algorithm and conventional mean shift algorithm shows that the former has a certain advantage in computation cost. 展开更多
关键词 color feature arrangement optimal color vector CLUSTER redundant bin
下载PDF
MRMR Based Feature Vector Design for Efficient Citrus Disease Detection
12
作者 Bobbinpreet Sultan Aljahdali +4 位作者 Tripti Sharma Bhawna Goyal Ayush Dogra Shubham Mahajan Amit Kant Pandit 《Computers, Materials & Continua》 SCIE EI 2022年第9期4771-4787,共17页
In recent times,the images and videos have emerged as one of the most important information source depicting the real time scenarios.Digital images nowadays serve as input for many applications and replacing the manua... In recent times,the images and videos have emerged as one of the most important information source depicting the real time scenarios.Digital images nowadays serve as input for many applications and replacing the manual methods due to their capabilities of 3D scene representation in 2D plane.The capabilities of digital images along with utilization of machine learning methodologies are showing promising accuracies in many applications of prediction and pattern recognition.One of the application fields pertains to detection of diseases occurring in the plants,which are destroying the widespread fields.Traditionally the disease detection process was done by a domain expert using manual examination and laboratory tests.This is a tedious and time consuming process and does not suffice the accuracy levels.This creates a room for the research in developing automation based methods where the images captured through sensors and cameras will be used for detection of disease and control its spreading.The digital images captured from the field’s forms the dataset which trains the machine learning models to predict the nature of the disease.The accuracy of these models is greatly affected by the amount of noise and ailments present in the input images,appropriate segmentation methodology,feature vector development and the choice of machine learning algorithm.To ensure the high rated performance of the designed system the research is moving in a direction to fine tune each and every stage separately considering their dependencies on subsequent stages.Therefore the most optimum solution can be obtained by considering the image processing methodologies for improving the quality of image and then applying statistical methods for feature extraction and selection.The training vector thus developed is capable of presenting the relationship between the feature values and the target class.In this article,a highly accurate system model for detecting the diseases occurring in citrus fruits using a hybrid feature development approach is proposed.The overall improvement in terms of accuracy is measured and depicted. 展开更多
关键词 Citrus diseases CLASSIFICATION feature vector design plant disease detection redundancy reduction
下载PDF
An Approach to Fault Diagnosis of Rotating Machinery Using the Second-Order Statistical Features of Thermal Images and Simplified Fuzzy ARTMAP
13
作者 Faisal Al Thobiani Van Tung Tran Tiedo Tinga 《Engineering(科研)》 2017年第6期524-539,共16页
Thermal image, or thermogram, becomes a new type of signal for machine condition monitoring and fault diagnosis due to the capability to display real-time temperature distribution and possibility to indicate the mach... Thermal image, or thermogram, becomes a new type of signal for machine condition monitoring and fault diagnosis due to the capability to display real-time temperature distribution and possibility to indicate the machine’s operating condition through its temperature. In this paper, an investigation of using the second-order statistical features of thermogram in association with minimum redundancy maximum relevance (mRMR) feature selection and simplified fuzzy ARTMAP (SFAM) classification is conducted for rotating machinery fault diagnosis. The thermograms of different machine conditions are firstly preprocessed for improving the image contrast, removing noise, and cropping to obtain the regions of interest (ROIs). Then, an enhanced algorithm based on bi-dimensional empirical mode decomposition is implemented to further increase the quality of ROIs before the second-order statistical features are extracted from their gray-level co-occurrence matrix (GLCM). The highly relevant features to the machine condition are selected from the total feature set by mRMR and are fed into SFAM to accomplish the fault diagnosis. In order to verify this investigation, the thermograms acquired from different conditions of a fault simulator including normal, misalignment, faulty bearing, and mass unbalance are used. This investigation also provides a comparative study of SFAM and other traditional methods such as back-propagation and probabilistic neural networks. The results show that the second-order statistical features used in this framework can provide a plausible accuracy in fault diagnosis of rotating machinery. 展开更多
关键词 Thermal Images SECOND-ORDER Statistical features Gray-Level CO-OCCURRENCE Matrix Minimum redundancy Maximum RELEVANCE Rotating Machinery Fault Diagnosis Simplified Fuzzy ARTMAP
下载PDF
Information Hiding Method Based on Block DWT Sub-Band Feature Encoding
14
作者 Qiudong SUN Wenxin MA +1 位作者 Wenying YAN Hong DAI 《Journal of Software Engineering and Applications》 2009年第5期383-387,共5页
For realizing of long text information hiding and covert communication, a binary watermark sequence was obtained firstly from a text file and encoded by a redundant encoding method. Then, two neighboring blocks were s... For realizing of long text information hiding and covert communication, a binary watermark sequence was obtained firstly from a text file and encoded by a redundant encoding method. Then, two neighboring blocks were selected at each time from the Hilbert scanning sequence of carrier image blocks, and transformed by 1-level discrete wavelet transformation (DWT). And then the double block based JNDs (just noticeable difference) were calculated with a visual model. According to the different codes of each two watermark bits, the average values of two corresponding detail sub-bands were modified by using one of JNDs to hide information into carrier image. The experimental results show that the hidden information is invisible to human eyes, and the algorithm is robust to some common image processing operations. The conclusion is that the algorithm is effective and practical. 展开更多
关键词 Sub-Band feature ENCODING redundANT ENCODING Visual Model Discrete WAVELET TRANSFORMATION Information Hiding
下载PDF
基于Fisher Score与最大信息系数混合模型的三电平逆变器故障特征选择方法
15
作者 杜磊 任晓红 +2 位作者 刘显策 韩向栋 俞啸 《电子设计工程》 2023年第1期83-88,共6页
针对三电平逆变器在特征提取时出现特征表达不一致和冗余问题,以提高三电平逆变器故障识别准确率为目的,提出一种基于Fisher Score与最大信息系数混合模型的三电平逆变器故障特征选择方法。该方法采用Fisher Score方法对原始特征集进行... 针对三电平逆变器在特征提取时出现特征表达不一致和冗余问题,以提高三电平逆变器故障识别准确率为目的,提出一种基于Fisher Score与最大信息系数混合模型的三电平逆变器故障特征选择方法。该方法采用Fisher Score方法对原始特征集进行故障特征重要度排序,且利用最大信息系数对特征之间的相关性进行评价,进而对特征排序结果进行调整;以故障分类准确率为评判依据,基于随机森林算法对Fisher Score与最大信息系数混合模型进行修正,实现敏感故障特征筛选与分类;利用仿真和实验台的逆变器故障数据集进行实验,实验结果表明所提出的故障诊断模型准确率分别为93.3%和90.2%,与传统reliefF特征选择方法相比,所提出的特征选择方法筛选的敏感特征更有利于三电平逆变器故障诊断识别分类,故障识别准确率分别提高了2.1%和1.3%。 展开更多
关键词 三电平逆变器 Fisher score 特征选择 最大信息系数 随机森林
下载PDF
基于F-Score特征选择的癫痫脑电信号识别方法
16
作者 凌宇 杜玉晓 李向欢 《自动化与信息工程》 2023年第5期58-62,73,共6页
随着癫痫脑电信号自动检测算法研究地不断深入,需要处理的特征维度也不断增加,且冗余特征增大了算法的复杂度,导致算法性能下降。为此,提出一种基于F-Score特征选择的癫痫脑电信号识别方法。首先,从原始癫痫脑电信号数据集中提取特征,... 随着癫痫脑电信号自动检测算法研究地不断深入,需要处理的特征维度也不断增加,且冗余特征增大了算法的复杂度,导致算法性能下降。为此,提出一种基于F-Score特征选择的癫痫脑电信号识别方法。首先,从原始癫痫脑电信号数据集中提取特征,并计算每个特征的F-Score统计值;然后,根据分类模型的分类准确率,通过序列前向搜索方法,选择最优特征集;最后,利用支持向量机和逻辑回归分类模型进行实验,并与传统的特征降维方法PCA进行对比。实验结果表明,本文方法可有效降低特征矩阵的维数,提高算法运算效率。 展开更多
关键词 F-score PCA 特征提取 特征选择 癫痫脑电信号识别
下载PDF
基于特征提取和集成学习的个人信用评分方法 被引量:1
17
作者 康海燕 胡成倩 《计算机仿真》 2024年第1期311-320,共10页
在大数据蓬勃发展的今天,信息经济已经深入社会方方面面,个人信用体系建设的重要性越发突出。而传统的信用体系存在覆盖率不足、评价特征维度高、数据孤岛等问题,为了解决以上问题,提出一种基于特征提取和Stacking集成学习的个人信用评... 在大数据蓬勃发展的今天,信息经济已经深入社会方方面面,个人信用体系建设的重要性越发突出。而传统的信用体系存在覆盖率不足、评价特征维度高、数据孤岛等问题,为了解决以上问题,提出一种基于特征提取和Stacking集成学习的个人信用评分方法(PSL-Stacking)。方法首先利用Pearson和Spearman系数对数据进行初始化分析剔除不相关数据,利用LightGBM算法进行特征选择,减少冗余特征对模型的影响;其次选取XGboost、LightGBM、Random Forest以及Huber回归等算法,利用Stacking集成学习技术构造个人信用评分模型。最后,以某电信数据为研究对象,对该上述模型的个人信用评分能力进行验证。实验结果得出上述模型具有很好的预测能力,能够准确的对用户信用进行评分,有效降低企业遭受金融欺诈、团伙套利等问题的风险。 展开更多
关键词 信用评分 特征提取 集成学习 欺诈
下载PDF
面向高维不平衡数据的特征选择算法
18
作者 王振飞 袁佩瑶 +1 位作者 曹中亚 张利莹 《小型微型计算机系统》 CSCD 北大核心 2024年第8期1839-1846,共8页
针对传统高维不平衡数据集的分类算法存在偏向多数类、忽视少数类等问题,本文提出一种基于密度聚类和重要性度量的特征选择算法(DBIM).首先通过随机降采样的方法构造出多个平衡子集,使用DBSCAN密度聚类方法作为基分类器生成初始特征子空... 针对传统高维不平衡数据集的分类算法存在偏向多数类、忽视少数类等问题,本文提出一种基于密度聚类和重要性度量的特征选择算法(DBIM).首先通过随机降采样的方法构造出多个平衡子集,使用DBSCAN密度聚类方法作为基分类器生成初始特征子空间.然后按照重要度对特征进行排序选择出较强分类的特征.最后,为了避免特征之间的冗余性,设计基于类分布的权重指标与冗余性评价指标相结合的方法进行计算,生成高质量的特征子集.在8个公开数据集上的实验结果表明,本文提出DBIM算法可以生成高相关度且低冗余度的特征子集,对高维不平衡数据集进行有效降维,提高分类性能. 展开更多
关键词 高维不平衡数据集 密度聚类 特征选择 相关性 冗余性
下载PDF
结合未知类特征生成与分类得分修正的SAR目标开集识别方法
19
作者 陈健 雍奇锋 +1 位作者 杜兰 尹林伟 《电子与信息学报》 EI CAS CSCD 北大核心 2024年第10期3890-3907,共18页
现有合成孔径雷达(SAR)目标识别方法大多局限于闭集假定,即认为训练模板库内训练目标类别包含全部待测目标类别,不适用于库内已知类和库外未知新类目标共存的真实开放识别环境。针对训练模板库目标类别非完备情况下的SAR目标识别问题,... 现有合成孔径雷达(SAR)目标识别方法大多局限于闭集假定,即认为训练模板库内训练目标类别包含全部待测目标类别,不适用于库内已知类和库外未知新类目标共存的真实开放识别环境。针对训练模板库目标类别非完备情况下的SAR目标识别问题,该文提出一种结合未知类特征生成与分类得分修正的SAR目标开集识别方法。该方法在利用已知类学习原型网络保证已知类识别精度的基础上结合对潜在未知类特征分布的先验认知,生成未知类特征更新网络,进一步保证特征空间中已知类、未知类特征的鉴别性。原型网络更新完成后,所提方法挑选各已知类边界特征,并计算边界特征到各自类原型的距离(极大距离),通过极值理论对各已知类极大距离进行概率拟合确定了各已知类最大分布区域。测试阶段在度量待测样本特征与各已知类原型距离预测闭集分类得分的基础上,计算了各距离在对应已知类极大距离分布上的概率,并修正闭集分类得分,实现了拒判概率的自动确定。基于MSTAR实测数据集的实验结果表明,所提方法能够有效表征真实未知类特征分布并提升网络特征空间已知类与未知类特征的鉴别性,可同时实现对库内已知类目标的准确识别和对库外未知类新目标的准确拒判。 展开更多
关键词 SAR目标识别 开集识别 未知类特征生成 极值理论 分类得分修正
下载PDF
基于多尺度融合和时空特征的网络入侵检测模型
20
作者 龚星宇 来源 +1 位作者 李娜 雷璇 《计算机工程与设计》 北大核心 2024年第6期1640-1646,共7页
针对入侵检测模型提取特征能力不足,且流量数据中含冗余噪声的问题,提出一种基于多尺度融合和时空特征的ML-PFN入侵检测模型。采用多尺度特征融合技术分别提取数据中浅层特征信息和深层特征信息,使模型学习的特征更加丰富;采用软阈值函... 针对入侵检测模型提取特征能力不足,且流量数据中含冗余噪声的问题,提出一种基于多尺度融合和时空特征的ML-PFN入侵检测模型。采用多尺度特征融合技术分别提取数据中浅层特征信息和深层特征信息,使模型学习的特征更加丰富;采用软阈值函数和注意力机制自动选择合适的阈值,减少噪声及不相关信息对模型的干扰;融合时空特征构成多尺度空间特征提取长短时记忆-并行特征网络(MSFE LSTM-parallel feature network, ML-PFN)模型,并应用于网络入侵检测。通过3个公开数据集进行性能评估,实验结果表明,ML-PFN模型对比其它5种分类模型各项指标效果最好,在训练时长适中的同时准确率达到96.45%。 展开更多
关键词 入侵检测 冗余噪声 多尺度融合 时空特征 软阈值 注意力机制 长短时记忆
下载PDF
上一页 1 2 29 下一页 到第
使用帮助 返回顶部