摘要
社会媒体网络产生的海量、高维无标记数据给数据处理工作带来巨大挑战,同时数据样本间构成的链接图信息在现有模式识别算法中难以有效利用.基于此,文中充分挖掘社会媒体网络数据链接关系图,结合部分监督信息提出一种基于链接关系的半监督特征选择算法(SSLFS).该算法利用谱分析和稀疏约束,使得选出的特征子集保持原数据的局部流形和稀疏特性.在社会媒体数据集Flickr上的实验结果表明,SSLFS相比其他特征选择方法得到的特征子集在分类性能上有较显著提高.
Mountains of high-dimensional, unlabeled data are produced by the social media network, which brings tremendous challenges to the data processing. Meanwhile, the linked graph information between data samples can not be effectively used in the existing pattern recognition algorithms. A semi-supervised feature selection method ( SSLFS) based on linked relations is proposed combined with a little supervised information after mining the linked graph of social media network. Through spectral analysis and sparsity constraint, SSLFS selects feature subsets which maintain the characteristics of local manifold and sparsity. The experimental results on the Flickr dataset show that the subset obtained by SSLFS is more effective when applied to classification compared with those by other methods.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2014年第2期166-172,共7页
Pattern Recognition and Artificial Intelligence
基金
国家863计划项目(No.2012AA01A510)资助
关键词
社会媒体网络
关联数据
半监督学习
特征选择
稀疏学习
流形学习
Social Media Network
Linked Data
Semi-Supervised Learning
Feature Selection
Sparse Learning
Manifold Learning