摘要
针对数据的高维性,维数约简成为了热点的研究方向,各种流形学习算法都试图发现高维数据的内在结构与规律,然而都是基于小邻域的学习,如何将全局和局部的数据学习结合起来是一个尚未解决的问题.纤维丛是微分流形中的重要理论,比如线性空间中每个子空间都可以看成是一个纤维,它们的集合是纤维丛.本文在流形学习基础上引入纤维丛,给出纤维丛模型,并提出基于切丛局部主方向的向量空间降维算法,该算法用k-均值划分数据集并在各块上求主成分,由第一主方向组成的切丛截面,在截面流形上进行利用等度规映射(ISOMAP)降维,最后在模拟数据和人脸数据上进行实验说明了算法的有效性.
Facing the data with high-dimensional,nonlinear and non struction,a serious problem we should solve is how to find the rules behind the data sets. Manifold learning is a dimensional reducion method oriented to such high- dimensional data. By the way of finding the low-dimensional manifold in the high dimensional space and the correspondance imbedding projection,it accomplishs the goal of reduction. Fiber bundle theory,as the chief content of the integral differential geometry, combining the topology and differential geometry, is the important part of research of geometry in the 20th century. Its special way of contacting and processing different geometry space and geometry values in different space provides the feasibile method to study the global and local relation of data sets.
To slove the high-dimensional of data, dimensionality reduction has become the focus of scientific research nowadays. Many manifold algorithms try to find the instinct structure and rules of the high-dimensional data, but most of them are based on the neighborhood. So how to learn the data both from the local and global perspective remain an unsolved problem. Fiber bundle is the important part of differential manifold. For example,each subspace in line space can be described as a fiber. The set of these fibers is fiber bundle. This paper introduces the fiber bundle based on manifold, proposes the model of fiber bundle and gives the vector space reduction algorithm based on local principal orientation of tangent bundle. Based on a set of unorganized data points sampled with noise from a parameterized manifold, the local geometry of the manifold is learned by constructing an approximation for the tangent space at each data point. Firstly the algorithm uses k-means to partition the data set into blocks, and applies PCA to solve the principal component. Then, it constructs the sector on the tangent bundle by the first linear principal component and uses isometric mapping(ISOMAP) to reduct the dimension of sector manifold. Finally an experiment on artificially data and face data set is manufactured to confirm the validity of the algorithm.
出处
《南京大学学报(自然科学版)》
CAS
CSCD
北大核心
2008年第5期477-485,共9页
Journal of Nanjing University(Natural Science)
基金
国家自然科学基金(60775045)
关键词
维数约简
流形学习
局部主成分分析
纤维丛
K-均值
dimensionality reduction, manifold learning, local principal component analysis, fiber bundle,k-means