摘要
针对工业、信息等领域出现的基于较大规模、非平稳变化复杂数据的回归问题,已有算法在计算成本及拟合效果方面无法同时满足要求.因此,文中提出基于多尺度高斯核的分布式正则化回归学习算法.算法中的假设空间为多个具有不同尺度的高斯核生成的再生核Hilbert空间的和空间.考虑到整个数据集划分的不同互斥子集波动程度不同,建立不同组合系数核函数逼近模型.利用最小二乘正则化方法同时独立求解各逼近模型.最后,通过对所得的各个局部估计子加权合成得到整体逼近模型.在2个模拟数据集和4个真实数据集上的实验表明,文中算法既能保证较优的拟合性能,又能降低运行时间.
The existing algorithms cannot produce satisfactory results with both low calculation cost and good fitting effect, due to the regression problems based on the complex data with large scale and nonstationary variation in industry, information and other fields. Therefore, a distributed regularized regression learning algorithm based on multi-scale Gaussian kernels is proposed. The hypothesis space of the proposed algorithm is a sum space composed of reproducing kernel Hilbert spaces generated by multiple Gaussian kernels with different scales. Since each disjoint subset partitioned from the whole data set with different degree of fluctuation, kernel function approximation models with different combination coefficients are established. According to the least square regularized method, a local estimator is learned from each subset independently in the meantime. Finally, a global approximation model is obtained by weighting all the local estimators. The experimental results on two simulation datasets and four real datasets show that the proposed algorithm reduces the running time successfully with a strong fitting ability compared with the existing algorithms.
作者
董雪梅
王洁微
DONG Xuemei;WANG Jiewei(School of Statistics and Mathematics, Zhejiang Gongshang University , Hangzhou 310018)
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2019年第7期589-599,共11页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.11571031,11701509)
浙江省一流学科A类(浙江工商大学统计学)资助~~
关键词
多尺度核
核方法
分布式学习
最小二乘正则化回归
Multi-scale Kernels
Kernel Method
Distributed Learning
Least Square Regularized Regression