摘要
为增强近红外光谱模型通用性,解决直接正交信号校正算法在光谱处理过程中可能出现过拟合、模型不稳定的现象,提出一种将随机森林与直接正交信号校正算法相结合的模型传递方法(Random Forest-Direct Orthogonal Signal Correction,RF-DOSC)。该方法首先利用随机森林算法进行近红外光谱波长点筛选,然后采用直接正交信号校正方法进行光谱处理并建立回归方程,由PLS计算回归系数求得模型传递矩阵。实验使用三台光谱仪(S,S1,S2)测得的玉米近红外光谱数据集建立传递模型,数据集1(D1)水分、油分、蛋白质、淀粉成分预测标准偏差(SEP)分别为0.1267、0.0982、0.1569和0.4051,数据集2(D2)四种成分的SEP分别为0.1548、0.0819、0.1366和0.3836,均小于传统方法。实验结果表明本文所提模型传递方法能有效消除光谱噪声,减小主仪器和从仪器光谱之间的差异,提高模型的稳定性和准确性,实现不同仪器之间模型的共享。
In order to enhance the generality of near infrared spectroscopy model and solve the problem that the direct orthogonal signal correction algorithm may be over-fitting and unstable in the spectral processing process,a model transfer method,Random Forest-Direct Orthogonal Signal Correction(RF-DOSC),is proposed for regression analysis.The proposed method firstly uses the random forest algorithm to screen the near-infrared spectrum wavelength points,then conducts the direct orthogonal signal correction approach to perform spectral processing and establishes the regression equation.The regression coefficient is calculated by PLS to obtain the model transfer matrix.In the experiment,the near-infrared spectral data of corn(i.e.,D,D1,and D2 datasets were measured by S,S1 and S2 spectrometers,respectively.)was used to establish the transfer model.The prediction standard deviation(SEP)of water,oil,protein and starch components in the D1 set were 0.1267,0.0982,0.1569 and 0.4051,respectively;The SEP of the four components of the D2 dataset were 0.1548,0.0819,0.1366,and 0.3836,respectively,which were better than the results of conventional methods.It is clearly illustrated that the proposed model transfer algorithm could effectively eliminate the spectral noise;reduce the difference between the master and slave spectra.Meanwhile,improving the stability and accuracy of the model and realizing the sharing model between different instruments,as shown by the above experiment results.
作者
王其滨
杨辉华
潘细朋
李灵巧
WANG Qi-bin;YANG Hui-hua;PAN Xi-peng;LI Ling-qiao(College of Electronic Engineering and Automation,Guilin University of Electronic Technology,Guilin 541004,China;College of Automation,Beijing University of Posts & Telecommunications,Beijing 100876,China)
出处
《激光与红外》
CAS
CSCD
北大核心
2020年第9期1081-1087,共7页
Laser & Infrared
基金
国家自然科学基金项目(No.21365008,No.61105004)资助。
关键词
近红外光谱
模型传递
随机森林
直接正交信号校正
near-infrared spectroscopy
model transfer
Random Forest
Direct Orthogonal Signal Correction