摘要
采用近红外光谱技术,对采自9个产区280批次干姜药材样本进行分析,获取其近红外光谱信息,并运用主成分分析(PCA)、偏最小二乘判别分析(PLS-DA)、正交偏最小二乘判别分析(OPLS-DA)及K-近邻(KNN)、支持向量机(SVM)、随机森林(RF)、人工神经网络(ANN)和梯度提升(GB)等机器学习算法对干姜药材进行产地溯源研究。结果显示,标准正态变量变换结合一阶导数的光谱预处理判别准确率为93.9%,可用于判别模型的建立。PCA和PLS-DA结果显示,四川、山东、云南、贵州产区的干姜药材样本可实现有效区分,其余产地样本存在部分重叠。采用机器学习算法对不同产区样本进行建模并验证分析,结果显示KNN、SVM、RF、ANN、GB算法的AUC值分别为0.96、0.99、0.99、0.99、0.98,整体预测准确率为83.3%、89.3%、90.5%、91.7%、89.3%,表明所建立模型可靠,提示机器学习算法结合近红外光谱的产地识别方法可用于干姜药材产地溯源。道地药材川干姜与非道地产区干姜的OPLS-DA结果显示,道地产区的干姜药材可与其余产区干姜药材显著区分,并显示出较好的判别准确率,提示该研究建立的近红外光谱技术结合化学计量学方法可作为川干姜道地药材的鉴别手段。该研究为干姜产地溯源提供了快速、无损的检测手段及可靠的数据分析方法,可望为中药材产地溯源研究提供新的方法参考。
In this study, 280 batches of Zingiberis Rhizoma samples from nine producing areas were analyzed to obtain infrared spectral information based on near-infrared spectroscopy(NIRS). Pluralistic chemometrics such as principal component analysis(PCA), partial least squares-discriminant analysis(PLS-DA), orthogonal partial least squares-discriminant analysis(OPLS-DA), K-nearest neighbors(KNN), support vector machine(SVM), random forest(RF), artificial neural network(ANN), and gradient boosting(GB) were applied for tracing of origins. The results showed that the discriminative accuracy of the spectral preprocessing by standard normal variate transformation coupled with the first derivative was 93.9%, which could be used for the construction of the discrimination model. PCA and PLS-DA score plots showed that samples from Shandong, Sichuan, Yunnan, and Guizhou could be effectively distinguished, but the remaining samples were partially overlapped. As revealed by the analysis results by machine learning algorithms, the AUC values of KNN, SVM, RF, ANN, and GB algorithms were 0.96, 0.99, 0.99, 0.99, and 0.98, respectively, with overall prediction accuracies of 83.3%, 89.3%, 90.5%, 91.7%, and 89.3%. It indicated that the developed model was reliable and the machine learning algorithm combined with NIRS for origin identification was sufficiently feasible. OPLS-DA showed that Zingiberis Rhizoma from Sichuan(genuine producing areas) could be significantly distinguished from other regions, with good discriminative accuracy, suggesting that the NIRS established in this study combined with chemometrics can be used for the identification of Zingiberis Rhizoma from Sichuan. This study established a rapid and nondestructive identification and reliable data analysis method for origin identification of Zingiberis Rhizoma, which is expected to provide a new idea for the origin tracing of Chinese medicinal materials.
作者
余代鑫
郭盛
张霞
严辉
张振宇
李海洋
杨健
段金廒
YU Dai-xin;GUO Sheng;ZHANG Xia;YAN Hui;ZHANG Zhen-yu;LI Hai-yang;YANG Jian;DUAN Jin-ao(National and Local Collaborative Engineering Center of Chinese Medicinal Resources Industrialization and Formulae Innovative Medicine/Jiangsu Collaborative Innovation Center of Chinese Medicinal Resources Industrialization/Jiangsu Key Laboratory for High Technology Research of Traditional Chinese Medicine Formulae,Nanjing University of Chinese Medicine,Nanjing 210023,China;College of Arificial Intelligence and Information Technology,Nanjing University of Chinese Medicine,Nanjing 210023,China;State Key Laboratory Breeding Base of Dao-di Herbs,National Resource Center for Chinese Materia Medica,China Academy of Chinese Medical Sciences,Bejing 100700,China)
出处
《中国中药杂志》
CAS
CSCD
北大核心
2022年第17期4583-4592,共10页
China Journal of Chinese Materia Medica
基金
国家重点研发计划项目(2020YFC1712700)
中央本级重大增减支项目(2060302)
国家中医药管理局中医药创新团队及人才支持计划项目(ZYYCXTD-D-202005)
财政部和农业农村部国家现代农业产业技术体系项目(CARS-21)
江苏省研究生实践创新项目(SJCX21_0698)。
关键词
干姜
近红外光谱
化学计量学
机器学习算法
溯源
Zingiberis Rhizoma
NIRS
chemometrics
machine learning algorithms
traceability