摘要
相比于细胞、组织等其他病理诊断样品,血液样本更易临床采集,其生化构成变化常表现于医学影像学检测到的临床症状出现之前,更有利于实现恶性疾病的早期筛查与诊断。拉曼光谱(RS)技术,具有快速、无标记、无损、非侵入等检测优势,且可获得特异性的生物分子结构和物质组成信息,在临床血液样品(血浆、血清)的癌变诊断检测中具有重大的应用前景。本工作采用显微拉曼光谱检测技术,在分析不同病变阶段(健康,早期癌变和晚期癌变)乳腺癌血清样品生化组成信息基础上,结合主成分分析(PCA)与线性判别分析(LDA)、支持向量机(SVM)和偏最小二乘算法(PLS-DA)等多变量光谱分析手段,构建光谱特征归类鉴别模型;并采用留一交叉验证方法(LOOCV)评估、比较这些模型的灵敏度、特异性和准确率,探索基于血清拉曼光谱的乳腺癌诊断方法。研究工作在观察血清类胡罗卜素成分共振拉曼光谱现象基础上,进一步分析了乳腺癌病理演进过程中血清样品蛋白质与脂类光谱特征变化。此外,利用多种光谱数据模型,在提取、识别更具代表性的分子光谱特征信息后,实现了较为准确的血清特征光谱信息鉴别分析。其中,PCA-LDA模型的分类准确率达99%;PCA-SVM(三个内核函数:线性核、多项式核及RBF核)模型中,当线性核函数PCA-SVM模型误差惩罚参数C为0.003时,其在测试集中的分类准确率为92%;当RBF核函数PCA-SVM模型参数C和参数γ分别为0.125和256时,其在测试集中的分类准确率可达到了最高94%;当多项式核函数PCA-SVM参数C为0.003、多项式阶数d=1时,其在测试集中的分类准确率为92%;然而,PLS-DA模型的分类准确率仅为80%。这些实验分析结果,从光谱特征的角度描述了不同病理条件下血清的物质构成信息,更为拓展血清拉曼光谱分析技术在乳腺癌早期筛查与病理分期分级中的应用范畴,奠定了一定实验与理论基础。
Compared to cell and sliced tissue samples,blood samples could be collected easier,and its biomedical constitution would show some relavant variations before clinical pathological symptoms.Raman spectroscopy provides molecular-related information about biomedical contents for clinical investigations in a rapid,nonlabeled,nondestructive and noninvasive way,presenting a significant application prospect for blood sample-based diagnosis.In this study,we present a reliable method for detecting breast cancer using blood serum combined with multivariate analysis methods.The blood serum samples were divided into healthy,early,and advanced cancer groups based on clinical pathological diagnosis.Using a quatz capillary tubes as sample holder,the spectral information was acquired to illustrating the biomedical constitution nature of the serum sample.The spectral classification models,which were built on the method of principal component analysis(PCA),linear discriminant analysis(LDA),supporting vector machines(SVM)and partial least squares discriminant analysis(PLS-DA),were utilized for unveiling the spectral variances among different investigated groups.And the leave-one-out cross-validation(LOOCV)method was adopted for evaluating the model classification performance.After that,we not only observed the resonance Raman spectral phenomena of carotenoid contents in serum but also identified the spectral variations of protein and lipid contents during breast cancer progression.By using the multivariate analysis methods,the representative spectral identities were recognized.Since then,the spectral classification accuracy of PCA-LDA model was found to be 99%.For three types kernel based PCA-SVM model,it was found that the linear kernel model reached 92%accuracy with parameter c=0.003,the classification accuracy of the RBF kernel model was 94%with parameter c=0.125 andγ=256,and the polynomial model presented 92%accuracy with parameter c=0.003 and d=11.Meanwhile,the spectral classification accuracy of PLS-DA was 80%.The obtained results could pave a theoretical and experimental foundation for serum Raman spectroscopy-based breast cancer early screening and diagnosis.
作者
张宝萍
宁甜
张富荣
陈一申
张占琴
王爽
ZHANG Bao-ping;NING Tian;ZHANG Fu-rong;CHEN Yi-shen;ZHANG Zhan-qin;WANG Shuang(Institute of Photonics and Photon-Technology,Northwest University,Xi’an 721710,China;The First Affiliated Hospital of Xi’an Jiaotong University,Xi’an 710061,China)
出处
《光谱学与光谱分析》
SCIE
EI
CAS
CSCD
北大核心
2023年第2期426-434,共9页
Spectroscopy and Spectral Analysis
基金
国家自然科学基金项目(61911530695)资助。