摘要
对基于皮尔逊相关系数的有机质谱谱图相似性评估方法进行了研究。以质量数为自变量,丰度为因变量,经过一定的数据预处理过程后两个化合物的谱图转化为两个数组,这样不同化合物就可套用皮尔逊相关系数进行相关性计算。采用皮尔逊相关系数方法对具有同分异构相似性和化学结构式相似性的两组有机物质谱图谱组内、组间进行相似性计算,具有一定相似性的同一组内,谱图之间呈现较高的相关系数分值;不同组的谱图呈现非常低的相关系数分值。因此使用皮尔逊相关系数方法进行谱图相似性评估是可行的。对丰度进行非线性变换,可以大幅度提高算法的变异系数,提高质谱数据库的搜索效率。
A method for similarity evaluation of organic mass spectra based on the Pearson correlation coeffi cient was studied. With mass number as independent variable, abundance as the dependent variable, after certain data pretreatment process, the spectra of two compounds was transformed into two arrays, so that the spectrum correlation between two different compounds could be calculated with Pearson correlation coeffi cient. Pearson correlation coeffi cient method was used to calculate mass spectrum similarity between intra-group and inter-group of two groups organic material which has isomerism similarity and chemical structural similarity, the spectras between different groups showed very low correlation coefficient scores, so the Pearson correlation coefficient method was feasible to evaluate spectra similarity. Nonlinear transform of abundance could greatly improve the coefficient of variation of the algorithm and the efficiency of mass spectrum database search.
出处
《化学分析计量》
CAS
2015年第3期33-37,共5页
Chemical Analysis And Meterage
基金
咸阳职业技术学院2013科研基金项目(2013KYC05)
关键词
皮尔逊相关系数
质谱
相似性检索
Pearson correlation coeffi cient
mass spectrometry
similarity retrieval