期刊文献+

基于动态规划的Needleman-Wunsch双序列比对算法的分析与研究 被引量:4

Analysis and research on the pairwise alignment Needleman-Wunsch algorithm based on dynamic programming
下载PDF
导出
摘要 生物序列比对是生物信息学中最基础的研究课题之一。基于动态规划的Needleman-Wunsch双序列比对算法主要采用迭代算法及空位罚分规则对基因序列进行逐一比对,计算二者相似性得分,最后通过回溯分析得出序列之间的最佳比对。虽然该算法可以得到最佳比对结果,但是时间复杂度和空间复杂度较高。首先对原算法进行分析,对计算得分和回溯进行改进。接着设计2次实验,以金黄色葡萄球菌和银葡萄球菌分别作为目标序列和待比对序列,分别生成序列长度范围相同和不同的5组数据进行实验测试。最后通过对新型冠状病毒和SARS病毒全序列进行比对,进一步验证了改进算法的有效性。实验结果表明,改进后的算法可以缩短序列比对时间,提高序列比对效率。 Sequence alignment is one of the most fundamental research problems in bioinformatics.The pairwise alignment Needleman-Wunsch based on dynamic programming mainly uses the iterative algorithm and the vacancy penalty rule to compare gene sequences one by one,calculates their similarity score,and finally obtains the best alignment between sequences through backtracking analysis.Although the algorithm can get the best result,it has high time and space complexity.Firstly,the original algorithm is analyzed and improved from the aspects of calculation score and backtracking.Secondly,two experiments are designed.In the experiments,Staphylococcus aureus is used as the target sequence,and Staphylococcus aureus is used as the counterpart sequence.Five groups of experiments with the same and different sequence length range are conducted.Finally,the novel coronavirus and SARS virus sequences are compared to verify the effectiveness of the algorithm.The experimental results show that the improved algorithm can reduce the sequence alignment time and improve the efficiency of sequence alignment.
作者 甘秋云 GAN Qiu-yun(School of Computing and Information Science,Fuzhou Institute of Technology,Fuzhou 350014,China)
出处 《计算机工程与科学》 CSCD 北大核心 2021年第2期340-346,共7页 Computer Engineering & Science
基金 福建省教育厅中青年教师教育科研项目(JAT191019)。
关键词 序列比对 动态规划 相似性 sequence alignment dynamic programming identity
  • 相关文献

参考文献6

二级参考文献55

  • 1Davic W Mount. Bioinformatics: Sequence and Genome Analysis[M]. USA: Cold Spring Harbor Laboratory Press, 2002, 53-54.
  • 2Needleman S, Wunsch C. A general method applicable to the search for similarities in the amino acid sequences of two proteins[J]. Journal of Molecular Biology, 1970, 48(3):443-453.
  • 3Hirschberg D. A linear space algorithm for conputing maximal common subsequences[J]. Comm ACM, 1975,18 (6) :341-343.
  • 4Hirschberg D. Serial Computations of Levenshtein Distances[M]. in: A. Apostolicl, Z. Galil (Eds.), Pattern Matching Algorithms, Oxford University Press, 1997, 123-141.
  • 5Ukkonen E. On approximate string matching[J]. Found Comput Theory, 1983, 158(6):487-495.
  • 6David R, Powell, Lloyd Allison, et al. A versatile divide and conquer technique for optimal string alignment[J]. Information Processing Letters, 1999, 70(3):127-139.
  • 7Mount D W.Bioinformatics:Sequence and Genome Analysis[M/OL].2nd ed.US:Cold Spring Harbor Laboratory Press,2004:2-12[2013-10-28].http://hydra.icgeb.trieste.it/~pongor/biophyshomepage/Mount_book/COVERS.pdf.
  • 8Sequence alignment[EB/OL].(2013-10-24)[2013-11-02]http://en.wikipedia.org/wiki/Sequence_alignment#cite_note-mount-1.
  • 9Dayhoff M O,Schwartz R M,Oreutt B C.In Atlas of Protein Sequence and structure[J].Bol,1978,5(3):345.
  • 10Miyamoto M M,Fitch W M.Testing the covarion hypothesis of molecular evolution[J].Mol Biol Evol,1995,12(3):503-512.

共引文献15

同被引文献29

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部