摘要
将拟南芥(A.tha liana)和线虫(C.eleg ans)的基因序列中的外显子按第一外显子,中间外显子和最后外显子划分成三类.分别将外显子/内含子剪切位点、翻译起始和终止位点附近的三联体的3个位点作为3条子链,以各条子链的不同碱基个数作为离散源参数,共12个离散源参数,计算各类外显子离散量.用离散增量实现了对三种序列类型的预测,预测成功率都达到80%以上;并且统计了剪切位点附近的碱基相对频数,结果比较了由于三联体所取位置及个数不同而造成的对预测结果的差异,说明了剪切位点附近碱基的保守性.
Based on the gene structural property of A. thaliana and C. elegan genomes,the exons are divided into three types: first exon,mid exon and last exon. By using the frequencies of 4 kinds of base at three positims (total of 12 parameters) near exon/intron boundary,initiation and termination site for translation,exons can be predicted by an algorithm based on the increment of diversity, in which the sole prediction parameter-increment of diversity is used as the prediction index. The resuits indicate that the accuracy of prediction are higher than 80%,In addition,the mononucleotide and dinucleotide frequencies near exon/intron boundary,initiation and termination site for translation are calculated,and these position's conservation are analyzed.
出处
《内蒙古大学学报(自然科学版)》
CAS
CSCD
北大核心
2006年第3期279-284,共6页
Journal of Inner Mongolia University:Natural Science Edition
基金
国家自然科学基金资助项目(30560039)
关键词
外显子
离散增量
剪切位点
保守性
exon
increment of diversity
splice site
conservation