摘要
利用Perl语言,对茶树花转录组序列进行大通量SSR位点的发掘,发现含SSR的序列10 290条,共12 582个SSR,平均2.41 kb出现一个SSR。在茶树花的转录组中共发现340种碱基重复模式,所占比例最高的是(AG/CT)n(44.99%)。在49 586条注释成功的茶树花Unigene中,共发现10 490个SSR位点,其中位于编码区的1917个,其出现频率仅为0.102 SSR/1000 bp,而非编码区为3.072 SSR/1000 bp。在基因编码区中出现频率最高的是三碱基微卫星(1140,59.5%),其次是六碱基微卫星(524,27.3%)。茶树花转录组所含微卫星以重复长度小于20 bp的序列最多,大于20 bp的仅为25.2%。茶树花转录组中,含微卫星基因的平均表达水平显著低于不含微卫星基因,其中含复杂微卫星基因的平均基因表达水平最低。
The microsatellites or simple sequence repeats(SSRs) in Camellia sinensis floral transcriptome were characterized.A total of 12 582 SSRs were identified in 10 290 unigenes,with one SSR per 2.41 kb.Among all 340 SSR motifs,(AG/CT)n was the most frequent repeat motif(44.99%).A total of 10 409 SSRs occurred in 49586 unigenes with Blast matches to annotated proteins in four databases,only 1917 of which occurred in protein-coding regions of these sequences.The density of SSRs was much higher in non-coding regions than in coding regions(0.102 SSRs per 1000 base pairs in coding regions vs.3.072 in non-coding regions).In the six repeat motifs,tri-nucleotide repeats were the most abundant in coding regions(1140),followed by hexa-nucleotide(524) repeats.The microsatellites with length below 20 bp were in maximum proportion,while the microsatellites over 20 bp were only 25.22%.The expression level of genes containing microsatellites was significantly lower than that not con-taining microsatellites.The overall expression levels of genes containing compound microsatellites were lowest.
出处
《作物学报》
CAS
CSCD
北大核心
2014年第1期80-85,共6页
Acta Agronomica Sinica
基金
国家现代农业产业技术体系建设专项(nycytx-23)
浙江省自然科学基金项目(LY13C160004)
浙江省茶产业技术创新战略联盟项目资助
关键词
茶树
微卫星
花
转录组
Camellia sinensis
Microsatellites
Flower
Transcriptome