Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some st...Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three- dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.展开更多
RNA molecules serve a wide range of functions that are closely linked to their structures.The basic structural units of RNA consist of single-and double-stranded regions.In order to carry out advanced functions such a...RNA molecules serve a wide range of functions that are closely linked to their structures.The basic structural units of RNA consist of single-and double-stranded regions.In order to carry out advanced functions such as catalysis and ligand binding,certain types of RNAs can adopt higher-order structures.The analysis of RNA structures has progressed alongside advancements in structural biology techniques,but it comes with its own set of challenges and corresponding solutions.In this review,we will discuss recent advances in RNA structure analysis techniques,including structural probing methods,X-ray crystallography,nuclear magnetic resonance,cryo-electron microscopy,and small-angle X-ray scattering.Often,a combination of multiple techniques is employed for the integrated analysis of RNA structures.We also survey important RNA structures that have been recently determined using various techniques.展开更多
To enable diverse functions and precise regulation,an RNA sequence often folds into complex yet distinct structures in different cellular states.Probing RNA in its native environment is essential to uncovering RNA str...To enable diverse functions and precise regulation,an RNA sequence often folds into complex yet distinct structures in different cellular states.Probing RNA in its native environment is essential to uncovering RNA structures of biological contexts.However,current methods generally require large amounts of input RNA and are challenging for physiologically relevant use.Here,we report smartSHAPE,a new RNA structure probing method that requires very low amounts of RNA input due to the largely reduced artefact of probing signals and increased efficiency of library construction.Using smartSHAPE,we showcased the profiling of the RNA structure landscape of mouse intestinal macrophages upon inflammation,and provided evidence that RNA conformational changes regulate immune responses.These results demonstrate that smartSHAPE can greatly expand the scope of RNA structure-based investigations in practical biological systems,and also provide a research paradigm for the study of post-transcriptional regulation.展开更多
As more information is gathered on the mechanisms of transcription and translation, it is becoming apparent that these processes are highly regulated. The formation of mRNA secondary and tertiary structures is one suc...As more information is gathered on the mechanisms of transcription and translation, it is becoming apparent that these processes are highly regulated. The formation of mRNA secondary and tertiary structures is one such regulatory process that until recently it has not been analysed in depth. Formation of these mRNA structures has the potential to enhance and inhibit alternative splicing of transcripts, and regulate rates and amount of translation. As this regulatory mechanism potentially impacts at both the transcriptional and translational level, while also potentially utilising the vast array of non-coding RNAs, it warrants further investigation. Currently, a variety of high- throughput sequencing techniques including parallel analysis of RNA structure (PARS), fragmentation sequencing (FragSeq) and selective 2-hydroxyl acylation analysed by primer extension (SHAPE) lead the way in the genome-wide identification and analysis of mRNA structure formation. These new sequencing techniques highlight the diversity and complexity of the transcriptome, and demonstrate another regulatory mechanism that could become a target for new therapeutic approaches.展开更多
RNA folds into intricate structures that are crucial for its functions and regulations. To date, a multitude of approaches for probing structures of the whole transcriptome, i.e., RNA struc- turomes, have been develop...RNA folds into intricate structures that are crucial for its functions and regulations. To date, a multitude of approaches for probing structures of the whole transcriptome, i.e., RNA struc- turomes, have been developed. Applications of these approaches to different cell lines and tissues have generated a rich resource for the study of RNA structure-function relationships at a systems biology level. In this review, we first introduce the designs of these methods and their applications to study different RNA structuromes. We emphasize their technological differences especially their unique advantages and caveats. We then summarize the structural insights in RNA functions and regulations obtained from the studies of RNA structuromes. And finally, we propose potential directions for future improvements and studies.展开更多
RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, an...RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field.展开更多
[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm su...[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm suitable for the lexicalized stochastic grammar model was proposed. The word grid mode was used to extract and divide RNA sequence to acquire lexical substring, and the cloud classifier was used to search the maximum probability of each lemma which was marked as a certain sec- ondary structure type. Then, the lemma information was introduced into the training stochastic grammar process as prior information, realizing the prediction on the sec- ondary structure of RNA, and the method was tested by experiment. [Result] The experimental results showed that the prediction accuracy and searching speed of stochastic grammar cloud model were significantly improved from the prediction with simple stochastic grammar. [Conclusion] This study laid the foundation for the wide application of stochastic grammar model for RNA secondary structure prediction.展开更多
RNAs play crucial and versatile roles in cellular biochemical reactions.Since experimental approaches of determining their three-dimensional(3D)structures are costly and less efficient,it is greatly advantageous to de...RNAs play crucial and versatile roles in cellular biochemical reactions.Since experimental approaches of determining their three-dimensional(3D)structures are costly and less efficient,it is greatly advantageous to develop computational methods to predict RNA 3D structures.For these methods,designing a model or scoring function for structure quality assessment is an essential step but this step poses challenges.In this study,we designed and trained a deep learning model to tackle this problem.The model was based on a graph convolutional network(GCN)and named RNAGCN.The model provided a natural way of representing RNA structures,avoided complex algorithms to preserve atomic rotational equivalence,and was capable of extracting features automatically out of structural patterns.Testing results on two datasets convincingly demonstrated that RNAGCN performs similarly to or better than four leading scoring functions.Our approach provides an alternative way of RNA tertiary structure assessment and may facilitate RNA structure predictions.RNAGCN can be downloaded from https://gitee.com/dcw-RNAGCN/rnagcn.展开更多
We have previously reported that the human ACAT1 gene produces a chimeric mRNA through the interchromosomal processing of two discontinuous RNAs transcribed from chromosomes 1 and 7. The chimeric mRNA uses AUG1397-139...We have previously reported that the human ACAT1 gene produces a chimeric mRNA through the interchromosomal processing of two discontinuous RNAs transcribed from chromosomes 1 and 7. The chimeric mRNA uses AUG1397-1399 and GGC1274-1276 as translation initiation codons to produce normal 50-kDa ACAT1 and a novel enzymatically active 56-kDa isoform, respectively, with the latter being authentically present in human cells, including human monocyte- derived macrophages. In this work, we report that RNA secondary structures located in the vicinity of the GGC1274-1276 codon are required for production of the 56-kDa isoform. The effects of the three predicted stem-loops (nt 1255-1268, 1286-1342 and 1355-1384) were tested individually by transfecting expression plasmids into cells that contained the wild-type, deleted or mutant stem-loop sequences linked to a partial ACAT1 AUG open reading frame (ORF) or to the ORFs of other genes. The expression patterns were monitored by western blot analyses. We found that the upstream stem-loop1255-1268 from chromosome 7 and downstream stem-loop1286-1342 from chromosome 1 were needed for production of the 56-kDa isoform, whereas the last stem-loop135s-1384 from chromosome 1 was dispensable. The results of experi- ments using both monocistronic and bicistronic vectors with a stable hairpin showed that translation initiation from the GGC1274-1276 codon was mediated by an internal ribosome entry site (IRES). Further experiments revealed that translation initiation from the GGC1274-1276 codon requires the upstream AU-constituted RNA secondary structure and the downstream GC-rich structure. This mechanistic work provides further support for the biological significance of the chimeric nature of the human ACAT1 transcript.展开更多
A novel method for the prediction of RNA secondary structure was proposed based on the particle swarm optimization(PSO). PSO is known to be effective in solving many different types of optimization problems and know...A novel method for the prediction of RNA secondary structure was proposed based on the particle swarm optimization(PSO). PSO is known to be effective in solving many different types of optimization problems and known for being able to approximate the global optimal results in the solution space. We designed an efficient objective function according to the minimum free energy, the number of selected stems and the average length of selected stems. We calculated how many legal stems there were in the sequence, and selected some of them to obtain an optimal result using PSO in the right of the objective function. A method based on the improved particle swarm optimization(IPSO) was proposed to predict RNA secondary structure, which consisted of three stages. The first stage was applied to encoding the source sequences, and to exploring all the legal stems. Then, a set of encoded stems were created in order to prepare input data for the second stage. In the second stage, IPSO was responsible for structure selection. At last, the optimal result was obtained from the secondary structures selected via IPSO. Nine sequences from the comparative RNA website were selected for the evaluation of the proposed method. Compared with other six methods, the proposed method decreased the complexity and enhanced the sensitivity and specificity on the basis of the experiment results.展开更多
Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of...Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of methods have been proposed to predict RNA secondary structures but their accuracies encountered bottleneck.Here we present a method for RNA secondary structure prediction using direct coupling analysis and a remove-and-expand algorithm that shows better performance than four existing popular multiple-sequence methods.We further show that the results can also be used to improve the prediction accuracy of the single-sequence methods.展开更多
A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudokno...A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudoknots including the well known H-type pseudoknot were permitted to occur if reasonable.We have applied this approach to e number of RNA sequences.The prediction accuracies we obtained were higher than those in published papers.展开更多
The RNA tertiary structure is essential to understanding the function and biological processes. Unfortunately, it is still challenging to determine the large RNA structure from direct experimentation or computational ...The RNA tertiary structure is essential to understanding the function and biological processes. Unfortunately, it is still challenging to determine the large RNA structure from direct experimentation or computational modeling. One promising approach is first to predict the tertiary contacts and then use the contacts as constraints to model the structure. The RNA structure modeling depends on the contact prediction accuracy. Although many contact prediction methods have been developed in the protein field, there are only several contact prediction methods in the RNA field at present. Here, we first review the theoretical basis and test the performances of recent RNA contact prediction methods for tertiary structure and complex modeling problems. Then, we summarize the advantages and limitations of these RNA contact prediction methods. We suggest some future directions for this rapidly expanding field in the last.展开更多
RNAs carry out diverse biological functions, partly because different conformations of the same RNA sequence can play different roles in cellular activities. To fully understand the biological functions of RNAs requir...RNAs carry out diverse biological functions, partly because different conformations of the same RNA sequence can play different roles in cellular activities. To fully understand the biological functions of RNAs requires a conceptual framework to investigate the folding kinetics of RNA molecules, instead of native structures alone. Over the past several decades, many experimental and theoretical methods have been developed to address RNA folding. The helix-based RNA folding theory is the one which uses helices as building blocks, to calculate folding kinetics of secondary structures with pseudoknots of long RNA in two different folding scenarios. Here, we will briefly review the helix-based RNA folding theory and its application in exploring regulation mechanisms of several riboswitches and self-cleavage activities of the hepatitis delta virus (HDV) ribozyme.展开更多
The attenuated vaccine strains of CSFV have a 12-nucleotides (nt) insertion in the 3'-UTR of genome as compared to that of CSFV virulent strains. In this study, we found a distinct heterogeneity in the 3'-UTR of a...The attenuated vaccine strains of CSFV have a 12-nucleotides (nt) insertion in the 3'-UTR of genome as compared to that of CSFV virulent strains. In this study, we found a distinct heterogeneity in the 3'-UTR of attenuated Thiverval and HCLV strains. The longest 3'-UTR of Thiverval strain was 259 base pairs (bp) with a 32-nt insertion, the shortest 3'-UTR had only 233 bp with a 6-nt insertion. The longest 3'-UTR of HCLV strain was 244 bp with a 17-nt insertion and the shortest 3' UTR was 235 bp with a 8-nt insertion. Compared with the published sequences of 3'-UTR of vaccine and virulent strains, the 3'-UTR of CSFV vaccine strains have two variable regions where insertion among the different vaccine strains were frequently found. The first is located between the second conservative TALk codon and the start of T-rich region where we found the variable length insertion in the same vaccine strain Thiveral or HCLV and the second is located between the end of T-rich region and the front of GAA eodon, however, a 4-nt deletion was found in this region in the virulent Shimen strain. These two regions may represent the "hot spot" for mutation. Modeling the secondary structures of the 3'-UTR suggests that the T-rich insertion could result in the change of structure and free energy, thus affecting the stability of the 3'-UTR structure. These findings will help to understand the mechanism of attenuated vaccines and improve vaccine safety, stability, and efficacy.展开更多
To investigate how synonymous codons have been adapted to the formation of ribonucleic acid(RNA)G-quadruplex(rG4)structure,a computational searching algorithm G4Hunter was applied to detect rG4 structures in protein-c...To investigate how synonymous codons have been adapted to the formation of ribonucleic acid(RNA)G-quadruplex(rG4)structure,a computational searching algorithm G4Hunter was applied to detect rG4 structures in protein-coding sequences of mRNAs in five eukaryotic species.The native sequences forming rG4s were then compared with randomized sequences to evaluate selection on synonymous codons.Factors that may influence the formation of rG4 were also investigated,and the selection pressures of rG4 in different gene regions were compared to explore its potential roles in gene regulation.The results show universal selective pressure acts on synonymous codons in rG4 regions to facilitate rG4 formation in five eukaryotic organisms.While G-rich codon combinations are preferred in the rG4 structural region,C-rich codon combinations are selectively unfavorable for rG4 formation.Gene's codon usage bias,nucleotide composition,and evolutionary rate can account for the selective variations on synonymous codons among rG4 structures within a species.Moreover,rG4 structures in the translational initiation region showed significantly higher selective pressures than those in the translational elongation region.展开更多
As a cornerstone of the central dogma of molecular biology,RNA plays vital roles in living organisms.Over the past few decades,many RNA labeling technologies have been developed to elucidate the biological function of...As a cornerstone of the central dogma of molecular biology,RNA plays vital roles in living organisms.Over the past few decades,many RNA labeling technologies have been developed to elucidate the biological function of RNA.These technologies have signifi-cantly advanced our understanding of RNA secondary structure,localization,and turnover.Additionally,taking advantage of these innovative RNA labeling approaches,plenty of tool kits have been devised for the regulation of RNA-related biological process,such as gene expression and gene editing.In this review,we primarily focus on an array of intracellular RNA labeling methods,encom-passing chemical probes-based labeling,metabolic labeling,and proximity-dependent labeling.We also provide a brief overview of their applications in the research of RNA biology.Finally,the perspectives of RNA labeling are also discussed.展开更多
The potato rot nematode(Ditylenchus destructor) is a very economically important nematode in agronomic and horticultural plants worldwide. In this study, 43 populations of D. destructor were collected from different h...The potato rot nematode(Ditylenchus destructor) is a very economically important nematode in agronomic and horticultural plants worldwide. In this study, 43 populations of D. destructor were collected from different hosts across China, including 37 populations from Chinese herbal medicine plants. Obtained sequences of ITS-rDNA and D2–D3of 28S-rDNA genes of D. destructor were compared and analyzed. Nine types of significant length variations in ITS sequences were observed among all populations. The differences in ITS1 length were mainly caused by the presence of repetitive elements with substantial base substitutions. Reconstructions of ITS1 secondary structures showed that the minisatellites formed a stem structure. Ten haplotypes were observed in all populations based on mutations and variations of helix H9. Among them, 3 known haplotypes(A–C) were found in 7 populations isolated from potato,sweet potato, and Codonopsis pilosula, and 7 unique haplotypes were found in other 36 populations collected from C. pilosula and Angelica sinensis compared with 7 haplotypes(A–G) according to Subbotin' system. These unique haplotypes were different from haplotypes A–G, and we named them as haplotypes H–N. The present results showed that a total of 14 haplotypes(A–N) of ITS-rDNA have been found in D. destructor. Phylogenetic analyses of ITSrDNA and D2–D3 showed that all populations of D. destructor were clustered into two major clades: one clade only containing haplotype A from sweet potato and the other containing haplotypes B–N from other plants. For further verification, PCR-ITS-RFLP profiles were conducted on 7 new haplotypes. Collectively, our study suggests that D. destructor populations on Chinese medicinal materials are very different from those on other hosts and this work provides a paradigm for relevant researches.展开更多
Genomic surveillance of monkeypox virus(MPXV)is essential to explore the reason of its unusual outbreak.Current phylogenomic analysis of the MPXV genome mainly focuses on the effect of amino acid mutations.Herein,we e...Genomic surveillance of monkeypox virus(MPXV)is essential to explore the reason of its unusual outbreak.Current phylogenomic analysis of the MPXV genome mainly focuses on the effect of amino acid mutations.Herein,we explore the evolutionary variation of RNA G-quadruplex(RG4)of MPXV and find that the genome evolution of MPXV can also produce new effects through changes in the RG4 structure.This RG4 is located in MPXV’s only Kelch-like C9L gene,which encodes for an antagonist of the innate immune response.The evolution of this virus increases the unfolding kinetic constant of C9L RG4 and promotes the C9 protein level in living cells.Importantly,all reported MPXV genomes in 2022 carry the C9L-RG4-5 pattern with the highest unfolding kinetic constant.Additionally,the RG4 ligand,RGB-1,can impede the unfolding of C9L-RG4-5 and thereby reduce the C9 protein level.These findings carve out a new path to comprehensively understanding MPXV virology.展开更多
Protein binding is essential to the transport,decay and regulation of almost all RNA molecules.However,the structural preference of protein binding on RNAs and their cellular functions and dynamics upon changing envir...Protein binding is essential to the transport,decay and regulation of almost all RNA molecules.However,the structural preference of protein binding on RNAs and their cellular functions and dynamics upon changing environmental conditions are poorly understood.Here,we integrated various high-throughput data and introduced a computational framework to describe the global interactions between RNA binding proteins(RBPs)and structured RNAs in yeast at single-nucleotide resolution.We found that on average,in terms of percent total lengths,~15%of mRNA untranslated regions(UTRs),~37%of canonical non-coding RNAs(ncRNAs)and^11%of long ncRNAs(lncRNAs)are bound by proteins.The RBP binding sites,in general,tend to occur at single-stranded loops,with evolutionarily conserved signatures,and often facilitate a specific RNA structure conformation in vivo.We found that four nucleotide modifications of tRNA are significantly associated with RBP binding.We also identified various structural motifs bound by RBPs in the UTRs of mRNAs,associated with localization,degradation and stress responses.Moreover,we identified>200 novel lncRNAs bound by RBPs,and about half of them contain conserved secondary structures.We present the first ensemble pattern of RBP binding sites in the structured non-coding regions of a eukaryotic genome,emphasizing their structural context and cellular functions.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.11074191,11175132,and 11374234)the National Basic Research Programof China(Grant No.2011CB933600)the Program for New Century Excellent Talents of China(Grant No.NCET 08-0408)
文摘Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three- dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.
基金National Key R&D Program of China(2021YFA1301500,2017YFA0504600,2022YFC2303700,2022YFA1302700,2022YFF1203100)National Natural Science Foundation of China(U1832215,32171191,91940302,32230018 and 32125007)+6 种基金Strategic Priority Research Program of Chinese Academy of Sciences(XDB37010201,XDB0490000)Center for Advanced Interdisciplinary Science and Biomedicine of IHM(QYPY20220019)the Fundamental Research Funds for the Central Universities(WK9100000032 and WK9100000044)Guangdong Science and Technology Department(2022A1515010328,2020B1212060018 and 2020B1212030004)the Postdoctoral Foundation of Tsinghua-Peking Center for Life Sciences[to J.Z.]the Beijing Advanced Innovation Center for Structural Biology[to Q.C.Z.]the Tsinghua-Peking Joint Center for Life Sciences[to Q.C.Z.].
文摘RNA molecules serve a wide range of functions that are closely linked to their structures.The basic structural units of RNA consist of single-and double-stranded regions.In order to carry out advanced functions such as catalysis and ligand binding,certain types of RNAs can adopt higher-order structures.The analysis of RNA structures has progressed alongside advancements in structural biology techniques,but it comes with its own set of challenges and corresponding solutions.In this review,we will discuss recent advances in RNA structure analysis techniques,including structural probing methods,X-ray crystallography,nuclear magnetic resonance,cryo-electron microscopy,and small-angle X-ray scattering.Often,a combination of multiple techniques is employed for the integrated analysis of RNA structures.We also survey important RNA structures that have been recently determined using various techniques.
基金the National Key R&D Program of China(2019YFA0110002 and 2018YFA0107603 to Q.C.Z,and 2020YFA0509100 to X.H.)National Natural Science Foundation of China(Grants No.32125007,91940306,91740204,and 31761163007 to Q.C.Z,and 31725010,31821003,31991174,32030037,82150105 to X.H.)Research Grants Council of the Hong Kong SAR,China Project No.N_CityU110/17 to C.K.K.
文摘To enable diverse functions and precise regulation,an RNA sequence often folds into complex yet distinct structures in different cellular states.Probing RNA in its native environment is essential to uncovering RNA structures of biological contexts.However,current methods generally require large amounts of input RNA and are challenging for physiologically relevant use.Here,we report smartSHAPE,a new RNA structure probing method that requires very low amounts of RNA input due to the largely reduced artefact of probing signals and increased efficiency of library construction.Using smartSHAPE,we showcased the profiling of the RNA structure landscape of mouse intestinal macrophages upon inflammation,and provided evidence that RNA conformational changes regulate immune responses.These results demonstrate that smartSHAPE can greatly expand the scope of RNA structure-based investigations in practical biological systems,and also provide a research paradigm for the study of post-transcriptional regulation.
文摘As more information is gathered on the mechanisms of transcription and translation, it is becoming apparent that these processes are highly regulated. The formation of mRNA secondary and tertiary structures is one such regulatory process that until recently it has not been analysed in depth. Formation of these mRNA structures has the potential to enhance and inhibit alternative splicing of transcripts, and regulate rates and amount of translation. As this regulatory mechanism potentially impacts at both the transcriptional and translational level, while also potentially utilising the vast array of non-coding RNAs, it warrants further investigation. Currently, a variety of high- throughput sequencing techniques including parallel analysis of RNA structure (PARS), fragmentation sequencing (FragSeq) and selective 2-hydroxyl acylation analysed by primer extension (SHAPE) lead the way in the genome-wide identification and analysis of mRNA structure formation. These new sequencing techniques highlight the diversity and complexity of the transcriptome, and demonstrate another regulatory mechanism that could become a target for new therapeutic approaches.
基金supported by the National Natural Science Foundation of China(Grant No.31671355)the National Thousand Young Talents Program of China to QCZ
文摘RNA folds into intricate structures that are crucial for its functions and regulations. To date, a multitude of approaches for probing structures of the whole transcriptome, i.e., RNA struc- turomes, have been developed. Applications of these approaches to different cell lines and tissues have generated a rich resource for the study of RNA structure-function relationships at a systems biology level. In this review, we first introduce the designs of these methods and their applications to study different RNA structuromes. We emphasize their technological differences especially their unique advantages and caveats. We then summarize the structural insights in RNA functions and regulations obtained from the studies of RNA structuromes. And finally, we propose potential directions for future improvements and studies.
基金Project supported by the National Natural Science Foundation of China (Grant Nos. 11774158, 11974173, 11774157, and 11934008)。
文摘RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field.
基金Supported by the Science Foundation of Hengyang Normal University of China(09A36)~~
文摘[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm suitable for the lexicalized stochastic grammar model was proposed. The word grid mode was used to extract and divide RNA sequence to acquire lexical substring, and the cloud classifier was used to search the maximum probability of each lemma which was marked as a certain sec- ondary structure type. Then, the lemma information was introduced into the training stochastic grammar process as prior information, realizing the prediction on the sec- ondary structure of RNA, and the method was tested by experiment. [Result] The experimental results showed that the prediction accuracy and searching speed of stochastic grammar cloud model were significantly improved from the prediction with simple stochastic grammar. [Conclusion] This study laid the foundation for the wide application of stochastic grammar model for RNA secondary structure prediction.
基金funded by the National Natural Science Foundation of China(Grant Nos.11774158 to JZ,11934008 to WW,and 11974173 to WFL)。
文摘RNAs play crucial and versatile roles in cellular biochemical reactions.Since experimental approaches of determining their three-dimensional(3D)structures are costly and less efficient,it is greatly advantageous to develop computational methods to predict RNA 3D structures.For these methods,designing a model or scoring function for structure quality assessment is an essential step but this step poses challenges.In this study,we designed and trained a deep learning model to tackle this problem.The model was based on a graph convolutional network(GCN)and named RNAGCN.The model provided a natural way of representing RNA structures,avoided complex algorithms to preserve atomic rotational equivalence,and was capable of extracting features automatically out of structural patterns.Testing results on two datasets convincingly demonstrated that RNAGCN performs similarly to or better than four leading scoring functions.Our approach provides an alternative way of RNA tertiary structure assessment and may facilitate RNA structure predictions.RNAGCN can be downloaded from https://gitee.com/dcw-RNAGCN/rnagcn.
文摘We have previously reported that the human ACAT1 gene produces a chimeric mRNA through the interchromosomal processing of two discontinuous RNAs transcribed from chromosomes 1 and 7. The chimeric mRNA uses AUG1397-1399 and GGC1274-1276 as translation initiation codons to produce normal 50-kDa ACAT1 and a novel enzymatically active 56-kDa isoform, respectively, with the latter being authentically present in human cells, including human monocyte- derived macrophages. In this work, we report that RNA secondary structures located in the vicinity of the GGC1274-1276 codon are required for production of the 56-kDa isoform. The effects of the three predicted stem-loops (nt 1255-1268, 1286-1342 and 1355-1384) were tested individually by transfecting expression plasmids into cells that contained the wild-type, deleted or mutant stem-loop sequences linked to a partial ACAT1 AUG open reading frame (ORF) or to the ORFs of other genes. The expression patterns were monitored by western blot analyses. We found that the upstream stem-loop1255-1268 from chromosome 7 and downstream stem-loop1286-1342 from chromosome 1 were needed for production of the 56-kDa isoform, whereas the last stem-loop135s-1384 from chromosome 1 was dispensable. The results of experi- ments using both monocistronic and bicistronic vectors with a stable hairpin showed that translation initiation from the GGC1274-1276 codon was mediated by an internal ribosome entry site (IRES). Further experiments revealed that translation initiation from the GGC1274-1276 codon requires the upstream AU-constituted RNA secondary structure and the downstream GC-rich structure. This mechanistic work provides further support for the biological significance of the chimeric nature of the human ACAT1 transcript.
基金Supported by the National Natural Science Foundation of China(No60971089)
文摘A novel method for the prediction of RNA secondary structure was proposed based on the particle swarm optimization(PSO). PSO is known to be effective in solving many different types of optimization problems and known for being able to approximate the global optimal results in the solution space. We designed an efficient objective function according to the minimum free energy, the number of selected stems and the average length of selected stems. We calculated how many legal stems there were in the sequence, and selected some of them to obtain an optimal result using PSO in the right of the objective function. A method based on the improved particle swarm optimization(IPSO) was proposed to predict RNA secondary structure, which consisted of three stages. The first stage was applied to encoding the source sequences, and to exploring all the legal stems. Then, a set of encoded stems were created in order to prepare input data for the second stage. In the second stage, IPSO was responsible for structure selection. At last, the optimal result was obtained from the secondary structures selected via IPSO. Nine sequences from the comparative RNA website were selected for the evaluation of the proposed method. Compared with other six methods, the proposed method decreased the complexity and enhanced the sensitivity and specificity on the basis of the experiment results.
基金Project supported by the National Natural Science Foundation of China(Grant No.31570722).
文摘Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of methods have been proposed to predict RNA secondary structures but their accuracies encountered bottleneck.Here we present a method for RNA secondary structure prediction using direct coupling analysis and a remove-and-expand algorithm that shows better performance than four existing popular multiple-sequence methods.We further show that the results can also be used to improve the prediction accuracy of the single-sequence methods.
文摘A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudoknots including the well known H-type pseudoknot were permitted to occur if reasonable.We have applied this approach to e number of RNA sequences.The prediction accuracies we obtained were higher than those in published papers.
基金Project supported by the National Natural Science Foundation of China (Grant No. 11704140)Self-determined Research Funds of CCNU from the Colleges' Basic Research and Operation of MOE (Grant No. CCNU20TS004).
文摘The RNA tertiary structure is essential to understanding the function and biological processes. Unfortunately, it is still challenging to determine the large RNA structure from direct experimentation or computational modeling. One promising approach is first to predict the tertiary contacts and then use the contacts as constraints to model the structure. The RNA structure modeling depends on the contact prediction accuracy. Although many contact prediction methods have been developed in the protein field, there are only several contact prediction methods in the RNA field at present. Here, we first review the theoretical basis and test the performances of recent RNA contact prediction methods for tertiary structure and complex modeling problems. Then, we summarize the advantages and limitations of these RNA contact prediction methods. We suggest some future directions for this rapidly expanding field in the last.
基金Project supported by the Science Fund from the Key Laboratory of Hubei Province, China (Grant No. 201932003)the National Natural Science Foundation of China (Grant Nos. 1157324 and 31600592).
文摘RNAs carry out diverse biological functions, partly because different conformations of the same RNA sequence can play different roles in cellular activities. To fully understand the biological functions of RNAs requires a conceptual framework to investigate the folding kinetics of RNA molecules, instead of native structures alone. Over the past several decades, many experimental and theoretical methods have been developed to address RNA folding. The helix-based RNA folding theory is the one which uses helices as building blocks, to calculate folding kinetics of secondary structures with pseudoknots of long RNA in two different folding scenarios. Here, we will briefly review the helix-based RNA folding theory and its application in exploring regulation mechanisms of several riboswitches and self-cleavage activities of the hepatitis delta virus (HDV) ribozyme.
基金supported by the National Natural Science Foundation of China (30571377)the National High-Tech R&D Program of China (863 Program,2006AA10A204)
文摘The attenuated vaccine strains of CSFV have a 12-nucleotides (nt) insertion in the 3'-UTR of genome as compared to that of CSFV virulent strains. In this study, we found a distinct heterogeneity in the 3'-UTR of attenuated Thiverval and HCLV strains. The longest 3'-UTR of Thiverval strain was 259 base pairs (bp) with a 32-nt insertion, the shortest 3'-UTR had only 233 bp with a 6-nt insertion. The longest 3'-UTR of HCLV strain was 244 bp with a 17-nt insertion and the shortest 3' UTR was 235 bp with a 8-nt insertion. Compared with the published sequences of 3'-UTR of vaccine and virulent strains, the 3'-UTR of CSFV vaccine strains have two variable regions where insertion among the different vaccine strains were frequently found. The first is located between the second conservative TALk codon and the start of T-rich region where we found the variable length insertion in the same vaccine strain Thiveral or HCLV and the second is located between the end of T-rich region and the front of GAA eodon, however, a 4-nt deletion was found in this region in the virulent Shimen strain. These two regions may represent the "hot spot" for mutation. Modeling the secondary structures of the 3'-UTR suggests that the T-rich insertion could result in the change of structure and free energy, thus affecting the stability of the 3'-UTR structure. These findings will help to understand the mechanism of attenuated vaccines and improve vaccine safety, stability, and efficacy.
基金The National Key Research and Development Program of China(No.2018YFC1314900,2018YFC1314902)the National Natural Science Foundation of China(No.61571109)the Fundamental Research Funds for the Central Universities(No.2242017K3DN04).
文摘To investigate how synonymous codons have been adapted to the formation of ribonucleic acid(RNA)G-quadruplex(rG4)structure,a computational searching algorithm G4Hunter was applied to detect rG4 structures in protein-coding sequences of mRNAs in five eukaryotic species.The native sequences forming rG4s were then compared with randomized sequences to evaluate selection on synonymous codons.Factors that may influence the formation of rG4 were also investigated,and the selection pressures of rG4 in different gene regions were compared to explore its potential roles in gene regulation.The results show universal selective pressure acts on synonymous codons in rG4 regions to facilitate rG4 formation in five eukaryotic organisms.While G-rich codon combinations are preferred in the rG4 structural region,C-rich codon combinations are selectively unfavorable for rG4 formation.Gene's codon usage bias,nucleotide composition,and evolutionary rate can account for the selective variations on synonymous codons among rG4 structures within a species.Moreover,rG4 structures in the translational initiation region showed significantly higher selective pressures than those in the translational elongation region.
基金supported by grants from the National Natural Science Foundation of China (92253202 and 22177087 to X.W.)the Ministry of Science and Technology (2023YFC3402200)the Fundamental Research Funds for the Central Universities (2042023kfyq05).
文摘As a cornerstone of the central dogma of molecular biology,RNA plays vital roles in living organisms.Over the past few decades,many RNA labeling technologies have been developed to elucidate the biological function of RNA.These technologies have signifi-cantly advanced our understanding of RNA secondary structure,localization,and turnover.Additionally,taking advantage of these innovative RNA labeling approaches,plenty of tool kits have been devised for the regulation of RNA-related biological process,such as gene expression and gene editing.In this review,we primarily focus on an array of intracellular RNA labeling methods,encom-passing chemical probes-based labeling,metabolic labeling,and proximity-dependent labeling.We also provide a brief overview of their applications in the research of RNA biology.Finally,the perspectives of RNA labeling are also discussed.
基金supported by the National Natural Science Foundation of China (31760507)the National Key R&D Program of China (2018YFC1706301)。
文摘The potato rot nematode(Ditylenchus destructor) is a very economically important nematode in agronomic and horticultural plants worldwide. In this study, 43 populations of D. destructor were collected from different hosts across China, including 37 populations from Chinese herbal medicine plants. Obtained sequences of ITS-rDNA and D2–D3of 28S-rDNA genes of D. destructor were compared and analyzed. Nine types of significant length variations in ITS sequences were observed among all populations. The differences in ITS1 length were mainly caused by the presence of repetitive elements with substantial base substitutions. Reconstructions of ITS1 secondary structures showed that the minisatellites formed a stem structure. Ten haplotypes were observed in all populations based on mutations and variations of helix H9. Among them, 3 known haplotypes(A–C) were found in 7 populations isolated from potato,sweet potato, and Codonopsis pilosula, and 7 unique haplotypes were found in other 36 populations collected from C. pilosula and Angelica sinensis compared with 7 haplotypes(A–G) according to Subbotin' system. These unique haplotypes were different from haplotypes A–G, and we named them as haplotypes H–N. The present results showed that a total of 14 haplotypes(A–N) of ITS-rDNA have been found in D. destructor. Phylogenetic analyses of ITSrDNA and D2–D3 showed that all populations of D. destructor were clustered into two major clades: one clade only containing haplotype A from sweet potato and the other containing haplotypes B–N from other plants. For further verification, PCR-ITS-RFLP profiles were conducted on 7 new haplotypes. Collectively, our study suggests that D. destructor populations on Chinese medicinal materials are very different from those on other hosts and this work provides a paradigm for relevant researches.
基金supported by the National Natural Science Foundation of China(grant nos.22034004 and 22027807)the National Key Research and Development Program of China(grant no.2021YFA1200104)+1 种基金the Strategic Priority Research Program of the Chinese Academy of Sciences(grant no.XDB36000000)the Vanke Special Fund for Public Health and Health Discipline Development(grant no.2022Z82WKJ003).
文摘Genomic surveillance of monkeypox virus(MPXV)is essential to explore the reason of its unusual outbreak.Current phylogenomic analysis of the MPXV genome mainly focuses on the effect of amino acid mutations.Herein,we explore the evolutionary variation of RNA G-quadruplex(RG4)of MPXV and find that the genome evolution of MPXV can also produce new effects through changes in the RG4 structure.This RG4 is located in MPXV’s only Kelch-like C9L gene,which encodes for an antagonist of the innate immune response.The evolution of this virus increases the unfolding kinetic constant of C9L RG4 and promotes the C9 protein level in living cells.Importantly,all reported MPXV genomes in 2022 carry the C9L-RG4-5 pattern with the highest unfolding kinetic constant.Additionally,the RG4 ligand,RGB-1,can impede the unfolding of C9L-RG4-5 and thereby reduce the C9 protein level.These findings carve out a new path to comprehensively understanding MPXV virology.
基金supported by the National Natural Science Foundation of China(31271402 and 31100601)the National Key Basic Research Program(2012CB316503)
文摘Protein binding is essential to the transport,decay and regulation of almost all RNA molecules.However,the structural preference of protein binding on RNAs and their cellular functions and dynamics upon changing environmental conditions are poorly understood.Here,we integrated various high-throughput data and introduced a computational framework to describe the global interactions between RNA binding proteins(RBPs)and structured RNAs in yeast at single-nucleotide resolution.We found that on average,in terms of percent total lengths,~15%of mRNA untranslated regions(UTRs),~37%of canonical non-coding RNAs(ncRNAs)and^11%of long ncRNAs(lncRNAs)are bound by proteins.The RBP binding sites,in general,tend to occur at single-stranded loops,with evolutionarily conserved signatures,and often facilitate a specific RNA structure conformation in vivo.We found that four nucleotide modifications of tRNA are significantly associated with RBP binding.We also identified various structural motifs bound by RBPs in the UTRs of mRNAs,associated with localization,degradation and stress responses.Moreover,we identified>200 novel lncRNAs bound by RBPs,and about half of them contain conserved secondary structures.We present the first ensemble pattern of RBP binding sites in the structured non-coding regions of a eukaryotic genome,emphasizing their structural context and cellular functions.