期刊文献+
共找到12篇文章
< 1 >
每页显示 20 50 100
Comprehensive functional annotation of susceptibility variants identifies genetic heterogeneity between lung adenocarcinoma and squamous cell carcinoma 被引量:3
1
作者 Na Qin Yuancheng Li +41 位作者 Cheng Wang Meng Zhu Juncheng Dai Tongtong Hong Demetrius Albanes Stephen Lam Adonina Tardon Chu Chen Gary Goodman Stig EBojesen Maria Teresa Landi Mattias Johansson Angela Risch H-Erich Wichmann Heike Bickeboller Gadi Rennert Susanne Arnold Paul Brennan John KField Sanjay Shete Loic Le Marchand Olle Melander Hans Brunnstrom Geoffrey Liu Rayjean JHung Angeline Andrew Lambertus AKiemeney Shan Zienolddiny Kjell Grankvist Mikael Johansson Neil Caporaso Penella Woll Philip Lazarus Matthew BSchabath Melinda CAldrich Victoria LStevens Guangfu Jin David CChristiani Zhibin Hu Christopher IAmos Hongxia Ma Hongbing Shen 《Frontiers of Medicine》 SCIE CAS CSCD 2021年第2期275-291,共17页
Although genome-wide association studies have identified more than eighty genetic variants associated with non-small cell lung cancer(NSCLC)risk,biological mechanisms of these variants remain largely unknown.By integr... Although genome-wide association studies have identified more than eighty genetic variants associated with non-small cell lung cancer(NSCLC)risk,biological mechanisms of these variants remain largely unknown.By integrating a large-scale genotype data of 15581 lung adenocarcinoma(AD)cases,8350 squamous cell carcinoma(SqCC)cases,and 27355 controls,as well as multiple transcriptome and epigenomic databases,we conducted histology-specific meta-analyses and functional annotations of both reported and novel susceptibility variants.We identified 3064 credible risk variants for NSCLC,which were overrepresented in enhancer-like and promoter-like histone modification peaks as well as DNase I hypersensitive sites.Transcription factor enrichment analysis revealed that USF1 was AD-specific while CREB1 was SqCC-specific.Functional annotation and genebased analysis implicated 894 target genes,including 274 specifics for AD and 123 for SqCC,which were overrepresented in somatic driver genes(ER=1.95,P=0.005).Pathway enrichment analysis and Gene-Set Enrichment Analysis revealed that AD genes were primarily involved in immune-related pathways,while SqCC genes were homologous recombination deficiency related.Our results illustrate the molecular basis of both wellstudied and new susceptibility loci of NSCLC,providing not only novel insights into the genetic heterogeneity between AD and SqCC but also a set of plausible gene targets for post-GWAS functional experiments. 展开更多
关键词 lung cancer genome-wide association study function annotation IMMUNE homologous recombination repair deficiency genetic heterogeneity
原文传递
Development of a panel of unigene-derived polymorphic EST–SSR markers in lentil using public database information 被引量:2
2
作者 Debjyoti Sen Gupta Peng Cheng +6 位作者 Gaurav Sablok Dil Thavarajah Pushparajah Thavarajah Clarice J.Coyne Shiv Kumar Michael Baum Rebecca J.McGee 《The Crop Journal》 SCIE CAS CSCD 2016年第5期425-433,共9页
Lentil(Lens culinaris Medik.), a diploid(2n = 14) with a genome size greater than 4000 Mbp, is an important cool season food legume grown worldwide. The availability of genomic resources is limited in this crop specie... Lentil(Lens culinaris Medik.), a diploid(2n = 14) with a genome size greater than 4000 Mbp, is an important cool season food legume grown worldwide. The availability of genomic resources is limited in this crop species. The objective of this study was to develop polymorphic markers in lentil using publicly available curated expressed sequence tag information(ESTs). In this study, 9513 ESTs were downloaded from the National Center for Biotechnology Information(NCBI) database to develop unigene-based simple sequence repeat(SSR) markers. The ESTs were assembled into 4053 unigenes and then analyzed to identify 374 SSRs using the MISA microsatellite identification tool. Among the 374 SSRs, 26 compound SSRs were observed.Primer pairs for these SSRs were designed using Primer3 version 1.14. To classify the functional annotation of ESTs and EST–SSRs, BLASTx searches(using E-value 1 × 10-5) against the public UniP rot(http://www.uniprot.org/) and NCBI(http://www.ncbi.nlh.nih.gov/) databases were performed. Further functional annotation was performed using PLAZA(version3.0) comparative genomics and GO annotation was summarized using the Plant GO slim category. Among the synthesized 312 primers, 219 successfully amplified Lens DNA. A diverse panel of 24 Lens genotypes was used to identify polymorphic markers. A polymorphic set of 57 markers successfully discriminated the test genotypes. This set of polymorphic markers with functional annotation data could be used as molecular tools in lentil breeding. 展开更多
关键词 Lens culinaris EST-SSRS functional annotation Unigene sequences EST database Genetic resources
下载PDF
LjaFGD:Lonicera japonica functional genomicsdatabase 被引量:4
3
作者 Qiaoqiao Xiao Zhongqiu Li +3 位作者 Mengmeng Qu Wenying Xu Zhen Su Jiaotong Yang 《Journal of Integrative Plant Biology》 SCIE CAS CSCD 2021年第8期1422-1436,共15页
Lonicera japonica Thunb.,a traditional Chinese herb,has been used for treating human diseases for thousands of years.Recently,the genome of L.japonica has been decoded,providing valuable information for research into ... Lonicera japonica Thunb.,a traditional Chinese herb,has been used for treating human diseases for thousands of years.Recently,the genome of L.japonica has been decoded,providing valuable information for research into gene function.However,no comprehensive database for gene functional analysis and mining is available for L.japonica.We therefore constructed LjaFGD(www.gzybioinformatics.cn/LjaFGD and bioinformatics.cau.edu.cn/LjaFGD),a database for analyzing and comparing gene function in L.japonica.We constructed a gene co-expression network based on 77 RNA-seq samples,and then annotated genes of L.japonica by alignment against protein sequences from public databases.We also introduced several tools for gene functional analysis,including Blast,motif analysis,gene set enrichment analysis,heatmap analysis,and JBrowse.Our co-expression network revealed that MYB and WRKY transcription factor family genes were co-expressed with genes encoding key enzymes in the biosynthesis of chlorogenic acid and luteolin in L.japonica.We used flavonol synthase 1(LjFLS1)as an example to show the reliability and applicability of our database.LjaFGD and its various associated tools will provide researchers with an accessible platform for retrieving functional information on L.japonica genes to further biological discovery. 展开更多
关键词 co-expression network functional annotation functional analysis tools Lonicera japonica
原文传递
Protein sequence databases generated from metagenomics and public databases produced similar soil metaproteomic results of microbial taxonomic and functional changes
4
作者 Yi XIONG Lu ZHENG +2 位作者 Xiangxiang MENG Ren Fang SHEN Ping LAN 《Pedosphere》 SCIE CAS CSCD 2022年第4期507-520,共14页
Soil metaproteomics has excellent potential as a tool to elucidate the structural and functional changes in soil microbial communities in response to environmental alterations. However, soil metaproteomics is hindered... Soil metaproteomics has excellent potential as a tool to elucidate the structural and functional changes in soil microbial communities in response to environmental alterations. However, soil metaproteomics is hindered by several challenges and gaps. Soil microbial communities possess extremely complex microbial composition, including many uncultured microorganisms without whole genome sequencing. Thus, how to select a suitable protein sequence database remains challenging in soil metaproteomics. In this study, the Public database and Meta-database were constructed using protein sequences from public databases and metagenomics, respectively. We comprehensively analyzed and compared the soil metaproteomic results using these two kinds of protein sequence databases for protein identification based on published soil metaproteomic raw data. The results demonstrated that many more proteins, higher sequence coverage, and even more microbial species and functional annotations could be identified using the Meta-database compared with those identified using the Public database. These findings indicated that the Meta-database was more specific as a protein sequence database. However, the follow-up in-depth metaproteomic analyses exhibited similar main results regardless of the database used. The microbial community composition at the genus level was similar between the two databases, especially the species annotations with high peptide-spectrum match and high abundance. The functional analyses in response to stress, such as the gene ontology enrichment of biological progress and molecular function and the key functional microorganisms, were also similar regardless of the database. Our analysis revealed that the Public database could also meet the demand to explore the functional responses of microbial proteins to some extent. This study provides valuable insights into the choice of protein sequence databases and their impacts on subsequent bioinformatic analysis in soil metaproteomic research and will facilitate the optimization of experimental design for different purposes. 展开更多
关键词 bioinformatics differentially accumulated protein functional annotation functional microorganism Meta-database microbial community microbial species Public database
原文传递
m6A-TSHub:Unveiling the Context-specific m^(6)A Methylation and m^(6)A-affecting Mutations in 23 Human Tissues 被引量:1
5
作者 Bowen Song Daiyun Huang +6 位作者 Yuxin Zhang Zhen Wei Jionglong Su João Pedro de Magalhães Daniel J.Rigden Jia Meng Kunqi Chen 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2023年第4期678-694,共17页
As the most pervasive epigenetic marker present on mRNAs and long non-coding RNAs(lncRNAs),N6-methyladenosine(m^(6)A)RNA methylation has been shown to participate in essential biological processes.Recent studies have ... As the most pervasive epigenetic marker present on mRNAs and long non-coding RNAs(lncRNAs),N6-methyladenosine(m^(6)A)RNA methylation has been shown to participate in essential biological processes.Recent studies have revealed the distinct patterns of m^(6)A methylome across human tissues,and a major challenge remains in elucidating the tissue-specific presence and circuitry of m^(6)A methylation.We present here a comprehensive online platform,m^(6)A-TSHub,for unveiling the context-specific m^(6)A methylation and genetic mutations that potentially regulate m^(6)A epigenetic mark.m^(6)A-TSHub consists of four core components,including(1)m^(6)A-TSDB,a comprehensive database of 184,554 functionally annotated m^(6)A sites derived from 23 human tissues and 499,369 m^(6)A sites from 25 tumor conditions,respectively;(2)m^(6)A-TSFinder,a web server for high-accuracy prediction of m^(6)A methylation sites within a specific tissue from RNA sequences,which was constructed using multi-instance deep neural networks with gated attention;(3)m^(6)ATSVar,a web server for assessing the impact of genetic variants on tissue-specific m^(6)A RNA modifications;and(4)m^(6)A-CAVar,a database of 587,983 The Cancer Genome Atlas(TCGA)cancer mutations(derived from 27 cancer types)that were predicted to affect m^(6)A modifications in the primary tissue of cancers.The database should make a useful resource for studying the m^(6)A methylome and the genetic factors of epitranscriptome disturbance in a specific tissue(or cancer type).m^(6)A-TSHub is accessible at www.xjtlu.edu.cn/biologicalsciences/m^(6)ats. 展开更多
关键词 N^(6)-methyladenosine Context-specific analysis Cancer mutation Genome analysis functional annotation
原文传递
iHypoxia:An Integrative Database of Protein Expression Dynamics in Response to Hypoxia in Animals
6
作者 Ze-Xian Liu Panqin Wang +8 位作者 Qingfeng Zhang Shihua Li Yuxin Zhang Yutong Guo Chongchong Jia Tian Shao Lin Li Han Cheng Zhenlong Wang 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2023年第2期267-277,共11页
Mammals have evolved mechanisms to sense hypoxia and induce hypoxic responses.Recently,high-throughput techniques have greatly promoted global studies of protein expression changes during hypoxia and the identificatio... Mammals have evolved mechanisms to sense hypoxia and induce hypoxic responses.Recently,high-throughput techniques have greatly promoted global studies of protein expression changes during hypoxia and the identification of candidate genes associated with hypoxiaadaptive evolution,which have contributed to the understanding of the complex regulatory networks of hypoxia.In this study,we developed an integrated resource for the expression dynamics of proteins in response to hypoxia(iHypoxia),and this database contains 2589 expression events of 1944 proteins identified by low-throughput experiments(LTEs)and 422,553 quantitative expression events of 33,559 proteins identified by high-throughput experiments from five mammals that exhibit a response to hypoxia.Various experimental details,such as the hypoxic experimental conditions,expression patterns,and sample types,were carefully collected and integrated.Furthermore,8788 candidate genes from diverse species inhabiting low-oxygen environments were also integrated.In addition,we conducted an orthologous search and computationally identified 394,141 proteins that may respond to hypoxia among 48 animals.An enrichment analysis of human proteins identified from LTEs shows that these proteins are enriched in certain drug targets and cancer genes.Annotation of known posttranslational modification(PTM)sites in the proteins identified by LTEs reveals that these proteins undergo extensive PTMs,particularly phosphorylation,ubiquitination,and acetylation.iHypoxia provides a convenient and user-friendly method for users to obtain hypoxia-related information of interest. 展开更多
关键词 HYPOXIA Expression dynamics Low-throughput experiment High-throughput experiment functional annotation
原文传递
WheatCENet:A Database for Comparative Co-expression Network Analysis of Allohexaploid Wheat and Its Progenitors
7
作者 Zhongqiu Li Yiheng Hu +8 位作者 Xuelian Ma Lingling Da Jiajie She Yue Liu Xin Yi Yaxin Cao Wenying Xu Yuannian Jiao Zhen Su 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2023年第2期324-336,共13页
Genetic and epigenetic changes after polyploidization events could result in variable gene expression and modified regulatory networks.Here,using large-scale transcriptome data,we constructed co-expression networks fo... Genetic and epigenetic changes after polyploidization events could result in variable gene expression and modified regulatory networks.Here,using large-scale transcriptome data,we constructed co-expression networks for diploid,tetraploid,and hexaploid wheat species,and built a platform for comparing co-expression networks of allohexaploid wheat and its progenitors,named WheatCENet.WheatCENet is a platform for searching and comparing specific functional coexpression networks,as well as identifying the related functions of the genes clustered therein.Functional annotations like pathways,gene families,protein-protein interactions,microRNAs(miRNAs),and several lines of epigenome data are integrated into this platform,and Gene Ontology(GO)annotation,gene set enrichment analysis(GSEA),motif identification,and other useful tools are also included.Using WheatCENet,we found that the network of WHEAT ABERRANT PANICLE ORGANIZATION I(WAPOI)has more co-expressed genes related to spike development in hexaploid wheat than its progenitors.We also found a novel motif of CCWWWWWWGG(CArG)specifically in the promoter region of WAPO-Al,suggesting that neofunctionalization of the WAPO-AI gene affects spikelet development in hexaploid wheat.WheatCENet is useful for investigating co-expression networks and conducting other analyses,and thus facilitates comparative and functional genomic studies in wheat. 展开更多
关键词 Co-expression network Species comparison Diploid and polyploid wheat functional annotation
原文传递
The first draft genome assembly and data analysis of the Malaysian mahseer (Tor tambroides)
8
作者 Melinda Mei Lin Lau Leonard Whye Kit Lim +1 位作者 Hung Hui Chung Han Ming Gan 《Aquaculture and Fisheries》 CSCD 2023年第5期481-491,共11页
The Malaysian mahseer(Tor tambroides),one of the most valuable freshwater fish in the world,is mainly targeted for human consumption.The mitogenomic data of this species is available to date,but the genomic informatio... The Malaysian mahseer(Tor tambroides),one of the most valuable freshwater fish in the world,is mainly targeted for human consumption.The mitogenomic data of this species is available to date,but the genomic information is still lacking.For the first time,we sequenced the whole genome of an adult fish on both Illumina and Nanopore platforms.The hybrid genome assembly had resulted in a sum of 1.23 Gb genomic sequence from the 44,726 contigs found with 44 kb N50 length and BUSCO genome completeness of 87.6%.Four types of SSRs had been detected and identified within the genome with a greater AT abundance than that of GC.Predicted protein sequences had been functionally annotated to public databases,namely GO,KEGG and COG.A maximum likelihood phylogenomic tree containing 52 Actinopterygii species and one Sarcopterygii species as outgroup was constructed,providing first insights into the genome-based evolutionary relationship of T.tambroides with other ray-finned fish.These data are crucial in facilitating the study of population genomics,species identification,morphological variations,and evolutionary biology,which are helpful in the conservation of this species. 展开更多
关键词 GENOME Gene annotation Tor tambroides PHYLOGENETIC functional annotation
原文传递
Single-cell Long Non-coding RNA Landscape of T Cells in Human Cancer Immunity
9
作者 Haitao Luo Dechao Bu +9 位作者 Lijuan Shao Yang Li Liang Sun Ce Wang Jing Wang Wei Yang Xiaofei Yang Jun Dong Yi Zhao Furong Li 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2021年第3期377-393,共17页
The development of new biomarkers or therapeutic targets for cancer immunotherapies requires deep understanding of Tcells.To date,the complete landscape and systematic characterization of long noncoding RNAs(lncRNAs)i... The development of new biomarkers or therapeutic targets for cancer immunotherapies requires deep understanding of Tcells.To date,the complete landscape and systematic characterization of long noncoding RNAs(lncRNAs)in T cells in cancer immunity are lacking.Here,by systematically analyzing full-length single-cell RNA sequencing(scRNA-seq)data of more than 20,000 libraries of T cells across three cancer types,we provided the first comprehensive catalog and the functional repertoires of lncRNAs in human T cells.Specifically,we developed a custom pipeline for de novo transcriptome assembly and obtained a novel lncRNA catalog containing 9433 genes.This increased the number of current human lncRNA catalog by 16%and nearly doubled the number of lncRNAs expressed in T cells.We found that a portion of expressed genes in single T cells were lncRNAs which had been overlooked by the majority of previous studies.Based on metacell maps constructed by the MetaCell algorithm that partitions scRNA-seq datasets into disjointed and homogenous groups of cells(metacells),154 signature lncRNA genes were identified.They were associated with effector,exhausted,and regulatory T cell states.Moreover,84 of them were functionally annotated based on the co-expression networks,indicating that lncRNAs might broadly participate in the regulation of T cell functions.Our findings provide a new point of view and resource for investigating the mechanisms of T cell regulation in cancer immunity as well as for novel cancer-immune biomarker development and cancer immunotherapies. 展开更多
关键词 Long non-coding RNA Transcriptome assembly Metacell Immune regulation functional annotation
原文传递
A post-GWAS replication study confirming the association of C14H8orf33 gene with milk production traits in dairy cattle
10
作者 Shaohua YANG Chao QI +7 位作者 Yan XIE Xiaogang CUI Yahui GAO Jianping JIANG Li JIANG Shengli ZHANG Qin ZHANG Dongxiao SUN 《Frontiers of Agricultural Science and Engineering》 2014年第4期321-330,共10页
Genome-wide association studies with an Illumina Bovine50K chip have detected 105 SNPs associated with one or multiple milk production traits in the Chinese Holstein population.Of these,38 significant SNPs detected wi... Genome-wide association studies with an Illumina Bovine50K chip have detected 105 SNPs associated with one or multiple milk production traits in the Chinese Holstein population.Of these,38 significant SNPs detected with high confidence by both L1-TDT and MMRA methods were selected to further mine potential key genes affecting milk yield and milk composition.By blasting the flanking sequences of these 38 SNPs with the bovine genome sequence combined with comparative genomics analysis,26 genes were found to contain or be near to such SNPs.Among them,the C14H8orf33 gene is merely 87 bp away from the significant SNP,Hapmap30383-BTC-005848.Hence,we report herein genotype-phenotype associations to further validate the genetic effects of the C14H8orf33 gene.By pooled DNA sequencing of 14 unrelated Holstein sires,a total of 18 with seven novel SNPs were identified.Among them,nine SNPs were in the 5′regulatory region,one in exon 6 and the other in the 3′UTR and 3′regulatory region.A total of nine of these identified SNPs were successfully genotyped and analyzed by mass spectrometry for association with five milk production traits in an independent resource population.The results showed that these SNPs were statistically significant for more than two traits[P<(0.0001–0.0267)].In addition,mRNA expression analyses revealed that C14H8orf33 was ubiquitous in eight different tissues,with a relatively higher expression level in the mammary gland than in other tissues.These findings,therefore,provide strong evidence for association of C14H8orf33 variants with milk yield and milk composition traits and may be applied in Chinese Holstein breeding programs. 展开更多
关键词 GWAS functional annotation Chinese Holstein milk production traits C14H8orf33 gene single nucleotide polymorphisms association study
原文传递
TripletGO: Integrating Transcript Expression Profiles with Protein Homology Inferences for Gene Function Prediction
11
作者 Yi-Heng Zhu Chengxin Zhang +4 位作者 Yan Liu Gilbert S.Omenn Peter L.Freddolino Dong-Jun Yu Yang Zhang 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2022年第5期1013-1027,共15页
Gene Ontology(GO)has been widely used to annotate functions of genes and gene products.Here,we proposed a new method,Triplet GO,to deduce GO terms of protein-coding and noncoding genes,through the integration of four ... Gene Ontology(GO)has been widely used to annotate functions of genes and gene products.Here,we proposed a new method,Triplet GO,to deduce GO terms of protein-coding and noncoding genes,through the integration of four complementary pipelines built on transcript expression profile,genetic sequence alignment,protein sequence alignment,and naīve probability.Triplet GO was tested on a large set of 5754 genes from 8 species(human,mouse,Arabidopsis,rat,fly,budding yeast,fission yeast,and nematoda)and 2433 proteins with available expression data from the third Critical Assessment of Protein Function Annotation challenge(CAFA3).Experimental results show that Triplet GO achieves function annotation accuracy significantly beyond the current state-of-the-art approaches.Detailed analyses show that the major advantage of Triplet GO lies in the coupling of a new triplet network-based profiling method with the feature space mapping technique,which can accurately recognize function patterns from transcript expression profiles.Meanwhile,the combination of multiple complementary models,especially those from transcript expression and protein-level alignments,improves the coverage and accuracy of the final GO annotation results.The standalone package and an online server of Triplet GO are freely available at https://zhanggroup.org/Triplet GO/. 展开更多
关键词 Gene function annotation Gene Ontology Transcript expression profile Triplet network Protein-level alignment
原文传递
In silico protein function prediction:the rise of machine learning-based approaches
12
作者 Jiaxiao Chen Zhonghui Gu +1 位作者 Luhua Lai Jianfeng Pei 《Medical Review》 2023年第6期487-510,共24页
Proteins function as integral actors in essential life processes,rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investig... Proteins function as integral actors in essential life processes,rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation.Within the context of protein research,an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings.Due to the exorbitant costs and limited throughput inherent in experimental investigations,computational models offer a promising alternative to accelerate protein function annotation.In recent years,protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks.This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction.In this review,we elucidate the historical evolution and research paradigms of computational methods for predicting protein function.Subsequently,we summarize the progress in protein and molecule representation as well as feature extraction techniques.Furthermore,we assess the performance of machine learning-based algorithms across various objectives in protein function prediction,thereby offering a comprehensive perspective on the progress within this field. 展开更多
关键词 protein function prediction pre-training models protein interaction prediction protein function annotation biological knowledge graph
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部