摘要
目的 通过生物信息学筛选出肝细胞癌(HCC)与正常组织的差异表达基因(DEGs),探索HCC预后生物标志物。方法 从公共基因数据库(GEO)筛选并下载3个微阵列数据集GSE101685、GSE84402和GSE62232,通过GEO2R在线分析平台对得到的基因芯片进行分析,可以得到癌组织与非癌组织的DEGs,利用DAVID数据库进行基因本体论(GO)功能富集分析及京都基因与基因组百科全书(KEGG)通路富集分析,应用STRING绘制出蛋白-蛋白相互作用(PPI)网络,导入Cytoscape软件用CytoHubba插件筛选出排名前10位的核心基因,通过Kaplan-Meier Plotter对每个核心基因进行生存分析,并绘制生存曲线,再通过GEPIA数据库进行表达量分析,进一步分析核心基因所涉及的信号通路。结果 数据集GSE101685、GSE84402、GSE62232分别筛选出459、471和292个DEGs。Venn图显示GSE101685、GSE84402和GSE62232数据集共同表达的DEGs有169个,其中上调DEGs 43个,下调DEGs 126个,通过上述相应分析,筛选出10个核心基因DLGAP5、BIRC5、CCNB1、CCNA2、TTK、NDC80、NCAPG、MAD2L1、BUB1B和RRM2。将10个核心基因通过Kaplan-Meier Plotter进行预后分析后,发现10个核心基因的过表达均会导致总体生存率的下降。将10个核心基因通过GEPIA数据库进行表达量分析,发现9个核心基因(DLGAP5、BIRC5、CCNB1、CCNA2、NDC80、NCAPG、MAD2L1、BUB1B、RRM2)表达差异有统计学意义,均P<0.05。其中核心基因BUB1B、CCNA2、CCNB1、MAD2L1和RRM2主要富集在p53信号通路和细胞周期。结论 BUB1B、CCNA2、CCNB1、MAD2L1和RRM2的过表达与HCC的不良生存率相关,可能成为HCC的预后生物标志物,可为HCC患者的治疗提供理论基础,DLGAP5、BIRC5、NDC80和NCAPG在HCC预后评估等方面值得继续探索。
Objective To screen the differentially expressed genes(DEGs) between hepatocellular carcinoma(HCC) and normal tissues by bioinformatics, and to explore prognostic biomarkers of HCC.Methods Three microarray datasets GSE101685, GSE84402, and GSE62232 were screened and downloaded from gene expression omnibus(GEO)database. DEGs of cancer and non-cancer tissues could be obtained by analyzing the obtained gene chips through online analysis platform GEO2R. Gene ontology(GO) functional enrichment analysis and Kyoto encyclopedia of genes and genomes(KEGG) pathway enrichment analysis were performed using the database for annotation, visualization, and integrated discovery(DAVID) database, the protein-protein interaction(PPI) network was mapped using STRING. Cytoscape software was imported to screen out the top 10 core genes by CytoHubba plug-in. Kaplan-Meier Plotte was used to analyze the survival of each core gene, and the survival curve was drawn. The GEPIA database was used for expression analysis to further analyze the signaling pathways involved in the core genes.Results A total of 459, 471 and 292 DEGs were screened from GSE101685,GSE84402and GSE62232data sets,respectively.Venn diagram showed that 169DEGs were co-expressed in the GSE101685,GSE84402and GSE62232data sets,among which 43DEGs were up-regulated and 126DEGs were down-regulated.Through the corresponding analysis above,10core genes(DLGAP5,BIRC5,CCNB1,CCNA2,TTK,NDC80,NCAPG,MAD2L1,BUB1B,RRM2)were screened out.Kaplan-Meier Plotte was used to analyze the prognosis of 10core genes,and it was found that the overexpression of 10core genes would lead to the decrease of overall survival rate.The expression levels of 10core genes were analyzed by GEPIA database,and the expression levels of 9core genes(DLGAP5,BIRC5,CCNB1,CCNA2,NDC80,NCAPG,MAD2L1,BUB1B,RRM2)were statistically significant(all P<0.05).The core genes BUB1B,CCNA2,CCNB1,MAD2L1and RRM2were mainly enriched in p53signaling pathway and cell cycle.Conclusions The overexpression of BUB1B,CCNA2,CCNB1,MAD2L1and RRM2is associated with poor survival rate of HCC,which may be a prognostic biomarker for HCC,and can provide a theoretical basis for the treatment of HCC patients.DLGAP5,BIRC5,NDC80and NCAPGare worthy of further exploration in the prognosis evaluation of HCC.
作者
武硕
贾友鹏
WU Shuo;JIA Youpeng(Department of Hepatobiliary Surgery,Dalian Central Hospital,Dalian 116000,China;Graduate School of Dalian Medical University,Dalian 116000,China)
出处
《社区医学杂志》
CAS
2023年第7期340-350,共11页
Journal Of Community Medicine
关键词
肝细胞癌
生物信息分析
差异基因
蛋白-蛋白相互作用网络
hepatocellular carcinoma
biological information analysis
differential gene
protein-protein interaction network