基于离散时序基因表达数据的双聚类算法被引量：1

Bicluster algorithm on discrete time-series gene expression data

下载PDF

导出

摘要目前应用于基因表达数据上的双聚类算法大多是基于真实数据提出的,因此易受噪声干扰,且这些算法很少考虑样本间的时序性。提出了一种有效的时间点连续的双聚类挖掘算法DTCB,从离散的时序基因表达数据中挖掘出时间点连续的最大共表达双聚类。该算法使用了一种新的数据离散化方法,同时提出了三种在离散数据集下基因间的共表达关系;为了提高挖掘效率,DTCB使用了有效的剪枝和输出策略,可以在不产生候选集的情况下一次性挖掘出所有的最大共表达双聚类。通过实验分析,证明DTCB具有高效的性能和良好的鲁棒性,且结果具有较好的统计和生物意义。 At present, the bicluster algorithms applied to the gene expression data were mostly based on real data. Therefore,they were susceptible to noise interference, and these algorithms rarely considered the time sequence between samples. This paper proposed an efficient time-continuous bicluster algorithm DTCB to mine the maximal time-continuous biclusters from the discrete time-series gene expression data. It used a new discretization method on gene expression data and defined three co-expression relations between genes in the discrete dataset. DTCB adopted several pruning and output techniques to improve the efficiency. It could produce maximal co-expression biclusters without candidate maintenance. The experimental results show that DTCB has efficient performance and better robustness. Simultaneously,the results can be of more statistical and biological significance.

作者许涛尚学群杨蜜静王淼

机构地区西北工业大学计算机学院计算机软件与理论系

出处《计算机应用研究》 CSCD 北大核心 2013年第12期3551-3556,3567,共7页 Application Research of Computers

基金国家"973"计划资助项目(2012CB316203) 国家自然科学基金资助项目(61272121)

关键词时序基因表达数据双聚类共表达时间点连续离散化 time-series gene expression data bicluster co-expression time-continuous discretization

分类号 TP311 [自动化与计算机技术—计算机软件与理论] TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献17

1RAMONI M,SEBASTIANI P,KOHANE I.Cluster analysis of gene ex-pression dynamics[J].PNAS,2002,99(14):9121-9126.
2MADEIRA S C,OLIVEIRA A L.Biclustering algorithms for biologicaldata analysis:a survey[J].IEEE/ACM Trans on ComputationalBiology and Bioinformatics,2004,1(1):24-45.
3TANAY A,SHARAN R,SHAMIR R.Discovering statistically signifi-cant biclusters in gene expression data[J].Bioinformatics,2002,18(SI):136-144.
4BEN-DOR A,CHOR B,KARP R,ef al.Discovering local structure ingene expression data:the order-preserving submatrix problem[J].Journal of Computational Biology,2003,10(3-4):373-384.
5MURALI T M,KASIF S.Extracting conserved gene expression motifsfrom gene expression data[C]//Proc of the 8th Pacific Symposium onBiocomputing.2003:77-88.
6CHENG Y1CHURCH G M.Biclustering of expression data[C]//Procof International Conference on Intelligent Systems for MolecularBiology.New York:ACM Press,2000:93-103.
7ZHAO Li-zhuang,ZAKI M.MicroCluster:efficient deterministicbiclustering of microarray data[J].IEEE Intelligent Systems,2005,20(6):40-49.
8PANDEY G,ATLURI G,STEINBACH M,et al An association analy-sis approach to biclustering[C]//Proc of the 15th ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2009:677-686.
9ZHANG Ya,ZHA Hong-yuan,CHU C H.A time-series biclusteringalgorithm for revealing co-regulated genes[C]//Proc of IEEE Interna-tional Conference on Information and Technology:Coding and Compu-ting.2005:32-37.
10WANG Guo-ren,YIN Lin-jun,ZHAO Yu-hai,ei al.Efficiently miningtime-delayed gene expression patterns[J].IEEE Trans on IEEESystems,Man,and Cybernetics,Part B:Cybernetics,2010,40(2):400-411.

二级参考文献14

1TAVAZOIE S, HUGHES J D, CAMPBELL M J, et al. Systematic determination of genetic network architecture [ J]. Nature Genetics, 1999,22 ( 3 ) :281 - 285.
2RAMONI M, SEBASTIANI P, KOHANE I. Cluster analysis of gene expression dynamics [ J ]. Proceedings of the National Academy of Sciences of the USA,2002,99(14) :9121-9126.
3CHENG Yi-zhong, CHURCH G M. Biclustering of expression data [ C ]//Proc of the 8th International Conference on Intelligent Systems for Molecular Biology. New York : ACM Press, 2000:93-103.
4BEN-DOR A, CHOR B, KARP R, et al. Discovering local structure in gene expression data: the order-preserving submatrix problem [ C l//Proc of the 6th Annual International Conference on Computa- tional Biology. New York: ACM Press,2002:49-57.
5CHENG K O, LAW N F, SIU W C, et al. BiVisu: software tool for bicluster detection and visualization [ J ]. Biointormatics, 2007,23 ( 17 ) :2342-2344.
6ZHAO Li-zhuang, ZAKI M J. MicroCluster: an efficient deterministic biclustering algorithm for microarray data[ J]. IEEE Intelligent Sys- tems,2005,20(6) :40-49.
7PANDEY G, ATLURI G, STEINBACH M, et al. An association a- nalysis approach to biclnsting[ C ]//Proc of the 15th ACM Conference on Kownlege Discovery and Data Mining. New York: ACM Press, 2009 : 677 - 686.
8ZHANG Ya, ZHA Hong-yuan, CHU C H. A time-series biclustering algorithm for revealing co-regulated genes[ C]//Proc of the 5th IEEE International Conference on Information Technology : Coding and Com- puting. Washington DC: IEEE Computer Society,2005:32-37.
9王淼,尚学群,谢华博,等.行常量差异表达基因模式挖掘算法研究[J].计算机研究与发展,2012,49(增刊):228-234.
10WANG Miao, SHANG Xue-qun, MIAO Miao, et al. MSPattem : effi- cient mining maximal subspace differential co-expression patterns in microarray dat asets[ C ]//Proc of IEEE International Conference on Signal Processing, Communication and Computing. 2011 : 181 - 190.

共引文献2

1邓小燕,甘晓玲,唐宜.谱聚类算法在基因表达数据分析中的应用[J].现代计算机,2014,20(6):8-12. 被引量：1
2林勤,林斯达,朱文敏.面向微阵列基因数据的基于PA指标双向聚类算法[J].计算机与现代化,2014(12):11-14.

同被引文献5

1孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008(1):48-61. 被引量：1074
2海沫,张书云,马燕林.分布式环境中聚类问题算法研究综述[J].计算机应用研究,2013,30(9):2561-2564. 被引量：13
3曾小青,徐秦,张丹,林大瀚.基于消费数据挖掘的多指标客户细分新方法[J].计算机应用研究,2013,30(10):2944-2947. 被引量：24
4江雨燕,李平,王清.基于共享背景主题的Labeled LDA模型[J].电子学报,2013,41(9):1794-1799. 被引量：17
5郑伟,潘正勇.结合FCM和RSF模型的医学图像分割方法[J].计算机应用与软件,2014,31(2):198-200. 被引量：1

引证文献1

1余文利,余建军,方建文.混合属性数据k-prototypes聚类算法[J].计算机系统应用,2015,24(6):168-172. 被引量：3

二级引证文献3

1何育朋.混合的大规模数据库中数值型数据聚类算法研究[J].微电子学与计算机,2017,34(2):119-122. 被引量：4
2陈佳佳,张旺,刘东海,张晓琴.一种融合α度量的混合数据K-prototypes算法[J].统计与决策,2023(10):16-22. 被引量：1
3张晓妹,胡殿凯.基于K-Prototypes聚类算法的股票分析师行为划分[J].计算机科学与应用,2018,8(6):894-901.

1刘宇宏,王士同,徐红林.改进的时序基因表达数据动态聚类算法[J].计算机工程与应用,2007,43(27):164-167.
2杨蜜静,尚学群,许涛,王淼.面向时序基因表达数据的双聚类算法[J].计算机应用研究,2013,30(8):2308-2314. 被引量：3
3李晓园,尚学群,王淼.从基因表达数据中有效挖掘差异共表达双聚类——DiCluster算法[J].计算机应用研究,2012,29(11):4087-4092. 被引量：1
4刘宇宏,王士同,徐红林.基于AR模型的动态模糊聚类算法[J].计算机工程与设计,2008,29(1):144-147. 被引量：1
5李敏,武学鸿,费耀平.融合PPI网络和基因表达的复合物识别算法[J].系统工程理论与实践,2014,34(2):437-443. 被引量：2
6谢飞君.朱频频:做有“脑子”的机器人[J].今日中国,2016,0(10):17-19.
7姜永森,陆媛,杨慧中.一种模糊相似关系的基因表达数据聚类方法[J].计算机工程与应用,2011,47(8):236-238. 被引量：2
8谢华博,尚学群,王淼.相对行常量差异共表达双聚类挖掘算法[J].计算机应用,2013,33(8):2188-2193. 被引量：1
9应文豪,王士同.基于惯性法则的基因调控网络推断[J].计算机工程与应用,2008,44(33):211-214.
10ZAN Xiangzhen,XIAO Biyu,MA Runnian,ZHANG Fengyue,LIU Wenbin.A Graph-based Method to Mine Coexpression Clusters Across Multiple Datasets[J].Chinese Journal of Electronics,2012,21(4):657-662. 被引量：1

计算机应用研究

2013年第12期

浏览历史

内容加载中请稍等...

基于离散时序基因表达数据的双聚类算法被引量：1

参考文献17

二级参考文献14

共引文献2

同被引文献5

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于离散时序基因表达数据的双聚类算法 被引量：1

参考文献17

二级参考文献14

共引文献2

同被引文献5

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于离散时序基因表达数据的双聚类算法被引量：1