Weight matrix models for signal sequence motif are simple. A main limitation of the models is the assumption of independence between positions. Signal enhancement is achieved by taking the total likelihood as the obje...Weight matrix models for signal sequence motif are simple. A main limitation of the models is the assumption of independence between positions. Signal enhancement is achieved by taking the total likelihood as the objective function for maximization to cluster sequences into groups with different patterns. As an example, the initial and terminal signals for translation in rice genome are examined.展开更多
Robust Clustering methods are aimed at avoiding unsatisfactory results resulting from the presence of certain amount of outlying observations in the input data of many practical applications such as biological sequenc...Robust Clustering methods are aimed at avoiding unsatisfactory results resulting from the presence of certain amount of outlying observations in the input data of many practical applications such as biological sequences analysis or gene expressions analysis. This paper presents a fuzzy clustering algorithm based on average link and possibilistic clustering paradigm termed as AVLINK. It minimizes the average dissimilarity between pairs of patterns within the same cluster and at the same time the size of a cluster is maximized by computing the zeros of the derivative of proposed objective function. AVLINK along with the proposed initialization procedure show a high outliers rejection capability as it makes their membership very low furthermore it does not requires the number of clusters to be known in advance and it can discover clusters of non convex shape. The effectiveness and robustness of the proposed algorithms have been demonstrated on different types of protein data sets.展开更多
基金the Special Funds for Major National Basic Research Projects,国家自然科学基金,Research Project 248 of Beijing
文摘Weight matrix models for signal sequence motif are simple. A main limitation of the models is the assumption of independence between positions. Signal enhancement is achieved by taking the total likelihood as the objective function for maximization to cluster sequences into groups with different patterns. As an example, the initial and terminal signals for translation in rice genome are examined.
文摘Robust Clustering methods are aimed at avoiding unsatisfactory results resulting from the presence of certain amount of outlying observations in the input data of many practical applications such as biological sequences analysis or gene expressions analysis. This paper presents a fuzzy clustering algorithm based on average link and possibilistic clustering paradigm termed as AVLINK. It minimizes the average dissimilarity between pairs of patterns within the same cluster and at the same time the size of a cluster is maximized by computing the zeros of the derivative of proposed objective function. AVLINK along with the proposed initialization procedure show a high outliers rejection capability as it makes their membership very low furthermore it does not requires the number of clusters to be known in advance and it can discover clusters of non convex shape. The effectiveness and robustness of the proposed algorithms have been demonstrated on different types of protein data sets.