期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

SCPS: a fast implementation of a spectral method for detecting protein families on a genome-wide scale

Tamás Nepusz Rajkumar Sasidharan Alberto Paccanaro 《BMC bioinformatics》2010,11(1):120

相似文献

2.

Biclustering via optimal re-ordering of data matrices in systems biology: rigorous methods and comparative studies

Peter A DiMaggioJr Scott R McAllister Christodoulos A Floudas Xiao-Jiang Feng Joshua D Rabinowitz Herschel A Rabitz 《BMC bioinformatics》2008,9(1):458

Background

The analysis of large-scale data sets via clustering techniques is utilized in a number of applications. Biclustering in particular has emerged as an important problem in the analysis of gene expression data since genes may only jointly respond over a subset of conditions. Biclustering algorithms also have important applications in sample classification where, for instance, tissue samples can be classified as cancerous or normal. Many of the methods for biclustering, and clustering algorithms in general, utilize simplified models or heuristic strategies for identifying the "best" grouping of elements according to some metric and cluster definition and thus result in suboptimal clusters. 相似文献

3.

New resampling method for evaluating stability of clusters

Irina M Gana Dresen Tanja Boes Johannes Huesing Markus Neuhaeuser Karl-Heinz Joeckel 《BMC bioinformatics》2008,9(1):42

Background

Hierarchical clustering is a widely applied tool in the analysis of microarray gene expression data. The assessment of cluster stability is a major challenge in clustering procedures. Statistical methods are required to distinguish between real and random clusters. Several methods for assessing cluster stability have been published, including resampling methods such as the bootstrap. 相似文献

4.

FLAME,a novel fuzzy clustering method for the analysis of DNA microarray data 总被引：3，自引：0，他引：3

Limin Fu Enzo Medico 《BMC bioinformatics》2007,8(1):3

Background

Data clustering analysis has been extensively applied to extract information from gene expression profiles obtained with DNA microarrays. To this aim, existing clustering approaches, mainly developed in computer science, have been adapted to microarray data analysis. However, previous studies revealed that microarray datasets have very diverse structures, some of which may not be correctly captured by current clustering methods. We therefore approached the problem from a new starting point, and developed a clustering algorithm designed to capture dataset-specific structures at the beginning of the process. 相似文献

5.

Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability

Gang Wu Ming-Yang Kao Guohui Lin Jia-Huai You 《Algorithms for molecular biology : AMB》2008,3(1):1

Background

In recent years, quartet-based phylogeny reconstruction methods have received considerable attentions in the computational biology community. Traditionally, the accuracy of a phylogeny reconstruction method is measured by simulations on synthetic datasets with known "true" phylogenies, while little theoretical analysis has been done. In this paper, we present a new model-based approach to measuring the accuracy of a quartet-based phylogeny reconstruction method. Under this model, we propose three efficient algorithms to reconstruct the "true" phylogeny with a high success probability. 相似文献

6.

Iterative class discovery and feature selection using Minimal Spanning Trees

Sudhir?Varma Email author Richard?Simon 《BMC bioinformatics》2004,5(1):126

Background

Clustering is one of the most commonly used methods for discovering hidden structure in microarray gene expression data. Most current methods for clustering samples are based on distance metrics utilizing all genes. This has the effect of obscuring clustering in samples that may be evident only when looking at a subset of genes, because noise from irrelevant genes dominates the signal from the relevant genes in the distance calculation. 相似文献

7.

HAMSTER: visualizing microarray experiments as a set of minimum spanning trees

Raymond Wan Larisa Kiseleva Hajime Harada Hiroshi Mamitsuka Paul Horton 《Source code for biology and medicine》2009,4(1):1-18

Background

Visualization tools allow researchers to obtain a global view of the interrelationships between the probes or experiments of a gene expression (e.g. microarray) data set. Some existing methods include hierarchical clustering and k-means. In recent years, others have proposed applying minimum spanning trees (MST) for microarray clustering. Although MST-based clustering is formally equivalent to the dendrograms produced by hierarchical clustering under certain conditions; visually they can be quite different. 相似文献

8.

Neural network analysis of lymphoma microarray data: prognosis and diagnosis near-perfect

Michael?C?O'Neill Email author Li?Song 《BMC bioinformatics》2003,4(1):13

Background

Microarray chips are being rapidly deployed as a major tool in genomic research. To date most of the analysis of the enormous amount of information provided on these chips has relied on clustering techniques and other standard statistical procedures. These methods, particularly with regard to cancer patient prognosis, have generally been inadequate in providing the reduced gene subsets required for perfect classification. 相似文献

9.

Inferring biological functions and associated transcriptional regulators using gene set expression coherence analysis

Tae-Min Kim Yeun-Jun Chung Mun-Gan Rhyu Myeong Ho Jung 《BMC bioinformatics》2007,8(1):453

Background

Gene clustering has been widely used to group genes with similar expression pattern in microarray data analysis. Subsequent enrichment analysis using predefined gene sets can provide clues on which functional themes or regulatory sequence motifs are associated with individual gene clusters. In spite of the potential utility, gene clustering and enrichment analysis have been used in separate platforms, thus, the development of integrative algorithm linking both methods is highly challenging. 相似文献

10.

MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering

Eun-Youn Kim Seon-Young Kim Daniel Ashlock Dougu Nam 《BMC bioinformatics》2009,10(1):260

Background

Uncovering subtypes of disease from microarray samples has important clinical implications such as survival time and sensitivity of individual patients to specific therapies. Unsupervised clustering methods have been used to classify this type of data. However, most existing methods focus on clusters with compact shapes and do not reflect the geometric complexity of the high dimensional microarray clusters, which limits their performance. 相似文献

11.

R/BHC: fast Bayesian hierarchical clustering for microarray data

Richard S Savage Katherine Heller Yang Xu Zoubin Ghahramani William M Truman Murray Grant Katherine J Denby David L Wild 《BMC bioinformatics》2009,10(1):242

Background

Although the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data analysis, little attention has been paid to uncertainty in the results obtained. 相似文献

12.

Reuse of imputed data in microarray analysis increases imputation efficiency

Ki-Yeol Kim Byoung-Jin Kim Gwan-Su Yi 《BMC bioinformatics》2004,5(1):160

Background

The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set. A few imputation methods for DNA microarray data have been introduced, but the efficiency of the methods was low and the validity of imputed values in these methods had not been fully checked. 相似文献

13.

Data reduction for spectral clustering to analyze high throughput flow cytometry data

Habil Zare Parisa Shooshtari Arvind Gupta Ryan R Brinkman 《BMC bioinformatics》2010,11(1):403

Background

Recent biological discoveries have shown that clustering large datasets is essential for better understanding biology in many areas. Spectral clustering in particular has proven to be a powerful tool amenable for many applications. However, it cannot be directly applied to large datasets due to time and memory limitations. To address this issue, we have modified spectral clustering by adding an information preserving sampling procedure and applying a post-processing stage. We call this entire algorithm SamSPECTRAL. 相似文献

14.

Incremental genetic K-means algorithm and its application in gene expression data analysis 总被引：1，自引：0，他引：1

Yi?Lu Shiyong?Lu Farshad?Fotouhi Youping?Deng Email author Susan?J?Brown 《BMC bioinformatics》2004,5(1):172

Background

In recent years, clustering algorithms have been effectively applied in molecular biology for gene expression data analysis. With the help of clustering algorithms such as K-means, hierarchical clustering, SOM, etc, genes are partitioned into groups based on the similarity between their expression profiles. In this way, functionally related genes are identified. As the amount of laboratory data in molecular biology grows exponentially each year due to advanced technologies such as Microarray, new efficient and effective methods for clustering must be developed to process this growing amount of biological data. 相似文献

15.

A Hybrid Distance Measure for Clustering Expressed Sequence Tags Originating from the Same Gene Family

Keng-Hoong Ng Chin-Kuan Ho Somnuk Phon-Amnuaisuk 《PloS one》2012,7(10)

相似文献

16.

Coenological study of snails (Mollusca: Gastropoda) in forest phytocoenoses of Medvednica mountain (NW Croatia,Yugoslavia)

V. Štamol 《Plant Ecology》1991,95(1):33-54

The coenological research of land snails has produced 42 taxa gathered on 68 sites in 11 forest phytocoenoses of the Medvednica mountain area (NW Croatia, Yugoslavia). Relative abundance, constancy and density have been established for every taxon of a snail community in the specific phytocoenosis. The total number of snail taxa, the average number of snail taxa per site, the Shannon-Wiener index (the qualitative characteristics) and the community density (the quantitative characteristic) have been established for every snail community. The snail communities from thermophilic basophilous phytocoenoses, especially the community in the association Querco-Ostryetum have been proved to be the richest in regard to quality and quantity. Markedly acidophilic phytocoenoses have had the poorest snail community: a snail community in the association Querco-Castanetum has proved the poorest in quality and the snail community in the association Luzulo-Fagetum has been the poorest in quantity. The Shannon-Wiener index has got its highest values in the snail communities belonging to climatogenous phytocoenoses. The attempts at establishing the relation of snail communities using objective methods have been made by calculating the similarity of snail communities according to Sørensen (1943) and Kulczyński (1927) and also by clustering according to weight variable-group method (Sokal & Sneath 1963: 310, 311). The following common features have been established among the results obtained by applying the stated methods:

- the grouping of snail communities belonging to thermophilic phytocoenoses on the calcareous soil

- the grouping of snail communities from the predominantly acidophilic woodlands

- the grouping of snail communities belonging to the acid soils of the highest region of Medvednica.
相似文献

17.

Markov clustering versus affinity propagation for the partitioning of protein interaction graphs

James Vlasblom Shoshana J Wodak 《BMC bioinformatics》2009,10(1):99

Background

Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. This task is commonly executed using clustering procedures, which aim at detecting densely connected regions within the interaction graphs. There exists a wealth of clustering algorithms, some of which have been applied to this problem. One of the most successful clustering procedures in this context has been the Markov Cluster algorithm (MCL), which was recently shown to outperform a number of other procedures, some of which were specifically designed for partitioning protein interactions graphs. A novel promising clustering procedure termed Affinity Propagation (AP) was recently shown to be particularly effective, and much faster than other methods for a variety of problems, but has not yet been applied to partition protein interaction graphs. 相似文献

18.

Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery 总被引：1，自引：0，他引：1

Leslie?R?Grate Email author 《BMC bioinformatics》2005,6(1):97

相似文献

19.

Reproducible Clusters from Microarray Research: Whither?

Garge NR Page GP Sprague AP Gorman BS Allison DB 《BMC bioinformatics》2005,6(Z2):S10

Motivation

In cluster analysis, the validity of specific solutions, algorithms, and procedures present significant challenges because there is no null hypothesis to test and no 'right answer'. It has been noted that a replicable classification is not necessarily a useful one, but a useful one that characterizes some aspect of the population must be replicable. By replicable we mean reproducible across multiple samplings from the same population. Methodologists have suggested that the validity of clustering methods should be based on classifications that yield reproducible findings beyond chance levels. We used this approach to determine the performance of commonly used clustering algorithms and the degree of replicability achieved using several microarray datasets.

Methods

We considered four commonly used iterative partitioning algorithms (Self Organizing Maps (SOM), K-means, Clutsering LARge Applications (CLARA), and Fuzzy C-means) and evaluated their performances on 37 microarray datasets, with sample sizes ranging from 12 to 172. We assessed reproducibility of the clustering algorithm by measuring the strength of relationship between clustering outputs of subsamples of 37 datasets. Cluster stability was quantified using Cramer's v²from a kXk table. Cramer's v²is equivalent to the squared canonical correlation coefficient between two sets of nominal variables. Potential scores range from 0 to 1, with 1 denoting perfect reproducibility.

Results

All four clustering routines show increased stability with larger sample sizes. K-means and SOM showed a gradual increase in stability with increasing sample size. CLARA and Fuzzy C-means, however, yielded low stability scores until sample sizes approached 30 and then gradually increased thereafter. Average stability never exceeded 0.55 for the four clustering routines, even at a sample size of 50. These findings suggest several plausible scenarios: (1) microarray datasets lack natural clustering structure thereby producing low stability scores on all four methods; (2) the algorithms studied do not produce reliable results and/or (3) sample sizes typically used in microarray research may be too small to support derivation of reliable clustering results. Further research should be directed towards evaluating stability performances of more clustering algorithms on more datasets specially having larger sample sizes with larger numbers of clusters considered.

相似文献

20.

Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes

Pierre R Bushel Russell D Wolfinger Greg Gibson 《BMC systems biology》2007,1(1):15-20

Background

Commonly employed clustering methods for analysis of gene expression data do not directly incorporate phenotypic data about the samples. Furthermore, clustering of samples with known phenotypes is typically performed in an informal fashion. The inability of clustering algorithms to incorporate biological data in the grouping process can limit proper interpretation of the data and its underlying biology. 相似文献

Background

Background

Background

Background

Background

Background

Background

Background

Background

Background

Background

Background

Background

- the grouping of snail communities belonging to thermophilic phytocoenoses on the calcareous soil - the grouping of snail communities from the predominantly acidophilic woodlands - the grouping of snail communities belonging to the acid soils of the highest region of Medvednica. 相似文献

Background

Motivation

Methods

Results

Background

- the grouping of snail communities belonging to thermophilic phytocoenoses on the calcareous soil

- the grouping of snail communities from the predominantly acidophilic woodlands

- the grouping of snail communities belonging to the acid soils of the highest region of Medvednica.
相似文献