期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A benchmark for statistical microarray data analysis that preserves actual biological and technical variance

Benoît De Hertogh Bertrand De Meulder Fabrice Berger Michael Pierre Eric Bareke Anthoula Gaigneaux Eric Depiereux 《BMC bioinformatics》2010,11(1):17

Background

Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods. 相似文献

2.

Discover protein sequence signatures from protein-protein interaction data

Jianwen Fang Ryan J Haasl Yinghua Dong Gerald H Lushington 《BMC bioinformatics》2005,6(1):277

Background

The development of high-throughput technologies such as yeast two-hybrid systems and mass spectrometry technologies has made it possible to generate large protein-protein interaction (PPI) datasets. Mining these datasets for underlying biological knowledge has, however, remained a challenge. 相似文献

3.

Missing value imputation for microarray gene expression data using histone acetylation information

Qian Xiang Xianhua Dai Yangyang Deng Caisheng He Jiang Wang Jihua Feng Zhiming Dai 《BMC bioinformatics》2008,9(1):252

Background

It is an important pre-processing step to accurately estimate missing values in microarray data, because complete datasets are required in numerous expression profile analysis in bioinformatics. Although several methods have been suggested, their performances are not satisfactory for datasets with high missing percentages. 相似文献

4.

Probabilistic prediction and ranking of human protein-protein interactions

Michelle S Scott Geoffrey J Barton 《BMC bioinformatics》2007,8(1):239

Background

Although the prediction of protein-protein interactions has been extensively investigated for yeast, few such datasets exist for the far larger proteome in human. Furthermore, it has recently been estimated that the overall average false positive rate of available computational and high-throughput experimental interaction datasets is as high as 90%. 相似文献

5.

AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number

Aaron M Newman James B Cooper 《BMC bioinformatics》2010,11(1):117

Background

Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry. 相似文献

6.

Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset

下载免费PDF全文

Sung E Choe Michael Boutros Alan M Michelson George M Church Marc S Halfon 《Genome biology》2004,6(2):R16

Background

As more methods are developed to analyze RNA-profiling data, assessing their performance using control datasets becomes increasingly important. 相似文献

7.

OligoSpawn: a software tool for the design of overgo probes from large unigene datasets

Jie Zheng Jan T Svensson Kavitha Madishetty Timothy J Close Tao Jiang Stefano Lonardi 《BMC bioinformatics》2006,7(1):7

Background

Expressed sequence tag (EST) datasets represent perhaps the largest collection of genetic information. ESTs can be exploited in a variety of biological experiments and analysis. Here we are interested in the design of overlapping oligonucleotide (overgo) probes from large unigene (EST-contigs) datasets. 相似文献

8.

Integrative missing value estimation for microarray data

Jianjun Hu Haifeng Li Michael S Waterman Xianghong Jasmine Zhou 《BMC bioinformatics》2006,7(1):449-14

Background

Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. 相似文献

9.

Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering

Shibu Yooseph Weizhong Li Granger Sutton 《BMC bioinformatics》2008,9(1):182

Background

The identification and study of proteins from metagenomic datasets can shed light on the roles and interactions of the source organisms in their communities. However, metagenomic datasets are characterized by the presence of organisms with varying GC composition, codon usage biases etc., and consequently gene identification is challenging. The vast amount of sequence data also requires faster protein family classification tools. 相似文献

10.

Clustering gene expression data with a penalized graph-based metric

Ariel E Bayá Pablo M Granitto 《BMC bioinformatics》2011,12(1):2

Background

The search for cluster structure in microarray datasets is a base problem for the so-called "-omic sciences". A difficult problem in clustering is how to handle data with a manifold structure, i.e. data that is not shaped in the form of compact clouds of points, forming arbitrary shapes or paths embedded in a high-dimensional space, as could be the case of some gene expression datasets. 相似文献

11.

EDISA: extracting biclusters from multiple time-series of gene expression profiles

Jochen Supper Martin Strauch Dierk Wanke Klaus Harter Andreas Zell 《BMC bioinformatics》2007,8(1):334

Background

Cells dynamically adapt their gene expression patterns in response to various stimuli. This response is orchestrated into a number of gene expression modules consisting of co-regulated genes. A growing pool of publicly available microarray datasets allows the identification of modules by monitoring expression changes over time. These time-series datasets can be searched for gene expression modules by one of the many clustering methods published to date. For an integrative analysis, several time-series datasets can be joined into a three-dimensional gene-condition-time dataset, to which standard clustering or biclustering methods are, however, not applicable. We thus devise a probabilistic clustering algorithm for gene-condition-time datasets. 相似文献

12.

Combining transcriptional datasets using the generalized singular value decomposition

Andreas W Schreiber Neil J Shirley Rachel A Burton Geoffrey B Fincher 《BMC bioinformatics》2008,9(1):335

相似文献

13.

PubMatrix: a tool for multiplex literature mining

Kevin?G?Becker Email author Douglas?A?Hosack Glynn?DennisJr Richard?A?Lempicki Tiffani?J?Bright Chris?Cheadle Jim?Engel 《BMC bioinformatics》2003,4(1):61

Background

Molecular experiments using multiplex strategies such as cDNA microarrays or proteomic approaches generate large datasets requiring biological interpretation. Text based data mining tools have recently been developed to query large biological datasets of this type of data. PubMatrix is a web-based tool that allows simple text based mining of the NCBI literature search service PubMed using any two lists of keywords terms, resulting in a frequency matrix of term co-occurrence. 相似文献

14.

Comparison study of microarray meta-analysis methods

Anna Campain Yee Hwa Yang 《BMC bioinformatics》2010,11(1):408

Background

Meta-analysis methods exist for combining multiple microarray datasets. However, there are a wide range of issues associated with microarray meta-analysis and a limited ability to compare the performance of different meta-analysis methods. 相似文献

15.

AutoFACT: AnAutomaticFunctionalAnnotation andClassificationTool

Liisa B Koski Michael W Gray B Franz Lang Gertraud Burger 《BMC bioinformatics》2005,6(1):151

Background

Assignment of function to new molecular sequence data is an essential step in genomics projects. The usual process involves similarity searches of a given sequence against one or more databases, an arduous process for large datasets. 相似文献

16.

Data reduction for spectral clustering to analyze high throughput flow cytometry data

Habil Zare Parisa Shooshtari Arvind Gupta Ryan R Brinkman 《BMC bioinformatics》2010,11(1):403

Background

Recent biological discoveries have shown that clustering large datasets is essential for better understanding biology in many areas. Spectral clustering in particular has proven to be a powerful tool amenable for many applications. However, it cannot be directly applied to large datasets due to time and memory limitations. To address this issue, we have modified spectral clustering by adding an information preserving sampling procedure and applying a post-processing stage. We call this entire algorithm SamSPECTRAL. 相似文献

17.

Expression profiles of switch-like genes accurately classify tissue and infectious disease phenotypes in model-based classification

Michael Gormley Aydin Tozeren 《BMC bioinformatics》2008,9(1):486

Background

Large-scale compilation of gene expression microarray datasets across diverse biological phenotypes provided a means of gathering a priori knowledge in the form of identification and annotation of bimodal genes in the human and mouse genomes. These switch-like genes consist of 15% of known human genes, and are enriched with genes coding for extracellular and membrane proteins. It is of interest to determine the prediction potential of bimodal genes for class discovery in large-scale datasets. 相似文献

18.

Performance of a genetic algorithm for mass spectrometry proteomics 总被引：1，自引：0，他引：1

Neal?O?Jeffries Email author 《BMC bioinformatics》2004,5(1):180

Background

Recently, mass spectrometry data have been mined using a genetic algorithm to produce discriminatory models that distinguish healthy individuals from those with cancer. This algorithm is the basis for claims of 100% sensitivity and specificity in two related publicly available datasets. To date, no detailed attempts have been made to explore the properties of this genetic algorithm within proteomic applications. Here the algorithm's performance on these datasets is evaluated relative to other methods. 相似文献

19.

Visualization of three-way comparisons of omics data

Richard Baran Martin Robert Makoto Suematsu Tomoyoshi Soga Masaru Tomita 《BMC bioinformatics》2007,8(1):72

Background

Density plot visualizations (also referred to as heat maps or color maps) are widely used in different fields including large-scale omics studies in biological sciences. However, the current color-codings limit the visualizations to single datasets or pairwise comparisons. 相似文献

20.

Meta-analysis of breast cancer microarray studies in conjunction with conserved <Emphasis Type="Italic">cis</Emphasis>-elements suggest patterns for coordinate regulation

David D Smith Pål Sætrom Ola SnøveJr Cathryn Lundberg Guillermo E Rivas Carlotta Glackin Garrett P Larson 《BMC bioinformatics》2008,9(1):63

Background

Gene expression measurements from breast cancer (BrCa) tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic clinical arrays show little overlap in the sets of genes identified. One approach to identify a set of consistently dysregulated candidate genes in these tumors is to employ meta-analysis of multiple independent microarray datasets. This allows one to compare expression data from a diverse collection of breast tumor array datasets generated on either cDNA or oligonucleotide arrays. 相似文献