期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Combining transcriptional datasets using the generalized singular value decomposition

Andreas W Schreiber Neil J Shirley Rachel A Burton Geoffrey B Fincher 《BMC bioinformatics》2008,9(1):335

相似文献

2.

A compatible exon-exon junction database for the identification of exon skipping events using tandem mass spectrum data

Fan Mo Xu Hong Feng Gao Lin Du Jun Wang Gilbert S Omenn Biaoyang Lin 《BMC bioinformatics》2008,9(1):537

Background

Alternative splicing is an important gene regulation mechanism. It is estimated that about 74% of multi-exon human genes have alternative splicing. High throughput tandem (MS/MS) mass spectrometry provides valuable information for rapidly identifying potentially novel alternatively-spliced protein products from experimental datasets. However, the ability to identify alternative splicing events through tandem mass spectrometry depends on the database against which the spectra are searched. 相似文献

3.

Recursive Cluster Elimination (RCE) for classification and feature selection from gene expression data

Malik Yousef Segun Jung Louise C Showe Michael K Showe 《BMC bioinformatics》2007,8(1):144

Background

Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE) rather than recursive feature elimination (RFE). We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. 相似文献

4.

Topological and organizational properties of the products of house-keeping and tissue-specific genes in protein-protein interaction networks

Wen-hsien Lin Wei-chung Liu Ming-jing Hwang 《BMC systems biology》2009,3(1):32-17

Background

Human cells of various tissue types differ greatly in morphology despite having the same set of genetic information. Some genes are expressed in all cell types to perform house-keeping functions, while some are selectively expressed to perform tissue-specific functions. In this study, we wished to elucidate how proteins encoded by human house-keeping genes and tissue-specific genes are organized in human protein-protein interaction networks. We constructed protein-protein interaction networks for different tissue types using two gene expression datasets and one protein-protein interaction database. We then calculated three network indices of topological importance, the degree, closeness, and betweenness centralities, to measure the network position of proteins encoded by house-keeping and tissue-specific genes, and quantified their local connectivity structure. 相似文献

5.

Effect of data normalization on fuzzy clustering of DNA microarray data

Seo Young Kim Jae Won Lee Jong Sung Bae 《BMC bioinformatics》2006,7(1):134-14

Background

Microarray technology has made it possible to simultaneously measure the expression levels of large numbers of genes in a short time. Gene expression data is information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. Clustering is an important tool for finding groups of genes with similar expression patterns in microarray data analysis. However, hard clustering methods, which assign each gene exactly to one cluster, are poorly suited to the analysis of microarray datasets because in such datasets the clusters of genes frequently overlap. 相似文献

6.

A first-draft human protein-interaction map 总被引：3，自引：2，他引：1

下载免费PDF全文

Lehner B Fraser AG 《Genome biology》2004,5(9):R63

Background

Protein-interaction maps are powerful tools for suggesting the cellular functions of genes. Although large-scale protein-interaction maps have been generated for several invertebrate species, projects of a similar scale have not yet been described for any mammal. Because many physical interactions are conserved between species, it should be possible to infer information about human protein interactions (and hence protein function) using model organism protein-interaction datasets.

Results

Here we describe a network of over 70,000 predicted physical interactions between around 6,200 human proteins generated using the data from lower eukaryotic protein-interaction maps. The physiological relevance of this network is supported by its ability to preferentially connect human proteins that share the same functional annotations, and we show how the network can be used to successfully predict the functions of human proteins. We find that combining interaction datasets from a single organism (but generated using independent assays) and combining interaction datasets from two organisms (but generated using the same assay) are both very effective ways of further improving the accuracy of protein-interaction maps.

Conclusions

The complete network predicts interactions for a third of human genes, including 448 human disease genes and 1,482 genes of unknown function, and so provides a rich framework for biomedical research.

相似文献

7.

Meta-analysis of breast cancer microarray studies in conjunction with conserved <Emphasis Type="Italic">cis</Emphasis>-elements suggest patterns for coordinate regulation

David D Smith Pål Sætrom Ola SnøveJr Cathryn Lundberg Guillermo E Rivas Carlotta Glackin Garrett P Larson 《BMC bioinformatics》2008,9(1):63

Background

Gene expression measurements from breast cancer (BrCa) tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic clinical arrays show little overlap in the sets of genes identified. One approach to identify a set of consistently dysregulated candidate genes in these tumors is to employ meta-analysis of multiple independent microarray datasets. This allows one to compare expression data from a diverse collection of breast tumor array datasets generated on either cDNA or oligonucleotide arrays. 相似文献

8.

Biclustering of gene expression data by non-smooth non-negative matrix factorization 总被引：1，自引：0，他引：1

Pedro Carmona-Saez Roberto D Pascual-Marqui F Tirado Jose M Carazo Alberto Pascual-Montano 《BMC bioinformatics》2006,7(1):78

Background

The extended use of microarray technologies has enabled the generation and accumulation of gene expression datasets that contain expression levels of thousands of genes across tens or hundreds of different experimental conditions. One of the major challenges in the analysis of such datasets is to discover local structures composed by sets of genes that show coherent expression patterns across subsets of experimental conditions. These patterns may provide clues about the main biological processes associated to different physiological states. 相似文献

9.

A comparison of four clustering methods for brain expression microarray data

Alexander L Richards Peter Holmans Michael C O'Donovan Michael J Owen Lesley Jones 《BMC bioinformatics》2008,9(1):490

Background

DNA microarrays, which determine the expression levels of tens of thousands of genes from a sample, are an important research tool. However, the volume of data they produce can be an obstacle to interpretation of the results. Clustering the genes on the basis of similarity of their expression profiles can simplify the data, and potentially provides an important source of biological inference, but these methods have not been tested systematically on datasets from complex human tissues. In this paper, four clustering methods, CRC, k-means, ISA and memISA, are used upon three brain expression datasets. The results are compared on speed, gene coverage and GO enrichment. The effects of combining the clusters produced by each method are also assessed. 相似文献

10.

Probabilistic prediction and ranking of human protein-protein interactions

Michelle S Scott Geoffrey J Barton 《BMC bioinformatics》2007,8(1):239

Background

Although the prediction of protein-protein interactions has been extensively investigated for yeast, few such datasets exist for the far larger proteome in human. Furthermore, it has recently been estimated that the overall average false positive rate of available computational and high-throughput experimental interaction datasets is as high as 90%. 相似文献

11.

Genome SEGE: A database for 'intronless' genes in eukaryotic genomes

Meena?Kishore?Sakharkar Pandjassarame?Kangueane Email author 《BMC bioinformatics》2004,5(1):67

Background

A number of completely sequenced eukaryotic genome data are available in the public domain. Eukaryotic genes are either 'intron containing' or 'intronless'. Eukaryotic 'intronless' genes are interesting datasets for comparative genomics and evolutionary studies. The SEGE database containing a collection of eukaryotic single exon genes is available. However, SEGE is derived using GenBank. The redundant, incomplete and heterogeneous qualities of GenBank data are a bottleneck for biological investigation in comparative genomics and evolutionary studies. Such studies often require representative gene sets from each genome and this is possible only by deriving specific datasets from completely sequenced genome data. Thus Genome SEGE, a database for 'intronless' genes in completely sequenced eukaryotic genomes, has been constructed. 相似文献

12.

Integrated analysis of gene expression by association rules discovery

Pedro Carmona-Saez Monica Chagoyen Andres Rodriguez Oswaldo Trelles Jose M Carazo Alberto Pascual-Montano 《BMC bioinformatics》2006,7(1):54-16

Background

Microarray technology is generating huge amounts of data about the expression level of thousands of genes, or even whole genomes, across different experimental conditions. To extract biological knowledge, and to fully understand such datasets, it is essential to include external biological information about genes and gene products to the analysis of expression data. However, most of the current approaches to analyze microarray datasets are mainly focused on the analysis of experimental data, and external biological information is incorporated as a posterior process. 相似文献

13.

Genetic interaction motif finding by expectation maximization – a novel statistical model for inferring gene modules from synthetic lethality

Yan?Qi Ping?Ye Joel?S?Bader Email author 《BMC bioinformatics》2005,6(1):288

Background

Synthetic lethality experiments identify pairs of genes with complementary function. More direct functional associations (for example greater probability of membership in a single protein complex) may be inferred between genes that share synthetic lethal interaction partners than genes that are directly synthetic lethal. Probabilistic algorithms that identify gene modules based on motif discovery are highly appropriate for the analysis of synthetic lethal genetic interaction data and have great potential in integrative analysis of heterogeneous datasets. 相似文献

14.

EDISA: extracting biclusters from multiple time-series of gene expression profiles

Jochen Supper Martin Strauch Dierk Wanke Klaus Harter Andreas Zell 《BMC bioinformatics》2007,8(1):334

Background

Cells dynamically adapt their gene expression patterns in response to various stimuli. This response is orchestrated into a number of gene expression modules consisting of co-regulated genes. A growing pool of publicly available microarray datasets allows the identification of modules by monitoring expression changes over time. These time-series datasets can be searched for gene expression modules by one of the many clustering methods published to date. For an integrative analysis, several time-series datasets can be joined into a three-dimensional gene-condition-time dataset, to which standard clustering or biclustering methods are, however, not applicable. We thus devise a probabilistic clustering algorithm for gene-condition-time datasets. 相似文献

15.

A Platform for Processing Expression of Short Time Series (PESTS)

Anshu Sinha Marianthi Markatou 《BMC bioinformatics》2011,12(1):13

Background

Time course microarray profiles examine the expression of genes over a time domain. They are necessary in order to determine the complete set of genes that are dynamically expressed under given conditions, and to determine the interaction between these genes. Because of cost and resource issues, most time series datasets contain less than 9 points and there are few tools available geared towards the analysis of this type of data. 相似文献

16.

Protein interaction network topology uncovers melanogenesis regulatory network components within functional genomics datasets

Hsiang Ho Tijana Milenković Vesna Memišević Jayavani Aruri Nataša Pržulj Anand K Ganesan 《BMC systems biology》2010,4(1):84

Background

RNA-mediated interference (RNAi)-based functional genomics is a systems-level approach to identify novel genes that control biological phenotypes. Existing computational approaches can identify individual genes from RNAi datasets that regulate a given biological process. However, currently available methods cannot identify which RNAi screen "hits" are novel components of well-characterized biological pathways known to regulate the interrogated phenotype. In this study, we describe a method to identify genes from RNAi datasets that are novel components of known biological pathways. We experimentally validate our approach in the context of a recently completed RNAi screen to identify novel regulators of melanogenesis. 相似文献

17.

Classification and biomarker identification using gene network modules and support vector machines

Malik Yousef Mohamed Ketany Larry Manevitz Louise C Showe Michael K Showe 《BMC bioinformatics》2009,10(1):337

Background

Classification using microarray datasets is usually based on a small number of samples for which tens of thousands of gene expression measurements have been obtained. The selection of the genes most significant to the classification problem is a challenging issue in high dimension data analysis and interpretation. A previous study with SVM-RCE (Recursive Cluster Elimination), suggested that classification based on groups of correlated genes sometimes exhibits better performance than classification using single genes. Large databases of gene interaction networks provide an important resource for the analysis of genetic phenomena and for classification studies using interacting genes. 相似文献

18.

New components of the <Emphasis Type="Italic">Dictyostelium</Emphasis> PKA pathway revealed by Bayesian analysis of expression data

Anup Parikh Eryong Huang Christopher Dinh Blaz Zupan Adam Kuspa Devika Subramanian Gad Shaulsky 《BMC bioinformatics》2010,11(1):163

Background

Identifying candidate genes in genetic networks is important for understanding regulation and biological function. Large gene expression datasets contain relevant information about genetic networks, but mining the data is not a trivial task. Algorithms that infer Bayesian networks from expression data are powerful tools for learning complex genetic networks, since they can incorporate prior knowledge and uncover higher-order dependencies among genes. However, these algorithms are computationally demanding, so novel techniques that allow targeted exploration for discovering new members of known pathways are essential. 相似文献

19.

Generation of Gene Ontology benchmark datasets with various types of positive signal

Petri T?r?nen Petri Pehkonen Liisa Holm 《BMC bioinformatics》2009,10(1):319

相似文献

20.

Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data

Chia Huey Ooi Madhu Chetty Shyh Wei Teng 《BMC bioinformatics》2006,7(1):320

Background

Due to the large number of genes in a typical microarray dataset, feature selection looks set to play an important role in reducing noise and computational cost in gene expression-based tissue classification while improving accuracy at the same time. Surprisingly, this does not appear to be the case for all multiclass microarray datasets. The reason is that many feature selection techniques applied on microarray datasets are either rank-based and hence do not take into account correlations between genes, or are wrapper-based, which require high computational cost, and often yield difficult-to-reproduce results. In studies where correlations between genes are considered, attempts to establish the merit of the proposed techniques are hampered by evaluation procedures which are less than meticulous, resulting in overly optimistic estimates of accuracy. 相似文献