共查询到20条相似文献,搜索用时 15 毫秒
1.
Gene Set Context Analysis (GSCA) is an open source software package to help researchers use massive amounts of publicly available gene expression data (PED) to make discoveries. Users can interactively visualize and explore gene and gene set activities in 25,000+ consistently normalized human and mouse gene expression samples representing diverse biological contexts (e.g. different cells, tissues and disease types, etc.). By providing one or multiple genes or gene sets as input and specifying a gene set activity pattern of interest, users can query the expression compendium to systematically identify biological contexts associated with the specified gene set activity pattern. In this way, researchers with new gene sets from their own experiments may discover previously unknown contexts of gene set functions and hence increase the value of their experiments. GSCA has a graphical user interface (GUI). The GUI makes the analysis convenient and customizable. Analysis results can be conveniently exported as publication quality figures and tables. GSCA is available at https://github.com/zji90/GSCA. This software significantly lowers the bar for biomedical investigators to use PED in their daily research for generating and screening hypotheses, which was previously difficult because of the complexity, heterogeneity and size of the data. 相似文献
2.
MOTIVATION: Analysis of gene expression data can provide insights into the time-lagged co-regulation of genes/gene clusters. However, existing methods such as the Event Method and the Edge Detection Method are inefficient as they compare only two genes at a time. More importantly, they neglect some important information due to their scoring criterian. In this paper, we propose an efficient algorithm to identify time-lagged co-regulated gene clusters. The algorithm facilitates localized comparison and processes several genes simultaneously to generate detailed and complete time-lagged information for genes/gene clusters. RESULTS: We experimented with the time-series Yeast gene dataset and compared our algorithm with the Event Method. Our results show that our algorithm is not only efficient, but also delivers more reliable and detailed information on time-lagged co-regulation between genes/gene clusters. AVAILABILITY: The software is available upon request. CONTACT: jiliping@comp.nus.edu.sg SUPPLEMENTARY INFORMATION: Supplementary tables and figures for this paper can be found at http://www.comp.nus.edu.sg/~jiliping/p2.htm. 相似文献
3.
Recent work has used graphs to modelize expression data from microarray experiments, in view of partitioning the genes into clusters. In this paper, we introduce the use of a decomposition by clique separators. Our aim is to improve the classical clustering methods in two ways: first we want to allow an overlap between clusters, as this seems biologically sound, and second we want to be guided by the structure of the graph to define the number of clusters. We test this approach with a well-known yeast database (Saccharomyces cerevisiae). Our results are good, as the expression profiles of the clusters we find are very coherent. Moreover, we are able to organize into another graph the clusters we find, and order them in a fashion which turns out to respect the chronological order defined by the the sporulation process. 相似文献
4.
5.
Microarrays and high-throughput sequencing methods can be used to measure the expression of thousands of genes in a biological sample in a few days, whereas PCR-based methods can be used to measure the expression of a few genes in thousands of samples in about the same amount of time. These methods become more costly as the number of biological samples increases or as the number of genes of interest increases, respectively, and these factors constrain experimental design. To address these issues, we introduced ‘vertical arrays’ in which RNA from each biological sample is converted into multiple, overlapping cDNA subsets and spotted on glass slides. These vertical arrays can be queried with single gene probes to assess the expression behavior in thousands of biological samples in a single hybridization reaction. The spotted subsets are less complex than the original RNA from which they derive, which improves signal-to-noise ratios. Here, we demonstrate the quantitative capabilities of vertical arrays, including the sensitivity and accuracy of the method and the number of subsets needed to achieve this accuracy for most expressed genes. 相似文献
6.
MOTIVATION: Temporal gene expression profiles provide an important characterization of gene function, as biological systems are predominantly developmental and dynamic. We propose a method of classifying collections of temporal gene expression curves in which individual expression profiles are modeled as independent realizations of a stochastic process. The method uses a recently developed functional logistic regression tool based on functional principal components, aimed at classifying gene expression curves into known gene groups. The number of eigenfunctions in the classifier can be chosen by leave-one-out cross-validation with the aim of minimizing the classification error. RESULTS: We demonstrate that this methodology provides low-error-rate classification for both yeast cell-cycle gene expression profiles and Dictyostelium cell-type specific gene expression patterns. It also works well in simulations. We compare our functional principal components approach with a B-spline implementation of functional discriminant analysis for the yeast cell-cycle data and simulations. This indicates comparative advantages of our approach which uses fewer eigenfunctions/base functions. The proposed methodology is promising for the analysis of temporal gene expression data and beyond. AVAILABILITY: MATLAB programs are available upon request. 相似文献
7.
J-Express is a Java application that allows the user to analyze gene expression (microarray) data in a flexible way giving access to multidimensional scaling, clustering, and visualization methods in an integrated manner. Specifically, J-Express includes implementations of hierarchical clustering, k-means, principal component analysis, and self-organizing maps. At present, it does not include methods for comparing two or more experiments for differentially expressed genes. The application is completely portable and requires only that a Java runtime environment 1.2 is installed on the system. Its efficiency allows interactive clustering of thousands of expression profiles on standard personal computers. 相似文献
8.
MOTIVATION: Patient outcome prediction using microarray technologies is an important application in bioinformatics. Based on patients' genotypic microarray data, predictions are made to estimate patients' survival time and their risk of tumor metastasis or recurrence. So, accurate prediction can potentially help to provide better treatment for patients. RESULTS: We present a new computational method for patient outcome prediction. In the training phase of this method, we make use of two types of extreme patient samples: short-term survivors who got an unfavorable outcome within a short period and long-term survivors who were maintaining a favorable outcome after a long follow-up time. These extreme training samples yield a clear platform for us to identify relevant genes whose expression is closely related to the outcome. The selected extreme samples and the relevant genes are then integrated by a support vector machine to build a prediction model, by which each validation sample is assigned a risk score that falls into one of the special pre-defined risk groups. We apply this method to several public datasets. In most cases, patients in high and low risk groups stratified by our method have clearly distinguishable outcome status as seen in their Kaplan-Meier curves. We also show that the idea of selecting only extreme patient samples for training is effective for improving the prediction accuracy when different gene selection methods are used. 相似文献
9.
10.
Analysis of gene expression data using self-organizing maps. 总被引:29,自引:0,他引:29
DNA microarray technologies together with rapidly increasing genomic sequence information is leading to an explosion in available gene expression data. Currently there is a great need for efficient methods to analyze and visualize these massive data sets. A self-organizing map (SOM) is an unsupervised neural network learning algorithm which has been successfully used for the analysis and organization of large data files. We have here applied the SOM algorithm to analyze published data of yeast gene expression and show that SOM is an excellent tool for the analysis and visualization of gene expression profiles. 相似文献
11.
12.
Recently, a novel approach has been developed to study gene expression in single cells with high time resolution using RNA Fluorescent In Situ Hybridization (FISH). The technique allows individual mRNAs to be counted with high accuracy in wild-type cells, but requires cells to be fixed; thus, each cell provides only a "snapshot" of gene expression. Here we show how and when RNA FISH data on pairs of genes can be used to reconstruct real-time dynamics from a collection of such snapshots. Using maximum-likelihood parameter estimation on synthetically generated, noisy FISH data, we show that dynamical programs of gene expression, such as cycles (e.g., the cell cycle) or switches between discrete states, can be accurately reconstructed. In the limit that mRNAs are produced in short-lived bursts, binary thresholding of the FISH data provides a robust way of reconstructing dynamics. In this regime, prior knowledge of the type of dynamics--cycle versus switch--is generally required and additional constraints, e.g., from triplet FISH measurements, may also be needed to fully constrain all parameters. As a demonstration, we apply the thresholding method to RNA FISH data obtained from single, unsynchronized cells of Saccharomyces cerevisiae. Our results support the existence of metabolic cycles and provide an estimate of global gene-expression noise. The approach to FISH data presented here can be applied in general to reconstruct dynamics from snapshots of pairs of correlated quantities including, for example, protein concentrations obtained from immunofluorescence assays. 相似文献
13.
14.
Cline MS Smoot M Cerami E Kuchinsky A Landys N Workman C Christmas R Avila-Campilo I Creech M Gross B Hanspers K Isserlin R Kelley R Killcoyne S Lotia S Maere S Morris J Ono K Pavlovic V Pico AR Vailaya A Wang PL Adler A Conklin BR Hood L Kuiper M Sander C Schmulevich I Schwikowski B Warner GJ Ideker T Bader GD 《Nature protocols》2007,2(10):2366-2382
Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape. 相似文献
15.
The purpose of many microarray studies is to find the association between gene expression and sample characteristics such as treatment type or sample phenotype. There has been a surge of efforts developing different methods for delineating the association. Aside from the high dimensionality of microarray data, one well recognized challenge is the fact that genes could be complicatedly inter-related, thus making many statistical methods inappropriate to use directly on the expression data. Multivariate methods such as principal component analysis (PCA) and clustering are often used as a part of the effort to capture the gene correlation, and the derived components or clusters are used to describe the association between gene expression and sample phenotype. We propose a method for patient population dichotomization using maximally selected test statistics in combination with the PCA method, which shows favorable results. The proposed method is compared with a currently well-recognized method. 相似文献
16.
17.
18.
The hand-held gene gun provides a rapid and efficient method of incorporating fluorescent dyes into cells, a technique that is becoming known as diolistics. Transporting fluorescent dyes into cells has, in the past, used predominantly injection or chemical methods. The use of the gene gun, combined with the new generation of fluorescent dyes, circumvents some of the problems of using these methods and also enables the study of cells that have proved difficult traditionally to transfect (e.g. those deep in tissues and/or terminally differentiated); in addition, the use of ion- or metabolite-sensitive dyes provides a route to study cellular mechanisms. Diolistics is also ideal for loading cells with optical nanosensors--nanometre-sized sensors linked to fluorescent probes. Here, we discuss the theoretical considerations of using diolistics, the advantages compared with other methods of inserting dyes into cells and the current uses of the technique, with particular consideration of nanosensors. 相似文献
19.
SUMMARY: The package HMMGEP performs cluster analysis on gene expression data using hidden Markov models. AVAILABILITY: HMMGEP, including the source code, documentation and sample data files, is available at http://www.bioinfo.tsinghua.edu.cn:8080/~rich/hmmgep_download/index.html. 相似文献