首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

We present a novel and systematic approach to analyze temporal microarray data. The approach includes normalization, clustering and network analysis of genes.

Methodology

Genes are normalized using an error model based uniform normalization method aimed at identifying and estimating the sources of variations. The model minimizes the correlation among error terms across replicates. The normalized gene expressions are then clustered in terms of their power spectrum density. The method of complex Granger causality is introduced to reveal interactions between sets of genes. Complex Granger causality along with partial Granger causality is applied in both time and frequency domains to selected as well as all the genes to reveal the interesting networks of interactions. The approach is successfully applied to Arabidopsis leaf microarray data generated from 31,000 genes observed over 22 time points over 22 days. Three circuits: a circadian gene circuit, an ethylene circuit and a new global circuit showing a hierarchical structure to determine the initiators of leaf senescence are analyzed in detail.

Conclusions

We use a totally data-driven approach to form biological hypothesis. Clustering using the power-spectrum analysis helps us identify genes of potential interest. Their dynamics can be captured accurately in the time and frequency domain using the methods of complex and partial Granger causality. With the rise in availability of temporal microarray data, such methods can be useful tools in uncovering the hidden biological interactions. We show our method in a step by step manner with help of toy models as well as a real biological dataset. We also analyse three distinct gene circuits of potential interest to Arabidopsis researchers.  相似文献   

2.
3.
4.
Extracting three-way gene interactions from microarray data   总被引:1,自引:0,他引:1  
MOTIVATION: It is an important and difficult task to extract gene network information from high-throughput genomic data. A common approach is to cluster genes using pairwise correlation as a distance metric. However, pairwise correlation is clearly too simplistic to describe the complex relationships among real genes since co-expression relationships are often restricted to a specific set of biological conditions/processes. In this study, we described a three-way gene interaction model that captures the dynamic nature of co-expression relationship between a gene pair through the introduction of a controller gene. RESULTS: We surveyed 0.4 billion possible three-way interactions among 1000 genes in a microarray dataset containing 678 human cancer samples. To test the reproducibility and statistical significance of our results, we randomly split the samples into a training set and a testing set. We found that the gene triplets with the strongest interactions (i.e. with the smallest P-values from appropriate statistical tests) in the training set also had the strongest interactions in the testing set. A distinctive pattern of three-way interaction emerged from these gene triplets: depending on the third gene being expressed or not, the remaining two genes can be either co-expressed or mutually exclusive (i.e. expression of either one of them would repress the other). Such three-way interactions can exist without apparent pairwise correlations. The identified three-way interactions may constitute candidates for further experimentation using techniques such as RNA interference, so that novel gene network or pathways could be identified.  相似文献   

5.
6.
This paper introduces a method to study the variation of brain functional connectivity networks with respect to experimental conditions in fMRI data. It is related to the psychophysiological interaction technique introduced by Friston et al. and extends to networks of correlation modulation (CM networks). Extended networks containing several dozens of nodes are determined in which the links correspond to consistent correlation modulation across subjects. In addition, we assess inter-subject variability and determine networks in which the condition-dependent functional interactions can be explained by a subject-dependent variable. We applied the technique to data from a study on syntactical production in bilinguals and analysed functional interactions differentially across tasks (word reading or sentence production) and across languages. We find an extended network of consistent functional interaction modulation across tasks, whereas the network comparing languages shows fewer links. Interestingly, there is evidence for a specific network in which the differences in functional interaction across subjects can be explained by differences in the subjects' syntactical proficiency. Specifically, we find that regions, including ones that have previously been shown to be involved in syntax and in language production, such as the left inferior frontal gyrus, putamen, insula, precentral gyrus, as well as the supplementary motor area, are more functionally linked during sentence production in the second, compared with the first, language in syntactically more proficient bilinguals than in syntactically less proficient ones. Our approach extends conventional activation analyses to the notion of networks, emphasizing functional interactions between regions independently of whether or not they are activated. On the one hand, it gives rise to testable hypotheses and allows an interpretation of the results in terms of the previous literature, and on the other hand, it provides a basis for studying the structure of functional interactions as a whole, and hence represents a further step towards the notion of large-scale networks in functional imaging.  相似文献   

7.
8.
MOTIVATION: Large scale gene expression data are often analysed by clustering genes based on gene expression data alone, though a priori knowledge in the form of biological networks is available. The use of this additional information promises to improve exploratory analysis considerably. RESULTS: We propose constructing a distance function which combines information from expression data and biological networks. Based on this function, we compute a joint clustering of genes and vertices of the network. This general approach is elaborated for metabolic networks. We define a graph distance function on such networks and combine it with a correlation-based distance function for gene expression measurements. A hierarchical clustering and an associated statistical measure is computed to arrive at a reasonable number of clusters. Our method is validated using expression data of the yeast diauxic shift. The resulting clusters are easily interpretable in terms of the biochemical network and the gene expression data and suggest that our method is able to automatically identify processes that are relevant under the measured conditions.  相似文献   

9.
An efficient two-step Markov blanket method for modeling and inferring complex regulatory networks from large-scale microarray data sets is presented. The inferred gene regulatory network (GRN) is based on the time series gene expression data capturing the underlying gene interactions. For constructing a highly accurate GRN, the proposed method performs: 1) discovery of a gene's Markov Blanket (MB), 2) formulation of a flexible measure to determine the network's quality, 3) efficient searching with the aid of a guided genetic algorithm, and 4) pruning to obtain a minimal set of correct interactions. Investigations are carried out using both synthetic as well as yeast cell cycle gene expression data sets. The realistic synthetic data sets validate the robustness of the method by varying topology, sample size, time delay, noise, vertex in-degree, and the presence of hidden nodes. It is shown that the proposed approach has excellent inferential capabilities and high accuracy even in the presence of noise. The gene network inferred from yeast cell cycle data is investigated for its biological relevance using well-known interactions, sequence analysis, motif patterns, and GO data. Further, novel interactions are predicted for the unknown genes of the network and their influence on other genes is also discussed.  相似文献   

10.
11.
12.
The increasing interest in systems biology has resulted in extensive experimental data describing networks of interactions (or associations) between molecules in metabolism, protein-protein interactions and gene regulation. Comparative analysis of these networks is central to understanding biological systems. We report a novel method (PHUNKEE: Pairing subgrapHs Using NetworK Environment Equivalence) by which similar subgraphs in a pair of networks can be identified. Like other methods, PHUNKEE explicitly considers the graphical form of the data and allows for gaps. However, it is novel in that it includes information about the context of the subgraph within the adjacent network. We also explore a new approach to quantifying the statistical significance of matching subgraphs. We report similar subgraphs in metabolic pathways and in protein-protein interaction networks. The most similar metabolic subgraphs were generally found to occur in processes central to all life, such as purine, pyrimidine and amino acid metabolism. The most similar pairs of subgraphs found in the protein-protein interaction networks of Drosophila melanogaster and Saccharomyces cerevisiae also include central processes such as cell division but, interestingly, also include protein sub-networks involved in pre-mRNA processing. The inclusion of network context information in the comparison of protein interaction networks increased the number of similar subgraphs found consisting of proteins involved in the same functional process. This could have implications for the prediction of protein function.  相似文献   

13.
14.
15.
Information regarding gene coexpression is useful to predict gene function. Several databases have been constructed for gene coexpression in model organisms based on a large amount of publicly available gene expression data measured by GeneChip platforms. In these databases, Pearson''s correlation coefficients (PCCs) of gene expression patterns are widely used as a measure of gene coexpression. Although the coexpression measure or GeneChip summarization method affects the performance of the gene coexpression database, previous studies for these calculation procedures were tested with only a small number of samples and a particular species. To evaluate the effectiveness of coexpression measures, assessments with large-scale microarray data are required. We first examined characteristics of PCC and found that the optimal PCC threshold to retrieve functionally related genes was affected by the method of gene expression database construction and the target gene function. In addition, we found that this problem could be overcome when we used correlation ranks instead of correlation values. This observation was evaluated by large-scale gene expression data for four species: Arabidopsis, human, mouse and rat.  相似文献   

16.
Advances in mass spectrometry among other technologies have allowed for quantitative, reproducible, proteome-wide measurements of levels of phosphorylation as signals propagate through complex networks in response to external stimuli under different conditions. However, computational approaches to infer elements of the signaling network strictly from the quantitative aspects of proteomics data are not well established. We considered a method using the principle of maximum entropy to infer a network of interacting phosphotyrosine sites from pairwise correlations in a mass spectrometry data set and derive a phosphorylation-dependent interaction network solely from quantitative proteomics data. We first investigated the applicability of this approach by using a simulation of a model biochemical signaling network whose dynamics are governed by a large set of coupled differential equations. We found that in a simulated signaling system, the method detects interactions with significant accuracy. We then analyzed a growth factor mediated signaling network in a human mammary epithelial cell line that we inferred from mass spectrometry data and observe a biologically interpretable, small-world structure of signaling nodes, as well as a catalog of predictions regarding the interactions among previously uncharacterized phosphotyrosine sites. For example, the calculation places a recently identified tumor suppressor pathway through ARHGEF7 and Scribble, in the context of growth factor signaling. Our findings suggest that maximum entropy derived network models are an important tool for interpreting quantitative proteomics data.  相似文献   

17.
18.
19.
Discovering gene networks with a neural-genetic hybrid   总被引:1,自引:0,他引:1  
Recent advances in biology (namely, DNA arrays) allow an unprecedented view of the biochemical mechanisms contained within a cell. However, this technology raises new challenges for computer scientists and biologists alike, as the data created by these arrays is often highly complex. One of the challenges is the elucidation of the regulatory connections and interactions between genes, proteins and other gene products. In this paper, a novel method is described for determining gene interactions in temporal gene expression data using genetic algorithms combined with a neural network component. Experiments conducted on real-world temporal gene expression data sets confirm that the approach is capable of finding gene networks that fit the data. A further repeated approach shows that those genes significantly involved in interaction with other genes can be highlighted and hypothetical gene networks and circuits proposed for further laboratory testing.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号