首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Multiple kinase activities are required for skeletal muscle differentiation. However, the mechanisms by which these kinase pathways converge to coordinate the myogenic process are unknown. Using multiple phosphoprotein and phosphopeptide enrichment techniques we obtained phosphopeptides from growing and differentiating C2C12 muscle cells and determined specific peptide sequences using LC-MS/MS. To place these phosphopeptides into a rational context, a bioinformatics approach was used. Phosphorylation sites were matched to known site-specific and to site non-specific kinase-substrate interactions, and then other substrates and upstream regulators of the implicated kinases were incorporated into a model network of protein-protein interactions. The model network implicated several kinases of known relevance to myogenesis including AKT, GSK3, CDK5, p38, DYRK, and MAPKAPK2 kinases. This combination of proteomics and bioinformatics technologies should offer great utility as the volume of protein-protein and kinase-substrate information continues to increase.  相似文献   

2.

Background  

Cellular processes are controlled by gene-regulatory networks. Several computational methods are currently used to learn the structure of gene-regulatory networks from data. This study focusses on time series gene expression and gene knock-out data in order to identify the underlying network structure. We compare the performance of different network reconstruction methods using synthetic data generated from an ensemble of reference networks. Data requirements as well as optimal experiments for the reconstruction of gene-regulatory networks are investigated. Additionally, the impact of prior knowledge on network reconstruction as well as the effect of unobserved cellular processes is studied.  相似文献   

3.
4.
I present an algorithm that determines the longest path between every gene pair in an arbitrarily large genetic network from large scale gene perturbation data. The algorithm's computational complexity is O(nk(2)), where n is the number of genes in the network and k is the average number of genes affected by a genetic perturbation. The algorithm is able to distinguish a large fraction of direct regulatory interactions from indirect interactions, even if the accuracy of its input data is substantially compromised.  相似文献   

5.

Background  

Modeling of gene expression data from time course experiments often involves the use of linear models such as those obtained from principal component analysis (PCA), independent component analysis (ICA), or other methods. Such methods do not generally yield factors with a clear biological interpretation. Moreover, implicit assumptions about the measurement errors often limit the application of these methods to log-transformed data, destroying linear structure in the untransformed expression data.  相似文献   

6.
Biochemical pathways such as metabolic, regulatory or signal transduction pathways can be viewed as interconnected processes forming an intricate network of functional and physical interactions between molecular species in the cell. The amount of information available on such pathways for different organisms is increasing very rapidly. This is offering the possibility of performing various analyses on the structure of the full network of pathways for one organism as well as across different organisms, and has therefore generated interest in developing databases for storing and managing this information. Analysing these networks remains far from straightforward owing to the nature of the databases, which are often heterogeneous, incomplete or inconsistent. Pathway analysis is hence a challenging problem in systems biology and in bioinformatics. Various forms of data models have been devised for the analysis of biochemical pathways. This paper presents an overview of the types of models used for this purpose, concentrating on those concerned with the structural aspects of biochemical networks. In particular, the different types of data models found in the literature are classified using a unified framework. In addition, how these models have been used in the analysis of biochemical networks is described. This enables us to underline the strengths and weaknesses of the different approaches, as well as to highlight relevant future research directions.  相似文献   

7.
8.
9.
On gene ranking using replicated microarray time course data   总被引:1,自引:0,他引:1  
Tai YC  Speed TP 《Biometrics》2009,65(1):40-51
Summary .  Consider the ranking of genes using data from replicated microarray time course experiments, where there are multiple biological conditions, and the genes of interest are those whose temporal profiles differ across conditions. We derive a multisample multivariate empirical Bayes' statistic for ranking genes in the order of differential expression, from both longitudinal and cross-sectional replicated developmental microarray time course data. Our longitudinal multisample model assumes that time course replicates are independent and identically distributed multivariate normal vectors. On the other hand, we construct a cross-sectional model using a normal regression framework with any appropriate basis for the design matrices. In both cases, we use natural conjugate priors in our empirical Bayes' setting which guarantee closed form solutions for the posterior odds. The simulations and two case studies using published worm and mouse microarray time course datasets indicate that the proposed approaches perform satisfactorily.  相似文献   

10.
An embryonic stem cell is a powerful tool for investigation of early development in vitro. The study of embryonic stem cell mediated neuronal differentiation allows for improved understanding of the mechanisms involved in embryonic neuronal development. We investigated expression profile changes using time course cDNA microarray to identify clues for the signaling network of neuronal differentiation. For the short time course microarray data, pattern analysis based on the quadratic regression method is an effective approach for identification and classification of a variety of expressed genes that have biological relevance. We studied the expression patterns, at each of 5 stages, after neuronal induction at the mRNA level of embryonic stem cells using the quadratic regression method for pattern analysis. As a result, a total of 316 genes (3.1%) including 166 (1.7%) informative genes in 8 possible expression patterns were identified by pattern analysis. Among the selected genes associated with neurological system, all three genes showing linearly increasing pattern over time, and one gene showing decreasing pattern over time, were verified by RT-PCR. Therefore, an increase in gene expression over time, in a linear pattern, may be associated with embryonic development. The genes: Tcfap2c, Ttr, Wnt3a, Btg2 and Foxk1 detected by pattern analysis, and verified by RT-PCR simultaneously, may be candidate markers associated with the development of the nervous system. Our study shows that pattern analysis, using the quadratic regression method, is very useful for investigation of time course cDNA microarray data. The pattern analysis used in this study has biological significance for the study of embryonic stem cells.  相似文献   

11.
12.
T Jombart  R M Eggo  P J Dodd  F Balloux 《Heredity》2011,106(2):383-390
Epidemiology and public health planning will increasingly rely on the analysis of genetic sequence data. In particular, genetic data coupled with dates and locations of sampled isolates can be used to reconstruct the spatiotemporal dynamics of pathogens during outbreaks. Thus far, phylogenetic methods have been used to tackle this issue. Although these approaches have proved useful for informing on the spread of pathogens, they do not aim at directly reconstructing the underlying transmission tree. Instead, phylogenetic models infer most recent common ancestors between pairs of isolates, which can be inadequate for densely sampled recent outbreaks, where the sample includes ancestral and descendent isolates. In this paper, we introduce a novel method based on a graph approach to reconstruct transmission trees directly from genetic data. Using simulated data, we show that our approach can efficiently reconstruct genealogies of isolates in situations where classical phylogenetic approaches fail to do so. We then illustrate our method by analyzing data from the early stages of the swine-origin A/H1N1 influenza pandemic. Using 433 isolates sequenced at both the hemagglutinin and neuraminidase genes, we reconstruct the likely history of the worldwide spread of this new influenza strain. The presented methodology opens new perspectives for the analysis of genetic data in the context of disease outbreaks.  相似文献   

13.
14.
15.
MOTIVATION: The issue of high dimensionality in microarray data has been, and remains, a hot topic in statistical and computational analysis. Efficient gene filtering and differentiation approaches can reduce the dimensions of data, help to remove redundant genes and noises, and highlight the most relevant genes that are major players in the development of certain diseases or the effect of drug treatment. The purpose of this study is to investigate the efficiency of parametric (including Bayesian and non-Bayesian, linear and non-linear), non-parametric and semi-parametric gene filtering methods through the application of time course microarray data from multiple sclerosis patients being treated with interferon-beta-1a. The analysis of variance with bootstrapping (parametric), class dispersion (semi-parametric) and Pareto (non-parametric) with permutation methods are presented and compared for filtering and finding differentially expressed genes. The Bayesian linear correlated model, the Bayesian non-linear model the and non-Bayesian mixed effects model with bootstrap were also developed to characterize the differential expression patterns. Furthermore, trajectory-clustering approaches were developed in order to investigate the dynamic patterns and inter-dependency of drug treatment effects on gene expression. RESULTS: Results show that the presented methods performed significant differently but all were adequate in capturing a small number of the potentially relevant genes to the disease. The parametric method, such as the mixed model and two Bayesian approaches proved to be more conservative. This may because these methods are based on overall variation in expression across all time points. The semi-parametric (class dispersion) and non-parametric (Pareto) methods were appropriate in capturing variation in expression from time point to time point, thereby making them more suitable for investigating significant monotonic changes and trajectories of changes in gene expressions in time course microarray data. Also, the non-linear Bayesian model proved to be less conservative than linear Bayesian correlated growth models to filter out the redundant genes, although the linear model showed better fit than non-linear model (smaller DIC). We also report the trajectories of significant genes-since we have been able to isolate trajectories of genes whose regulations appear to be inter-dependent.  相似文献   

16.
A data-driven clustering method for time course gene expression data   总被引:1,自引:0,他引:1  
Gene expression over time is, biologically, a continuous process and can thus be represented by a continuous function, i.e. a curve. Individual genes often share similar expression patterns (functional forms). However, the shape of each function, the number of such functions, and the genes that share similar functional forms are typically unknown. Here we introduce an approach that allows direct discovery of related patterns of gene expression and their underlying functions (curves) from data without a priori specification of either cluster number or functional form. Smoothing spline clustering (SSC) models natural properties of gene expression over time, taking into account natural differences in gene expression within a cluster of similarly expressed genes, the effects of experimental measurement error, and missing data. Furthermore, SSC provides a visual summary of each cluster's gene expression function and goodness-of-fit by way of a 'mean curve' construct and its associated confidence bands. We apply this method to gene expression data over the life-cycle of Drosophila melanogaster and Caenorhabditis elegans to discover 17 and 16 unique patterns of gene expression in each species, respectively. New and previously described expression patterns in both species are discovered, the majority of which are biologically meaningful and exhibit statistically significant gene function enrichment. Software and source code implementing the algorithm, SSClust, is freely available (http://genemerge.bioteam.net/SSClust.html).  相似文献   

17.
18.
During the past year, X-ray crystallographers and solution NMR spectroscopists have made significant progress towards the complete structural characterization of conserved biochemical pathways and processes. Some of these advances were made in the context of nascent structural genomics programs, which promise to accelerate structural studies of biologically and medically important proteins. The results of high-throughput protein production, crystallization, structure determination, homology modeling and functional annotation published by two such programs have provided insight into the evolution and function of enzymes in the isoprenoid biosynthesis and ribulose monophosphate pathways.  相似文献   

19.
Allozyme data are widely used to infer the phylogenies of populations and closely-related species. Numerous parsimony, distance, and likelihood methods have been proposed for phylogenetic analysis of these data; the relative merits of these methods have been debated vigorously, but their accuracy has not been well explored. In this study, I compare the performance of 13 phylogenetic methods (six parsimony, six distance, and continuous maximum likelihood) by applying a congruence approach to eight allozyme data sets from the literature. Clades are identified that are supported by multiple data sets other than allozymes (e.g. morphology, DNA sequences), and the ability of different methods to recover these 'known' clades is compared. The results suggest that (1) distance and likelihood methods generally outperform parsimony methods, (2) methods that utilize frequency data tend to perform well, and (3) continuous maximum likelihood is among the most accurate methods, and appears to be robust to violations of its assumptions. These results are in agreement with those from recent simulation studies, and help provide a basis for empirical workers to choose among the many methods available for analysing allozyme characters.  相似文献   

20.
Computer-aided synthesis of biochemical pathways   总被引:5,自引:0,他引:5  
The synthesis of biochemical pathways satisfying stoichiometric constraints is discussed. Stoichiometric constraints arise primarily from designating compounds as required or allowed reactants, and required or allowed products of the pathways; they also arise from similar restrictions on intermediate metabolites and bioreactions participating in the pathways. An algorithm for the complete and correct solution of the problem is presented; the algorithm satisfies each constraint by recursively transforming a base-set of pathways. The algorithm is applied to the problem of lysine synthesis from glucose and ammonia. In addition to the established synthesis routes, the algorithm constructs several alternative pathways that bypass key enzymes, such as malate dehydrogenase and pyruvate dehydrogenase. Apart from the construction of pathways with desired characteristics, the systematic synthesis of pathways can also uncover fundamental constraints in a particular problem, by demonstrating that no pathways exist to meet certain sets of specifications. In the case of lysine, the algorithm shows that oxaloacetate is a necessary intermediate in all pathways leading to lysine from glucose, and that the yield of lysine over glucose cannot exceed 67% in the absence of enzymatic recovery of carbon dioxide.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号