首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Systems biology has embraced computational modeling in response to the quantitative nature and increasing scale of contemporary data sets. The onslaught of data is accelerating as molecular profiling technology evolves. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) is a community effort to catalyze discussion about the design, application, and assessment of systems biology models through annual reverse-engineering challenges.

Methodology and Principal Findings

We describe our assessments of the four challenges associated with the third DREAM conference which came to be known as the DREAM3 challenges: signaling cascade identification, signaling response prediction, gene expression prediction, and the DREAM3 in silico network challenge. The challenges, based on anonymized data sets, tested participants in network inference and prediction of measurements. Forty teams submitted 413 predicted networks and measurement test sets. Overall, a handful of best-performer teams were identified, while a majority of teams made predictions that were equivalent to random. Counterintuitively, combining the predictions of multiple teams (including the weaker teams) can in some cases improve predictive power beyond that of any single method.

Conclusions

DREAM provides valuable feedback to practitioners of systems biology modeling. Lessons learned from the predictions of the community provide much-needed context for interpreting claims of efficacy of algorithms described in the scientific literature.  相似文献   

2.

Background

Reverse-engineering gene networks from expression profiles is a difficult problem for which a multitude of techniques have been developed over the last decade. The yearly organized DREAM challenges allow for a fair evaluation and unbiased comparison of these methods.

Results

We propose an inference algorithm that combines confidence matrices, computed as the standard scores from single-gene knockout data, with the down-ranking of feed-forward edges. Substantial improvements on the predictions can be obtained after the execution of this second step.

Conclusions

Our algorithm was awarded the best overall performance at the DREAM4 In Silico 100-gene network sub-challenge, proving to be effective in inferring medium-size gene regulatory networks. This success demonstrates once again the decisive importance of gene expression data obtained after systematic gene perturbations and highlights the usefulness of graph analysis to increase the reliability of inference.  相似文献   

3.

Background

Predication of gene regularity network (GRN) from expression data is a challenging task. There are many methods that have been developed to address this challenge ranging from supervised to unsupervised methods. Most promising methods are based on support vector machine (SVM). There is a need for comprehensive analysis on prediction accuracy of supervised method SVM using different kernels on different biological experimental conditions and network size.

Results

We developed a tool (CompareSVM) based on SVM to compare different kernel methods for inference of GRN. Using CompareSVM, we investigated and evaluated different SVM kernel methods on simulated datasets of microarray of different sizes in detail. The results obtained from CompareSVM showed that accuracy of inference method depends upon the nature of experimental condition and size of the network.

Conclusions

For network with nodes (<200) and average (over all sizes of networks), SVM Gaussian kernel outperform on knockout, knockdown, and multifactorial datasets compared to all the other inference methods. For network with large number of nodes (~500), choice of inference method depend upon nature of experimental condition. CompareSVM is available at http://bis.zju.edu.cn/CompareSVM/.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0395-x) contains supplementary material, which is available to authorized users.  相似文献   

4.
5.
The Dialogue for Reverse Engineering Assessments and Methods (DREAM) project was initiated in 2006 as a community-wide effort for the development of network inference challenges for rigorous assessment of reverse engineering methods for biological networks. We participated in the in silico network inference challenge of DREAM3 in 2008. Here we report the details of our approach and its performance on the synthetic challenge datasets. In our methodology, we first developed a model called relative change ratio (RCR), which took advantage of the heterozygous knockdown data and null-mutant knockout data provided by the challenge, in order to identify the potential regulators for the genes. With this information, a time-delayed dynamic Bayesian network (TDBN) approach was then used to infer gene regulatory networks from time series trajectory datasets. Our approach considerably reduced the searching space of TDBN; hence, it gained a much higher efficiency and accuracy. The networks predicted using our approach were evaluated comparatively along with 29 other submissions by two metrics (area under the ROC curve and area under the precision-recall curve). The overall performance of our approach ranked the second among all participating teams.  相似文献   

6.

Background

Researchers sorely need markers and approaches for biodiversity exploration (both specimen linked and metagenomics) using the full potential of next generation sequencing technologies (NGST). Currently, most studies rely on expensive multiple tagging, PCR primer universality and/or the use of few markers, sometimes with insufficient variability.

Methodology/Principal Findings

We propose a novel approach for the isolation and sequencing of a universal, useful and popular marker across distant, non-model metazoans: the complete mitochondrial genome. It relies on the properties of metazoan mitogenomes for enrichment, on careful choice of the organisms to multiplex, as well as on the wide collection of accumulated mitochondrial reference datasets for post-sequencing sorting and identification instead of individual tagging. Multiple divergent organisms can be sequenced simultaneously, and their complete mitogenome obtained at a very low cost. We provide in silico testing of dataset assembly for a selected set of example datasets.

Conclusions/Significance

This approach generates large mitogenome datasets. These sequences are useful for phylogenetics, molecular identification and molecular ecology studies, and are compatible with all existing projects or available datasets based on mitochondrial sequences, such as the Barcode of Life project. Our method can yield sequences both from identified samples and metagenomic samples. The use of the same datasets for both kinds of studies makes for a powerful approach, especially since the datasets have a high variability even at species level, and would be a useful complement to the less variable 18S rDNA currently prevailing in metagenomic studies.  相似文献   

7.

Background

Current technologies have lead to the availability of multiple genomic data types in sufficient quantity and quality to serve as a basis for automatic global network inference. Accordingly, there are currently a large variety of network inference methods that learn regulatory networks to varying degrees of detail. These methods have different strengths and weaknesses and thus can be complementary. However, combining different methods in a mutually reinforcing manner remains a challenge.

Methodology

We investigate how three scalable methods can be combined into a useful network inference pipeline. The first is a novel t-test–based method that relies on a comprehensive steady-state knock-out dataset to rank regulatory interactions. The remaining two are previously published mutual information and ordinary differential equation based methods (tlCLR and Inferelator 1.0, respectively) that use both time-series and steady-state data to rank regulatory interactions; the latter has the added advantage of also inferring dynamic models of gene regulation which can be used to predict the system''s response to new perturbations.

Conclusion/Significance

Our t-test based method proved powerful at ranking regulatory interactions, tying for first out of methods in the DREAM4 100-gene in-silico network inference challenge. We demonstrate complementarity between this method and the two methods that take advantage of time-series data by combining the three into a pipeline whose ability to rank regulatory interactions is markedly improved compared to either method alone. Moreover, the pipeline is able to accurately predict the response of the system to new conditions (in this case new double knock-out genetic perturbations). Our evaluation of the performance of multiple methods for network inference suggests avenues for future methods development and provides simple considerations for genomic experimental design. Our code is publicly available at http://err.bio.nyu.edu/inferelator/.  相似文献   

8.

Background and Aims

For 84 years, botanists have relied on calculating the highest common factor for series of haploid chromosome numbers to arrive at a so-called basic number, x. This was done without consistent (reproducible) reference to species relationships and frequencies of different numbers in a clade. Likelihood models that treat polyploidy, chromosome fusion and fission as events with particular probabilities now allow reconstruction of ancestral chromosome numbers in an explicit framework. We have used a modelling approach to reconstruct chromosome number change in the large monocot family Araceae and to test earlier hypotheses about basic numbers in the family.

Methods

Using a maximum likelihood approach and chromosome counts for 26 % of the 3300 species of Araceae and representative numbers for each of the other 13 families of Alismatales, polyploidization events and single chromosome changes were inferred on a genus-level phylogenetic tree for 113 of the 117 genera of Araceae.

Key Results

The previously inferred basic numbers x = 14 and x = 7 are rejected. Instead, maximum likelihood optimization revealed an ancestral haploid chromosome number of n = 16, Bayesian inference of n = 18. Chromosome fusion (loss) is the predominant inferred event, whereas polyploidization events occurred less frequently and mainly towards the tips of the tree.

Conclusions

The bias towards low basic numbers (x) introduced by the algebraic approach to inferring chromosome number changes, prevalent among botanists, may have contributed to an unrealistic picture of ancestral chromosome numbers in many plant clades. The availability of robust quantitative methods for reconstructing ancestral chromosome numbers on molecular phylogenetic trees (with or without branch length information), with confidence statistics, makes the calculation of x an obsolete approach, at least when applied to large clades.  相似文献   

9.

Background

GP-BAR1, a member G protein coupled receptor superfamily, is a cell surface bile acid-activated receptor highly expressed in the ileum and colon. In monocytes, ligation of GP-BAR1 by secondary bile acids results in a cAMP-dependent attenuation of cytokine generation.

Aims

To investigate the role GP-BAR1 in regulating intestinal homeostasis and inflammation-driven immune dysfunction in rodent models of colitis.

Methods

Colitis was induced in wild type and GP-BAR1−/− mice by DSS and TNBS administration. Potential GP-BAR1 agonists were identified by in silico screening and computational docking studies.

Results

GP-BAR1−/− mice develop an abnormal morphology of colonic mucous cells and an altered molecular architecture of epithelial tight junctions with increased expression and abnormal subcellular distribution of zonulin 1 resulting in increased intestinal permeability and susceptibility to develop severe colitis in response to DSS at early stage of life. By in silico screening and docking studies we identified ciprofloxacin as a GP-BAR1 ligand. In monocytes, ciprofloxacin increases cAMP concentrations and attenuates TNFα release induced by TLR4 ligation in a GP-BAR1 dependent manner. Treating mice rendered colitic by TNBS with ciprofloxacin and oleanolic acid, a well characterized GP-BAR1 ligand, abrogates signs and symptoms of colitis. Colonic expression of GP-BAR1 mRNA increases in rodent models of colitis and tissues from Crohn''s disease patients. Flow cytometry analysis demonstrates that ≈90% of CD14+ cells isolated from the lamina propria of TNBS-treated mice stained positively for GP-BAR1.

Conclusions

GP-BAR1 regulates intestinal barrier structure. Its expression increases in rodent models of colitis and Crohn''s disease. Ciprofloxacin is a GP-BAR1 ligand.  相似文献   

10.

Background and Aims

Laeliinae are a neotropical orchid subtribe with approx. 1500 species in 50 genera. In this study, an attempt is made to assess generic alliances based on molecular phylogenetic analysis of DNA sequence data.

Methods

Six DNA datasets were gathered: plastid trnL intron, trnL-F spacer, matK gene and trnK introns upstream and dowstream from matK and nuclear ITS rDNA. Data were analysed with maximum parsimony (MP) and Bayesian analysis with mixed models (BA).

Key Results

Although relationships between Laeliinae and outgroups are well supported, within the subtribe sequence variation is low considering the broad taxonomic range covered. Localized incongruence between the ITS and plastid trees was found. A combined tree followed the ITS trees more closely, but the levels of support obtained with MP were low. The Bayesian analysis recovered more well-supported nodes. The trees from combined MP and BA allowed eight generic alliances to be recognized within Laeliinae, all of which show trends in morphological characters but lack unambiguous synapomorphies.

Conclusions

By using combined plastid and nuclear DNA data in conjunction with mixed-models Bayesian inference, it is possible to delimit smaller groups within Laeliinae and discuss general patterns of pollination and hybridization compatibility. Furthermore, these small groups can now be used for further detailed studies to explain morphological evolution and diversification patterns within the subtribe.Key words: Laeliinae, Orchidaceae, ITS, trnL intron, trnL-F spacer, matK  相似文献   

11.

Background

There exist several computational tools which allow for the optimisation and inference of biological networks using a Boolean formalism. Nevertheless, the results from such tools yield only limited quantitative insights into the complexity of biological systems because of the inherited qualitative nature of Boolean networks.

Results

We introduce optPBN, a Matlab-based toolbox for the optimisation of probabilistic Boolean networks (PBN) which operates under the framework of the BN/PBN toolbox. optPBN offers an easy generation of probabilistic Boolean networks from rule-based Boolean model specification and it allows for flexible measurement data integration from multiple experiments. Subsequently, optPBN generates integrated optimisation problems which can be solved by various optimisers.In term of functionalities, optPBN allows for the construction of a probabilistic Boolean network from a given set of potential constitutive Boolean networks by optimising the selection probabilities for these networks so that the resulting PBN fits experimental data. Furthermore, the optPBN pipeline can also be operated on large-scale computational platforms to solve complex optimisation problems. Apart from exemplary case studies which we correctly inferred the original network, we also successfully applied optPBN to study a large-scale Boolean model of apoptosis where it allows identifying the inverse correlation between UVB irradiation, NFκB and Caspase 3 activations, and apoptosis in primary hepatocytes quantitatively. Also, the results from optPBN help elucidating the relevancy of crosstalk interactions in the apoptotic network.

Summary

The optPBN toolbox provides a simple yet comprehensive pipeline for integrated optimisation problem generation in the PBN formalism that can readily be solved by various optimisers on local or grid-based computational platforms. optPBN can be further applied to various biological studies such as the inference of gene regulatory networks or the identification of the interaction''s relevancy in signal transduction networks.  相似文献   

12.
13.

Background

Increasing number of eQTL (Expression Quantitative Trait Loci) datasets facilitate genetics and systems biology research. Meta-analysis tools are in need to jointly analyze datasets of same or similar issue types to improve statistical power especially in trans-eQTL mapping. Meta-analysis framework is also necessary for ChrX eQTL discovery.

Results

We developed a novel tool, meta-eqtl, for fast eQTL meta-analysis of arbitrary sample size and arbitrary number of datasets. Further, this tool accommodates versatile modeling, eg. non-parametric model and mixed effect models. In addition, meta-eqtl readily handles calculation of chrX eQTLs.

Conclusions

We demonstrated and validated meta-eqtl as fast and comprehensive tool to meta-analyze multiple datasets and ChrX eQTL discovery. Meta-eqtl is a set of command line utilities written in R, with some computationally intensive parts written in C. The software runs on Linux platforms and is designed to intelligently adapt to high performance computing (HPC) cluster. We applied the novel tool to liver and adipose tissue data, and revealed eSNPs underlying diabetes GWAS loci.  相似文献   

14.

Background

Optimal selection of multiple regulatory genes, known as targets, for deletion to enhance or suppress the activities of downstream genes or metabolites is an important problem in genetic engineering. Such problems become more feasible to address in silico due to the availability of more realistic dynamical system models of gene regulatory and metabolic networks. The goal of the computational problem is to search for a subset of genes to knock out so that the activity of a downstream gene or a metabolite is optimized.

Methodology/Principal Findings

Based on discrete dynamical system modeling of gene regulatory networks, an integer programming problem is formulated for the optimal in silico target gene deletion problem. In the first result, the integer programming problem is proved to be NP-hard and equivalent to a nonlinear programming problem. In the second result, a heuristic algorithm, called GKONP, is designed to approximate the optimal solution, involving an approach to prune insignificant terms in the objective function, and the parallel differential evolution algorithm. In the third result, the effectiveness of the GKONP algorithm is demonstrated by applying it to a discrete dynamical system model of the yeast pheromone pathways. The empirical accuracy and time efficiency are assessed in comparison to an optimal, but exhaustive search strategy.

Significance

Although the in silico target gene deletion problem has enormous potential applications in genetic engineering, one must overcome the computational challenge due to its NP-hardness. The presented solution, which has been demonstrated to approximate the optimal solution in a practical amount of time, is among the few that address the computational challenge. In the experiment on the yeast pheromone pathways, the identified best subset of genes for deletion showed advantage over genes that were selected empirically. Once validated in vivo, the optimal target genes are expected to achieve higher genetic engineering effectiveness than a trial-and-error procedure.  相似文献   

15.
16.

Background

In silico models have recently been created in order to predict which genetic variants are more likely to contribute to the risk of a complex trait given their functional characteristics. However, there has been no comprehensive review as to which type of predictive accuracy measures and data visualization techniques are most useful for assessing these models.

Methods

We assessed the performance of the models for predicting risk using various methodologies, some of which include: receiver operating characteristic (ROC) curves, histograms of classification probability, and the novel use of the quantile-quantile plot. These measures have variable interpretability depending on factors such as whether the dataset is balanced in terms of numbers of genetic variants classified as risk variants versus those that are not.

Results

We conclude that the area under the curve (AUC) is a suitable starting place, and for models with similar AUCs, violin plots are particularly useful for examining the distribution of the risk scores.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1616-z) contains supplementary material, which is available to authorized users.  相似文献   

17.

Background

The transmission networks of Plasmodium vivax characterize how the parasite transmits from one location to another, which are informative and insightful for public health policy makers to accurately predict the patterns of its geographical spread. However, such networks are not apparent from surveillance data because P. vivax transmission can be affected by many factors, such as the biological characteristics of mosquitoes and the mobility of human beings. Here, we pay special attention to the problem of how to infer the underlying transmission networks of P. vivax based on available tempo-spatial patterns of reported cases.

Methodology

We first define a spatial transmission model, which involves representing both the heterogeneous transmission potential of P. vivax at individual locations and the mobility of infected populations among different locations. Based on the proposed transmission model, we further introduce a recurrent neural network model to infer the transmission networks from surveillance data. Specifically, in this model, we take into account multiple real-world factors, including the length of P. vivax incubation period, the impact of malaria control at different locations, and the total number of imported cases.

Principal Findings

We implement our proposed models by focusing on the P. vivax transmission among 62 towns in Yunnan province, People''s Republic China, which have been experiencing high malaria transmission in the past years. By conducting scenario analysis with respect to different numbers of imported cases, we can (i) infer the underlying P. vivax transmission networks, (ii) estimate the number of imported cases for each individual town, and (iii) quantify the roles of individual towns in the geographical spread of P. vivax.

Conclusion

The demonstrated models have presented a general means for inferring the underlying transmission networks from surveillance data. The inferred networks will offer new insights into how to improve the predictability of P. vivax transmission.  相似文献   

18.

Background

Current methods for haplotype inference without pedigree information assume random mating populations. In animal and plant breeding, however, mating is often not random. A particular form of nonrandom mating occurs when parental individuals of opposite sex originate from distinct populations. In animal breeding this is called crossbreeding and hybridization in plant breeding. In these situations, association between marker and putative gene alleles might differ between the founding populations and origin of alleles should be accounted for in studies which estimate breeding values with marker data. The sequence of alleles from one parent constitutes one haplotype of an individual. Haplotypes thus reveal allele origin in data of crossbred individuals.

Results

We introduce a new method for haplotype inference without pedigree that allows nonrandom mating and that can use genotype data of the parental populations and of a crossbred population. The aim of the method is to estimate line origin of alleles. The method has a Bayesian set up with a Dirichlet Process as prior for the haplotypes in the two parental populations. The basic idea is that only a subset of the complete set of possible haplotypes is present in the population.

Conclusion

Line origin of approximately 95% of the alleles at heterozygous sites was assessed correctly in both simulated and real data. Comparing accuracy of haplotype frequencies inferred with the new algorithm to the accuracy of haplotype frequencies inferred with PHASE, an existing algorithm for haplotype inference, showed that the DP algorithm outperformed PHASE in situations of crossbreeding and that PHASE performed better in situations of random mating.  相似文献   

19.

Background

Treatments designed to correct cystic fibrosis transmembrane conductance regulator (CFTR) defects must first be evaluated in preclinical experiments in the mouse model of cystic fibrosis (CF). Mice nasal mucosa mimics the bioelectric defect seen in humans. The use of nasal potential difference (VTE) to assess ionic transport is a powerful test evaluating the restoration of CFTR function. Nasal VTE in CF mice must be well characterized for correct interpretation.

Methods

We performed VTE measurements in large-scale studies of two mouse models of CF—B6;129 cftr knockout and FVB F508del-CFTR—and their respective wild-type (WT) littermates. We assessed the repeatability of the test for cftr knockout mice and defined cutoff points distinguishing between WT and F508del-CFTR mice.

Results

We determined the typical VTE values for CF and WT mice and demonstrated the existence of residual CFTR activity in F508del-CFTR mice. We characterized intra-animal variability in B6;129 mice and defined the cutoff points for F508del-CFTR chloride secretion rescue. Hyperpolarization of more than -2.15 mV after perfusion with a low-concentration Cl- solution was considered to indicate a normal response.

Conclusions

These data will make it possible to interpret changes in nasal VTE in mouse models of CF, in future preclinical studies.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号