首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Horizontal gene transfer (HGT), a process of acquisition and fixation of foreign genetic material, is an important biological phenomenon. Several approaches to HGT inference have been proposed. However, most of them either rely on approximate, non-phylogenetic methods or on the tree reconciliation, which is computationally intensive and sensitive to parameter values.

Results

We investigate the locus tree inference problem as a possible alternative that combines the advantages of both approaches. We present several algorithms to solve the problem in the parsimony framework. We introduce a novel tree mapping, which allows us to obtain a heuristic solution to the problems of locus tree inference and duplication classification.

Conclusions

Our approach allows for faster comparisons of gene and species trees and improves known algorithms for duplication inference in the presence of polytomies in the species trees. We have implemented our algorithms in a software tool available at https://github.com/mciach/LocusTreeInference.
  相似文献   

2.

Background

Somatic copy number alternations (SCNAs) can be utilized to infer tumor subclonal populations in whole genome seuqncing studies, where usually their read count ratios between tumor-normal paired samples serve as the inferring proxy. Existing SCNA based subclonal population inferring tools consider the GC bias of tumor and normal sample is of the same fature, and could be fully offset by read count ratio. However, we found that, the read count ratio on SCNA segments presents a Log linear biased pattern, which influence existing read count ratios based subclonal inferring tools performance. Currently no correction tools take into account the read ratio bias.

Results

We present Pre-SCNAClonal, a tool that improving tumor subclonal population inferring by correcting GC-bias at SCNAs level. Pre-SCNAClonal first corrects GC bias using Markov chain Monte Carlo probability model, then accurately locates baseline DNA segments (not containing any SCNAs) with a hierarchy clustering model. We show Pre-SCNAClonal’s superiority to exsiting GC-bias correction methods at any level of subclonal population.

Conclusions

Pre-SCNAClonal could be run independently as well as serving as pre-processing/gc-correction step in conjuntion with exsiting SCNA-based subclonal inferring tools.
  相似文献   

3.

Background

Most studies inferring species phylogenies use sequences from single copy genes or sets of orthologs culled from gene families. For taxa such as plants, with very high levels of gene duplication in their nuclear genomes, this has limited the exploitation of nuclear sequences for phylogenetic studies, such as those available in large EST libraries. One rarely used method of inference, gene tree parsimony, can infer species trees from gene families undergoing duplication and loss, but its performance has not been evaluated at a phylogenomic scale for EST data in plants.

Results

A gene tree parsimony analysis based on EST data was undertaken for six angiosperm model species and Pinus, an outgroup. Although a large fraction of the tentative consensus sequences obtained from the TIGR database of ESTs was assembled into homologous clusters too small to be phylogenetically informative, some 557 clusters contained promising levels of information. Based on maximum likelihood estimates of the gene trees obtained from these clusters, gene tree parsimony correctly inferred the accepted species tree with strong statistical support. A slight variant of this species tree was obtained when maximum parsimony was used to infer the individual gene trees instead.

Conclusion

Despite the complexity of the EST data and the relatively small fraction eventually used in inferring a species tree, the gene tree parsimony method performed well in the face of very high apparent rates of duplication.
  相似文献   

4.

Background

A challenging problem in current systems biology is that of parameter inference in biological pathways expressed as coupled ordinary differential equations (ODEs). Conventional methods that repeatedly numerically solve the ODEs have large associated computational costs. Aimed at reducing this cost, new concepts using gradient matching have been proposed, which bypass the need for numerical integration. This paper presents a recently established adaptive gradient matching approach, using Gaussian processes (GPs), combined with a parallel tempering scheme, and conducts a comparative evaluation with current state-of-the-art methods used for parameter inference in ODEs. Among these contemporary methods is a technique based on reproducing kernel Hilbert spaces (RKHS). This has previously shown promising results for parameter estimation, but under lax experimental settings. We look at a range of scenarios to test the robustness of this method. We also change the approach of inferring the penalty parameter from AIC to cross validation to improve the stability of the method.

Methods

Methodology for the recently proposed adaptive gradient matching method using GPs, upon which we build our new method, is provided. Details of a competing method using RKHS are also described here.

Results

We conduct a comparative analysis for the methods described in this paper, using two benchmark ODE systems. The analyses are repeated under different experimental settings, to observe the sensitivity of the techniques.

Conclusions

Our study reveals that for known noise variance, our proposed method based on GPs and parallel tempering achieves overall the best performance. When the noise variance is unknown, the RKHS method proves to be more robust.
  相似文献   

5.

Background

Inference of gene networks from expression data is an important problem in computational biology. Many algorithms have been proposed for solving the problem efficiently. However, many of the available implementations are programming libraries that require users to write code, which limits their accessibility.

Results

We have developed a tool called CyNetworkBMA for inferring gene networks from expression data that integrates with Cytoscape. Our application offers a graphical user interface for networkBMA, an efficient implementation of Bayesian Model Averaging methods for network construction. The client-server architecture of CyNetworkBMA makes it possible to distribute or centralize computation depending on user needs.

Conclusions

CyNetworkBMA is an easy-to-use tool that makes network inference accessible to non-programmers through seamless integration with Cytoscape. CyNetworkBMA is available on the Cytoscape App Store at http://apps.cytoscape.org/apps/cynetworkbma.
  相似文献   

6.

Background

Most phylogenetic studies using molecular data treat gaps in multiple sequence alignments as missing data or even completely exclude alignment columns that contain gaps.

Results

Here we show that gap patterns in large-scale, genome-wide alignments are themselves phylogenetically informative and can be used to infer reliable phylogenies provided the gap data are properly filtered to reduce noise introduced by the alignment method. We introduce here the notion of split-inducing indels (splids) that define an approximate bipartition of the taxon set. We show both in simulated data and in case studies on real-life data that splids can be efficiently extracted from phylogenomic data sets.

Conclusions

Suitably processed gap patterns extracted from genome-wide alignment provide a surprisingly clear phylogenetic signal and an allow the inference of accurate phylogenetic trees.
  相似文献   

7.

Introduction

Metabolomics is a well-established tool in systems biology, especially in the top–down approach. Metabolomics experiments often results in discovery studies that provide intriguing biological hypotheses but rarely offer mechanistic explanation of such findings. In this light, the interpretation of metabolomics data can be boosted by deploying systems biology approaches.

Objectives

This review aims to provide an overview of systems biology approaches that are relevant to metabolomics and to discuss some successful applications of these methods.

Methods

We review the most recent applications of systems biology tools in the field of metabolomics, such as network inference and analysis, metabolic modelling and pathways analysis.

Results

We offer an ample overview of systems biology tools that can be applied to address metabolomics problems. The characteristics and application results of these tools are discussed also in a comparative manner.

Conclusions

Systems biology-enhanced analysis of metabolomics data can provide insights into the molecular mechanisms originating the observed metabolic profiles and enhance the scientific impact of metabolomics studies.
  相似文献   

8.

Background

Maximum parsimony phylogenetic tree reconciliation is an important technique for reconstructing the evolutionary histories of hosts and parasites, genes and species, and other interdependent pairs. Since the problem of finding temporally feasible maximum parsimony reconciliations is NP-complete, current methods use either exact algorithms with exponential worst-case running time or heuristics that do not guarantee optimal solutions.

Results

We offer an efficient new approach that begins with a potentially infeasible maximum parsimony reconciliation and iteratively “repairs” it until it becomes temporally feasible.

Conclusions

In a non-trivial number of cases, this approach finds solutions that are better than those found by the widely-used Jane heuristic.
  相似文献   

9.

Background

Longitudinal measurement is commonly employed in health research and provides numerous benefits for understanding disease and trait progression over time. More broadly, it allows for proper treatment of correlated responses within clusters. We evaluated 3 methods for analyzing genome-by-epigenome interactions with longitudinal outcomes from family data.

Results

Linear mixed-effect models, generalized estimating equations, and quadratic inference functions were used to test a pharmacoepigenetic effect in 200 simulated posttreatment replicates. Adjustment for baseline outcome provided greater power and more accurate control of Type I error rates than computation of a pre-to-post change score.

Conclusions

Comparison of all modeling approaches indicated a need for bias correction in marginal models and similar power for each method, with quadratic inference functions providing a minor decrement in power compared to generalized estimating equations and linear mixed-effects models.
  相似文献   

10.
Wang  Zhiwei  Liu  Kevin J. 《BMC genomics》2016,17(10):785-174

Background

The most widely used state-of-the-art methods for reconstructing species phylogenies from genomic sequence data assume that sampled loci are identically and independently distributed. In principle, free recombination between loci and a lack of intra-locus recombination are necessary to satisfy this assumption. Few studies have quantified the practical impact of recombination on species tree inference methods, and even fewer have used genomic sequence data for this purpose. One prominent exception is the 2012 study of Lanier and Knowles. A main finding from the study was that species tree inference methods are relatively robust to intra-locus recombination, assuming free recombination between loci. The latter assumption means that the open question regarding the impact of recombination on species tree analysis is not fully resolved.

Results

The goal of this study is to further investigate this open question. Using simulations based upon the multi-species coalescent-with-recombination model as well as empirical datasets, we compared common pipeline-based techniques for inferring species phylogenies. The simulation conditions included a range of dataset sizes and several choices for recombination rate which was either uniform across loci or incorporated recombination hotspots. We found that pipelines which explicitly utilize inferred recombination breakpoints to delineate recombination-free intervals result in greater accuracy compared to widely used alternatives that preprocess sequences based upon linkage disequilibrium decay. Furthermore, the use of a relatively simple approach for recombination breakpoint inference does not degrade the accuracy of downstream species tree inference compared to more accurate alternatives.

Conclusions

Our findings clarify the impact of recombination upon current phylogenomic pipelines for species tree inference. Pipeline-based approaches which utilize inferred recombination breakpoints to densely sample loci across genomic sequences can tolerate intra-locus recombination and violations of the assumption of free recombination between loci.
  相似文献   

11.

Background

Little is known about the phylogeography of norovirus (NoV) in China. In norovirus, a clear understanding for the characteristics of tree topology, migration patterns and its demographic dynamics in viral circulation are needed to identify its prevalence trends, which can help us better prepare for its epidemics as well as develop useful control strategies. The aim of this study was to explore the genetic diversity, temporal distribution, demographic dynamics and migration patterns of NoV that circulated in China.

Results

Our analysis showed that two major genogroups, GI and GII, were identified in China, in which GII.3, GII.4 and GII.17 accounted for the majority with a total proportion around 70%. Our demography inference suggested that during the long-term migration process, NoV evolved into multiple lineages and then experienced a selective sweep, which reduced its genetic diversity. The phylogeography results suggested that the norovirus may have originated form the South China (Hong Kong and Guangdong), followed by multicenter direction outbreaks across the country.

Conclusions

From these analyses, we indicate that domestic poultry trade and frequent communications of people from different regions have all contributed to the spread of the NoV in China. Together with recent advances in phylogeographic inference, our researches also provide powerful illustrations of how coalescent-based methods can extract adequate information in molecular epidemiology.
  相似文献   

12.

Background and aims

Aluminum (Al) accumulator plants are occasionally found in certain genera or families of woody plant species that are broadly dispersed in the angiosperm phylogeny. However, spatial and seasonal patterns in Al accumulation within the closely related species of each group remain poorly understood.

Methods

We quantitatively monitored the internal Al levels of eight Theaceae and Ternstroemiaceae species growing on acidic soils at multiple sites.

Results

Among the eight species, seven other than Ternstroemia gymnanthera shared a rapid Al accumulation in the developing leaves. Species comparison revealed that Al accumulation in mature leaves saturates within a flushing year, regardless of differences in leaf structure, seasonality, and acidic soil pH (4.5–5.5) at multiple sites. In tall trees of Stewartia monadelpha, the Al contents of the leaves were constantly high irrespective of their height positions up to 12 m. Moreover, the Al content of the leaves was only slightly decreased in the last 2 weeks of autumn senescence, in which nitrogen (N) or phosphate (P) retranslocation had been completed.

Conclusion

These results suggest that most of the Theaceae and Ternstroemiaceae species possess an effective metal-transport mechanism that rapidly loads Al into the young leaves until each level reaches a species-specific threshold.
  相似文献   

13.
14.

Background

During the last few years, the knowledge of drug, disease phenotype and protein has been rapidly accumulated and more and more scientists have been drawn the attention to inferring drug-disease associations by computational method. Development of an integrated approach for systematic discovering drug-disease associations by those informational data is an important issue.

Methods

We combine three different networks of drug, genomic and disease phenotype and assign the weights to the edges from available experimental data and knowledge. Given a specific disease, we use our network propagation approach to infer the drug-disease associations.

Results

We apply prostate cancer and colorectal cancer as our test data. We use the manually curated drug-disease associations from comparative toxicogenomics database to be our benchmark. The ranked results show that our proposed method obtains higher specificity and sensitivity and clearly outperforms previous methods. Our result also show that our method with off-targets information gets higher performance than that with only primary drug targets in both test data.

Conclusions

We clearly demonstrate the feasibility and benefits of using network-based analyses of chemical, genomic and phenotype data to reveal drug-disease associations. The potential associations inferred by our method provide new perspectives for toxicogenomics and drug reposition evaluation.
  相似文献   

15.

Background

Recent coevolutionary analysis has considered tree topology as a means to reduce the asymptotic complexity associated with inferring the complex coevolutionary interrelationships that arise between phylogenetic trees. Targeted algorithmic design for specific tree topologies has to date been highly successful, with one recent formulation providing a logarithmic space complexity reduction for the dated tree reconciliation problem.

Methods

In this work we build on this prior analysis providing a further asymptotic space reduction, by providing a new formulation for the dynamic programming table used by a number of popular coevolutionary analysis techniques. This model gives rise to a sub quadratic running time solution for the dated tree reconciliation problem for selected tree topologies, and is shown to be, in practice, the fastest method for solving the dated tree reconciliation problem for expected evolutionary trees. This result is achieved through the analysis of not only the topology of the trees considered for coevolutionary analysis, but also the underlying structure of the dynamic programming algorithms that are traditionally applied to such analysis.

Conclusion

The newly inferred theoretical complexity bounds introduced herein are then validated using a combination of synthetic and biological data sets, where the proposed model is shown to provide an \(O(\sqrt{n})\) space saving, while it is observed to run in half the time compared to the fastest known algorithm for solving the dated tree reconciliation problem. What is even more significant is that the algorithm derived herein is able to guarantee the optimality of its inferred solution, something that algorithms of comparable speed have to date been unable to achieve.
  相似文献   

16.

Background

Isometric gene tree reconciliation is a gene tree/species tree reconciliation problem where both the gene tree and the species tree include branch lengths, and these branch lengths must be respected by the reconciliation. The problem was introduced by Ma et al. in 2008 in the context of reconstructing evolutionary histories of genomes in the infinite sites model.

Results

In this paper, we show that the original algorithm by Ma et al. is incorrect, and we propose a modified algorithm that addresses the problems that we discovered. We have also improved the running time from \(O(N^2)\) to \(O(N\log N)\), where N is the total number of nodes in the two input trees. Finally, we examine two new variants of the problem: reconciliation of two unrooted trees and scaling of branch lengths of the gene tree during reconciliation of two rooted trees.

Conclusions

We provide several new algorithms for isometric reconciliation of trees. Some questions in this area remain open; most importantly extensions of the problem allowing for imprecise estimates of branch lengths.
  相似文献   

17.
18.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

19.

Background

Discrete-state stochastic models have become a well-established approach to describe biochemical reaction networks that are influenced by the inherent randomness of cellular events. In the last years several methods for accurately approximating the statistical moments of such models have become very popular since they allow an efficient analysis of complex networks.

Results

We propose a generalized method of moments approach for inferring the parameters of reaction networks based on a sophisticated matching of the statistical moments of the corresponding stochastic model and the sample moments of population snapshot data. The proposed parameter estimation method exploits recently developed moment-based approximations and provides estimators with desirable statistical properties when a large number of samples is available. We demonstrate the usefulness and efficiency of the inference method on two case studies.

Conclusions

The generalized method of moments provides accurate and fast estimations of unknown parameters of reaction networks. The accuracy increases when also moments of order higher than two are considered. In addition, the variance of the estimator decreases, when more samples are given or when higher order moments are included.
  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号