首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
MOTIVATION: Deciphering the location of gene duplications and multiple gene duplication episodes on the Tree of Life is fundamental to understanding the way gene families and genomes evolve. The multiple gene duplication problem provides a framework for placing gene duplication events onto nodes of a given species tree, and detecting episodes of multiple gene duplication. One version of the multiple gene duplication problem was defined by Guigó et al. in 1996. Several heuristic solutions have since been proposed for this problem, but no exact algorithms were known. RESULTS: In this article we solve this longstanding open problem by providing the first exact and efficient solution. We also demonstrate the improvement offered by our algorithm over the best heuristic approaches, by applying it to several simulated as well as empirical datasets.  相似文献   

2.
When Charles Darwin convinced the scientific community that species evolve, the long-held essentialist view of each species as fixed was rejected and a clear conceptual understanding of the term was lost. For the next century, a real species problem existed that became culturally entrenched within the scientific community. Although largely solved decades ago, the species problem remains entrenched today due to a suite of factors. Most of the factors that help maintain its perceived intractability have been revealed and logically dismissed; yet this is not widely known so those factors continue to be influential. It is time to recognize this false foundation and relegate the species problem to history.  相似文献   

3.
4.
To find unknown protein-coding genes, annotation pipelines use a combination of ab initio gene prediction and similarity to experimentally confirmed genes or proteins. Here, we show that although the ab initio predictions have an intrinsically high false-positive rate, they also have a consistently low false-negative rate. The incorporation of similarity information is meant to reduce the false-positive rate, but in doing so it increases the false-negative rate. The crucial variable is gene size (including introns)--genes of the most extreme sizes, especially very large genes, are most likely to be incorrectly predicted.  相似文献   

5.

Background

The gene duplication (GD) problem seeks a species tree that implies the fewest gene duplication events across a given collection of gene trees. Solving this problem makes it possible to use large gene families with complex histories of duplication and loss to infer phylogenetic trees. However, the GD problem is NP-hard, and therefore, most analyses use heuristics that lack any performance guarantee.

Results

We describe the first integer linear programming (ILP) formulation to solve instances of the gene duplication problem exactly. With simulations, we demonstrate that the ILP solution can solve problem instances with up to 14 taxa. Furthermore, we apply the new ILP solution to solve the gene duplication problem for the seed plant phylogeny using a 12-taxon, 6, 084-gene data set. The unique, optimal solution, which places Gnetales sister to the conifers, represents a new, large-scale genomic perspective on one of the most puzzling questions in plant systematics.

Conclusions

Although the GD problem is NP-hard, our novel ILP solution for it can solve instances with data sets consisting of as many as 14 taxa and 1, 000 genes in a few hours. These are the largest instances that have been solved to optimally to date. Thus, this work can provide large-scale genomic perspectives on phylogenetic questions that previously could only be addressed by heuristic estimates.
  相似文献   

6.
7.
8.
9.
How much horizontal gene transfer (HGT) between species influences bacterial phylogenomics is a controversial issue. This debate, however, lacks any quantitative assessment of the impact of HGT on phylogenies and of the ability of tree-building methods to cope with such events. I introduce a Markov model of genome evolution with HGT, accounting for the constraints on time -- an HGT event can only occur between concomitantly living species. This model is used to simulate multigene sequence data sets with or without HGT. The consequences of HGT on phylogenomic inference are analyzed and compared to other well-known phylogenetic artefacts. It is found that supertree methods are quite robust to HGT, keeping high levels of performance even when gene trees are largely incongruent with each other. Gene tree incongruence per se is not indicative of HGT. HGT, however, removes the (otherwise observed) positive relationship between sequence length and gene tree congruence to the estimated species tree. Surprisingly, when applied to a bacterial and a eukaryotic multigene data set, this criterion rejects the HGT hypothesis for the former, but not the latter data set.  相似文献   

10.
The problem of the lipoid thromboplastins   总被引:5,自引:0,他引:5  
  相似文献   

11.
12.
Children who are retarded readers may present a complex problem involving physical impediments, emotional distress, or teaching methods. A child with specific reading disability has spatial confusion, an exaggeration or persistence of a normal childhood tendency to reversal of letters and symbols, ambidexterity, normal intelligence, and poor visual recall of words. Children with these characteristics fail to learn to read in a teaching system in which the main emphasis is on visual associations. Treatment of such reading difficulties, as well as prophylactic measures, is outlined.  相似文献   

13.
14.
15.
16.
17.
《CMAJ》1916,6(3):239-241
  相似文献   

18.
The species problem is the long-standing failure of biologists to agree on how we should identify species and how we should define the word 'species'. The innumerable attacks on the problem have turned the often-repeated question 'what are species?' into a philosophical conundrum. Today, the preferred form of attack is the well-crafted argument, and debaters seem to have stopped inquiring about what new information is needed to solve the problem. However, our knowledge is not complete and we have overlooked something. The species problem can be overcome if we understand our own role, as conflicted investigators, in causing the problem.  相似文献   

19.
Perkins TJ  Hallett M  Glass L 《Bio Systems》2006,84(2):115-123
We study the inverse problem, or the "reverse-engineering" problem, for two abstract models of gene expression dynamics, discrete-time Boolean networks and continuous-time switching networks. Formally, the inverse problem is similar for both types of networks. For each gene, its regulators and its Boolean dynamics function must be identified. However, differences in the dynamical properties of these two types of networks affect the amount of data that is necessary for solving the inverse problem. We derive estimates for the average amounts of time series data required to solve the inverse problem for randomly generated Boolean and continuous-time switching networks. We also derive a lower bound on the amount of data needed that holds for both types of networks. We find that the amount of data required is logarithmic in the number of genes for Boolean networks, matching the general lower bound and previous theory, but are superlinear in the number of genes for continuous-time switching networks. We also find that the amount of data needed scales as 2(K), where K is the number of regulators per gene, rather than 2(2K), as previous theory suggests.  相似文献   

20.
The identification of loci influenced by positive selection is a major goal of evolutionary genetics. A popular approach is to perform scans of alignments on a genome-wide scale in order to find regions evolving at accelerated rates on a particular branch of a phylogenetic tree. However, positive selection is not the only process that can lead to accelerated evolution. Notably, GC-biased gene conversion (gBGC) is a recombination-associated process that results in the biased fixation of G and C nucleotides. This process can potentially generate bursts of nucleotide substitutions within hotspots of meiotic recombination. Here, we analyse the results of a scan for positive selection on genes on branches across the primate phylogeny. We show that genes identified as targets of positive selection have a significant tendency to exhibit the genomic signature of gBGC. Using a maximum-likelihood framework, we estimate that more than 20 per cent of cases of significantly elevated non-synonymous to synonymous substitution rates ratio (dN/dS), particularly in shorter branches, could be due to gBGC. We demonstrate that in some cases, gBGC can lead to very high dN/dS (more than 2). Our results indicate that gBGC significantly affects the evolution of coding sequences in primates, often leading to patterns of evolution that can be mistaken for positive selection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号