首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Meligkotsidou L  Fearnhead P 《Genetics》2005,171(4):2073-2084
We develop a method for maximum-likelihood estimation of coalescence times in genealogical trees, based on population genetics data. For this purpose, a Viterbi-type algorithm is constructed to maximize the joint likelihood of the coalescence times. Marginal confidence intervals for the coalescence times based on the profile likelihoods are also computed. Our method of finding MLEs and calculating C.I.'s appears to be more accurate than alternative numerical maximization methods, and maximum-likelihood inference appears to be more accurate than other existing model-free approaches to estimating coalescent times. We demonstrate the method on two different data sets: human Y chromosome DNA data and fungus DNA data.  相似文献   

2.
We present a method called the G(A|B) method for estimating coalescence probabilities within population lineages from genome sequences when one individual is sampled from each population. Population divergence times can be estimated from these coalescence probabilities if additional assumptions about the history of population sizes are made. Our method is based on a method presented by Rasmussen et al. (2014) to test whether an archaic genome is from a population directly ancestral to a present-day population. The G(A|B) method does not require distinguishing ancestral from derived alleles or assumptions about demographic history before population divergence. We discuss the relationship of our method to two similar methods, one introduced by Green et al. (2010) and called the F(A|B) method and the other introduced by Schlebusch et al. (2017) and called the TT method. When our method is applied to individuals from three or more populations, it provides a test of whether the population history is treelike because coalescence probabilities are additive on a tree. We illustrate the use of our method by applying it to three high-coverage archaic genomes, two Neanderthals (Vindija and Altai) and a Denisovan.Subject terms: Rare variants, Evolutionary genetics

One of the goals of population genetics is to estimate the divergence time of isolated populations. We will review several methods that have been proposed and present a new method that is closely related to two existing methods. We will emphasize the assumptions made when using different methods. It will be useful to make the distinction between estimating coalescence probabilities within populations and estimating population divergence times. We will also introduce a test for a treelike population history based on our method.For distantly related populations, the numbers of mutational differences between sequences indicate relative times of divergence. Relative times are converted to absolute times by assuming a mutation rate. This method traces to Zuckerkandl and Pauling (1962, 1965) and has been used and refined extensively. This class of methods estimates genomic divergence times. Using it to estimate population or species divergence times assumes that those times are so large that the difference between them can be ignored.For recently diverged populations, the numbers of mutational differences probably do not provide a reliable estimate of population divergence times both because there may be too few mutations that differentiate populations and because the difference between the genomic and population divergence times may be substantial. To overcome this problem, Green et al. (2010) (in Supplement 14) introduced a method that accounts for the difference between genomic and population divergence. This method was used in later papers from the same group (Meyer et al. 2012; Prüfer et al. 2014, 2017).The Green et al. (2010) method is applicable when one genome is sampled from each of two populations. It depends on the statistic F(A|B), which is the fraction of sites in population A that carry the derived allele when that site is heterozygous in population B. Green et al. (2010) showed by simulation that the expectation of F(A|B) decreases roughly exponentially with the separation time of A and B. The rate of decrease depends on the history of population sizes both in B and in the population ancestral to A and B. Green et al. (2010) estimated population divergence times by interpolating their simulation results.More recently, Schlebusch et al. (2017), in Section 9.1 of their supplementary materials, introduced a similar method, called the TT method. Their method is based on analytic expressions for the configuration probabilities of SNPs that are polymorphic in the two populations. The TT method assumes that ancestral and derived alleles can be distinguished and the population before divergence was of constant size. The TT method is developed and elaborated on by Sjödin et al. (2020).In the present paper, we present a new method that is closely related to the F(A|B) and TT methods. We call it the G(A|B) method to emphasize its similarity to F(A|B). Our method is based on a method presented by Rasmussen et al. (2014) to test whether an ancient DNA sequence is from a population directly ancestral to a present-day population. We will show that our method provides a way to test whether the history of three or more populations is accurately represented by a population tree even if the demographic histories of those populations are not known.  相似文献   

3.
4.
Two techniques for obtaining information about population structure from nucleotide sequences in DNA are summarized. The first focuses on the selection or neutrality of enzyme polymorphisms, the second on the detection of recombination. Neither method requires phylogeny estimation.  相似文献   

5.
Phylogeny reconstruction is a difficult computational problem, because the number of possible solutions increases with the number of included taxa. For example, for only 14 taxa, there are more than seven trillion possible unrooted phylogenetic trees. For this reason, phylogenetic inference methods commonly use clustering algorithms (e.g., the neighbor-joining method) or heuristic search strategies to minimize the amount of time spent evaluating nonoptimal trees. Even heuristic searches can be painfully slow, especially when computationally intensive optimality criteria such as maximum likelihood are used. I describe here a different approach to heuristic searching (using a genetic algorithm) that can tremendously reduce the time required for maximum-likelihood phylogenetic inference, especially for data sets involving large numbers of taxa. Genetic algorithms are simulations of natural selection in which individuals are encoded solutions to the problem of interest. Here, labeled phylogenetic trees are the individuals, and differential reproduction is effected by allowing the number of offspring produced by each individual to be proportional to that individual's rank likelihood score. Natural selection increases the average likelihood in the evolving population of phylogenetic trees, and the genetic algorithm is allowed to proceed until the likelihood of the best individual ceases to improve over time. An example is presented involving rbcL sequence data for 55 taxa of green plants. The genetic algorithm described here required only 6% of the computational effort required by a conventional heuristic search using tree bisection/reconnection (TBR) branch swapping to obtain the same maximum-likelihood topology.   相似文献   

6.
An estimate of the average number of evolutionarily acceptable substitutions per nucleotide since the most recent common ancestor of a pair of homologous sequences is found which uses nucleotide sequence data. The estimate is derived assuming a Poisson-like model for the evolutionary process. A method is also suggested for analyzing nucleotide sequence data in M homologous sequences (M 3). A simulation study is reported showing that the estimates are satisfactory providing there is sufficient homology between the sequences. To demonstrate the methods a numerical example using some β-globin data is presented.  相似文献   

7.
8.
The partition matrix is a graphical tool for comparative analysis of nucleotide sequences following alignment. It is particularly useful for investigating the divergent phylogenies of sequence regions undergoing reticulate evolution. A partition matrix is generated by determining the consistency of the parsimoniously informative sites in a set of aligned sequences with the binary partitions inferred from the sequences. Since the linear order of sites is maintained, the matrix can be used to assess whether the distribution of sites either supporting or conflicting with particular partitions changes along the length of the alignment. The usefulness of the matrix in allowing visual identification of differences in evolutionary history among regions depends on the order in which partitions are shown; several suitable ordering schemes are proposed. We demonstrate the use of the partition matrix in interpreting the evolution of the pseudoautosomal boundary region on the sex chromosome of catarrhine primates. Its routine use should help to avoid attempts to derive single phylogenies from sequences whose evolution has been reticulate and to identify the gene conversion or recombination events underlying the reticulation. The method is relatively fast. It is exploratory, and it can form the basis for more formal analysis, which we discuss.   相似文献   

9.
10.
Here, shotgun metagenomic sequencing was conducted to reveal the hydrogen-oxidizing autotrophic-denitrifying metabolism in an enriched Thauera-dominated consortium. A draft genome named Thauera R4 of over 90 % completeness (3.8 Mb) was retrieved mainly by a coverage-defined binning method from 3.5 Gb paired-end Illumina reads. We identified 1,263 genes (accounting for 33 % of total genes in the finished genome of Thauera aminoaromatica MZ1T) with average nucleotide identity of 87.6 % shared between Thauera R4 and T. aminoaromatica MZ1T. Although Thauera R4 and T. aminoaromatica shared quite similar nitrogen metabolism and a high nucleotide similarity (98.8 %) in their 16S ribosomal RNA genes, they showed different functional potentials in several important environmentally relevant processes. Unlike T. aminoaromatica MZ1T, Thauera R4 carries an operon of [NiFe]-hydrogenase (EC 1.12.99.6) catalyzing molecular hydrogen oxidation in nitrate-rich solution. Moreover, Thauera R4 is a mixtrophic bacterium possessing key enzymes for autotrophic CO2-fixation and heterotrophic acetate assimilation metabolism. This Thauera R4 bin provides another genetic reference to better understand the niches of Thauera and demonstrates a model pipeline to reveal functional profiles and reconstruct novel and dominant genomes from a simplified mixed culture in environmental studies.  相似文献   

11.
The nucleotide sequence of a mitochondrial replicon from maize   总被引:2,自引:0,他引:2  
S R Ludwig  R F Pohlman  J Vieira  A G Smith  J Messing 《Gene》1985,38(1-3):131-138
The 1913-bp maize mitochondrial (mt) plasmid was isolated from a suspension culture of a Black Mexican Sweet maize strain, cloned into M13mp vectors, and sequenced by a unidirectional progressive deletion method. The 1.9-kb extrachromosomal double-stranded circular DNA plasmid was found to contain regions of sequence which in other systems are known to be part of origins of replication (ori). This plasmid could be used as a carrier for chimeric genes and a molecular probe for replication.  相似文献   

12.
? Premise of the study: The Malpighiaceae include ~1300 tropical flowering plant species in which generic definitions and intergeneric relationships have long been problematic. The goals of our study were to resolve relationships among the 11 generic segregates from the New World genus Mascagnia, test the monophyly of the largest remaining Malpighiaceae genera, and clarify the placement of Old World Malpighiaceae. ? Methods: We combined DNA sequence data for four genes (plastid ndhF, matK, and rbcL and nuclear PHYC) from 338 ingroup accessions that represented all 77 currently recognized genera with morphological data from 144 ingroup species to produce a complete generic phylogeny of the family. ? Key results and conclusions: The genera are distributed among 14 mostly well-supported clades. The interrelationships of these major subclades have strong support, except for the clade comprising the wing-fruited genera (i.e., the malpighioid+Amorimia, Ectopopterys, hiraeoid, stigmaphylloid, and tetrapteroid clades). These results resolve numerous systematic problems, while others have emerged and constitute opportunities for future study. Malpighiaceae migrated from the New to Old World nine times, with two of those migrants being very recent arrivals from the New World. The seven other Old World clades dispersed much earlier, likely during the Tertiary. Comparison of floral morphology in Old World Malpighiaceae with their closest New World relatives suggests that morphological stasis in the New World likely results from selection by neotropical oil-bee pollinators and that the morphological diversity found in Old World flowers has evolved following their release from selection by those bees.  相似文献   

13.
Human papillomaviruses (HPV) of the beta-group seem to be involved in the pathogenesis of non-melanoma skin cancer. Papillomaviruses are host specific and are considered closely co-evolving with their hosts. Evolutionary incongruence between early genes and late genes has been reported among oncogenic genital alpha-papillomaviruses and considerably challenge phylogenetic reconstructions. We investigated the relationships of 29 beta-HPV (25 types plus four putative new types, subtypes, or variants) as inferred from codon aligned and amino acid sequence data of the genes E1, E2, E6, E7, L1, and L2 using likelihood, distance, and parsimony approaches. An analysis of a L1 fragment included additional nucleotide and amino acid sequences from seven non-human beta-papillomaviruses. Early genes and late genes evolution did not conflict significantly in beta-papillomaviruses based on partition homogeneity tests (p > or = 0.001). As inferred from the complete genome analyses, beta-papillomaviruses were monophyletic and segregated into four highly supported monophyletic assemblages corresponding to the species 1, 2, 3, and fused 4/5. They basically split into the species 1 and the remainder of beta-papillomaviruses, whose species 3, 4, and 5 constituted the sistergroup of species 2. beta-Papillomaviruses have been isolated from humans, apes, and monkeys, and phylogenetic analyses of the L1 fragment showed non-human papillomaviruses highly polyphyletic nesting within the HPV species. Thus, host and virus phylogenies were not congruent in beta-papillomaviruses, and multiple invasions across species borders may contribute (additionally to host-linked evolution) to their diversification.  相似文献   

14.
 Phylogenetic relationships in Rosaceae were studied using parsimony analysis of nucleotide sequence data from two regions of the chloroplast genome, the matK gene and the trnL-trnF region. As in a previously published phylogeny of Rosaceae based upon rbcL sequences, monophyletic groups were resolved that correspond, with some modifications, to subfamilies Maloideae and Rosoideae, but Spiraeoideae were polyphyletic. Three main lineages appear to have diverged early in the evolution of the family: 1) Rosoideae sensu stricto, including taxa with a base chromosome number of 7 (occasionally 8); 2) actinorhizal Rosaceae, a group of taxa that engage in symbiotic nitrogen fixation; and 3) the rest of the family. The spiraeoid genus Gillenia, not included in the rbcL study, was strongly supported as the sister taxon to Maloideae sensu lato. A New World origin of Maloideae is suggested. The position of the economically important genus Prunus and the status of subfamily Amygdaloideae remain unresolved. Received February 27, 2001 Accepted October 11, 2001  相似文献   

15.
Tribe Spiraeeae has generally been defined to include Aruncus, Kelseya, Luetkea, Pentactina, Petrophyton, Sibiraea, and Spiraea. Recent phylogenetic analyses have supported inclusion of Holodiscus in this group. Spiraea, with 50-80 species distributed throughout the north temperate regions of the world, is by far the largest and most widespread genus in the tribe; the remaining genera have one to several species each. Phylogenetic analyses of nuclear ITS and chloroplast trnL-trnF nucleotide sequences for 33 species representing seven of the aforementioned genera plus Xerospiraea divided the tribe into two well supported clades, one including Aruncus, Luetkea, Holodiscus, and Xerospiraea, the second including the other genera. Within Spiraea, none of the three sections recognized by Rehder based on inflorescence morphology is supported as monophyletic. Our analyses suggest a western North American origin for the tribe, with several biogeographic events involving vicariance or dispersal between the Old and New Worlds having occurred within this group.  相似文献   

16.
SUMMARY: G-language Genome Analysis Environment (G-language GAE) is an open source generic software package aimed for higher efficiency in bioinformatics analysis. G-language GAE has an interface as a set of Perl libraries for software development, and a graphical user interface for easy manipulation. Both Windows and Linux versions are available. AVAILABILITY: From http://www.g-language.org/ under GNU General Public License. CD-ROMs are distributed freely in major conferences.  相似文献   

17.
Although multiple gene sequences are becoming increasingly available for molecular phylogenetic inference, the analysis of such data has largely relied on inference methods designed for single genes. One of the common approaches to analyzing data from multiple genes is concatenation of the individual gene data to form a single supergene to which traditional phylogenetic inference procedures - e.g., maximum parsimony (MP) or maximum likelihood (ML) - are applied. Recent empirical studies have demonstrated that concatenation of sequences from multiple genes prior to phylogenetic analysis often results in inference of a single, well-supported phylogeny. Theoretical work, however, has shown that the coalescent can produce substantial variation in single-gene histories. Using simulation, we combine these ideas to examine the performance of the concatenation approach under conditions in which the coalescent produces a high level of discord among individual gene trees and show that it leads to statistically inconsistent estimation in this setting. Furthermore, use of the bootstrap to measure support for the inferred phylogeny can result in moderate to strong support for an incorrect tree under these conditions. These results highlight the importance of incorporating variation in gene histories into multilocus phylogenetics.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号