首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.

Background  

Maximum parsimony phylogenetic tree reconstruction from genetic variation data is a fundamental problem in computational genetics with many practical applications in population genetics, whole genome analysis, and the search for genetic predictors of disease. Efficient methods are available for reconstruction of maximum parsimony trees from haplotype data, but such data are difficult to determine directly for autosomal DNA. Data more commonly is available in the form of genotypes, which consist of conflated combinations of pairs of haplotypes from homologous chromosomes. Currently, there are no general algorithms for the direct reconstruction of maximum parsimony phylogenies from genotype data. Hence phylogenetic applications for autosomal data must therefore rely on other methods for first computationally inferring haplotypes from genotypes.  相似文献   

4.
Webb, G.E. 1994 1015: Parallelism, non-biotic data and phylogeny reconstruction in paleobiology.
Many systematists equate parallelism and convergence. However, whereas convergence is relatively uncommon and easily recognized using divergent characters, parallelism is common but more difficult to recognize because divergent characters are less abundant. Cladists, in particular, equate homeomorphy with convergence and reject parallelism as a distinct concept. Unfortunately, cladistic parsimony analysis may not resolve most parallelism. Therefore, criteria for the a priori recognition and objective evaluation of parallelism are very significant. Non-biotic data (e.g., stratigraphic and geographic distribution) provide independent criteria for the construction of hypotheses of parallelism in cases where taxa (1) were geographically isolated during homeomorphic character-state transformations, (2) occurred with endemic faunas, and (3) evolved in similar environmental conditions as suggested by paleoecological data. Australian lithostrotionoid corals were long considered congeneric with European taxa. However, because of their geographic isolation, occurrence with endemic rugose corals and occurrence in similar depositional environments as European forms, they are now considered a homeomorphic clade, resulting from an extended sequence of parallel character-state transformations. The high degree of parallelism, combined with abundant symplesiomorphic characters, led to erroneous phylogenetic inferences when non-biotic data were excluded from analysis. Cladistics, homeomorphy, lithostrotionoid corals, parallelism, phylogeny .  相似文献   

5.
Comparison of several protein phylogeny reconstruction methods was realized on a set of natural protein sequences. The programs of the PHYLIP package and FastME, PhyML and TreeTop programs were tested. In contrast to several studied programs that used simulated sequences, our results demonstrate the superiority of distance methods over the maximum likelihood method.  相似文献   

6.

Background  

With microarray technology, variability in experimental environments such as RNA sources, microarray production, or the use of different platforms, can cause bias. Such systematic differences present a substantial obstacle to the analysis of microarray data, resulting in inconsistent and unreliable information. Therefore, one of the most pressing challenges in the field of microarray technology is how to integrate results from different microarray experiments or combine data sets prior to the specific analysis.  相似文献   

7.
8.
The phylogeny of selected members of the phylum Rotifera is examined based on analyses under parsimony direct optimization and Bayesian inference of phylogeny. Species of the higher metazoan lineages Acanthocephala, Micrognathozoa, Cycliophora, and potential outgroups are included to test rotiferan monophyly. The data include 74 morphological characters combined with DNA sequence data from four molecular loci, including the nuclear 18S rRNA, 28S rRNA, histone H3, and the mitochondrial cytochrome c oxidase subunit I. The combined molecular and total evidence analyses support the inclusion of Acanthocephala as a rotiferan ingroup, but do not support the inclusion of Micrognathozoa and Cycliophora. Within Rotifera, the monophyletic Monogononta is sister group to a clade consisting of Acanthocephala, Seisonidea, and Bdelloidea-for which we propose the name Hemirotifera. We also formally propose the inclusion of Acanthocephala within Rotifera, but maintaining the name Rotifera for the new expanded phylum. Within Monogononta, Gnesiotrocha and Ploima are also supported by the data. The relationships within Ploima remain unstable to parameter variation or to the method of phylogeny reconstruction and poorly supported, and the analyses showed that monophyly was questionable for the families Dicranophoridae, Notommatidae, and Brachionidae, and for the genus Proales. Otherwise, monophyly was generally supported for the represented ploimid families and genera.  相似文献   

9.

Background

The rapid accumulation of whole-genome data has renewed interest in the study of using gene-order data for phylogenetic analyses and ancestral reconstruction. Current software and web servers typically do not support duplication and loss events along with rearrangements.

Results

MLGO (Maximum Likelihood for Gene-Order Analysis) is a web tool for the reconstruction of phylogeny and/or ancestral genomes from gene-order data. MLGO is based on likelihood computation and shows advantages over existing methods in terms of accuracy, scalability and flexibility.

Conclusions

To the best of our knowledge, it is the first web tool for analysis of large-scale genomic changes including not only rearrangements but also gene insertions, deletions and duplications. The web tool is available from http://www.geneorder.org/server.php.  相似文献   

10.
Phylogenetic relationships of 79 caniform carnivores were addressed based on four nuclear sequence-tagged sites (STS) and one nuclear exon, IRBP, using both supertree and supermatrix analyses. We recovered the three major arctoid lineages, Ursidae, Pinnipedia, and Musteloidea, as monophyletic, with Ursidae (bears) strongly supported as the basal arctoid lineage. Within Pinnipedia, Phocidae (true seals) were sister to the Otaroidea [Otariidae (fur seals and sea lions) and Odobenidae (walrus)]. Phocid subfamily and tribal designations were supported, but the otariid subfamily split between fur seals and sea lions was not. All family designations within Musteloidea were strongly supported: Mephitidae (skunks), Ailuridae (monotypic red panda), Mustelidae (weasels, badgers, otters), and Procyonidae (raccoons). A novel hypothesis for the position of the red panda was recovered, placing it as branching after Mephitidae and before Mustelidae+Procyonidae. Within Mustelidae, subfamily taxonomic changes are considered. This study represents the most comprehensive sampling to date of the Caniformia in a molecular study and contains the most complete molecular phylogeny for the Procyonidae. Our data set was also used in an empirical examination of the effect of missing data on both supertree and supermatrix analyses. Sequence for all genes in all taxa could not be obtained, so two variants of the data set with differing amounts of missing data were examined. The amount of missing data did not have a strong effect; instead, phylogenetic resolution was more dependent on the presence of sufficient informative characters. Supertree and supermatrix methods performed equivalently with incomplete data and were highly congruent; conflicts arose only in weakly supported areas, indicating that more informative characters are required to confidently resolve close species relationships.  相似文献   

11.
The ongoing generation of prodigious amounts of genomic sequence data from myriad vertebrates is providing unparalleled opportunities for establishing definitive phylogenetic relationships among species. The size and complexities of such comparative sequence data sets not only allow smaller and more difficult branches to be resolved but also present unique challenges, including large computational requirements and the negative consequences of systematic biases. To explore these issues and to clarify the phylogenetic relationships among mammals, we have analyzed a large data set of over 60 megabase pairs (Mb) of high-quality genomic sequence, which we generated from 41 mammals and 3 other vertebrates. All sequences are orthologous to a 1.9-Mb region of the human genome that encompasses the cystic fibrosis transmembrane conductance regulator gene (CFTR). To understand the characteristics and challenges associated with phylogenetic analyses of such a large data set, we partitioned the sequence data in several ways and utilized maximum likelihood, maximum parsimony, and Neighbor-Joining algorithms, implemented in parallel on Linux clusters. These studies yielded well-supported phylogenetic trees, largely confirming other recent molecular phylogenetic analyses. Our results provide support for rooting the placental mammal tree between Atlantogenata (Xenarthra and Afrotheria) and Boreoeutheria (Euarchontoglires and Laurasiatheria), illustrate the difficulty in resolving some branches even with large amounts of data (e.g., in the case of Laurasiatheria), and demonstrate the valuable role that very large comparative sequence data sets can play in refining our understanding of the evolutionary relationships of vertebrates.  相似文献   

12.
Genealogical relationships among Gasterosteidae (Teleostei: Gasterosteiformes) were tested with 84 morphological, 48 behavioral, and 2879 molecular characters. Phylogenetic analysis of the combined data set identified a single (CI = 0.735) best‐supported hypothesis (Spinachia (Apeltes ((Pungitius + Culaea)(Gasterosteus aculeatus + G. wheatlandi)))). This hypothesis is identical to previous phylogenetic propositions proposed on the basis of behavioral, and behavioral plus morphological data. Our hypothesis, however, differed from a molecular‐based phylogeny in the placement of Apeltes. This analysis highlights the importance of combining all available evidence in order to produce the best‐supported proposition of genealogical relationships.  相似文献   

13.
Both mitochondrial and nuclear gene sequences have been employed in efforts to reconstruct deep-level phylogenetic relationships. A fundamental question in molecular systematics concerns the efficacy of different types of sequences in recovering clades at different taxonomic levels. We compared the performance of four mitochondrial data sets (cytochrome b, cytochrome oxidase II, NADH dehydrogenase subunit I, 12S rRNA-tRNA-16S rRNA) and eight nuclear data sets (exonic regions of alpha-2B adrenergic receptor, aquaporin, ss-casein, gamma-fibrinogen, interphotoreceptor retinoid binding protein, kappa-casein, protamine, von Willebrand Factor) in recovering deep-level mammalian clades. We employed parsimony and minimum-evolution with a variety of distance corrections for superimposed substitutions. In 32 different pairwise comparisons between these mitochondrial and nuclear data sets, we used the maximum set of overlapping taxa. In each case, the variable-length bootstrap was used to resample at the size of the smaller data set. The nuclear exons consistently performed better than mitochondrial protein and rRNA-tRNA coding genes on a per-residue basis in recovering benchmark clades. We also concatenated nuclear genes for overlapping taxa and made comparisons with concatenated mitochondrial protein-coding genes from complete mitochondrial genomes. The variable-length bootstrap was used to score the recovery of benchmark clades as a function of the number of resampled base pairs. In every case, the nuclear concatenations were more efficient than the mitochondrial concatenations in recovering benchmark clades. Among genes included in our study, the nuclear genes were much less affected by superimposed substitutions. Nuclear genes having appropriate rates of substitution should receive strong consideration in efforts to reconstruct deep-level phylogenetic relationships.  相似文献   

14.
We introduce and evaluate data analysis methods to interpret simultaneous measurement of multiple genomic features made on the same biological samples. Our tools use gene sets to provide an interpretable common scale for diverse genomic information. We show we can detect genetic effects, although they may act through different mechanisms in different samples, and show we can discover and validate important disease-related gene sets that would not be discovered by analyzing each data type individually.  相似文献   

15.

Background

Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because de novo assembly for non-model genomes and multi-genome alignment are challenging.

Results

To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (https://sourceforge.net/projects/aaf-phylogeny) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms.

Conclusion

Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1647-5) contains supplementary material, which is available to authorized users.  相似文献   

16.
17.
The grass subfamily Danthonioideae is one of the smaller in the family. We utilize DNA sequence data from three chloroplast regions (trnL, rpoC2 and rbcL) and one nuclear region (Internal Transcribed Spacer; ITS) both singly and in combination to elucidate the relationships of the genera in the subfamily. The topology retrieved by the ITS region is not congruent with that of the plastid data, but this conflict is not strongly supported. Nine well-supported clades are retrieved by all data sets. The relationships at the base of the subfamily are clearly established, comprising a series of three clades of Merxmuellera species. The earliest diverging clade probably does not belong in Danthonioideae. The other two clades are centered in the tropical African mountains and Cape mountains respectively. A clade of predominantly North and South American Danthonia species as well as D. archboldii from New Guinea is retrieved, but the African and Asian species of Danthonia are related to African species of Merxmuellera, thus rendering Danthonia polyphyletic. The relationships of the Danthonia clade remain equivocal, as do those of the two Cortaderia clades, the Pseudopentameris and Rytidosperma clades. Paseka Mafa was tragically killed in a vehicle accident in July 2001. This paper includes information he collected during the course of his MSc in Systematics and Biodiversity Science at the University of Cape Town.  相似文献   

18.
19.
Consensus on the evolutionary relationships of humans, chimpanzees, and gorillas has not been reached, despite the existence of a number of DNA sequence data sets relating to the phylogeny, partly because not all gene trees from these data sets agree. However, given the well-known phenomenon of gene tree-species tree mismatch, agreement among gene trees is not expected. A majority of gene trees from available DNA sequence data support one hypothesis, but is this evidence sufficient for statistical confidence in the majority hypothesis? All available DNA sequence data sets showing phylogenetic resolution among the hominoids are grouped according to genetic linkage of their corresponding genes to form independent data sets. Of the 14 independent data sets defined in this way, 11 support a human- chimpanzee clade, 2 support a chimpanzee-gorilla clade, and one supports a human-gorilla clade. The hypothesis of a trichotomous speciation event leading to Homo; Pan, and Gorilla can be firmly rejected on the basis of this data set distribution. The multiple-locus test (Wu 1991), which evaluates hypotheses using gene tree-species tree mismatch probabilities in a likelihood ratio test, favors the phylogeny with a Homo-Pan clade and rejects the other alternatives with a P value of 0.002. When the probabilities are modified to reflect effective population size differences among different types of genetic loci, the observed data set distribution is even more likely under the Homo-Pan clade hypothesis. Maximum-likelihood estimates for the time between successive hominoid divergences are in the range of 300,000-2,800,000 years, based on a reasonable range of estimates for long-term hominoid effective population size and for generation time. The implication of the multiple-locus test is that existing DNA sequence data sets provide overwhelming and sufficient support for a human-chimpanzee clade: no additional DNA data sets need to be generated for the purpose of estimating hominoid phylogeny. Because DNA hybridization evidence (Caccone and Powell 1989) also supports a Homo-Pan clade, the problem of hominoid phylogeny can be confidently considered solved.   相似文献   

20.
In DNA microarray studies, gene-set analysis (GSA) has become the focus of gene expression data analysis. GSA utilizes the gene expression profiles of functionally related gene sets in Gene Ontology (GO) categories or priori-defined biological classes to assess the significance of gene sets associated with clinical outcomes or phenotypes. Many statistical approaches have been proposed to determine whether such functionally related gene sets express differentially (enrichment and/or deletion) in variations of phenotypes. However, little attention has been given to the discriminatory power of gene sets and classification of patients.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号