共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
Background
Maximum parsimony phylogenetic tree reconstruction from genetic variation data is a fundamental problem in computational genetics with many practical applications in population genetics, whole genome analysis, and the search for genetic predictors of disease. Efficient methods are available for reconstruction of maximum parsimony trees from haplotype data, but such data are difficult to determine directly for autosomal DNA. Data more commonly is available in the form of genotypes, which consist of conflated combinations of pairs of haplotypes from homologous chromosomes. Currently, there are no general algorithms for the direct reconstruction of maximum parsimony phylogenies from genotype data. Hence phylogenetic applications for autosomal data must therefore rely on other methods for first computationally inferring haplotypes from genotypes. 相似文献4.
GREGORY E. WEBB 《Lethaia: An International Journal of Palaeontology and Stratigraphy》1994,27(3):185-192
Webb, G.E. 1994 1015: Parallelism, non-biotic data and phylogeny reconstruction in paleobiology.
Many systematists equate parallelism and convergence. However, whereas convergence is relatively uncommon and easily recognized using divergent characters, parallelism is common but more difficult to recognize because divergent characters are less abundant. Cladists, in particular, equate homeomorphy with convergence and reject parallelism as a distinct concept. Unfortunately, cladistic parsimony analysis may not resolve most parallelism. Therefore, criteria for the a priori recognition and objective evaluation of parallelism are very significant. Non-biotic data (e.g., stratigraphic and geographic distribution) provide independent criteria for the construction of hypotheses of parallelism in cases where taxa (1) were geographically isolated during homeomorphic character-state transformations, (2) occurred with endemic faunas, and (3) evolved in similar environmental conditions as suggested by paleoecological data. Australian lithostrotionoid corals were long considered congeneric with European taxa. However, because of their geographic isolation, occurrence with endemic rugose corals and occurrence in similar depositional environments as European forms, they are now considered a homeomorphic clade, resulting from an extended sequence of parallel character-state transformations. The high degree of parallelism, combined with abundant symplesiomorphic characters, led to erroneous phylogenetic inferences when non-biotic data were excluded from analysis. Cladistics, homeomorphy, lithostrotionoid corals, parallelism, phylogeny . 相似文献
Many systematists equate parallelism and convergence. However, whereas convergence is relatively uncommon and easily recognized using divergent characters, parallelism is common but more difficult to recognize because divergent characters are less abundant. Cladists, in particular, equate homeomorphy with convergence and reject parallelism as a distinct concept. Unfortunately, cladistic parsimony analysis may not resolve most parallelism. Therefore, criteria for the a priori recognition and objective evaluation of parallelism are very significant. Non-biotic data (e.g., stratigraphic and geographic distribution) provide independent criteria for the construction of hypotheses of parallelism in cases where taxa (1) were geographically isolated during homeomorphic character-state transformations, (2) occurred with endemic faunas, and (3) evolved in similar environmental conditions as suggested by paleoecological data. Australian lithostrotionoid corals were long considered congeneric with European taxa. However, because of their geographic isolation, occurrence with endemic rugose corals and occurrence in similar depositional environments as European forms, they are now considered a homeomorphic clade, resulting from an extended sequence of parallel character-state transformations. The high degree of parallelism, combined with abundant symplesiomorphic characters, led to erroneous phylogenetic inferences when non-biotic data were excluded from analysis. Cladistics, homeomorphy, lithostrotionoid corals, parallelism, phylogeny . 相似文献
5.
Comparison of several protein phylogeny reconstruction methods was realized on a set of natural protein sequences. The programs of the PHYLIP package and FastME, PhyML and TreeTop programs were tested. In contrast to several studied programs that used simulated sequences, our results demonstrate the superiority of distance methods over the maximum likelihood method. 相似文献
6.
Ki-Yeol Kim Dong Hyuk Ki Ha Jin Jeong Hei-Cheul Jeung Hyun Cheol Chung Sun Young Rha 《BMC bioinformatics》2007,8(1):218
Background
With microarray technology, variability in experimental environments such as RNA sources, microarray production, or the use of different platforms, can cause bias. Such systematic differences present a substantial obstacle to the analysis of microarray data, resulting in inconsistent and unreliable information. Therefore, one of the most pressing challenges in the field of microarray technology is how to integrate results from different microarray experiments or combine data sets prior to the specific analysis. 相似文献7.
8.
A modern approach to rotiferan phylogeny: combining morphological and molecular data 总被引:5,自引:0,他引:5
The phylogeny of selected members of the phylum Rotifera is examined based on analyses under parsimony direct optimization and Bayesian inference of phylogeny. Species of the higher metazoan lineages Acanthocephala, Micrognathozoa, Cycliophora, and potential outgroups are included to test rotiferan monophyly. The data include 74 morphological characters combined with DNA sequence data from four molecular loci, including the nuclear 18S rRNA, 28S rRNA, histone H3, and the mitochondrial cytochrome c oxidase subunit I. The combined molecular and total evidence analyses support the inclusion of Acanthocephala as a rotiferan ingroup, but do not support the inclusion of Micrognathozoa and Cycliophora. Within Rotifera, the monophyletic Monogononta is sister group to a clade consisting of Acanthocephala, Seisonidea, and Bdelloidea-for which we propose the name Hemirotifera. We also formally propose the inclusion of Acanthocephala within Rotifera, but maintaining the name Rotifera for the new expanded phylum. Within Monogononta, Gnesiotrocha and Ploima are also supported by the data. The relationships within Ploima remain unstable to parameter variation or to the method of phylogeny reconstruction and poorly supported, and the analyses showed that monophyly was questionable for the families Dicranophoridae, Notommatidae, and Brachionidae, and for the genus Proales. Otherwise, monophyly was generally supported for the represented ploimid families and genera. 相似文献
9.
Background
The rapid accumulation of whole-genome data has renewed interest in the study of using gene-order data for phylogenetic analyses and ancestral reconstruction. Current software and web servers typically do not support duplication and loss events along with rearrangements.Results
MLGO (Maximum Likelihood for Gene-Order Analysis) is a web tool for the reconstruction of phylogeny and/or ancestral genomes from gene-order data. MLGO is based on likelihood computation and shows advantages over existing methods in terms of accuracy, scalability and flexibility.Conclusions
To the best of our knowledge, it is the first web tool for analysis of large-scale genomic changes including not only rearrangements but also gene insertions, deletions and duplications. The web tool is available from http://www.geneorder.org/server.php. 相似文献10.
Phylogenetic relationships of 79 caniform carnivores were addressed based on four nuclear sequence-tagged sites (STS) and one nuclear exon, IRBP, using both supertree and supermatrix analyses. We recovered the three major arctoid lineages, Ursidae, Pinnipedia, and Musteloidea, as monophyletic, with Ursidae (bears) strongly supported as the basal arctoid lineage. Within Pinnipedia, Phocidae (true seals) were sister to the Otaroidea [Otariidae (fur seals and sea lions) and Odobenidae (walrus)]. Phocid subfamily and tribal designations were supported, but the otariid subfamily split between fur seals and sea lions was not. All family designations within Musteloidea were strongly supported: Mephitidae (skunks), Ailuridae (monotypic red panda), Mustelidae (weasels, badgers, otters), and Procyonidae (raccoons). A novel hypothesis for the position of the red panda was recovered, placing it as branching after Mephitidae and before Mustelidae+Procyonidae. Within Mustelidae, subfamily taxonomic changes are considered. This study represents the most comprehensive sampling to date of the Caniformia in a molecular study and contains the most complete molecular phylogeny for the Procyonidae. Our data set was also used in an empirical examination of the effect of missing data on both supertree and supermatrix analyses. Sequence for all genes in all taxa could not be obtained, so two variants of the data set with differing amounts of missing data were examined. The amount of missing data did not have a strong effect; instead, phylogenetic resolution was more dependent on the presence of sufficient informative characters. Supertree and supermatrix methods performed equivalently with incomplete data and were highly congruent; conflicts arose only in weakly supported areas, indicating that more informative characters are required to confidently resolve close species relationships. 相似文献
11.
Prasad AB Allard MW;NISC Comparative Sequencing Program Green ED 《Molecular biology and evolution》2008,25(9):1795-1808
The ongoing generation of prodigious amounts of genomic sequence data from myriad vertebrates is providing unparalleled opportunities for establishing definitive phylogenetic relationships among species. The size and complexities of such comparative sequence data sets not only allow smaller and more difficult branches to be resolved but also present unique challenges, including large computational requirements and the negative consequences of systematic biases. To explore these issues and to clarify the phylogenetic relationships among mammals, we have analyzed a large data set of over 60 megabase pairs (Mb) of high-quality genomic sequence, which we generated from 41 mammals and 3 other vertebrates. All sequences are orthologous to a 1.9-Mb region of the human genome that encompasses the cystic fibrosis transmembrane conductance regulator gene (CFTR). To understand the characteristics and challenges associated with phylogenetic analyses of such a large data set, we partitioned the sequence data in several ways and utilized maximum likelihood, maximum parsimony, and Neighbor-Joining algorithms, implemented in parallel on Linux clusters. These studies yielded well-supported phylogenetic trees, largely confirming other recent molecular phylogenetic analyses. Our results provide support for rooting the placental mammal tree between Atlantogenata (Xenarthra and Afrotheria) and Boreoeutheria (Euarchontoglires and Laurasiatheria), illustrate the difficulty in resolving some branches even with large amounts of data (e.g., in the case of Laurasiatheria), and demonstrate the valuable role that very large comparative sequence data sets can play in refining our understanding of the evolutionary relationships of vertebrates. 相似文献
12.
Michelle Y. Mattern Deborah A. McLennan 《Cladistics : the international journal of the Willi Hennig Society》2004,20(1):14-22
Genealogical relationships among Gasterosteidae (Teleostei: Gasterosteiformes) were tested with 84 morphological, 48 behavioral, and 2879 molecular characters. Phylogenetic analysis of the combined data set identified a single (CI = 0.735) best‐supported hypothesis (Spinachia (Apeltes ((Pungitius + Culaea)(Gasterosteus aculeatus + G. wheatlandi)))). This hypothesis is identical to previous phylogenetic propositions proposed on the basis of behavioral, and behavioral plus morphological data. Our hypothesis, however, differed from a molecular‐based phylogeny in the placement of Apeltes. This analysis highlights the importance of combining all available evidence in order to produce the best‐supported proposition of genealogical relationships. 相似文献
13.
Springer MS DeBry RW Douady C Amrine HM Madsen O de Jong WW Stanhope MJ 《Molecular biology and evolution》2001,18(2):132-143
Both mitochondrial and nuclear gene sequences have been employed in efforts to reconstruct deep-level phylogenetic relationships. A fundamental question in molecular systematics concerns the efficacy of different types of sequences in recovering clades at different taxonomic levels. We compared the performance of four mitochondrial data sets (cytochrome b, cytochrome oxidase II, NADH dehydrogenase subunit I, 12S rRNA-tRNA-16S rRNA) and eight nuclear data sets (exonic regions of alpha-2B adrenergic receptor, aquaporin, ss-casein, gamma-fibrinogen, interphotoreceptor retinoid binding protein, kappa-casein, protamine, von Willebrand Factor) in recovering deep-level mammalian clades. We employed parsimony and minimum-evolution with a variety of distance corrections for superimposed substitutions. In 32 different pairwise comparisons between these mitochondrial and nuclear data sets, we used the maximum set of overlapping taxa. In each case, the variable-length bootstrap was used to resample at the size of the smaller data set. The nuclear exons consistently performed better than mitochondrial protein and rRNA-tRNA coding genes on a per-residue basis in recovering benchmark clades. We also concatenated nuclear genes for overlapping taxa and made comparisons with concatenated mitochondrial protein-coding genes from complete mitochondrial genomes. The variable-length bootstrap was used to score the recovery of benchmark clades as a function of the number of resampled base pairs. In every case, the nuclear concatenations were more efficient than the mitochondrial concatenations in recovering benchmark clades. Among genes included in our study, the nuclear genes were much less affected by superimposed substitutions. Nuclear genes having appropriate rates of substitution should receive strong consideration in efforts to reconstruct deep-level phylogenetic relationships. 相似文献
14.
We introduce and evaluate data analysis methods to interpret simultaneous measurement of multiple genomic features made on
the same biological samples. Our tools use gene sets to provide an interpretable common scale for diverse genomic information.
We show we can detect genetic effects, although they may act through different mechanisms in different samples, and show we
can discover and validate important disease-related gene sets that would not be discovered by analyzing each data type individually. 相似文献
15.
Background
Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because de novo assembly for non-model genomes and multi-genome alignment are challenging.Results
To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (https://sourceforge.net/projects/aaf-phylogeny) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms.Conclusion
Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1647-5) contains supplementary material, which is available to authorized users. 相似文献16.
17.
N. P. Barker C. Galley G. A. Verboom P. Mafa M. Gilbert H. P. Linder 《Plant Systematics and Evolution》2007,264(3-4):135-156
The grass subfamily Danthonioideae is one of the smaller in the family. We utilize DNA sequence data from three chloroplast
regions (trnL, rpoC2 and rbcL) and one nuclear region (Internal Transcribed Spacer; ITS) both singly and in combination to elucidate the relationships
of the genera in the subfamily. The topology retrieved by the ITS region is not congruent with that of the plastid data, but
this conflict is not strongly supported. Nine well-supported clades are retrieved by all data sets. The relationships at the
base of the subfamily are clearly established, comprising a series of three clades of Merxmuellera species. The earliest diverging clade probably does not belong in Danthonioideae. The other two clades are centered in the
tropical African mountains and Cape mountains respectively. A clade of predominantly North and South American Danthonia species as well as D. archboldii from New Guinea is retrieved, but the African and Asian species of Danthonia are related to African species of Merxmuellera, thus rendering Danthonia polyphyletic. The relationships of the Danthonia clade remain equivocal, as do those of the two Cortaderia clades, the Pseudopentameris and Rytidosperma clades.
Paseka Mafa was tragically killed in a vehicle accident in July 2001. This paper includes information he collected during
the course of his MSc in Systematics and Biodiversity Science at the University of Cape Town. 相似文献
18.
19.
Molecular phylogeny of the hominoids: inferences from multiple independent DNA sequence data sets 总被引:14,自引:4,他引:10
Consensus on the evolutionary relationships of humans, chimpanzees, and
gorillas has not been reached, despite the existence of a number of DNA
sequence data sets relating to the phylogeny, partly because not all gene
trees from these data sets agree. However, given the well-known phenomenon
of gene tree-species tree mismatch, agreement among gene trees is not
expected. A majority of gene trees from available DNA sequence data support
one hypothesis, but is this evidence sufficient for statistical confidence
in the majority hypothesis? All available DNA sequence data sets showing
phylogenetic resolution among the hominoids are grouped according to
genetic linkage of their corresponding genes to form independent data sets.
Of the 14 independent data sets defined in this way, 11 support a human-
chimpanzee clade, 2 support a chimpanzee-gorilla clade, and one supports a
human-gorilla clade. The hypothesis of a trichotomous speciation event
leading to Homo; Pan, and Gorilla can be firmly rejected on the basis of
this data set distribution. The multiple-locus test (Wu 1991), which
evaluates hypotheses using gene tree-species tree mismatch probabilities in
a likelihood ratio test, favors the phylogeny with a Homo-Pan clade and
rejects the other alternatives with a P value of 0.002. When the
probabilities are modified to reflect effective population size differences
among different types of genetic loci, the observed data set distribution
is even more likely under the Homo-Pan clade hypothesis. Maximum-likelihood
estimates for the time between successive hominoid divergences are in the
range of 300,000-2,800,000 years, based on a reasonable range of estimates
for long-term hominoid effective population size and for generation time.
The implication of the multiple-locus test is that existing DNA sequence
data sets provide overwhelming and sufficient support for a
human-chimpanzee clade: no additional DNA data sets need to be generated
for the purpose of estimating hominoid phylogeny. Because DNA hybridization
evidence (Caccone and Powell 1989) also supports a Homo-Pan clade, the
problem of hominoid phylogeny can be confidently considered solved.
相似文献
20.
In DNA microarray studies, gene-set analysis (GSA) has become the focus of gene expression data analysis. GSA utilizes the gene expression profiles of functionally related gene sets in Gene Ontology (GO) categories or priori-defined biological classes to assess the significance of gene sets associated with clinical outcomes or phenotypes. Many statistical approaches have been proposed to determine whether such functionally related gene sets express differentially (enrichment and/or deletion) in variations of phenotypes. However, little attention has been given to the discriminatory power of gene sets and classification of patients. 相似文献