首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Thanks to the large amount of signal contained in genome-wide sequence alignments, phylogenomic analyses are converging towards highly supported trees. However, high statistical support does not imply that the tree is accurate. Systematic errors, such as the Long Branch Attraction (LBA) artefact, can be misleading, in particular when the taxon sampling is poor, or the outgroup is distant. In an otherwise consistent probabilistic framework, systematic errors in genome-wide analyses can be traced back to model mis-specification problems, which suggests that better models of sequence evolution should be devised, that would be more robust to tree reconstruction artefacts, even under the most challenging conditions.

Methods

We focus on a well characterized LBA artefact analyzed in a previous phylogenomic study of the metazoan tree, in which two fast-evolving animal phyla, nematodes and platyhelminths, emerge either at the base of all other Bilateria, or within protostomes, depending on the outgroup. We use this artefactual result as a case study for comparing the robustness of two alternative models: a standard, site-homogeneous model, based on an empirical matrix of amino-acid replacement (WAG), and a site-heterogeneous mixture model (CAT). In parallel, we propose a posterior predictive test, allowing one to measure how well a model acknowledges sequence saturation.

Results

Adopting a Bayesian framework, we show that the LBA artefact observed under WAG disappears when the site-heterogeneous model CAT is used. Using cross-validation, we further demonstrate that CAT has a better statistical fit than WAG on this data set. Finally, using our statistical goodness-of-fit test, we show that CAT, but not WAG, correctly accounts for the overall level of saturation, and that this is due to a better estimation of site-specific amino-acid preferences.

Conclusion

The CAT model appears to be more robust than WAG against LBA artefacts, essentially because it correctly anticipates the high probability of convergences and reversions implied by the small effective size of the amino-acid alphabet at each site of the alignment. More generally, our results provide strong evidence that site-specificities in the substitution process need be accounted for in order to obtain more reliable phylogenetic trees.
  相似文献   

2.
Reconstructing the phylogeny of sponges (Porifera) is one of the remaining challenges to resolve the metazoan Tree of Life and is a prerequisite for understanding early animal evolution. Molecular phylogenetic analyses for two of the three extant classes of the phylum, Demospongiae and Calcarea, are largely incongruent with traditional classifications, most likely because of a paucity of informative morphological characters and high levels of homoplasy. For the third class, Hexactinellida (glass sponges)--predominantly deep-sea inhabitants with unusual morphology and biology--we present the first molecular phylogeny, along with a cladistic analysis of morphological characters. We collected 18S, 28S, and mitochondrial 16S ribosomal DNA sequences of 34 glass sponge species from 27 genera, 9 families, and 3 orders and conducted partitioned Bayesian analyses using RNA secondary structure-specific substitution models (paired-sites models) for stem regions. Bayes factor comparisons of different paired-sites models against each other and conventional (independent-sites) models revealed a significantly better fit of the former but, contrary to previous predictions, the least parameter-rich of the tested paired-sites models provided the best fit to our data. In contrast to Demospongiae and Calcarea, our rDNA phylogeny agrees well with the traditional classification and a previously proposed phylogenetic system, which we ascribe to a more informative morphology in Hexactinellida. We find high support for a close relationship of glass sponges and Demospongiae sensu stricto, though the latter may be paraphyletic with respect to Hexactinellida. Homoscleromorpha appears to be the sister group of Calcarea. Contrary to most previous findings from rDNA, we recover Porifera as monophyletic, although support for this clade is low under paired-sites models.  相似文献   

3.
Tetraodontiform fishes (e.g., triggerfishes, boxfishes, pufferfishes, and giant ocean sunfishes) have long been recognized as a monophyletic group. Morphological analyses have resulted in conflicting hypotheses of relationships among the tetraodontiform families. Molecular data from the single-copy nuclear gene RAG1 and from two mitochondrial ribosomal genes, 12S and 16S, were used to test these morphology-based hypotheses. Total evidence (RAG1+12S+16S), RAG1-only, and mitochondrial-only analyses were performed using both maximum parsimony and Bayesian criteria. Total evidence and RAG1-only analyses recover a monophyletic Tetraodontiformes. However, the relationships recovered within the order differ, and none completely conform to previous hypotheses. Analysis of mitochondrial data alone fails to recover a monophyletic Tetraodontiformes and therefore does not support any of the morphology-based topologies. The RAG1 data appear to give the best estimate of tetraodontiform phylogeny, resulting in many strongly supported nodes and showing a high degree of congruence between both parsimony and Bayesian analyses. All analyses recover every tetraodontiform family for which more than one representative is included as a strongly supported monophyletic group. Balistidae and Monacanthidae are recovered as sister groups with robust support in every analysis, and all analyses except the Bayesian analyses of the mitochondrial data alone recover a strongly supported sister-group relationship between Tetraodontidae and Diodontidae. Many of the intrafamilial relationships recovered from the molecular data presented here corroborate previous morphological hypotheses.  相似文献   

4.
An important issue in population ecology is to disentangle different density-dependent mechanisms that may limit or regulate animal populations. This goal is further complicated when studying long-lived species for which experimental approaches are not feasible, in whose cases density-dependence hypotheses are tested using long-term monitored populations. Here we respond to some criticisms and identify additional problems associated with these kinds of observational studies. Current caveats are related to the temporal and spatial scales covered by population monitoring data, which may question its suitability for density-dependence tests, and to statistical flaws such as the incorrect control for confounding variables, low statistical power, the distribution of demographic variables, the interpretation of spurious correlations, and the often used stepwise series of univariate analyses. Generalised linear mixed models are recommended over other more traditional approaches, since they help to solve the above statistical problems and, more importantly, allow to properly test several hypotheses simultaneously. Finally, several management actions aimed to recover endangered species, such as supplementary feeding, might be considered as field experiments for further testing density-dependence hypotheses in long-lived study models. We expect these opportunities, together with the most adequate statistical tools now available, will help to better our understanding of density-dependent effects in wild populations.  相似文献   

5.
A probabilistic generative model for GO enrichment analysis   总被引:1,自引:0,他引:1  
The Gene Ontology (GO) is extensively used to analyze all types of high-throughput experiments. However, researchers still face several challenges when using GO and other functional annotation databases. One problem is the large number of multiple hypotheses that are being tested for each study. In addition, categories often overlap with both direct parents/descendents and other distant categories in the hierarchical structure. This makes it hard to determine if the identified significant categories represent different functional outcomes or rather a redundant view of the same biological processes. To overcome these problems we developed a generative probabilistic model which identifies a (small) subset of categories that, together, explain the selected gene set. Our model accommodates noise and errors in the selected gene set and GO. Using controlled GO data our method correctly recovered most of the selected categories, leading to dramatic improvements over current methods for GO analysis. When used with microarray expression data and ChIP-chip data from yeast and human our method was able to correctly identify both general and specific enriched categories which were overlooked by other methods.  相似文献   

6.
《Genomics》2020,112(5):2970-2977
Here we determined mitogenomes of three Bostrichiformia species. These data were combined with 51 previously sequenced Polyphaga mitogenomes to explore the higher-level relationships within Polyphaga by using four different mitogenomic datasets and three tree inference approaches. Among Polyphaga mitogenomes we observed heterogeneity in nucleotide composition and evolutionary rates, which may have affected phylogenetic inferences across the different mitogenomic datasets. Elateriformia, Cucujiformia, and Scarabaeiformia were each inferred to be monophyletic by all analyses, as was Bostrichiformia by most analyses based on two datasets with low heterogeneity. The large series Staphyliniformia was never recovered as monophyletic in our analyses. The Bayesian tree using a degenerated nucleotide dataset (P123_Degen) and a site-heterogeneous mixture model in PhyloBayes was supported as the best Polyphaga phylogeny: (Scirtiformia, (Elateriformia, ((Bostrichiformia, Cucujiformia), (Scarabaeiformia + Staphyliniformia)))). For Cucujiformia, the largest series, we inferred a superfamily-level phylogeny: ((Cleroidea, Coccinelloidea), (Tenebrionoidea, (Cucujoidea + Curculionoidea + Chrysomeloidea))).  相似文献   

7.
There is much interest in studying animal personalities but considerable debate as to how to define and evaluate them. We assessed the utility of one proposed framework while studying personality in terrestrial hermit crabs (Coenobita clypeatus). We recorded the latency of individuals to emerge from their shells over multiple trials in four unique manipulations. We used the specific testing situations within these manipulations to define two temperament categories (shyness-boldness and exploration-avoidance). Our results identified individual behavioral consistency (i.e., personality) across repeated trials of the same situations, within both categories. Additionally, we found correlations between behaviors across contexts (traits) that suggested that the crabs had behavioral syndromes. While we found some correlations between behaviors that are supposed to measure the same temperament trait, these correlations were not inevitable. Furthermore, a principal component analysis (PCA) of our data revealed new relationships between behaviors and provided the foundation for an alternate interpretation: measured behaviors may be situation-specific, and may not reflect general personality traits at all. These results suggest that more attention must be placed on how we infer personalities from standardized methods, and that we must be careful to not force our data to fit our frameworks.  相似文献   

8.
9.
动物系统发育和大规模测序:进展和问题   总被引:2,自引:2,他引:0  
Phylogenomics, the inference of phylogenetic trees using genome-scale data, is becoming the rule for resolving difficult parts of the tree of life. Its promise resides in the large amount of information available, which should eliminate stochastic error. However, systematic error, which is due to limitations of reconstruction methods, is becoming more apparent. We will illustrate, using animal phylogeny as a case study, the three most efficient approaches to avoid the pitfalls of phylogenomics (1) using a dense taxon sampling, (2) using probabilistic methods with complex models of sequence evolution that more accurately detect multiple substitutions, and (3) removing the fastest evolving part of the data (e.g., species and positions). The analysis of a dataset of 55 animal species and 102 proteins (25712 amino acid positions) shows that standard site-homogeneous model inference is sensitive to long-branch attraction artifact, whereas the site-heterogeneous CAT model is less so. The latter model correctly locates three very fast evolving species, the appendicularian tunicate Oikopleura, the acoel Convoluta and the myxozoan Buddenbrockia. Overall, the resulting tree is in excellent agreement with the new animal phylogeny, confirming that "simple" organisms like platyhelminths and nematodes are not necessarily of basal emergence. This further emphasizes the importance of secondary simplification in animals, and for organismal evolution in general.  相似文献   

10.
Genome-scale data sets result in an enhanced resolution of the phylogenetic inference by reducing stochastic errors. However, there is also an increase of systematic errors due to model violations, which can lead to erroneous phylogenies. Here, we explore the impact of systematic errors on the resolution of the eukaryotic phylogeny using a data set of 143 nuclear-encoded proteins from 37 species. The initial observation was that, despite the impressive amount of data, some branches had no significant statistical support. To demonstrate that this lack of resolution is due to a mutual annihilation of phylogenetic and nonphylogenetic signals, we created a series of data sets with slightly different taxon sampling. As expected, these data sets yielded strongly supported but mutually exclusive trees, thus confirming the presence of conflicting phylogenetic and nonphylogenetic signals in the original data set. To decide on the correct tree, we applied several methods expected to reduce the impact of some kinds of systematic error. Briefly, we show that (i) removing fast-evolving positions, (ii) recoding amino acids into functional categories, and (iii) using a site-heterogeneous mixture model (CAT) are three effective means of increasing the ratio of phylogenetic to nonphylogenetic signal. Finally, our results allow us to formulate guidelines for detecting and overcoming phylogenetic artefacts in genome-scale phylogenetic analyses.  相似文献   

11.
Recent hypotheses argue that phylogenetic relatedness should predict both the niche differences that stabilise coexistence and the average fitness differences that drive competitive dominance. These still largely untested predictions complicate Darwin's hypothesis that more closely related species less easily coexist, and challenge the use of community phylogenetic patterns to infer competition. We field parameterised models of competitor dynamics with pairs of 18 California annual plant species, and then related species' niche and fitness differences to their phylogenetic distance. Stabilising niche differences were unrelated to phylogenetic distance, while species' average fitness showed phylogenetic structure. This meant that more distant relatives had greater competitive asymmetry, which should favour the coexistence of close relatives. Nonetheless, coexistence proved unrelated to phylogeny, due in part to increasing variance in fitness differences with phylogenetic distance, a previously overlooked property of such relationships. Together, these findings question the expectation that distant relatives should more readily coexist.  相似文献   

12.
13.
Neuroptera (lacewings) and allied orders Megaloptera (dobsonflies, alderflies) and Raphidioptera (snakeflies) are predatory insects and together make up the clade Neuropterida. The higher‐level relationships within Neuropterida have historically been widely disputed with multiple competing hypotheses. Moreover, the evolution of important biological innovations among various Neuropterida families, such as the origin, timing and direction of transitions between aquatic and terrestrial habitats of larvae, remains poorly understood. To investigate the origin and diversification of lacewings and their allies, we undertook phylogenetic analyses of mitochondrial genomes of all families of Neuropterida using Bayesian inference, maximum likelihood and maximum parsimony methods. We present a robust, fully resolved phylogeny and divergence time estimation for Neuropterida with strong statistical support for almost all nodes. Mitochondrial sequence data are typified by significant compositional heterogeneity across lineages, and parsimony and models assuming homogeneous rates did not recover Neuroptera as monophyletic. Only a model accounting for compositional heterogeneity (i.e. CAT‐GTR) recovered all orders of Neuropterida as monophyletic. Significant findings of the mitogenomic phylogeny include recovering Raphidioptera as sister to Megaloptera plus Neuroptera. The sister family of all other lacewings are the dusty‐wings (Coniopterygidae), rather than Nevrorthidae. Nevrorthidae are instead returned to their traditional position as the sister group of the spongilla‐flies (Sisyridae) and closely related to Osmylidae. Our divergence time analysis indicates that the Mesozoic was indeed a ‘golden age’ for lacewings, with most families of Neuropterida diverging during the Triassic and Jurassic and all extant families present by the Early Cretaceous. Based on ancestral character state reconstructions of larval habitat we evaluate competing hypotheses regarding the life style of early neuropteridan larvae as either aquatic or terrestrial.  相似文献   

14.
Determining the relationships among the major groups of cellular life is important for understanding the evolution of biological diversity, but is difficult given the enormous time spans involved. In the textbook ‘three domains’ tree based on informational genes, eukaryotes and Archaea share a common ancestor to the exclusion of Bacteria. However, some phylogenetic analyses of the same data have placed eukaryotes within the Archaea, as the nearest relatives of different archaeal lineages. We compared the support for these competing hypotheses using sophisticated phylogenetic methods and an improved sampling of archaeal biodiversity. We also employed both new and existing tests of phylogenetic congruence to explore the level of uncertainty and conflict in the data. Our analyses suggested that much of the observed incongruence is weakly supported or associated with poorly fitting evolutionary models. All of our phylogenetic analyses, whether on small subunit and large subunit ribosomal RNA or concatenated protein-coding genes, recovered a monophyletic group containing eukaryotes and the TACK archaeal superphylum comprising the Thaumarchaeota, Aigarchaeota, Crenarchaeota and Korarchaeota. Hence, while our results provide no support for the iconic three-domain tree of life, they are consistent with an extended eocyte hypothesis whereby vital components of the eukaryotic nuclear lineage originated from within the archaeal radiation.  相似文献   

15.
Different frameworks have been proposed for using molecular data in systematic revisions, but there is ongoing debate on their applicability, merits and shortcomings. In this paper we examine the fit between morphological and molecular data in the systematic revision of Paroaria, a group of conspicuous songbirds endemic to South America. We delimited species based on examination of >600 specimens, and developed distance-gap, and distance- and character-based coalescent simulations to test species limits with molecular data. The morphological and molecular data collected were then analyzed using parsimony, maximum likelihood, and Bayesian phylogenetics. The simulations were better at evaluating the new species limits than using genetic distances. Species diversity within Paroaria had been underestimated by 60%, and the revised genus comprises eight species. Phylogenetic analyses consistently recovered a congruent topology for the most recently derived species in the genus, but the most basal divergences were not resolved with these data. The systematic and phylogenetic hypotheses developed here are relevant to both setting conservation priorities and understanding the biogeography of South America.  相似文献   

16.
Bergmann's rule and the mammal fauna of northern North America   总被引:6,自引:0,他引:6  
The observation that "on the whole…  larger species live farther north and the smaller ones farther south" was first published by Carl Bergmann in 1847. However, why animal body mass might show such spatial variation, and indeed whether it is a general feature of animal assemblages, is currently unclear. We discuss reasons for this uncertainty, and use our conclusions to direct an analysis of Bergmann's rule in the mammals in northern North America, in the communities of species occupying areas that were covered by ice at the last glacial maximum. First, we test for the existence of Bergmann's rule in this assemblage, and investigate whether small- and large-bodied species show different spatial patterns of body size variation. We then attempt to explain the spatial variation in terms of environmental variation, and evaluate the adequacy of our analyses to account for the spatial pattern using the residuals arising from our environmental models. Finally, we use the results of these models to test predictions of different hypotheses proposed to account for Bergmann's rule. Bergmann's rule is strongly supported. Both small- and large-bodied species exhibit the rule. Our environmental models account for most of the spatial variation in mean, minimum and maximum body mass in this assemblage. Our results falsify predictions of hypotheses relating to migration ability and random colonisation and diversification, but support predictions of hypotheses relating to both heat conservation and starvation resistance.  相似文献   

17.
Aim To demonstrate that parsimony analysis of endemicity (PAE) is not analogous to a cladistic biogeographical analysis. Location We used six data sets from previously published studies from around the world. Methods In order to test the efficiency of PAE in recovering historical relationships among areas, we performed an empirical comparison of nodes recovered with PAE, primary Brooks parsimony analysis (BPA), and an event‐based method using three models (maximum codivergence, reconciled trees, and the default model of the treefitter program) for six data sets. We measured the performance of PAE in recovering historical area relationships by counting the number and examining the content of nodes recovered by PAE and by historical methods. The dispersal/vicariance ratio was calculated to assess the prevalence of dispersal or vicariance in each reconstruction and its relationship to the performance of PAE. Results Our results show that PAE recovers an average of 17.25% of historical nodes. PAE and BPA tend to provide similar results; however, in relation to the event‐based models, PAE performance was poor under all the tested scenarios. Although in some cases PAE reconstructions are more resolved than historical reconstructions, this does not necessarily mean that PAE produces more informative answers. These additional nodes correspond to unsupported statements that are based solely on the distributional data of taxa and not on their phylogenetic history. In other words, these nodes were not found by the historical methods, which take phylogenetics into account. The number of historical nodes recovered using PAE was in general negatively correlated with the dispersal/vicariance ratio. Main conclusions Our results show that PAE is unable to recover historical patterns and therefore does not fit into the current paradigm of historical biogeography. These findings raise doubts regarding conclusions derived from biogeographical studies that interpret PAE trees as area cladograms. We acknowledge that PAE aims to describe but does not explain the current distribution of organisms. It is therefore a useful tool in other biogeographical or ecological analyses for exploring the distribution of taxa or for establishing hypotheses of primary homology between areas.  相似文献   

18.
Recent interest in diagnoses and relationships between lineages of the alligator snapping turtle (Macrochelys temminckii) present conflicting patterns of molecular variation across the taxon's range. This study uses geometric morphometric techniques to test molecular hypotheses. We analyse alligator snapping turtle cranial variation amongst populations (i.e. drainages) with the hypothesis that populations of turtles recovered as monophyletic by previous molecular studies are more similar to each other in cranial shape. Dorsal, lateral and ventral cranial shape analyses corroborate the uniqueness of populations recovered by molecular genetic hypotheses. Additionally, analyses reveal near equal separation between drainages that were assigned to monophyletic clades by previous phylogenetic studies. These results reveal the potential for more independent lineages that have yet to be diagnosed, and unique cranial shapes are described for our three most heavily sampled drainages.  相似文献   

19.
Aim Assessing the relative vulnerability of species within an assemblage to extinction is crucial for conservation planning at the regional scale. Here, we quantify relative vulnerability to extinction, in terms of both resistance and resilience to environmental change, in an assemblage of tropical rainforest vertebrates. Location Wet Tropics Bioregion, north Queensland, Australia. Methods We collated data on 163 vertebrates that occur in the Australian Wet Tropics, including 24 frogs, 33 reptiles, 19 mammals and 87 birds. We used the ‘seven forms of rarity’ model to assess relative vulnerability or resistance to environmental change. We then develop a new analogous eight‐celled model to assess relative resilience, or potential to recover from environmental perturbation, based on reproductive output, potential for dispersal and climatic niche marginality. Results In the rarity model, our assemblage had more species very vulnerable and very resistant than expected by chance. There was a more even distribution of species over the categories in the resilience model. The three traits included in each model were not independent of each other; species that were widespread were also habitat generalists, while species with narrow geographical ranges tended to be locally abundant. In the resilience model, species with low reproductive output had a narrow climatic niche and also a low capacity to disperse. Frogs were the most vulnerable taxonomic group overall. The model categories were compared to current IUCN category of listed species, and the product of the two models was best correlated with IUCN listings. Main conclusions The models presented here offer an objective way to predict the resistance of a species to environmental change, and its capacity to recover from disturbance. The new resilience model has similar advantages to the rarity model, in that it uses simple information and is therefore useful for examining patterns in assemblages with many poorly known species.  相似文献   

20.
Major depression is a relatively common psychiatric disorder that can be quite debilitating. Family, twin, and adoption studies indicate that unipolar depression has both genetic and environmental components. Early age at onset and recurrent episodes in the proband each increase the familiarity of the illness. To investigate the potential genetic underpinnings of the disease, we have performed a complex segregation analysis on 832 individuals from 50 multigenerational families ascertained through a proband with early-onset recurrent unipolar major depression. The analysis was conducted by use of regressive models, to test a variety of hypotheses to explain the familial aggregation of recurrent unipolar depression. Analyses were conducted under two alternative definitions of affection status for the relatives of probands: (1) "narrow," in which relatives were assumed to be affected only if they were diagnosed with recurrent unipolar depression; and (2) "broad," in which relatives were assumed to be affected if diagnosed with any major affective illness. Under the narrow-definition assumption, the model that best explains these family data is a transmitted (although non-Mendelian) recessive major effect with significant residual parental effects on affection status. Under the broad-definition assumption, the best-fitting model is a Mendelian codominant major locus with significant residual parental and spousal effects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号