首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 9 毫秒
1.
Variations of nucleotidic composition affect phylogenetic inference conducted under stationary models of evolution. In particular, they may cause unrelated taxa sharing similar base composition to be grouped together in the resulting phylogeny. To address this problem, we developed a nonstationary and nonhomogeneous model accounting for compositional biases. Unlike previous nonstationary models, which are branchwise, that is, assume that base composition only changes at the nodes of the tree, in our model, the process of compositional drift is totally uncoupled from the speciation events. In addition, the total number of events of compositional drift distributed across the tree is directly inferred from the data. We implemented the method in a Bayesian framework, relying on Markov Chain Monte Carlo algorithms, and applied it to several nucleotidic data sets. In most cases, the stationarity assumption was rejected in favor of our nonstationary model. In addition, we show that our method is able to resolve a well-known artifact. By Bayes factor evaluation, we compared our model with 2 previously developed nonstationary models. We show that the coupling between speciations and compositional shifts inherent to branchwise models may lead to an overparameterization, resulting in a lesser fit. In some cases, this leads to incorrect conclusions, concerning the nature of the compositional biases. In contrast, our compound model more flexibly adapts its effective number of parameters to the data sets under investigation. Altogether, our results show that accounting for nonstationary sequence evolution may require more elaborate and more flexible models than those currently used.  相似文献   

2.
3.
Statistical studies of gene populations on the purine/pyrimidine alphabet have shown that the mean occurrence probability of thei-motif YRY(N) i YRY (R=purine, Y=pyrimidine, N=R or Y) is not uniform by varyingi in the range [1,99], but presents a maximum ati=6 in the following populations: protein coding genes of eukaryotes, prokaryotes, chloroplasts and mitrochondria, and also viral introns, ribosomal RNA genes and transfer RNA genes (Arquès and Michel, 1987b,J. theor. Biol. 128, 457–461). From the “universality” of this observation, we suggested that the oligonucleotide YRY(N)6 is a primitive one and that it has a central function in DNA sequence evolution (Arquès and Michel, 1987b,J. theor. Biol. 128, 457–461). Following this idea, we introduce a concept of a model of DNA sequence evolution which will be validated according to a shema presented in three parts. In the first part, using the last version of the gene database, the YRY(N)6YRY preferential occurrence (maximum ati=6) is confirmed for the populations mentioned above and is extended to some newly analysed populations: chloroplast introns, chloroplast 5′ regions, mitochondrial 5′ regions and small nuclear RNA genes. On the other hand, the YRY(N)6YRY preferential occurrence and periodicities are used in order to classify 18 gene populations. In the second part, we will demonstrate that several statistical features characterizing different gene populations (in particular the YRY(N)6YRY preferential occurrence and the periodicities) can be retrieved from a simple Markov model based on the mixing of the two oligonucleotides YRY(N)6 and YRY(N)3 and based on the percentages of RYR and YRY in the unspecified trinucleotides (N)3 of YRY(N)6 and YRY(N)3. Several properties are identified and prove in particular that the oligonucleotide mixing is an independent process and that several different features are functions of a unique parameter. In the third part, the return of the model to the reality shows a strong correlation between reality and simulation concerning the presence of large alternating purine/pyrimidine stretches and of periodicities. It also contributes to a greater understanding of biological reality, e.g. the presence or the absence of large alternating purine/pyrimidine stretches can be explained as being a simple consequence of the mixing of two particular oligonucleotides. Finally, we believe that such an approach is the first step toward a unified model of DNA sequence evolution allowing the molecular understanding of both the origin of life and the actual biological reality.  相似文献   

4.
Propagule pressure is intuitively a key factor in biological invasions: increased availability of propagules increases the chances of establishment, persistence, naturalization, and invasion. The role of propagule pressure relative to disturbance and various environmental factors is, however, difficult to quantify. We explored the relative importance of factors driving invasions using detailed data on the distribution and percentage cover of alien tree species on South Africa's Agulhas Plain (2,160 km2). Classification trees based on geology, climate, land use, and topography adequately explained distribution but not abundance (canopy cover) of three widespread invasive species (Acacia cyclops, Acacia saligna, and Pinus pinaster). A semimechanistic model was then developed to quantify the roles of propagule pressure and environmental heterogeneity in structuring invasion patterns. The intensity of propagule pressure (approximated by the distance from putative invasion foci) was a much better predictor of canopy cover than any environmental factor that was considered. The influence of environmental factors was then assessed on the residuals of the first model to determine how propagule pressure interacts with environmental factors. The mediating effect of environmental factors was species specific. Models combining propagule pressure and environmental factors successfully predicted more than 70% of the variation in canopy cover for each species.  相似文献   

5.
Abundant representation of sharks in the fossil record makes this group a superb system in which to investigate rates and patterns of molecular evolution and to explore the strengths and weaknesses of phylogenetic inferences from molecular data. In this report, the molecular evolution of the cytochrome b gene in sharks is described and the information related to results from phylogenetic analysis of the data evaluated in the light of a phylogeny derived independently of the molecular data. Across divergent lineages of sharks there is evidence for significant substitution rate variation, departure from compositional equilibrium, and substantial homoplasy; nevertheless, the signal of evolutionary history is evident in patterns of shared transversions and amino acid replacements.   相似文献   

6.
Sampling properties of DNA sequence data in phylogenetic analysis   总被引:20,自引:6,他引:20  
We inferred phylogenetic trees from individual genes and random samples of nucleotides from the mitochondrial genomes of 10 vertebrates and compared the results to those obtained by analyzing the whole genomes. Individual genes are poor samples in that they infrequently lead to the whole-genome tree. A large number of nucleotide sites is needed to exactly determine the whole-genome tree. A relatively small number of sites, however, often results in a tree close to the whole-genome tree. We found that blocks of contiguous sites were less likely to lead to the whole-genome tree than samples composed of sites drawn individually from throughout the genome. Samples of contiguous sites are not representative of the entire genome, a condition that violates a basic assumption of the bootstrap method as it is applied in phylogenetic studies.   相似文献   

7.
Nonhomogeneous substitution models have been introduced for phylogenetic inference when the substitution process is nonstationary, for example, when sequence composition differs between lineages. Existing models can have many parameters, and it is then difficult and computationally expensive to learn the parameters and to select the optimal model complexity. We extend an existing nonhomogeneous substitution model by introducing a reversible jump Markov chain Monte Carlo method for efficient Bayesian inference of the model order along with other phylogenetic parameters of interest. We also introduce a new hierarchical prior which leads to more reasonable results when only a small number of lineages share a particular substitution process. The method is implemented in the PHASE software, which includes specialized substitution models for RNA genes with conserved secondary structure. We apply an RNA-specific nonhomogeneous model to a structure-based alignment of rRNA sequences spanning the entire tree of life. A previous study of the same genes from a similar set of species found robust evidence for a mesophilic last universal common ancestor (LUCA) by inference of the G+C composition at the root of the tree. In the present study, we find that the helical GC composition at the root is strongly dependent on the root position. With a bacterial rooting, we find that there is no longer strong support for either a mesophile or a thermophile LUCA, although a hyperthermophile LUCA remains unlikely. We discuss reasons why results using only RNA helices may differ from results using all aligned sites when applying nonhomogeneous models to RNA genes.  相似文献   

8.
We assessed the utility of eight DNA sequence markers (5.8S rDNA, 18S rDNA, 28S rDNA, ITS regions, long-wavelength opsin, elongation factor 1-alpha, cytochrome b, and cytochrome oxidase I) in reconstructing phylogenetic relationships at various levels of divergence in gallwasps (Hymenoptera: Cynipidae), using a set of eight exemplar taxa. We report sequence divergence values and saturation levels and compare phylogenetic results of these sequences analyzed both separately and combined to a well-corroborated morphological phylogeny. Likelihood ratio tests were used to find the best evolutionary model fitting each of the markers. The likelihood model best explaining the data is, for most loci, parameter rich, with strong A-T bias for mitochondrial loci and strong rate heterogeneity for the majority of loci. Our data suggest that 28S rDNA, elongation factor 1-alpha, and long-wavelength opsin may be potentially useful markers for the resolution of cynipid and other insect within-family-level divergences (circa 50-100 mya old), whereas mitochondrial loci and ITS regions are most useful for lower-level phylogenetics. In contrast, the 18S rDNA marker is likely to be useful for the resolution of above-family-level relationships.  相似文献   

9.
Slipped-strand mispairing: a major mechanism for DNA sequence evolution   总被引:128,自引:13,他引:128  
Simple repetitive DNA sequences are a widespread and abundant feature of genomic DNA. The following several features characterize such sequences: (1) they typically consist of a variety of repeated motifs of 1-10 bases--but may include much larger repeats as well; (2) larger repeat units often include shorter ones within them; (3) long polypyrimidine and poly-CA tracts are often found; and (4) tandem arrangements of closely related motifs are often found. We propose that slipped-strand mispairing events, in concert with unequal crossing- over, can readily account for all of these features. The frequent occurrence of long tandem repeats of particular motifs (polypyrimidine and poly-CA tracts) appears to result from nonrandom patterns of nucleotide substitution. We argue that the intrahelical process of slipped-strand mispairing is much more likely to be the major factor in the initial expansion of short repeated motifs and that, after initial expansion, simple tandem repeats may be predisposed to further expansion by unequal crossing-over or other interhelical events because of their propensity to mispair. Evidence is presented that single-base repeats (the shortest possible motifs) are represented by longer runs in mammalian introns than would be expected on a random basis, supporting the idea that SSM may be a ubiquitous force in the evolution of the eukaryotic genome. Simple repetitive sequences may therefore represent a natural ground state of DNA unselected for coding functions.   相似文献   

10.
Variation in 30 chloroplast DNAs, representing 22 wild and cultivated accessions in the genus Pisum, was analyzed by comparing fragment patterns produced by 16 restriction endonucleases. Three types of mutations were detected. First, an inversion of between 2.2 kilobase pairs (kb) and 5.2 kb distinguished a population of P. humile from all other Pisum accessions examined. Second, deletions and insertions of between 50 and 1200 base pairs produced small restriction fragment length variations in four regions of the 120-kb chloroplast genome. Two of these regions—one of which is located within the sequence that is inverted in P. humile—showed a high degree of size polymorphism, to the extent that size differences were detected between individuals from the same accession. Finally, a total of only 11 restriction site mutations were detected among the 165 restriction sites sampled in the 30 DNAs. Based on these results and previous data, we conclude that the chloroplast genome is evolving very slowly relative to nuclear and mitochondrial DNAs. The Pisum chloroplast DNA restriction site mutations define two major lineages: One includes all tested accessions of P. fulvum, which is known to be cytogenetically quite distinct from all other Pisum taxa. The second includes 12 of 13 cultivated lines of the garden pea (P. sativum) and a wild population of P. humile from northern Israel. These observations strongly reinforce an earlier conclusion that the cultivated pea was domesticated primarily from northern populations of P. humile. A 13th P. sativum cultivar has a chloroplast genome that is significantly different from those of the aforementioned lines and somewhat more similar to those of P. elatius and southern populations of P. humile. This observation indicates that secondary hybridization may have occurred during the domestication of the garden pea.  相似文献   

11.
Ribosomal DNA: molecular evolution and phylogenetic inference.   总被引:79,自引:0,他引:79  
Ribosomal DNA (rDNA) sequences have been aligned and compared in a number of living organisms, and this approach has provided a wealth of information about phylogenetic relationships. Studies of rDNA sequences have been used to infer phylogenetic history across a very broad spectrum, from studies among the basal lineages of life to relationships among closely related species and populations. The reasons for the systematic versatility of rDNA include the numerous rates of evolution among different regions of rDNA (both among and within genes), the presence of many copies of most rDNA sequences per genome, and the pattern of concerted evolution that occurs among repeated copies. These features facilitate the analysis of rDNA by direct RNA sequencing, DNA sequencing (either by cloning or amplification), and restriction enzyme methodologies. Constraints imposed by secondary structure of rRNA and concerted evolution need to be considered in phylogenetic analyses, but these constraints do not appear to impede seriously the usefulness of rDNA. An analysis of aligned sequences of the four nuclear and two mitochondrial rRNA genes identified regions of these genes that are likely to be useful to address phylogenetic problems over a wide range of levels of divergence. In general, the small subunit nuclear sequences appear to be best for elucidating Precambrian divergences, the large subunit nuclear sequences for Paleozoic and Mesozoic divergences, and the organellar sequences of both subunits for Cenozoic divergences. Primer sequences were designed for use in amplifying the entire nuclear rDNA array in 15 sections by use of the polymerase chain reaction; these "universal" primers complement previously described primers for the mitochondrial rRNA genes. Pairs of primers can be selected in conjunction with the analysis of divergence of the rRNA genes to address systematic problems throughout the hierarchy of life.  相似文献   

12.
Most bioinformatics tools require specialized input formats for sequence comparison and analysis. This is particularly true for molecular phylogeny programs, which accept only certain formats. In addition, it is often necessary to eliminate highly similar sequences among the input, especially when the dataset is large. Moreover, most programs have restrictions upon the sequence name. Here we introduce SeqMaT, a Sequence Manipulation Tool. It has the following functions: data format conversion,sequence name coding and decoding,redundant and highly similar sequence removal, anddata mining utilities. SeqMaT was developed using Java with two versions, web-based and standalone. A standalone program is convenient to manipulate a large number of sequences, while the web version will guarantee wide availability of the tool for researchers and practitioners throughout the Internet. AVAILABILITY: The database is available for free at http://glee.ist.unomaha.edu/seqmat.  相似文献   

13.
At present, the Tibetan Mastiff is the oldest and most ferocious dog in the world. However, the origin of the Tibetan Mastiff and its Phylogenetic relationship with other large breed dogs such as Saint Bernard are unclear. In this study, the primers were designed according to the mitochondrial genome sequence of the domestic dog, and the 2,525 bp mitochondrial sequence, containing the whole sequence of Cytochrome b, tRNA-Thr, tRNA-Pro, and control region of the Tibetan Mastiff, was obtained. Using grey wolves and coyotes as outgroups, the Tibetan Mastiff and 12 breeds of domestic dogs were analyzed in phylogenesis. Tibetan Mastiff, domestic dog breeds, and grey wolves were clustered into a group and coyotes were clustered in a group separately. This indicated that the Tibetan Mastiff and the other domestic dogs originated from the grey wolf, and the Tibetan Mastiff belonged to Carnivora, Canidae, Canis, Canis lupus, Canis lupus familiaris on the animal taxonomy. In domestic dogs, the middle and small breed dogs were clustered at first; German Sheepdog, Swedish Elkhound, and Black Russian Terrier were clustered into one group, and the Tibetan Mastiff, Old English Sheepdog, Leonberger, and Saint Bernard were clustered in another group. This confirmed the viewpoint that many of the famous large breed dogs worldwide Such as Saint Bernard possibly had the blood lineage of the Tibetan Mastiff, based on the molecular data. According to the substitution rate, we concluded that the approximate divergence time between Tibetan Mastiff and grey wolf was 58,000 years before the present (YBP), and the approximate divergence time between other domestic dogs and grey wolf was 42,000 YBP, demonstrating that the time of origin of the Tibetan Mastiff was earlier than that of the other domestic dogs.  相似文献   

14.
15.
Live history evolution in Serpulimorph polychaetes: a phylogenetic analysis   总被引:1,自引:0,他引:1  
The widely accepted hypothesis of plesiomorphy of planktotrophic, and apomorphy of lecithotrophic, larval development in marine invertebrates has been recently challenged as a result of phylogenetic analyses of various taxa. Here the evolution of planktotrophy and lecithotrophy in Serpulimorph polychaetes (families Serpulidae and Spirorbidae) was studied using a hypothesis of phylogenetic relationships in this group. A phylogenetic (parsimony) analysis of 36 characters (34 morphological, 2 developmental) was performed for 12 selected serpulid and 6 spirorbid species with known reproductive/developmental strategies. Four species of Sabellidae were used in the outgroup. The analysis yielded 4 equally parsimonious trees of 78 steps, with a consistency index (CI) of 0.654 (CI excluding uninformative characters is 0.625). Under the assumption of unweighted parsimony analysis, planktotrophic larvae are apomorphic and non-feeding brooded embryos are plesiomorphic in serpulimorph polychaetes. The estimated polarity of life history transitions may be strengthened by further studies demonstrating an absence of a unidirectional bias in planktotrophy-lecithotrophy transition in polychaetes.  相似文献   

16.
VOSTORG is a new, versatile package of programs for the inference and presentation of phylogenetic trees, as well as an efficient tool for nucleotide (nt) and amino acid (aa) sequence analysis (sequence input, verification, alignment, construction of consensus, etc.). On appropriately equipped systems, these data can be displayed on a video monitor or printed as required. They are implemented on IBM PC/XT/AT/PS-2 or compatible computers and hardware graphic support is recommended. The package is designed to be easily handled by occasional computer users and yet it is powerful enough for experienced professionals.  相似文献   

17.
Passerine birds comprise over half of avian diversity, but have proved difficult to classify. Despite a long history of work on this group, no comprehensive hypothesis of passerine family-level relationships was available until recent analyses of DNA-DNA hybridization data. Unfortunately, given the value of such a hypothesis in comparative studies of passerine ecology and behaviour, the DNA-hybridization results have not been well tested using independent data and analytical approaches. Therefore, we analysed nucleotide sequence variation at the nuclear RAG-1 and c-mos genes from 69 passerine taxa, including representatives of most currently recognized families. In contradiction to previous DNA-hybridization studies, our analyses suggest paraphyly of suboscine passerines because the suboscine New Zealand wren Acanthisitta was found to be sister to all other passerines. Additionally, we reconstructed the parvorder Corvida as a basal paraphyletic grade within the oscine passerines. Finally, we found strong evidence that several family-level taxa are misplaced in the hybridization results, including the Alaudidae, Irenidae, and Melanocharitidae. The hypothesis of relationships we present here suggests that the oscine passerines arose on the Australian continental plate while it was isolated by oceanic barriers and that a major northern radiation of oscines (i.e. the parvorder Passerida) originated subsequent to dispersal from the south.  相似文献   

18.
Voltage-sensitive cation-selective ion channels of the voltage-gated ion channel (VGC) superfamily were examined by a combination of sequence alignment and phylogenetic tree construction procedures. Segments of the alpha-subunits of K+-selective channels homologous to the structurally elucidated KcsA channel of Streptomyces lividans were multiply aligned, and this alignment provided the database for computer-assisted structural analyses and phylogenetic tree construction. Similar analyses were conducted with the four homologous repeats of the alpha-subunits from representative Ca2+- and Na+-selective channels, as well as with the ensemble of K+, Ca2+ and Na+ channels. In both the single subunit of the K+ channels and the individual repeats of the Ca2+ and Na+ channels, the analyses suggest the occurrence of at least two tandemly arranged modules corresponding to the predicted voltage-sensor domain and the pore domain. The phylogenetic analyses reveal strict clustering of segments according to cation-selectivity and repeat unit. We surmise that the pore module of the prokaryotic K+ channel was the primordial polypeptide upon which other modules were superimposed during evolution in order to generate phenotypic diversity. These observations may prove applicable to all members of the VGC family yet to be discovered throughout the prokaryotic and eukaryotic kingdoms.  相似文献   

19.
Savill NJ  Hoyle DC  Higgs PG 《Genetics》2001,157(1):399-411
We test models for the evolution of helical regions of RNA sequences, where the base pairing constraint leads to correlated compensatory substitutions occurring on either side of the pair. These models are of three types: 6-state models include only the four Watson-Crick pairs plus GU and UG; 7-state models include a single mismatch state that combines all of the 10 possible mismatches; 16-state models treat all mismatch states separately. We analyzed a set of eubacterial ribosomal RNA sequences with a well-established phylogenetic tree structure. For each model, the maximum-likelihood values of the parameters were obtained. The models were compared using the Akaike information criterion, the likelihood-ratio test, and Cox's test. With a high significance level, models that permit a nonzero rate of double substitutions performed better than those that assume zero double substitution rate. Some models assume symmetry between GC and CG, between AU and UA, and between GU and UG. Models that relaxed this symmetry assumption performed slightly better, but the tests did not all agree on the significance level. The most general time-reversible model significantly outperformed any of the simplifications. We consider the relative merits of all these models for molecular phylogenetics.  相似文献   

20.
gm: a practical tool for automating DNA sequence analysis   总被引:1,自引:0,他引:1  
The gm (gene modeler) program automates the identification ofcandidate genes in anonymous, genomic DNA sequence data, gmaccepts sequence data, organism-specific consensus matricesand codon asymmetry tables, and a set of parameters as input;it returns a set of models describing the structures of candidategenes in the sequence and a corresponding set of predicted aminoacid sequences as output, gm is implemented in C, and has beentested on Sun, VAX, Sequent, MIPS and Cray computers. It iscapable of analyzing sequences of several kilobases containingmulti-exon genes in >1 min execution time on a Sun 4/60. Received on December 4, 1989; accepted on February 28, 1990  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号