共查询到20条相似文献,搜索用时 15 毫秒
1.
Jon P. Anderson Allen G. Rodrigo Gerald H. Learn Yang Wang Hillard Weinstock Marcia L. Kalish Kenneth E. Robbins Leroy Hood James I. Mullins 《Journal of molecular evolution》2001,53(1):55-62
Phylogenetic analyses frequently rely on models of sequence evolution that detail nucleotide substitution rates, nucleotide
frequencies, and site-to-site rate heterogeneity. These models can influence hypothesis testing and can affect the accuracy
of phylogenetic inferences. Maximum likelihood methods of simultaneously constructing phylogenetic tree topologies and estimating
model parameters are computationally intensive, and are not feasible for sample sizes of 25 or greater using personal computers.
Techniques that initially construct a tree topology and then use this non-maximized topology to estimate ML substitution rates,
however, can quickly arrive at a model of sequence evolution. The accuracy of this two-step estimation technique was tested
using simulated data sets with known model parameters. The results showed that for a star-like topology, as is often seen
in human immunodeficiency virus type 1 (HIV-1) subtype B sequences, a random starting topology could produce nucleotide substitution
rates that were not statistically different than the true rates. Samples were isolated from 100 HIV-1 subtype B infected individuals
from the United States and a 620 nt region of the env gene was sequenced for each sample. The sequence data were used to obtain a substitution model of sequence evolution specific
for HIV-1 subtype B env by estimating nucleotide substitution rates and the site-to-site heterogeneity in 100 individuals from the United States.
The method of estimating the model should provide users of large data sets with a way to quickly compute a model of sequence
evolution, while the nucleotide substitution model we identified should prove useful in the phylogenetic analysis of HIV-1
subtype B env sequences.
Received: 4 October 2000 / Accepted: 1 March 2001 相似文献
2.
The study of rates of nucleotide substitution in RNA viruses is central to our understanding of their evolution. Herein we
report a comprehensive analysis of substitution rates in 50 RNA viruses using a recently developed maximum likelihood phylogenetic
method. This analysis revealed a significant relationship between genetic divergence and isolation time for an extensive array
of RNA viruses, although more rate variation was usually present among lineages than would be expected under the constraints
of a molecular clock. Despite the lack of a molecular clock, the range of statistically significant variation in overall substitution
rates was surprisingly narrow for those viruses where a significant relationship between genetic divergence and time was found,
as was the case when synonymous sites were considered alone, where the molecular clock was rejected less frequently. An analysis
of the ecological and genetic factors that might explain this rate variation revealed some evidence of significantly lower
substitution rates in vector-borne viruses, as well as a weak correlation between rate and genome length. Finally, a simulation
study revealed that our maximum likelihood estimates of substitution rates are valid, even if the molecular clock is rejected,
provided that sufficiently large data sets are analyzed.
Received: 23 February 2001 / Accepted: 3 July 2001 相似文献
3.
Dorota Szczepanik Paweł Mackiewicz Maria Kowalczuk Agnieszka Gierlik Aleksandra Nowicka Mirosław R. Dudek Stanisław Cebrat 《Journal of molecular evolution》2001,52(5):426-433
One of the main causes of bacterial chromosome asymmetry is replication-associated mutational pressure. Different rates of
nucleotide substitution accumulation on leading and lagging strands implicate qualitative and quantitative differences in
the accumulation of mutations in protein coding sequences lying on different DNA strands. We show that the divergence rate
of orthologs situated on leading strands is lower than the divergence rate of those situated on lagging strands. The ratio
of the mutation accumulation rate for sequences lying on lagging strands to that of sequences lying on leading strands is
rather stable and time-independent. The divergence rate of sequences which changed their positions, with respect to the direction
of replication fork movement, is not stable—sequences which have recently changed their positions are the most prone to mutation
accumulation. This effect may influence estimations of evolutionary distances between species and the topology of phylogenetic
trees.
Received: 24 July 2000 / Accepted: 16 January 2001 相似文献
4.
Molecular evolution of nitrate reductase genes 总被引:9,自引:0,他引:9
To understand the evolutionary mechanisms and relationships of nitrate reductases (NRs), the nucleotide sequences encoding
19 nitrate reductase (NR) genes from 16 species of fungi, algae, and higher plants were analyzed. The NR genes examined show
substantial sequence similarity, particularly within functional domains, and large variations in GC content at the third codon
position and intron number. The intron positions were different between the fungi and plants, but conserved within these groups.
The overall and nonsynonymous substitution rates among fungi, algae, and higher plants were estimated to be 4.33 × 10−10 and 3.29 × 10−10 substitutions per site per year. The three functional domains of NR genes evolved at about one-third of the rate of the N-terminal
and the two hinge regions connecting the functional domains. Relative rate tests suggested that the nonsynonymous substitution
rates were constant among different lineages, while the overall nucleotide substitution rates varied between some lineages.
The phylogenetic trees based on NR genes correspond well with the phylogeny of the organisms determined from systematics and
other molecular studies. Based on the nonsynonymous substitution rate, the divergence time of monocots and dicots was estimated
to be about 340 Myr when the fungi–plant or algae–higher plant divergence times were used as reference points and 191 Myr
when the rice–barley divergence time was used as a reference point. These two estimates are consistent with other estimates
of divergence times based on these reference points. The lack of consistency between these two values appears to be due to
the uncertainty of the reference times.
Received: 10 April 1995 / Accepted: 10 September 1995 相似文献
5.
Mitochondrial DNA (mtDNA) sequences are widely used for inferring the phylogenetic relationships among species. Clearly,
the assumed model of nucleotide or amino acid substitution used should be as realistic as possible. Dependence among neighboring
nucleotides in a codon complicates modeling of nucleotide substitutions in protein-encoding genes. It seems preferable to
model amino acid substitution rather than nucleotide substitution. Therefore, we present a transition probability matrix of
the general reversible Markov model of amino acid substitution for mtDNA-encoded proteins. The matrix is estimated by the
maximum likelihood (ML) method from the complete sequence data of mtDNA from 20 vertebrate species. This matrix represents
the substitution pattern of the mtDNA-encoded proteins and shows some differences from the matrix estimated from the nuclear-encoded
proteins. The use of this matrix would be recommended in inferring trees from mtDNA-encoded protein sequences by the ML method.
Received: 3 May 1995 / Accepted: 31 October 1995 相似文献
6.
David Posada 《Journal of molecular evolution》2001,52(5):434-444
Models of sequence evolution play an important role in molecular evolutionary studies. The use of inappropriate models of
evolution may bias the results of the analysis and lead to erroneous conclusions. Several procedures for selecting the best-fit
model of evolution for the data at hand have been proposed, like the likelihood ratio test (LRT) and the Akaike (AIC) and
Bayesian (BIC) information criteria. The relative performance of these model-selecting algorithms has not yet been studied
under a range of different model trees. In this study, the influence of branch length variation upon model selection is characterized.
This is done by simulating sequence alignments under a known model of nucleotide substitution, and recording how often this
true model is recovered by different model-fitting strategies. Results of this study agree with previous simulations and suggest
that model selection is reasonably accurate. However, different model selection methods showed distinct levels of accuracy.
Some LRT approaches showed better performance than the AIC or BIC information criteria. Within the LRTs, model selection is
affected by the complexity of the initial model selected for the comparisons, and only slightly by the order in which different
parameters are added to the model. A specific hierarchy of LRTs, which starts from a simple model of evolution, performed
overall better than other possible LRT hierarchies, or than the AIC or BIC.
Received: 2 October 2000 / Accepted: 4 January 2001 相似文献
7.
Suzuki Y Katayama K Fukushi S Kageyama T Oya A Okamura H Tanaka Y Mizokami M Gojobori T 《Journal of molecular evolution》1999,48(4):383-389
With the aim of elucidating evolutionary features of GB virus C/hepatitis G virus (GBV-C/HGV), molecular evolutionary analyses
were conducted using the entire coding region of this virus. In particular, the rate of nucleotide substitution for this virus
was estimated to be less than 9.0 × 10−6 per site per year, which was much slower than those for other RNA viruses. The phylogenetic tree reconstructed for GBV-C/HGV,
by using GB virus A (GBV-A) as outgroup, indicated that there were three major clusters (the HG, GB, and Asian types) in GBV-C/HGV,
and the divergence between the ancestor of GB- and Asian-type strains and that of HG-type strains first took place more than
7000–10,000 years ago. The slow evolutionary rate for GBV-C/HGV suggested that this virus cannot escape from the immune response
of the host by means of producing escape mutants, implying that it may have evolved other systems for persistent infection.
Received: 2 June 1998 / Accepted: 8 August 1998 相似文献
8.
We document the phylogenetic behavior of the 18S rRNA molecule in 67 taxa from 28 metazoan phyla and assess the effects of
among-site rate variation on reconstructing phylogenies of the animal kingdom. This empirical assessment was undertaken to
clarify further the limits of resolution of the 18S rRNA gene as a phylogenetic marker and to address the question of whether
18S rRNA phylogenies can be used as a source of evidence to infer the reality of a Cambrian explosion. A notable degree of
among-site rate variation exists between different regions of the 18S rRNA molecule, as well as within all classes of secondary
structure. There is a significant negative correlation between inferred number of nucleotide substitutions and phylogenetic
information, as well as with the degree of substitutional saturation within the molecule. Base compositional differences both
within and between taxa exist and, in certain lineages, may be associated with long branches and phylogenetic position. Importantly,
excluding sites with different degrees of nucleotide substitution significantly influences the topology and degree of resolution
of maximum-parsimony phylogenies as well as neighbor-joining phylogenies (corrected and uncorrected for among-site rate variation)
reconstructed at the metazoan scale. Together, these data indicate that the 18S rRNA molecule is an unsuitable candidate for
reconstructing the evolutionary history of all metazoan phyla, and that the polytomies, i.e., unresolved nodes within 18S
rRNA phylogenies, cannot be used as a single or reliable source of evidence to support the hypothesis of a Cambrian explosion.
Received: 9 December 1997 / Accepted: 23 March 1998 相似文献
9.
Aurora M. Nedelcu 《Journal of molecular evolution》2001,53(6):670-679
This study provides a phylogenetic/comparative approach to deciphering the processes underlying the evolution of plastid
rRNA genes in genomes under relaxed functional constraints. Nonphotosynthetic green algal taxa that belong to two distinct
classes, Chlorophyceae (Polytoma) and Trebouxiophyceae (Prototheca), were investigated. Similar to the situation described previously for plastid 16S rRNA genes in nonphotosynthetic land plants,
nucleotide substitution levels, extent of structural variations, and percentage AT values are increased in nonphotosynthetic
green algae compared to their closest photosynthetic relatives. However, the mutational processes appear to be different in
many respects. First, with the increase in AT content, more transversions are noted in Polytoma and holoparasite angiosperms, while more transitions characterize the evolution of the 16S rDNA sequences in Prototheca. Second, although structural variations do accumulate in both Polytoma and Prototheca (as well as holoparasitic plastid 16S rRNAs), insertions as large as 1.6 kb characterize the plastid 16S rRNA genes in the
former, whereas significantly smaller indels (not exceeding 24 bp) seem to be more prevalent in the latter group. The differences
in evolutionary rates and patterns within and between lineages might be due to mutations in replication/repair-related genes;
slipped-strand mispairing is likely the mechanism responsible for the expansion of insertions in Polytoma plastid 16S rRNA genes.
Received: 29 December 2000 / Accepted: 18 May 2001 相似文献
10.
We investigated the occurrence of gene conversions between paralogous sequences of Salmoninae derived from ancestral tetraploidization
and their effect on the evolutionary history of DNA sequences. A microsatellite with long flanking regions (750 bp) including
both coding and noncoding sequences was analyzed. Microsatellite size polymorphism was used to detect the alleles of both
paralogous counterparts and infer linkage arrangement between loci. DNA sequencing of seven Salmoninae species revealed that
paralogous sequences were highly differentiated within species, especially for noncoding regions. Ten gene conversion events
between paralogous sequences were inferred. While these events appears to have homogenized regions of otherwise highly differential
paralogous sequences, they amplified the differentiation among orthologous sequences. Their effects were larger on coding
than on noncoding regions. As a consequence, noncoding sequences grouped by orthologous lineages in phylogenetic trees, whereas
coding regions grouped by taxa. Based upon these results, we present a model showing how gene conversion events may also result
in the PCR amplification of nonorthologous sequences in different taxa, with obvious complications for phylogenetic inferences,
comparative mapping, and population genetic studies.
Received: 11 October 2000 / Accepted: 18 September 2001 相似文献
11.
Graziano Pesole Luigi R. Ceci Carmela Gissi Cecilia Saccone Carla Quagliariello 《Journal of molecular evolution》1996,43(5):447-452
We have analyzed the nad3-rps12 locus for eight angiosperms in order to compare the utility of mitochondrial DNA and edited mRNA sequences in phylogenetic
reconstruction. The two coding regions, containing from 25 to 35 editing sites in the various plants, have been concatenated
in order to increase the significance of the analysis. Differing from the corresponding chloroplast sequences, unedited mitochondrial
DNA sequences seem to evolve under a quasi-neutral substitution process which undifferentiates the nucleotide substitution
rates for the three codon positions. By using complete gene sequences (all codon positions) we found that genomic sequences
provide a classical angiosperm phylogenetic tree with a clear-cut grouping of monocotyledons and dicotyledons with Magnoliidae
at the basal branch of the tree. Conversely, owing to their low nucleotide substitution rates, edited mRNA sequences were
found not to be suitable for studying phylogenetic relationships among angiosperms.
Received: 24 January 1996 / Accepted: 5 June 1996 相似文献
12.
We have investigated the phylogenetic relationships of monotremes and marsupials using nucleotide sequence data from the
neurotrophins; nerve growth factor (NGF), brain-derived neurotrophic factor (BDNF), and neurotrophin-3 (NT-3). The study included
species representing monotremes, Australasian marsupials and placentals, as well as species representing birds, reptiles,
and fish. PCR was used to amplify fragments encoding parts of the neurotrophin genes from echidna, platypus, and eight marsupials
from four different orders. Phylogenetic trees were generated using parsimony analysis, and support for the different tree
structures was evaluated by bootstrapping. The analysis was performed with NGF, BDNF, or NT-3 sequence data used individually
as well as with the three neurotrophins in a combined matrix, thereby simultaneously considering phylogenetic information
from three separate genes. The results showed that the monotreme neurotrophin sequences associate to either therian or bird
neurotrophin sequences and suggests that the monotremes are not necessarily related closer to therians than to birds. Furthermore,
the results confirmed the present classification of four Australasian marsupial orders based on morphological characters,
and suggested a phylogenetic relationship where Dasyuromorphia is related closest to Peramelemorphia followed by Notoryctemorphia
and Diprotodontia. These studies show that sequence data from neurotrophins are well suited for phylogenetic analysis of mammals
and that neurotrophins can resolve basal relationships in the evolutionary tree.
Received: 27 January 1997 / Accepted: 20 March 1997 相似文献
13.
14.
Analysis of DNA sequences of 132 introns and 140 exons from 42 pairs of orthologous genes of mouse and rat was used to compare
patterns of evolutionary change between introns and exons. The mean of the absolute difference in length (measured in base
pairs) between the two species was nearly five times as high in the case of introns as in the case of exons. The average rate
of nucleotide substitution in introns was very similar to the rate of synonymous substitution in exons, and both were about
three times the rate of substitution at nonsynonymous sites in exons. G+C content of introns and exons of the same gene were
correlated; but mean G+C content at the third positions of exons was significantly higher than that of introns or positions
1–2 of exons from the same gene. G+C content was conserved over evolutionary time, as indicated by strong correlations between
mouse and rat; but the change in G+C content was greatest at position 3 of exons, intermediate in introns, and lowest at positions
1–2 in introns.
Received: 23 December 1996 / Accepted: 1 April 1997 相似文献
15.
A detailed analysis of the evolutionary history of hepatitis B virus (HBV) was undertaken using 39 mammalian hepadnaviruses
for which complete genome sequences were available, including representatives of all six human genotypes, as well as a large
sample of small S gene sequences. Phylogenetic trees of these data were ambiguous, supporting no single place of origin for
HBV, and depended heavily on the underlying model of DNA substitution. In some instances genotype F, predominant in the Americas,
was the first to diverge, suggesting that the virus arose in the New World. In other trees, however, sequences from genotype
B, prevalent in East Asia, were the most divergent. An attempt was also made to determine the rate of nucleotide substitution
in the C open reading frame and then to date the origin of HBV. However, no relationship between time and number of substitutions
was found in two independent data sets, indicating that a reliable molecular clock does not exist for these data. Both the
pattern and the rate of nucleotide substitution are therefore complex phenomena in HBV and hinder any attempt to reconstruct
the past spread of this virus.
Received: 5 December 1998 / Accepted: 23 February 1999 相似文献
16.
Characteristics of Nucleotide Substitution in the Hepatitis C Virus Genome: Constraints on Sequence Change in Coding Regions at Both Ends of the Genome 总被引:19,自引:0,他引:19
Comparison of complete genome sequences for different variants of hepatitis C virus (HCV) reveals several different constraints
on sequence change. Synonymous changes are suppressed in coding regions at both 5′ and 3′ ends of the genome. No evidence
was found for the existence of alternative reading frames or for a lower mutation frequency in these regions. Instead, suppression
may be due to constraints imposed by RNA secondary structures identified within the core and NS5b genes. Nonsynonymous substitutions
are less frequent than synonymous ones except in the hypervariable region of E2 and, to a lesser extent, in E1, NS2, and NS5b.
Transitions are more frequent than transversions, particularly at the third position of codons where the bias is 16:1. In
addition, nucleotide substitutions may not occur symmetrically since there is a bias toward G or C at the third position of
codons, while T ↔ C transitions were twice as frequent as A ↔ G transitions. These different biases do not affect the phylogenetic
analysis of HCV variants but need to be taken into account in interpreting sequence change in longitudinal studies.
Received: 9 September 1996 / Accepted: 20 April 1997 相似文献
17.
Synonymous and nonsynonymous rate variation in nuclear genes of mammals 总被引:34,自引:6,他引:28
A maximum likelihood approach was used to estimate the synonymous and nonsynonymous substitution rates in 48 nuclear genes
from primates, artiodactyls, and rodents. A codon-substitution model was assumed, which accounts for the genetic code structure,
transition/transversion bias, and base frequency biases at codon positions. Likelihood ratio tests were applied to test the
constancy of nonsynonymous to synonymous rate ratios among branches (evolutionary lineages). It is found that at 22 of the
48 nuclear loci examined, the nonsynonymous/synonymous rate ratio varies significantly across branches of the tree. The result
provides strong evidence against a strictly neutral model of molecular evolution. Our likelihood estimates of synonymous and
nonsynonymous rates differ considerably from previous results obtained from approximate pairwise sequence comparisons. The
differences between the methods are explored by detailed analyses of data from several genes. Transition/transversion rate
bias and codon frequency biases are found to have significant effects on the estimation of synonymous and nonsynonymous rates,
and approximate methods do not adequately account for those factors. The likelihood approach is preferable, even for pairwise
sequence comparison, because more-realistic models about the mutation and substitution processes can be incorporated in the
analysis.
Received: 17 May 1997 / Accepted: 28 September 1997 相似文献
18.
McClellan DA 《Journal of molecular evolution》2000,51(3):185-193
The codon-degeneracy model (CDM) predicts relative frequencies of substitution for any set of homologous protein-coding DNA
sequences based on patterns of nucleotide degeneracy, codon composition, and the assumption of selective neutrality. However,
at present, the CDM is reliant on outside estimates of transition bias. A new method by which the power of the CDM can be
used to find a synonymous transition bias that is optimal for any given phylogenetic tree topology is presented. An example
is illustrated that utilizes optimized transition biases to generate CDM GF-scores for every possible phylogenetic tree for
pocket gophers of the genus Orthogeomys. The resulting distribution of CDM GF-scores is compared and contrasted with the results of maximum parsimony and maximum
likelihood methods. Although convergence on a single tree topology by the CDM and another method indicates greater support
for that particular tree, the value of CDM GF-score as the sole optimality criterion for phylogeny reconstruction remains
to be determined. It is clear, however, that the a priori estimation of an optimum transition bias from codon composition
has a direct application to differentiating between alternative trees.
Received: 13 October 1999 / Accepted: 28 April 2000 相似文献
19.
Nucleotide Substitution Rate of Mammalian Mitochondrial Genomes 总被引:22,自引:0,他引:22
We present here for the first time a comprehensive study based on the analysis of closely related organisms to provide an
accurate determination of the nucleotide substitution rate in mammalian mitochondrial genomes. This study examines the evolutionary
pattern of the different functional mtDNA regions as accurately as possible on the grounds of available data, revealing some
important ``genomic laws.' The main conclusions can be summarized as follows. (1) High intragenomic variability in the evolutionary
dynamic of mtDNA was found. The substitution rate is strongly dependent on the region considered, and slow- and fast-evolving
regions can be identified. Nonsynonymous sites, the D-loop central domain, and tRNA and rRNA genes evolve much more slowly
than synonymous sites and the two peripheral D-loop region domains. The synonymous rate is fairly uniform over the genome,
whereas the rate of nonsynonymous sites depends on functional constraints and therefore differs considerably between genes.
(2) The commonly accepted statement that mtDNA evolves more rapidly than nuclear DNA is valid only for some regions, thus
it should be referred to specific mitochondrial components. In particular, nonsynonymous sites show comparable rates in mitochondrial
and nuclear genes; synonymous sites and small rRNA evolve about 20 times more rapidly and tRNAs about 100 times more rapidly
in mitochondria than in their nuclear counterpart. (3) A species-specific evolution is particularly evident in the D-loop
region. As the divergence times of the organism pairs under consideration are known with sufficient accuracy, absolute nucleotide
substitution rates are also provided.
Received: 11 May 1998 / Accepted: 2 September 1998 相似文献
20.
The Evolutionary History of Prosaposin: Two Successive Tandem-Duplication Events Gave Rise to the Four Saposin Domains in Vertebrates 总被引:1,自引:0,他引:1
Einat Hazkani-Covo Neta Altman Mia Horowitz Dan Graur 《Journal of molecular evolution》2002,54(1):30-34
Prosaposin is a multifunctional protein encoded by a single-copy gene. It contains four saposin domains (A, B, C, and D)
occurring as tandem repeats connected by linker sequences. Because the saposin domains are similar to one another, it is deduced
that they were created by sequential duplications of an ancestral domain. There are two types of evolutionary scenarios that
may explain the creation of the four-domain gene: (1) two rounds of tandem internal gene duplication and (2) three rounds
of duplications. An evolutionary and phylogenetic analysis of saposin DNA and amino acid sequences from human, mouse, rat,
chicken, and zebrafish indicates that the first evolutionary scenario is the most likely. Accordingly, an ancestral saposin-unit
duplication produced a two-domain gene, which, subsequently, underwent a second complete tandem duplication to give rise to
the present four-domain structure of the prosaposin gene.
Received: 8 February 2001 / Accepted: 29 June 2001 相似文献