共查询到20条相似文献,搜索用时 609 毫秒
1.
McClellan DA 《Journal of molecular evolution》2000,51(2):131-140
Mitochondrial genetic codons can be categorized by four patterns of nucleotide-site degeneracy based on varying combinations
of twofold- or nondegenerate sites at first codon positions and twofold- or fourfold-degenerate sites at third codon positions.
Herein, a model of molecular evolution is introduced that uses these patterns to calculate expected substitution frequencies
for each codon position and substitution type relative to overall number of synonymous or nonsynonymous substitutions. Regions
of the pocket gopher cytochrome oxidase subunit I (COI) and cytochrome b (cyt-b) genes are analyzed using this model. Chi-square distributions are used to produce relative goodness-of-fit (GF) scores for
measuring the difference between substitution frequencies predicted by the codon-degeneracy model (CDM), and frequencies inferred
using a well-supported phylogenetic tree of closely related species. The GF scores for expected and observed synonymous (GFsyn= 0.429, p= 0.807) and nonsynonymous (GFns= 2.309, p= 0.679) substitution frequencies resulted in a failure to reject the CDM as a null hypothesis for the molecular evolution
of COI and cyt-b in pocket gophers. Alternative tree topologies and calculations of transition bias for these data result in higher GF scores.
Received: 25 March 1999 / Accepted: 17 September 1999 相似文献
2.
To study the evolution of mtDNA and the intergeneric relationships of New World Jays (Aves: Corvidae), we sequenced the entire
mitochondrial DNA control region (CR) from 21 species representing all genera of New World jays, an Old World jay, crows,
and a magpie. Using maximum likelihood methods, we found that both the transition/transversion ratio (κ) and among site rate
variation (α) were higher in flanking domains I and II than in the conserved central domain and that the frequency of indels
was highest in domain II. Estimates of κ and α were much more influenced by the density of taxon sampling than by alternative
optimal tree topologies. We implemented a successive approximation method incorporating these parameters into phylogenetic
analysis. In addition we compared our study in detail to a previous study using cytochrome b and morphology to examine the effect of taxon sampling, evolutionary rates of genes, and combined data on tree resolution.
We found that the particular weighting scheme used had no effect on tree topology and little effect on tree robustness. Taxon
sampling had a significant effect on tree robustness but little effect on the topology of the best tree. The CR data set differed
nonsignificantly from the tree derived from the cytochrome b/morphological data set primarily in the placement of the genus Gymnorhinus, which is near the base of the CR tree. However, contrary to conventional taxonomy, the CR data set suggested that blue and
black jays (Cyanocorax sensu lato) might be paraphyletic and that the brown jay Psilorhinus (=Cyanocorax) morio is the sister group to magpie jays (Calocitta), a phylogenetic hypothesis that is likely as parsimonious with regard to nonmolecular characters as monophyly of Cyanocorax. The CR tree also suggests that the common ancestor of NWJs was likely a cooperative breeder. Consistent with recent systematic
theory, our data suggest that DNA sequences with high substitution rates such as the CR may nonetheless be useful in reconstructing
relatively deep phylogenetic nodes in avian groups.
Received: 10 November 1999 / Accepted: 16 March 2000 相似文献
3.
Mitochondrial Genes Collectively Suggest the Paraphyly of Crustacea with Respect to Insecta 总被引:9,自引:0,他引:9
Erik García-Machado Malgorzata Pempera Nicole Dennebouy Mario Oliva-Suarez Jean-Claude Mounolou Monique Monnerot 《Journal of molecular evolution》1999,49(1):142-149
Complete sequences of seven protein coding genes from Penaeus notialis mitochondrial DNA were compared in base composition and codon usage with homologous genes from Artemia franciscana and four insects. The crustacean genes are significantly less A + T-rich than their counterpart in insects and the pattern
of codon usage (ratio of G + C-rich versus A + T-rich codon) is less biased. A phylogenetic analysis using amino acid sequences
of the seven corresponding polypeptides supports a sister-taxon status for mollusks–annelid and arthropods. Furthermore, a
distance matrix-based tree and two most-parsimonious trees both suggest that crustaceans are paraphyletic with respect to
insects. This is also supported by the inclusion of Panulirus argus COII (complete) and COI and COIII (partial) sequence data. From analysis of single and combined genes to infer phylogenies,
it is observed that obtained from single genes are not well supported in most topologies cases and notably differ from that
of the tree based on all seven genes.
Received: 25 August 1998 / Accepted: 8 March 1999 相似文献
4.
Jon P. Anderson Allen G. Rodrigo Gerald H. Learn Yang Wang Hillard Weinstock Marcia L. Kalish Kenneth E. Robbins Leroy Hood James I. Mullins 《Journal of molecular evolution》2001,53(1):55-62
Phylogenetic analyses frequently rely on models of sequence evolution that detail nucleotide substitution rates, nucleotide
frequencies, and site-to-site rate heterogeneity. These models can influence hypothesis testing and can affect the accuracy
of phylogenetic inferences. Maximum likelihood methods of simultaneously constructing phylogenetic tree topologies and estimating
model parameters are computationally intensive, and are not feasible for sample sizes of 25 or greater using personal computers.
Techniques that initially construct a tree topology and then use this non-maximized topology to estimate ML substitution rates,
however, can quickly arrive at a model of sequence evolution. The accuracy of this two-step estimation technique was tested
using simulated data sets with known model parameters. The results showed that for a star-like topology, as is often seen
in human immunodeficiency virus type 1 (HIV-1) subtype B sequences, a random starting topology could produce nucleotide substitution
rates that were not statistically different than the true rates. Samples were isolated from 100 HIV-1 subtype B infected individuals
from the United States and a 620 nt region of the env gene was sequenced for each sample. The sequence data were used to obtain a substitution model of sequence evolution specific
for HIV-1 subtype B env by estimating nucleotide substitution rates and the site-to-site heterogeneity in 100 individuals from the United States.
The method of estimating the model should provide users of large data sets with a way to quickly compute a model of sequence
evolution, while the nucleotide substitution model we identified should prove useful in the phylogenetic analysis of HIV-1
subtype B env sequences.
Received: 4 October 2000 / Accepted: 1 March 2001 相似文献
5.
A new, model-based method was devised to locate nucleotide changes in a given phylogenetic tree. For each site, the posterior
probability of any possible change in each branch of the tree is computed. This probabilistic method is a valuable alternative
to the maximum parsimony method when base composition is skewed (i.e., different from 25% A, 25% C, 25% G, 25% T): computer
simulations showed that parsimony misses more rare → common than common → rare changes, resulting in biased inferred change
matrices, whereas the new method appeared unbiased. The probabilistic method was applied to the analysis of the mutation and
substitution processes in the mitochondrial control region of mouse. Distinct change patterns were found at the polymorphism
(within species) and divergence (between species) levels, rejecting the hypothesis of a neutral evolution of base composition
in mitochondrial DNA.
Received: 15 March 1999 / Accepted: 7 October 1999 相似文献
6.
Synonymous and nonsynonymous rate variation in nuclear genes of mammals 总被引:34,自引:6,他引:28
A maximum likelihood approach was used to estimate the synonymous and nonsynonymous substitution rates in 48 nuclear genes
from primates, artiodactyls, and rodents. A codon-substitution model was assumed, which accounts for the genetic code structure,
transition/transversion bias, and base frequency biases at codon positions. Likelihood ratio tests were applied to test the
constancy of nonsynonymous to synonymous rate ratios among branches (evolutionary lineages). It is found that at 22 of the
48 nuclear loci examined, the nonsynonymous/synonymous rate ratio varies significantly across branches of the tree. The result
provides strong evidence against a strictly neutral model of molecular evolution. Our likelihood estimates of synonymous and
nonsynonymous rates differ considerably from previous results obtained from approximate pairwise sequence comparisons. The
differences between the methods are explored by detailed analyses of data from several genes. Transition/transversion rate
bias and codon frequency biases are found to have significant effects on the estimation of synonymous and nonsynonymous rates,
and approximate methods do not adequately account for those factors. The likelihood approach is preferable, even for pairwise
sequence comparison, because more-realistic models about the mutation and substitution processes can be incorporated in the
analysis.
Received: 17 May 1997 / Accepted: 28 September 1997 相似文献
7.
Selection on the codon bias of chloroplast and cyanelle genes in different plant and algal lineages 总被引:8,自引:0,他引:8
Brian R. Morton 《Journal of molecular evolution》1998,46(4):449-459
In the plant chloroplast genome the codon usage of the highly expressed psbA gene is unique and is adapted to the tRNA population, probably due to selection for translation efficiency. In this study
the role of selection on codon usage in each of the fully sequenced chloroplast genomes, in addition to Chlamydomonas reinhardtii, is investigated by measuring adaptation to this pattern of codon usage. A method is developed which tests selection on each
gene individually by constructing sequences with the same amino acid composition as the gene and randomly assigning codons
based on the nucleotide composition of noncoding regions of that genome. The codon bias of the actual gene is then compared
to a distribution of random sequences. The data indicate that within the algae selection is strong in Cyanophora paradoxa, affecting a majority of genes, of intermediate intensity in Odontella sinensis, and weaker in Porphyra purpurea and Euglena gracilis. In the plants, selection is found to be quite weak in Pinus thunbergii and the angiosperms but there is evidence that an intermediate level of selection exists in the liverwort Marchantia polymorpha. The role of selection is then further investigated in two comparative studies. It is shown that average relative codon bias
is correlated with expression level and that, despite saturation levels of substitution, there is a strong correlation among
the algae genomes in the degree of codon bias of homologous genes. All of these data indicate that selection for translation
efficiency plays a significant role in determining the codon bias of chloroplast genes but that it acts with different intensities
in different lineages. In general it is stronger in the algae than the higher plants, but within the algae Euglena is found to have several unusual features which are noted. The factors that might be responsible for this variation in intensity
among the various genomes are discussed.
Received: 6 June 1997 / Accepted: 24 July 1997 相似文献
8.
Noboru Sueoka 《Journal of molecular evolution》1999,49(1):49-62
The relative contribution of mutation and selection to the G+C content of DNA was analyzed in bacterial species having widely
different G+C contents. The analysis used two methods that were developed previously. The first method was to plot the average
G+C content of a set of nucleotides against the G+C content of the third codon position for each gene. This method was used
to present the G+C distribution of the third codon position and to assess the relative neutrality of a set of nucleotides
to that of the G+C content of the third codon position. The second method was to plot the intrastrand bias of the third codon
position from Parity Rule 2 (PR2), where A=T and G=C. It was found that whereas intragenomic distributions of the DNA G+C content of these bacteria are narrow in the majority
of species, in some species the G+C content of the minor class of genes distributes over wider ranges than the major class
of genes. On the other hand, ubiquitous PR2 biases are amino acid specific and independent of the G+C content of DNA, so that
when averaged over the amino acids, the biases are small and not correlated with the DNA G+C content. Therefore, translation
coupled PR2-biases are unlikely to explain the wide range of G+C contents among different species. Considering all data available,
it was concluded that the amino acid-specific PR2 bias has only a minor effect, if any, on the average G+C content. In addition,
PR2 bias patterns of different species show phylogenetic relationships, and the pattern can be as a taxal fingerprint.
Received: 5 November 1998 / Accepted: 1 March 1999 相似文献
9.
Jenkins GM Pagel M Gould EA de A Zanotto PM Holmes EC 《Journal of molecular evolution》2001,52(4):383-390
The extent to which base composition and codon usage vary among RNA viruses, and the possible causes of this bias, is undetermined
in most cases. A maximum-likelihood statistical method was used to test whether base composition and codon usage bias covary
with arthropod association in the genus Flavivirus, a major source of disease in humans and animals. Flaviviruses are transmitted by mosquitoes, by ticks, or directly between
vertebrate hosts. Those viruses associated with ticks were found to have a significantly lower G+C content than non-vector-borne
flaviviruses and this difference was present throughout the genome at all amino acids and codon positions. In contrast, mosquito-borne
viruses had an intermediate G+C content which was not significantly different from those of the other two groups. In addition,
biases in dinucleotide and codon usage that were independent of base composition were detected in all flaviviruses, but these
did not covary with arthropod association. However, the overall effect of these biases was slight, suggesting only weak selection
at synonymous sites. A preliminary analysis of base composition, codon usage, and vector specificity in other RNA virus families
also revealed a possible association between base composition and vector specificity, although with biases different from
those seen in the Flavivirus genus.
Received: 29 August 2000 / Accepted: 19 December 2000 相似文献
10.
Graziano Pesole Luigi R. Ceci Carmela Gissi Cecilia Saccone Carla Quagliariello 《Journal of molecular evolution》1996,43(5):447-452
We have analyzed the nad3-rps12 locus for eight angiosperms in order to compare the utility of mitochondrial DNA and edited mRNA sequences in phylogenetic
reconstruction. The two coding regions, containing from 25 to 35 editing sites in the various plants, have been concatenated
in order to increase the significance of the analysis. Differing from the corresponding chloroplast sequences, unedited mitochondrial
DNA sequences seem to evolve under a quasi-neutral substitution process which undifferentiates the nucleotide substitution
rates for the three codon positions. By using complete gene sequences (all codon positions) we found that genomic sequences
provide a classical angiosperm phylogenetic tree with a clear-cut grouping of monocotyledons and dicotyledons with Magnoliidae
at the basal branch of the tree. Conversely, owing to their low nucleotide substitution rates, edited mRNA sequences were
found not to be suitable for studying phylogenetic relationships among angiosperms.
Received: 24 January 1996 / Accepted: 5 June 1996 相似文献
11.
A model of nucleotide substitution that allows the transition/transversion rate bias to vary across sites was constructed.
We examined the fit of this model using likelihood-ratio tests by analyzing 13 protein coding genes and 1 pseudogene. Likelihood-ratio
testing indicated that a model that allows variation in the transition/transversion rate bias across sites provided a significant
improvement in fit for most protein coding genes but not for the pseudogene. When the analysis was repeated with parameters
estimated separately for first, second, and third codon positions, strong heterogeneity was uncovered for the first and second
codon positions; the variation in the transition/transversion rate was generally weaker at the third codon position. The transition
rate bias and branch lengths are underestimated when variation in the transition/transversion rate was not accommodated, suggesting
that it may be important to accommodate variation in the pattern of nucleotide substitution for accurate estimation of evolutionary
parameters.
Received: 4 November 1997 / Accepted: 19 May 1998 相似文献
12.
We compared the codon usage of sequences of transposable elements (TEs) with that of host genes from the species Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Saccharomyces cerevisiae, and Homo sapiens. Factorial correspondence analysis showed that, regardless of the base composition of the genome, the TEs differed from the
genes of their host species by their AT-richness. In all species, the percentage of A + T on the third codon position of the
TEs was higher than that on the first codon position and lower than that in the noncoding DNA of the genomes. This indicates
that the codon choice is not simply the outcome of mutational bias but is also subject to selection constraints. A tendency
toward higher A + T on the third position than on the first position was also found in the host genes of A. thaliana, C. elegans, and S. cerevisiae but not in those of D. melanogaster and H. sapiens. This strongly suggests that the AT choice is a host-independent characteristic common to all TEs. The codon usage of TEs
generally appeared to be different from the mean of the host genes. In the AT-rich genomes of Arabidopsis thaliana, Caenorhabditis elegans, and Saccharomyces cerevisiae, the codon usage bias of TEs was similar to that of weakly expressed genes. In the GC-rich genome of D. melanogaster, however, the bias in codon usage of the TEs clearly differed from that of weakly expressed genes. These findings suggest
that selection acts on TEs and that TEs may display specific behavior within the host genomes.
Received: 2 May 2001 / Accepted: 29 October 2001 相似文献
13.
J. Hinrich G.v.d. Schulenburg Ulrike Englisch J.-Wolfgang Wägele 《Journal of molecular evolution》1999,48(1):2-12
A comparison of ribosomal internal transcribed spacer 1 (ITS1) elements of digenetic trematodes (Platyhelminthes) including
unidentified digeneans isolated from Cyathura carinata (Crustacea: Isopoda) revealed DNA sequence similarities at more than half of the spacer at its 3′ end. Primary sequence similarity
was shown to be associated with secondary structure conservation, which suggested that similarity is due to identity by descent
and not chance. Using an analysis of apomorphies, the sequence data were shown to produce a distinct phylogenetic signal.
This was confirmed by the consistency of results of different tree reconstruction methods such as distance approaches, maximum
parsimony, and maximum likelihood. Morphological evidence additionally supported the phylogenetic tree based on ITS1 data
and the inferred phylogenetic position of the unidentified digeneans of C. carinata met the expectations from known trematode life-cycle patterns. Although ribosomal ITS1 elements are generally believed to
be too variable for phylogenetic analysis above the species or genus level, the overall consistency of the results of this
study strongly suggests that this is not the case in digenetic trematodes. Here, 3′ end ITS1 sequence data seem to provide
a valuable tool for elucidating phylogenetic relationships of a broad range of phylogenetically distinct taxa.
Received: 20 October 1997 / Accepted: 24 March 1998 相似文献
14.
Statistical and biochemical studies of the genetic code have found evidence of nonrandom patterns in the distribution of
codon assignments. It has, for example, been shown that the code minimizes the effects of point mutation or mistranslation:
erroneous codons are either synonymous or code for an amino acid with chemical properties very similar to those of the one
that would have been present had the error not occurred. This work has suggested that the second base of codons is less efficient
in this respect, by about three orders of magnitude, than the first and third bases. These results are based on the assumption
that all forms of error at all bases are equally likely. We extend this work to investigate (1) the effect of weighting transition
errors differently from transversion errors and (2) the effect of weighting each base differently, depending on reported mistranslation
biases. We find that if the bias affects all codon positions equally, as might be expected were the code adapted to a mutational
environment with transition/transversion bias, then any reasonable transition/transversion bias increases the relative efficiency
of the second base by an order of magnitude. In addition, if we employ weightings to allow for biases in translation, then
only 1 in every million random alternative codes generated is more efficient than the natural code. We thus conclude not only
that the natural genetic code is extremely efficient at minimizing the effects of errors, but also that its structure reflects
biases in these errors, as might be expected were the code the product of selection.
Received: 25 July 1997 / Accepted: 9 January 1998 相似文献
15.
In bacteria, synonymous codon usage can be considerably affected by base composition at neighboring sites. Such context-dependent
biases may be caused by either selection against specific nucleotide motifs or context-dependent mutation biases. Here we
consider the evolutionary conservation of context-dependent codon bias across 11 completely sequenced bacterial genomes. In
particular, we focus on two contextual biases previously identified in Escherichia coli; the avoidance of out-of-frame stop codons and AGG motifs. By identifying homologues of E. coli genes, we also investigate the effect of gene expression level in Haemophilus influenzae and Mycoplasma genitalium. We find that while context-dependent codon biases are widespread in bacteria, few are conserved across all species considered.
Avoidance of out-of-frame stop codons does not apply to all stop codons or amino acids in E. coli, does not hold for different species, does not increase with gene expression level, and is not relaxed in Mycoplasma spp., in which the canonical stop codon, TGA, is recognized as tryptophan. Avoidance of AGG motifs shows some evolutionary
conservation and increases with gene expression level in E. coli, suggestive of the action of selection, but the cause of the bias differs between species. These results demonstrate that
strong context-dependent forces, both selective and mutational, operate on synonymous codon usage but that these differ considerably
between genomes.
Received: 6 May 1999 / Accepted: 29 October 1999 相似文献
16.
The Old World sparrows (genus Passer) phylogeography and their relative abundance of nuclear mtDNA pseudogenes 总被引:2,自引:0,他引:2
Allende LM Rubio I Ruíz-Del-Valle V Guillén J Martínez-Laso J Lowy E Varela P Zamora J Arnaiz-Villena A 《Journal of molecular evolution》2001,53(2):144-154
The phylogenetic relationships of genus Passer (Old World sparrows) have been studied with species covering their complete world living range. Mitochondrial (mt) cyt b
genes and pseudogenes have been analyzed, the latter being strikingly abundant in genus Passer compared with other studied songbirds. The significance of these Passer pseudogenes is presently unclear. The mechanisms by which mt cyt b genes become pseudogenes after nuclear translocation are
discussed together with their mode of evolution, i.e., transition/transversion mitochondrial ratio is decreased in the nucleus,
as is the constraint for variability at the three codon positions. However, the skewed base composition according to codon
position (in 1st position the percentage is very similar for the four bases, in 2nd position there are fewer percentage of A and G and more percentage of T, and in 3rd codon position fewer percentage of G and T and is very rich in A and C) is maintained in the translocated nuclear pseudogenes.
Different nuclear internal mechanisms and/or selective pressures must exist for explaining this nuclear/mitochondrial differential
DNA base evolutive variability. Also, the phylogenetic usefulness of pseudogenes for defining relationships between closely
related lineages is stressed. The analyses suggest that the primitive genus Passer species comes from Africa, the Cape sparrow
being the oldest: P. hispaniolensis italiae is more likely conspecific to P. domesticus than to P. hispaniolensis. Also, Passer species are not included within weavers or Estrildinae or Emberizinae, as previously suggested. European and American Emberizinae sparrows are closely related to each other and seem to be the earliest species that radiated among the studied songbirds
(all in the Miocene Epoch).
Received: 29 November 2000 / Accepted: 22 March 2001 相似文献
17.
The Nonrandom Location of Synonymous Codons Suggests That Reading Frame-Independent Forces Have Patterned Codon Preferences 总被引:6,自引:0,他引:6
Biased codon usage is common in eukaryotic and prokaryotic genes. Evidence from Escherichia, Saccharomyces, and Drosophila indicates that it favors translational efficiency and accuracy. However, to date no functional advantages have been identified
in the codon–anticodon interactions involving the most frequently used (preferred) codons. Here we present evidence that forces
not related to the individual codon–anticodon interaction may be involved in determining which synonymous codons are preferred
or avoided. We show that the ``off-frame' trinucleotide motif preferences inferrable from Drosophila coding regions are often in the same direction as Drosophila's ``in-frame' codon preferences, i.e., its codon usage. The off-frame preferences were inferred from the nonrandomness of
the location of confamilial synonymous codons along coding regions—a pattern often described as a context dependence of nucleotide
choice at synonymous positions or as codon-pair bias. We relied on randomizations of the location of confamilial codons that
do not alter, and cannot be influenced by, the encoded amino acid sequences, codon usage, or base composition of the genes
examined. The statistically significant congruency of in-frame and off-frame trinucleotide preferences suggests that the same
kind of reading-frame-independent force(s) may also influence synonymous codon choice. These forces may have produced biases
in codon usage that then led to the evolution of the translational advantages of these motifs as preferred codons. Under this
scenario, tRNA pool size differences between preferred and nonpreferred codons initially were evolved to track the default
overrepresentation of codons with preferred motifs. The motif preference hypothesis can explain the structuring of codon preferences
and the similarities in the codon usages of distantly related organisms.
Received: 10 November 1998 / Accepted: 23 February 1999 相似文献
18.
Grishin NV 《Journal of molecular evolution》1999,48(3):264-273
The reliable reconstruction of tree topology from a set of homologous sequences is one of the main goals in the study of
molecular evolution. If consistent estimators of distances from a multiple sequence alignment are known, the distance method
is attractive because the tree reconstruction is consistent. To obtain a distance estimate d, the observed proportion of differences p (p-distance) is usually ``corrected' for multiple and back substitutions by means of a functional relationship d=f(p). In this paper the conditions under which this correction of p-distances will not alter the selection of the tree topology are specified. When these conditions are not fulfilled the selection
of the tree topology may depend on the correction function applied. A novel method which includes estimates of distances not
only between sequence pairs, but between triplets, quadruplets, etc., is proposed to strengthen the proper selection of correction
function and tree topology. A ``super' tree that includes all tree topologies as special cases is introduced.
Received: 17 February 1998 / Accepted: 20 July 1998 相似文献
19.
The cytochrome b gene as a phylogenetic marker: the limits of resolution for analyzing relationships among cichlid fishes 总被引:12,自引:0,他引:12
The mitochondrial cytochrome b (cyt-b) gene is widely used in systematic studies to resolve divergences at many taxonomic levels. The present study focuses mainly
on the utility of cyt-b as a molecular marker for inferring phylogenetic relationship at various levels within the fish family Cichlidae. A total
of 78 taxa were used in the present analysis, representing all the major groups in the family Cichlidae (72 taxa) and other
families from the suborders Labroidei and Percoidei. Gene trees obtained from cyt-b are compared to a published total evidence tree derived from previous studies. Minimum evolution trees based on cyt-b data resulted in topologies congruent with all previous analyses. Parsimony analyses downweighting transitions relative to
transversions (ts1:tv4) or excluding transitions at third codon positions resulted in more robust bootstrap support for recognized
clades than unweighted parsimony. Relative rate tests detected significantly long branches for some taxa (LB taxa) which were
composed mainly by dwarf Neotropical cichlids. An improvement of the phylogenetic signal, as shown by the four-cluster likelihood
mapping analysis, and higher bootstrap values were obtained by excluding LB taxa. Despite some limitations of cyt-b as a phylogenetic marker, this gene either alone or in combination with other data sets yields a tree that is in agreement
with the well-established phylogeny of cichlid fish.
Received: 11 October 2000 / Accepted: 26 February 2001 相似文献
20.
It is now well-established that compositional bias in DNA sequences can adversely affect phylogenetic analysis based on those
sequences. Phylogenetic analyses based on protein sequences are generally considered to be more reliable than those derived
from the corresponding DNA sequences because it is believed that the use of encoded protein sequences circumvents the problems
caused by nucleotide compositional biases in the DNA sequences. There exists, however, a correlation between AT/GC bias at
the nucleotide level and content of AT- and GC-rich codons and their corresponding amino acids. Consequently, protein sequences
can also be affected secondarily by nucleotide compositional bias. Here, we report that DNA bias not only may affect phylogenetic
analysis based on DNA sequences, but also drives a protein bias which may affect analyses based on protein sequences. We present
a striking example where common phylogenetic tools fail to recover the correct tree from complete animal mitochondrial protein-coding
sequences. The data set is very extensive, containing several thousand sites per sequence, and the incorrect phylogenetic
trees are statistically very well supported. Additionally, neither the use of the LogDet/paralinear transform nor removal
of positions in the protein alignment with AT- or GC-rich codons allowed recovery of the correct tree. Two taxa with a large
compositional bias continually group together in these analyses, despite a lack of close biological relatedness. We conclude
that even protein-based phylogenetic trees may be misleading, and we advise caution in phylogenetic reconstruction using protein
sequences, especially those that are compositionally biased.
Received: 19 February 1998 / Accepted: 28 August 1998 相似文献