共查询到20条相似文献,搜索用时 0 毫秒
1.
Soullier S Jay P Poulat F Vanacker JM Berta P Laudet V 《Journal of molecular evolution》1999,48(5):517-527
From a database containing the published HMG protein sequences, we constructed an alignment of the HMG box functional domain
based on sequence identity. Due to the large number of sequences (more than 250) and the short size of this domain, several
data sets were used. This analysis reveals that the HMG box superfamily can be separated into two clearly defined subfamilies:
(i) the SOX/MATA/TCF family, which clusters proteins able to bind to specific DNA sequences; and (ii) the HMG/UBF family,
which clusters members which bind non specifically to DNA. The appearance and diversification of these subfamilies largely
predate the split between the yeast and the metazoan lineages. Particular emphasis was placed on the analysis of the SOX subfamily.
For the first time our analysis clearly identified the SOX subfamily as structured in six groups of genes named SOX5/6, SRY,
SOX2/3, SOX14, SOX4/22, and SOX9/18. The validity of these gene clusters is confirmed by their functional characteristics
and their sequences outside the HMG box. In sharp contrast, there are only a few robust branching patterns inside the UBF/HMG
family, probably because of the much more ancient diversification of this family than the diversification of the SOX family.
The only consistent groups that can be detected by our analysis are HMG box 1, vertebrate HMG box 2, insect SSRP, and plant
HMG. The various UBF boxes cannot be clustered together and their diversification appears to be extremely ancient, probably
before the appearance of metazoans.
Received: 20 July 1998 / Accepted: 19 October 1998 相似文献
2.
Wallis M 《Journal of molecular evolution》2000,50(5):465-473
Previous studies have shown that pituitary growth hormone displays an episodic pattern of evolution, with a slow underlying
evolutionary rate and occasional sustained bursts of rapid change. The present study establishes that pituitary prolactin
shows a similar pattern. During much of tetrapod evolution the sequence of prolactin has been strongly conserved, showing
a slow basal rate of change (approx 0.27 × 109 substitutions/amino acid site/year). This rate has increased substantially (∼12- to 38-fold) on at least four occasions during
eutherian evolution, during the evolution of primates, artiodactyls, rodents, and elephants. That these increases are real
and not a consequence of inadvertant comparison of paralogous genes is shown (for at least the first three groups) by the
fact that they are confined to mature protein coding sequence and not apparent in sequences coding for signal peptides or
when synonymous substitutions are examined. Sequences of teleost prolactins differ markedly from those of tetrapods and lungfish,
but during the course of teleost evolution the rate of change of prolactin has been less variable than that of growth hormone.
It is concluded that the evolutionary pattern seen for prolactin shows long periods of near-stasis interrupted by occasional
bursts of rapid change, resembling the pattern seen for growth hormone in general but not in detail. The most likely basis
for these bursts appears to be adaptive evolution though the biological changes involved are relatively small.
Received: 31 August 1999 / Accepted: 9 February 2000 相似文献
3.
Zoltán Szabó Simona A. Levi-Minzi Angela M. Christiano Carole Struminger Mark Stoneking Mark A. Batzer Charles D. Boyd 《Journal of molecular evolution》1999,49(5):664-671
Previous evidence has demonstrated the absence of exons 34 and 35 within the 3′ end of the human tropoelastin (ELN) gene.
These exons encode conserved polypeptide domains within tropoelastin and are found in the ELN gene in vertebrate species ranging
from chickens to rats to cows. We have analyzed the ELN gene in a variety of primate species to determine whether the absence
of exons 34 and 35 in humans either is due to allelic variation within the human population or is a general characteristic
of the Primates order. An analysis of the 3′ end of the ELN gene in several nonhuman primates and in 546 chromosomes from
humans of varying ethnic background demonstrated a sequential loss of exons 34 and 35 during primate evolution. The loss of
exon 35 occurred at least 35–45 million years ago, when Catarrhines diverged from Platyrrhines (New World monkeys). Exon 34 loss, in contrast, occurred only about 6–8 million years ago, when Homo separated from the common ancestor shared with chimpanzees and gorillas. Loss of both exons was probably facilitated by Alu-mediated
recombination events and possibly conferred a functional evolutionary advantage in elastic tissue.
Received: 6 July 1998 / Accepted: 18 February 1999 相似文献
4.
The Molecular Evolution of the Vertebrate Trypsinogens 总被引:1,自引:0,他引:1
We expand the already large number of known trypsinogen nucleotide and amino acid sequences by presenting additional trypsinogen sequences from the tunicate (Boltenia villosa), the lamprey (Petromyzon marinus), the pufferfish (Fugu rubripes), and the frog (Xenopus laevis). The current array of known trypsinogen sequences now spans the entire vertebrate phylogeny. Phylogenetic analysis is made difficult by the presence of multiple isozymes within species and rates of evolution that vary highly between both species and isozymes. We nevertheless present a Fitch-Margoliash phylogeny constructed from pairwise distances. We employ this phylogeny as a vehicle for speculation on the evolution of the trypsinogen gene family as well as the general modes of evolution of multigene families. Unique attributes of the lamprey and tunicate trypsinogens are noted. Received: 12 July 1997 相似文献
5.
Cristian Cañestro Ricard Albalat Lars Hjelmqvist Laura Godoy Hans Jörnvall Roser Gonzàlez-Duarte 《Journal of molecular evolution》2002,54(1):81-89
The alcohol dehydrogenase (ADH) family has evolved into at least eight ADH classes during vertebrate evolution. We have characterized
three prevertebrate forms of the parent enzyme of this family, including one from an urochordate (Ciona intestinalis) and two from cephalochordates (Branchiostoma floridae and Branchiostoma lanceolatum). An evolutionary analysis of the family was performed gathering data from protein and gene structures, exon–intron distribution,
and functional features through chordate lines. Our data strongly support that the ADH family expansion occurred 500 million
years ago, after the cephalochordate/vertebrate split, probably in the gnathostome subphylum line of the vertebrates. Evolutionary
rates differ between the ancestral, ADH3 (glutathione-dependent formaldehyde dehydrogenase), and the emerging forms, including
the classical alcohol dehydrogenase, ADH1, which has an evolutionary rate 3.6-fold that of the ADH3 form. Phylogenetic analysis
and chromosomal mapping of the vertebrate Adh gene cluster suggest that family expansion took place by tandem duplications, probably concurrent with the extensive isoform
burst observed before the fish/tetrapode split, rather than through the large-scale genome duplications also postulated in
early vertebrate evolution. The absence of multifunctionality in lower chordate ADHs and the structures compared argue in
favor of the acquisition of new functions in vertebrate ADH classes. Finally, comparison between B. floridae and B. lanceolatum Adhs provides the first estimate for a cephalochordate speciation, 190 million years ago, probably concomitant with the beginning
of the drifting of major land masses from the Pangea.
Received: 10 April 2001 / Accepted: 23 May 2001 相似文献
6.
Glutamine synthetase type I (GSI) genes have previously been described only in prokaryotes except that the fungus Emericella nidulans contains a gene (fluG) which encodes a protein with a large N-terminal domain linked to a C-terminal GSI-like domain. Eukaryotes generally contain
the type II (GSII) genes which have been shown to occur also in some prokaryotes. The question of whether GSI and GSII genes
are orthologues or paralogues remains a point of controversy. In this article we show that GSI-like genes are widespread in
higher plants and have characterized one of the genes from the legume Medicago truncatula. This gene is part of a small gene family and is expressed in many organs of the plant. It encodes a protein similar in size
and with between 36 and 46% amino acid sequence similarity to prokaryotic GS proteins used in the analyses, whereas it is
larger and with less than 25% similarity to GSII proteins, including those from the same plant species. Phylogenetic analyses
suggest that this protein is most similar to putative proteins encoded by expressed sequence tags of other higher plant species
(including dicots and a monocot) and forms a cluster with FluG as the most divergent of the GSI sequences. The discovery of
GSI-like genes in higher plants supports the paralogous evolution of GSI and GSII genes, which has implications for the use
of GS in molecular studies on evolution.
Received: 4 May 1999 / Accepted: 17 September 1999 相似文献
7.
The chaetognaths are an extraordinarily homogeneous phylum of animals at the morphological level, with a bauplan that can
be traced back to the Cambrian. Despite the attention of zoologists for over two centuries, there is little agreement on classification
within the phylum. We have used a molecular biological approach to investigate the phylogeny of extant chaetognaths. A rapidly
evolving expansion segment toward the 5′ end of 28S ribosomal DNA (rDNA) was amplified using the polymerase chain reaction
(PCR), cloned, and sequenced from 26 chaetognath samples representing 18 species. An unusual finding was the presence of two
distinct classes of 28S rDNA gene in chaetognaths; our analyses suggest these arose by a gene (or gene cluster) duplication
in a common ancestor of extant chaetognaths. The two classes of chaetognath 28S rDNA have been subject to different rates
of molecular evolution; we present evidence that both are expressed and functional. In phylogenetic reconstructions, the two
classes of 28S rDNA yield trees that root each other; these clearly demonstrate that the Aphragmophora and Phragmophora are
natural groups. Within the Aphragmophora, we find good support for the groupings denoted Solidosagitta, Parasagitta, and Pseudosagitta. The relationships between several well-supported groups within the Aphragmophora are uncertain; we suggest this reflects
rapid, recent radiation during chaetognath evolution.
Received: 19 March 1996 / Accepted: 5 August 1996 相似文献
8.
A fractal renewal point process (FRPP) is used to model molecular evolution in agreement with the relationship between the
variance and the mean numbers of nonsynonymous and synonymous substitutions in mammals. Like other episodic models such as
the doubly stochastic Poisson process, this model accounts for the large variances observed in amino acid substitution rates,
but unlike certain other episodic models, it also accounts for the increase in the index of dispersion with the mean number
of substitutions in Ohta's (1995) data. We find that this correlation is significant for nonsynonymous substitutions at the
1% level and for synonymous substitutions at the 10% level, even after removing lineage effects and when using Bulmer's (1989)
unbiased estimator of the index of dispersion. This model is simpler than most other overdispersed models of evolution in
the sense that it is fully specified by a single interevent probability distribution. Interpretations in terms of chaotic
dynamics and in terms of chance and selection are discussed.
Received: 12 January 1998 / Accepted: 19 May 1998 相似文献
9.
Calmodulin is a calcium-binding EF-hand protein that is an activator of many enzymes as well as ion pumps and channels. Due
to its multiple targets and its central role in the cell, understanding the evolutionary history of calmodulin genes should
provide insights into the origin of genetic complexity in eukaryotes. We have previously isolated and characterized a calmodulin
gene from the early-diverging chordate Branchiostoma lanceolatum (CaM1). In this paper, we report the existence of a second calmodulin gene (CaM2) as well as two CaM-like genomic fragments (CaML-2, CaML-3) in B. lanceolatum and a CaM2 and three CaM-like genes (CaML-1, CaML-2, CaML-3) in B. floridae. The CaM-like genes were isolated using low-stringency PCR. Surprisingly, the nucleotide sequences of the B. lanceolatum CaM1 and CaM2 cDNAs differ by 19.3%. Moreover, the CaM2 protein differs at two positions from the amino acid sequence of CaM1; the latter
is identical to calmodulins in Drosophila melanogaster, the mollusc Aplysia californica, and the tunicate Halocynthia roretzi. The two B. lanceolatum CaM-like genes are more closely related to the CaM2 than to the CaM1 gene. This relationship is supported by the phylogenetic analyses and the identical exon/intron organization of these three
genes, a relationship unique among animal CaM sequences. These data demonstrate the existence of a CaM multigene family in the cephalochordate Branchiostoma, which may have evolved independently from the multigene family in vertebrates.
Received: 2 November 1999 / Accepted: 25 April 2000 相似文献
10.
The aldo-keto reductase enzymes comprise a functionally diverse gene family which catalyze the NADPH-dependant reduction
of a variety of carbonyl compounds. The protein sequences of 45 members of this family were aligned and phylogenetic trees
were deduced from this alignment using the neighbor-joining and Fitch algorithms. The branching order of these trees indicates
that the vertebrate enzymes cluster in three groups, which have a monophyletic origin distinct from the bacterial, plant,
and invertebrate enzymes. A high level of conservation was observed between the vertebrate hydroxysteroid dehydrogenase enzymes,
prostaglandin F synthase, and ρ-crystallin of Xenopus laevis. We infer from the phylogenetic analysis that prostaglandin F synthase may represent a recent recruit to the eicosanoid biosynthetic
pathway from the hydroxysteroid dehydrogenase pathway and furthermore that, in the context of gene recruitment, Xenopus laevisρ-crystallin may represent a shared gene.
Received: 26 August 1996 / Accepted: 5 June 1997 相似文献
11.
Albert Jeltsch 《Journal of molecular evolution》1999,49(1):161-164
Circular permutations of genes during molecular evolution often are regarded as elusive, although a simple model can explain
these rearrangements. The model assumes that first a gene duplication of the precursor gene occurs in such a way that both
genes become fused in frame, leading to a tandem protein. After generation of a new start codon within the 5′ part of the
tandem gene and a stop at an equivalent position in the 3′ part of the gene, a protein is encoded that represents a perfect
circular permutation of the precursor gene product. The model is illustrated here by the molecular evolution of adenine-N6 DNA methyltransferases. β- and γ-type enzymes of this family can be interconverted by a single circular permutation event.
Interestingly, tandem proteins, proposed as evolutionary intermediates during circular permutation, can be directly observed
in the case of adenine methyltransferases, because some enzymes belonging to type IIS, like the FokI methyltransferase, are built up by two fused enzymes, both of which are active independently of each other. The mechanism
for circular permutation illustrated here is very easy and applicable to every protein. Thus, circular permutation can be
regarded as a normal process in molecular evolution and a changed order of conserved amino acid motifs should not be interpreted
to argue against divergent evolution.
Received: 17 November 1998 / Accepted: 19 February 1999 相似文献
12.
Madern D 《Journal of molecular evolution》2002,54(6):825-840
The NAD(P)-dependent malate (L-MalDH) and NAD-dependent lactate (L-LDH) form a large super-family that has been characterized
in organisms belonging to the three domains of life. In the first part of this study, the group of [LDH-like] L-MalDH, which
are malate dehydrogenases resembling lactate dehydrogenase, were analyzed and clearly defined with respect to the other enzymes.
In the second part, the phylogenetic relationships of the whole super-family were presented by taking into account the [LDH-like]
L-MalDH. The inferred tree unambiguously shows that two ancestral genes duplications, and not one as generally thought, are
needed to explain both the distribution into two enzymatic functions and the observation of three main groups within the super-family:
L-LDH, [LDH-like] L-MalDH, and dimeric L-MalDH. In addition, various cases of functional changes within each group were observed
and analyzed. The direction of evolution was found to always be polarized: from enzymes with a high stringency of substrate
recognition to enzymes with a broad substrate specificity. A specific phyletic distribution of the L-LDH, [LDH-like] L-MalDH,
and dimeric L-MalDH over the Archaeal, Bacterial, and Eukaryal domains was observed. This was analyzed in the light of biochemical,
structural, and genomic data available for the L-LDH, [LDH-like] L-MalDH, and dimeric L-MalDH. This analysis led to the elaboration
of a refined evolutionary scenario of the super-family, in which the selection of L-LDH and the fate of L-MalDH during mitochrondrial
genesis are presented. 相似文献
13.
In translation, separate aminoacyl-tRNA synthetases attach the 20 different amino acids to their cognate tRNAs, with the
exception of glutamine. Eukaryotes and some bacteria employ a specific glutaminyl-tRNA synthetase (GlnRS) which other Bacteria,
the Archaea (archaebacteria), and organelles apparently lack. Instead, tRNAGln is initially acylated with glutamate by glutamyl-tRNA synthetase (GluRS), then the glutamate moiety is transamidated to glutamine.
Lamour et al. [(1994) Proc Natl Acad Sci USA 91:8670–8674] suggested that an early duplication of the GluRS gene in eukaryotes
gave rise to the gene for GlnRS—a copy of which was subsequently transferred to proteobacteria. However, questions remain
about the occurrence of GlnRS genes among the Eucarya (eukaryotes) outside of the ``crown' taxa (animals, fungi, and plants),
the distribution of GlnRS genes in the Bacteria, and their evolutionary relationships to genes from the Archaea. Here, we
show that GlnRS occurs in the most deeply branching eukaryotes and that putative GluRS genes from the Archaea are more closely
related to GlnRS and GluRS genes of the Eucarya than to those of Bacteria. There is still no evidence for the existence of
GlnRS in the Archaea. We propose that the last common ancestor to contemporary cells, or cenancestor, used transamidation
to synthesize Gln-tRNAGln and that both the Bacteria and the Archaea retained this pathway, while eukaryotes developed a specific GlnRS gene through
the duplication of an existing GluRS gene. In the Bacteria, GlnRS genes have been identified in a total of 10 species from
three highly diverse taxonomic groups: Thermus/Deinococcus, Proteobacteria γ/β subdivision, and Bacteroides/Cytophaga/Flexibacter.
Although all bacterial GlnRS form a monophyletic group, the broad phyletic distribution of this tRNA synthetase suggests that
multiple gene transfers from eukaryotes to bacteria occurred shortly after the Archaea–eukaryote divergence. 相似文献
14.
Multiple phospholipase A2 (PLA2) isoenzymes found in a single snake venom induce a variety of pharmacological effects. These multiple forms are formed by gene duplication and accelerated evolution of exons. We examined the amino acid sequences of 127 snake venom PLA2 enzymes and their homologues to study in which location most natural substitutions occur. Our data show that hot spots of amino acid substitutions in this group of proteins occur mostly on the surface. A logistic model correlating the substitution rates of each amino acid residue with their surface accessibility indicates that the probability of natural substitutions occurring in the fully exposed residue is 2.6–3.5 times greater than that of substitutions occurring in buried residues. These surface substitutions play a significant role in the evolution of new PLA2 isoenzymes by altering the specificity of targeting to various tissues or cells, resulting in distinct pharmacological effects. Thus natural substitutions in PLA2 enzymes, in contrast to popular belief, are not random substitutions but appear to be directed toward modifying the molecular surface. Received: 11 May 1998 / Accepted: 29 June 1998 相似文献
15.
Michael Wallis 《Journal of molecular evolution》2001,53(1):10-18
Pituitary growth hormone (GH) and prolactin have been shown previously to display a pattern of evolution in which episodes of rapid change are imposed on a low underlying basal rate (near-stasis). This study was designed to explore whether a similar pattern is seen in the evolution of other protein hormones in mammals. Seven protein hormones were examined (with the common α-subunit of the glycoprotein hormones providing an additional polypeptide for analysis)—those for which sequences from at least four eutherian orders are available with a suitable non-eutherian outgroup. Six of these (GH, prolactin, insulin, parathyroid hormone, glycoprotein hormone α-subunit, and luteinizing hormone β-subunit) showed markedly variable evolutionary rates in each case with a pattern of a slow basal rate and bursts of rapid change, the precise positions of the bursts varying from protein to protein. Two protein hormones (follicle-stimulating hormone β-subunit and thyroid-stimulating hormone β-subunit) showed no significant rate variation. Based on the sequences currently available, and pooling data from all eight proteins, the phase of slow basal change occupied about 85% of the sampled evolutionary time, but most evolutionary change (about 62% of the substitutions accepted) occurred during the episodes of rapid change. It is concluded that, in mammals at least, a pattern of prolonged periods of near-stasis with occasional episodes of rapid change provides a better model of evolutionary change for protein hormones than the one of constant evolutionary rates that is commonly favored. The mechanisms underlying this episodic evolution are not yet clear, and it may be that they vary from one group to another; in some cases, positive selection appears to underlie bursts of rapid change. Where gene duplication is associated with a period of accelerated evolution this often occurs at the end rather than the beginning of the episode. To what extent the type of pattern seen for protein hormones can be extended to other proteins remains to be established. Received: 10 October 2000 / Accepted: 18 December 2000 相似文献
16.
The pairs of nitrogen fixation genes nifDK and nifEN encode for the α and β subunits of nitrogenase and for the two subunits of the NifNE protein complex, involved in the biosynthesis
of the FeMo cofactor, respectively. Comparative analysis of the amino acid sequences of the four NifD, NifK, NifE, and NifN
in several archaeal and bacterial diazotrophs showed extensive sequence similarity between them, suggesting that their encoding
genes constitute a novel paralogous gene family. We propose a two-step model to reconstruct the possible evolutionary history
of the four genes. Accordingly, an ancestor gene gave rise, by an in-tandem paralogous duplication event followed by divergence,
to an ancestral bicistronic operon; the latter, in turn, underwent a paralogous operon duplication event followed by evolutionary
divergence leading to the ancestors of the present-day nifDK and nifEN operons. Both these paralogous duplication events very likely predated the appearance of the last universal common ancestor.
The possible role of the ancestral gene and operon in nitrogen fixation is also discussed.
Received: 21 June 1999 / Accepted: 1 March 2000 相似文献
17.
Weinreich DM 《Journal of molecular evolution》2001,52(1):40-50
A higher rate of molecular evolution in rodents than in primates at synonymous sites and, to a lesser extent, at amino acid
replacement sites has been reported previously for most nuclear genes examined. Thus in these genes the average ratio of amino
acid replacement to synonymous substitution rates in rodents is lower than in primates, an observation at odds with the neutral
model of molecular evolution. Under Ohta's mildly deleterious model of molecular evolution, these observations are seen as
the consequence of the combined effects of a shorter generation time (driving a higher mutation rate) and a larger effective
population size (resulting in more effective selection against mildly deleterious mutations) in rodents. The present study
reports the results of a maximum-likelihood analysis of the ratio of amino acid replacements to synonymous substitutions for
genes encoded in mitochondrial DNA (mtDNA) in these two lineages. A similar pattern is observed: in rodents this ratio is
significantly lower than in primates, again consistent only with the mildly deleterious model. Interestingly the lineage-specific
difference is much more pronounced in mtDNA-encoded than in nuclear-encoded proteins, an observation which is shown to run
counter to expectation under Ohta's model. Finally, accepting certain fossil divergence dates, the lineage-specific difference
in amino acid replacement-to-synonymous substitution ratio in mtDNA can be partitioned and is found to be entirely the consequence
of a higher mutation rate in rodents. This conclusion is consistent with a replication-dependent model of mutation in mtDNA.
Received: 24 September 1999 / Accepted: 18 September 2000 相似文献
18.
We have investigated the phylogenetic relationships of monotremes and marsupials using nucleotide sequence data from the
neurotrophins; nerve growth factor (NGF), brain-derived neurotrophic factor (BDNF), and neurotrophin-3 (NT-3). The study included
species representing monotremes, Australasian marsupials and placentals, as well as species representing birds, reptiles,
and fish. PCR was used to amplify fragments encoding parts of the neurotrophin genes from echidna, platypus, and eight marsupials
from four different orders. Phylogenetic trees were generated using parsimony analysis, and support for the different tree
structures was evaluated by bootstrapping. The analysis was performed with NGF, BDNF, or NT-3 sequence data used individually
as well as with the three neurotrophins in a combined matrix, thereby simultaneously considering phylogenetic information
from three separate genes. The results showed that the monotreme neurotrophin sequences associate to either therian or bird
neurotrophin sequences and suggests that the monotremes are not necessarily related closer to therians than to birds. Furthermore,
the results confirmed the present classification of four Australasian marsupial orders based on morphological characters,
and suggested a phylogenetic relationship where Dasyuromorphia is related closest to Peramelemorphia followed by Notoryctemorphia
and Diprotodontia. These studies show that sequence data from neurotrophins are well suited for phylogenetic analysis of mammals
and that neurotrophins can resolve basal relationships in the evolutionary tree.
Received: 27 January 1997 / Accepted: 20 March 1997 相似文献
19.
Drosophila ananassae is known to produce numerous alpha-amylase variants. We have cloned seven different Amy genes in an African strain homozygous for the AMY1,2,3,4 electrophoretic pattern. These genes are organized as two main clusters:
the first one contains three intronless copies on the 2L chromosome arm, two of which are tandemly arranged. The other cluster,
on the 3L arm, contains two intron-bearing copies. The amylase variants AMY1 and AMY2 have been assigned to the intronless
cluster, and AMY3 and AMY4 to the second one. The divergence of coding sequences between clusters is moderate (6.1% in amino
acids), but the flanking regions are very different, which could explain their differential regulation. Within each cluster,
coding and noncoding regions are conserved. Two very divergent genes were also cloned, both on chromosome 3L, but very distant
from each other and from the other genes. One is the Amyrel homologous (41% divergent), the second one, Amyc1 (21.6% divergent) is unknown outside the D. ananassae subgroup. These two genes have unknown functions.
Received: 30 May 2000 / Accepted: 17 July 2000 相似文献
20.
We have examined the length distribution of perfect dimer repeats, where perfect means uninterrupted by any other base, using data from GenBank on primates and rodents. Virtually no lengths greater than 30 repeats are found, except for rodent AG repeats, which extend to 35. Comparable numbers of long AC and AG repeats suggest that they have not been selected for special functions or DNA structures. We have compared the data with predictions of two models: (1) a Bernoulli Model in which bases are assumed equally likely and distributed at random and (2) an Unbiased Random Walk Model (URWM) in which repeats are permitted to change length by plus or minus one unit, with equal probabilities, and in which base substitutions are allowed to destroy long perfect repeats, producing two shorter perfect repeats. The source of repeats is assumed to be from single base substutions from neighboring sequences, i.e., those differing from the perfect repeat by a single base. Mutation rates either independent of repeat length or proportional to length were considered. An upper limit to the lengths L≈ 30 is assumed and isolated dimers are assumed unable to expand, so that there are absorbing barriers to the random walk at lengths 1 and L+ 1, and a steady state of lengths is reached. With these assumptions and estimated values for the rates of length mutation and base substitution, reasonable agreement is found with the data for lengths > 5 repeats. Shorter repeats, of lengths ≤ 3 are in general agreement with the Bernoulli Model. By reducing the rate of length mutations for n≤ 5, it is possible to obtain reasonable agreement with the full range of data. For these reduced rates, the times between length mutations become comparable to those suggested for a bottleneck in the evolution of Homo sapiens, which may be the reason for low heterozygosity of short repeats. 相似文献