The Artemia hemoglobin is a dimer comprising two nine-domain covalent polymers in quaternary association. Each polymer is encoded by a gene representing nine successive globin domains which have different sequences and are presumed to have been copied originally from a single-domain gene. Two different polymers exist as the result of a complete duplication of the nine-domain gene, allowing the formation of either homodimers or the heterodimer. The total population size of 18 domains comprising nine corresponding pairs, coupled with the probability that they reflect several hundred million years of evolution in the same lineage, provides a unique model in which the process of gene multiplication can be analyzed. The outcome has important implications for the reliability of local molecular clocks. The two polymers differ from each other at 11.7% of amino acid sites; however when corresponding individual domains are compared between polymers, amino acid substitution fluctuates by a factor of 2.7-fold from lowest to highest. This variation is not obvious at the DNA level: Domain pair identity values fluctuate by 1.3-fold. Identity values are, however, uncorrected for multiple substitutions, and both silent and nonsilent changes are pooled. Therefore, to determine the variability in relative substitution rates at the DNA level, we have used the method of Li (1993, J Mol Evol 36:96–99) to determine estimates of nonsynonymous (K A ) and synonymous (K S ) substitutions per site for the nine pairs of domains. As expected, the overall level of silent substitutions (K S of 56.9%) far exceeded nonsilent substitutions (K A of 6.7%); however, for corresponding domain pairs, K A fluctuates by 2.3-fold and K S by 1.7-fold. The large discrepancies reflected in the expressed protein have accrued within a single lineage and the implication is that divergence dates of different genera based on amino acid sequences, even with well-studied proteins of reasonable size, can be wrong by a factor well in excess of 2. Received: 4 June 1997 / Accepted: 17 December 1997  相似文献   

Nonrandomness in the intron and exon phase distributions in a sample of 305 human genes has been found and analyzed. It was shown that exon duplications had a significant effect on the exon phase nonrandomness. All of the nonrandomness is probably due to both the processes of exon duplication and shuffling. A quantitative estimation of exon duplications in the human genome and their influence on the intron and exon phase distributions has been analyzed. According to our estimation, the proportion of duplicated exons in the human genome constitutes at least 6% of the total. Generalizing the particular case of exon duplication to the more common event of exon shuffling, we modeled and analyzed the influence of exon shuffling on intron phase distribution. Received: 28 March 1997 / Accepted: 9 July 1997  相似文献   

We have determined the genomic structure of an integrin β-subunit gene from the coral, Acropora millepora. The coding region of the gene contains 26 introns, spaced relatively uniformly, and this is significantly more than have been found in any integrin β-subunit genes from higher animals. Twenty-five of the 26 coral introns are also found in a β-subunit gene from at least one other phylum, indicating that the coral introns are ancestral. While there are some suggestions of intron gain or sliding, the predominant theme seen in the homologues from higher animals is extensive intron loss. The coral baseline allows one to infer that a number of introns found in only one phylum of higher animals result from frequent intron loss, as opposed to the seemingly more parsimonious alternative of isolated intron gain. The patterns of intron loss confirm results from protein sequences that most of the vertebrate genes, with the exception of β4, belong to one of two β subunit families. The similarity of the patterns within each of the β1,2,7 and β3,5,6,8 groups indicates that these gene structures have been very stable since early vertebrate evolution. Intron loss has been more extensive in the invertebrate genes, and obvious patterns have yet to emerge in this more limited data set. Received: 5 March 2001 / Accepted: 17 May 2001  相似文献   

The intron positions of ten different protein families were examined to determine (the statistical likelihood of) whether spliceosomal introns are the result of random insertion events into previously intronless genes, on the one hand, or the result of random loss from common ancestral introns, on the other. The number of expected matches for the alternative scenarios was calculated for a binomial distribution by considering currently observed introns relative to all possible locations for insertion or loss. Introns occurring at approximately the same location (hereafter called a ``match') were tallied for each of the paired proteins. Matches were identified by their positions in the multiple alignment and were defined as any two introns occurring within a window of 11 possible nucleotide positions, thereby allowing for possible alignment errors and ``intron sliding.' Matches were tallied from the raw data and compared with the expected number of matches for the two different scenarios. The results suggest that the distribution of introns in genes encoding proteins is due to random insertion and not random loss. Received: 8 September 1996 / Accepted: 24 January 1997  相似文献   

Large dsDNA-containing chlorella viruses encode a pyrimidine dimer-specific glycosylase (PDG) that initiates repair of UV-induced pyrimidine dimers. The PDG enzyme is a homologue of the bacteriophage T4-encoded endonuclease V. The pdg gene was cloned and sequenced from 42 chlorella viruses isolated over a 12-year period from diverse geographic regions. Surprisingly, the pdg gene from 15 of these 42 viruses contain a 98-nucleotide intron that is 100% conserved among the viruses and another 4 viruses contain an 81-nucleotide intron, in the same position, that is nearly 100% identical (one virus differed by one base). In contrast, the nucleotides in the pdg coding regions (exons) from the intron-containing viruses are 84 to 100% identical. The introns in the pdg gene have 5′-AG/GTATGT and 3′-TTGCAG/AA splice site sequences which are characteristic of nuclear-located, spliceosomal processed pre-mRNA introns. The 100% identity of the 98-nucleotide intron sequence in the 15 viruses and the near-perfect identity of an 81-nucleotide intron sequence in another 4 viruses imply strong selective pressure to maintain the DNA sequence of the intron when it is in the pdg gene. However, the ability of intron-plus and intron-minus viruses to repair UV-damaged DNA in the dark was nearly identical. These findings contradict the widely accepted dogma that intron sequences are more variable than exon sequences. Received: 13 May 1999 / Accepted: 20 August 1999  相似文献   

The Peperomia polybotrya coxI gene intron is the only currently reported group I intron in a vascular plant mitochondrial genome and it likely originated by horizontal transfer from a fungal donor. We provide a clearer picture of the horizontal transfer and a portrayal of the evolution of the group I intron since it was gained by the Peperomia mitochondrial genome. The intron was transferred recently in terms of plant evolution, being restricted to the single genus Peperomia among the order Piperales. Additional support is presented for the suggestion that a recombination/repair mechanism was used by the intron for integration into the Peperomia mitochondrial genome, as a perfect 1:1 correspondence exists between the intron's presence in a species and the presence of divergent nucleotide markers flanking the intron insertion site. Sequencing of coxI introns from additional Peperomia species revealed that several mutations have occurred in the intron since the horizontal transfer, but sequence alterations have not caused frameshifts or created stop codons in the intronic open reading frame. In addition, two coxI pseudogenes in Peperomia cubensis were discovered that lack a large region of coxI exon 2 and contain a truncated version of the group I intron that likely cannot be spliced out. Received: 29 May 1997 / Accepted: 1 November 1997  相似文献   

To study sex differences in mutation rate in primates, we sequenced the third introns of the AMGX and AMGY genes from humans, orangutans, and squirrel monkeys and estimated that the male-to-female ratio of mutation rate is α= 5.14 with the 95% confidence interval (2.42, 16.6). Combining this data set and the data sets from ZFX/ZFY and SMCX/SMCY introns, we obtained an estimate of α= 5.06 with the 95% confidence interval reduced to (3.24, 8.79). The α value is significantly higher in higher primates than in rodents. Received: 19 August 1996 / Accepted: 22 November 1996  相似文献   

P elements of two different subfamilies designated as M- and O-type are thought to have invaded host species in the Drosophila obscura group via horizontal transmission from external sources. Sequence comparisons with P elements isolated from other species suggested that the horizontal invasion by the O-type must have been a rather recent event, whereas the M-type invasion should have occurred in the more distant past. To trace the phylogenetic history of O-type elements, additional taxa were screened for the presence of O- and M-type elements using type-specific PCR primers. The phylogeny deduced from the sequence data of a 927-bp section (14 taxa) indicate that O-type elements have undergone longer periods of regular vertical transmission in the lineages of the saltans and willistoni groups of Drosophila. However, starting from a species of the D. willistoni group they were transmitted horizontally into other lineages. First the lineage of the D. affinis subgroup was infected, and finally, in a more recent wave of horizontal spread, species of three different genera were invaded by O-type elements from the D. affinis lineage: Scaptomyza, Lordiphosa, and the sibling species D. bifasciata/D. imaii of the Drosophila obscura subgroup. The O-type elements isolated from these taxa are almost identical (sequence divergence <1%). In contrast, no such striking similarities are observed among M-type elements. Nevertheless, the sequence phylogeny of M-type elements is also not in accordance with the phylogeny of their host species, suggesting earlier horizontal transfer events. The results imply that P elements cross species barriers more frequently than previously thought but require a particular genomic environment and thus seem to be confined to a rather narrow spectrum of host species. Consequently, different P element types acquired by successive horizontal transmission events often coexist within the same genome. Received: 15 May 2000 / Accepted: 19 July 2000  相似文献   

The morphologically uniform species Gonium pectorale is a colonial green flagellate of worldwide distribution. The affinities of 25 isolates from 18 sites on five continents were assessed by both DNA sequence comparisons and sexual compatibility. Complete sequences were obtained (i) for the internal transcribed spacer ITS-1 and ITS-2 regions of ribosomal DNA and (ii) for each of three single-copy spliceosomal introns, two in a small G protein and one in the actin gene. ITS sequences appeared to homogenize sufficiently rapidly to behave as a single copy gene. Intron sequence differences between isolates in this species reached nucleotide substitution saturation, while ITS sequences did not. Parsimony and evolutionary distance analysis of the two types of DNA data gave essentially the same tree conformation. By all these criteria, the group of G. pectorale isolates fell into two main clades, A and B. Clade A, with isolates from four continents, was comprised of four subclades of quite closely related isolates, plus one strain of ambiguous affinity. Clade B was comprised of two subclades represented by South African and South American isolates, respectively; thus, only subclades of clade B showed geographical localization. With respect to mating, all isolates except one homothallic strain and one apparently sterile strain fell into either one or the other of two mating types. Pairings in all possible combinations revealed that isolates from the same site formed abundant zygotes, which germinated to produce new, sexually active organisms. Zygotes were also formed in many pairings of other combinations, including crosses of clade A with clade B organisms, but none of the latter produced viable germlings. The ability to mate and produce viable progeny that were themselves capable of sexual reproduction was restricted to members of subclades established on the basis of DNA sequence similarities. Thus, the grades of difference in both nuclear intron sequences and rDNA ITS sequences paralleled those observed in the sexual analysis. Received: 9 March 1998 / Accepted: 1 June 1998  相似文献   

Phylogenetic relationships among the NBS-LRR (nucleotide binding site–leucine-rich repeat) resistance gene homologues (RGHs) from 30 genera and nine families were evaluated relative to phylogenies for these taxa. More than 800 NBS-LRR RGHs were analyzed, primarily from Fabaceae, Brassicaceae, Poaceae, and Solanaceae species, but also from representatives of other angiosperm and gymnosperm families. Parsimony, maximum likelihood, and distance methods were used to classify these RGHs relative to previously observed gene subfamilies as well as within more closely related sequence clades. Grouping sequences using a distance cutoff of 250 PAM units (point accepted mutations per 100 residues) identified at least five ancient sequence clades with representatives from several plant families: the previously observed TIR gene subfamily and a minimum of four deep splits within the non-TIR gene subfamily. The deep splits in the non-TIR subfamily are also reflected in comparisons of amino acid substitution rates in various species and in ratios of nonsynonymous-to-synonymous nucleotide substitution rates (K A/K S values) in Arabidopsis thaliana. Lower K A/K S values in the TIR than the non-TIR sequences suggest greater functional constraints in the TIR subfamily. At least three of the five identified ancient clades appear to predate the angiosperm–gymnosperm radiation. Monocot sequences are absent from the TIR subfamily, as observed in previous studies. In both subfamilies, clades with sequences separated by approximately 150 PAM units are family but not genus specific, providing a rough measure of minimum dates for the first diversification event within these clades. Within any one clade, particular taxa may be dramatically over- or underrepresented, suggesting preferential expansions or losses of certain RGH types within particular taxa and suggesting that no one species will provide models for all major sequence types in other taxa. Received: 13 June 2001 / Accepted: 22 October 2001  相似文献   

The extracellular hemoglobins of cladocerans derive from the aggregation of 12 two-domain globin subunits that are apparently encoded by four genes. This study establishes that at least some of these genes occur as a tandem array in both Daphnia magna and Daphnia exilis. The genes share a uniform structure; a bridge intron separates two globin domains which each include three exons and two introns. Introns are small, averaging just 77 bp, but a longer sequence (2.2–3.2 kb) separates adjacent globin genes. A survey of structural diversity in globin genes from other daphniids revealed three independent cases of intron loss, but exon lengths were identical, excepting a 3-bp insertion in exon 5 of Simocephalus. Heterogeneity in the extent of nucleotide divergence was marked among exons, largely as a result of the pronounced diversification of the terminal exon. This variation reflected, in part, varying exposure to concerted evolution. Conversion events were frequent in exons 1–4 but were absent from exons 5 and 6. Because of this difference, the results of phylogenetic analyses were strongly affected by the sequences employed in this construction. Phylogenies based on total nucleotide divergence in exons 1–4 revealed affinities among all genes isolated from a single species, reflecting the impact of gene conversion events. In contrast, phylogenies based on total nucleotide divergence in exons 5 and 6 revealed affinities among orthologous genes from different taxa. Received: 8 March 1999 / Accepted: 14 July 1999  相似文献   

K-Cl cotransport is abnormally active in erythrocytes containing positively charged hemoglobins such as Hb S (SS: β6 Glu → Val) or Hb C (CC: β6 Glu → Lys). The relatively younger age of erythrocytes in these diseases cannot completely account for the increased K-Cl cotransport activity. It has been suggested that these positively charged Hb may interact with the K-Cl cotransport system or one of its regulators and induce changes in its functional activity. We report here data on the volume- and pH-dependence of K-Cl cotransport in ghosts obtained from normal and sickle erythrocytes, and on the effect of addition of either Hb A or Hb S before resealing. In erythrocyte ghosts prepared with the gel column method to contain minimal amounts of Hb, (white ghosts, WG), K-Cl cotransport has similar magnitude in normal and sickle erythrocytes, is not inhibited by alkaline pH and it is volume-independent. Addition of low concentrations of Hb A to WG from normal erythrocytes decreases the magnitude of K-Cl cotransport and restores its volume dependency, but not its pH sensitivity. Addition of Hb S to WG from either normal or sickle erythrocytes restores the volume-dependent component of K-Cl cotransport and increases the magnitude of flux mediated by this transporter. Thus, Hb A and Hb S seem to affect in different manners the functional properties of K-Cl cotransport. Received: 29 May 1998/Revised: 3 November 1998  相似文献   

The correlation was shown between the length of introns and the codon usage of the coding sequences of the corresponding genes, which in some cases can be related to the level of gene expression. The link is positive in the unicellular organisms, i.e., genes with the longer introns show the higher bias of codon usage. It is most pronounced in baker's yeast, where it is definitely related to the level of gene expression—genes with the higher level of expression have the longer introns. The correlation is inverted in multicellular organisms as compared to unicellular ones. Some organisms, however, do not show the link. The presence or absence of the link does not seem to be related to the GC percent of the coding sequences. Received: 7 December 1999 / Accepted: 10 May 2000  相似文献   

Cryptomonads, small biflagellate algae, contain four different genomes. In addition to the nucleus, mitochondrion, and chloroplast is a fourth DNA-containing organelle the nucleomorph. Nucleomorphs result from the successive reduction of the nucleus of an engulfed phototrophic eukaryotic endosymbiont by a secondary eukaryotic host cell. By sequencing the chloroplast genome and the nucleomorph chromosomes, we identified a groEL homologue in the genome of the chloroplast and a related cpn60 in one of the nucleomorph chromosomes. The nucleomorph-encoded Cpn60 and the chloroplast-encoded GroEL correspond in each case to one of the two divergent GroEL homologues in the cyanobacterium Synechocystis sp. PCC6803. The coexistence of divergent groEL/cpn60 genes in different genomes in one cell offers insights into gene transfer from evolving chloroplasts to cell nuclei and convergent gene evolution in chlorophyll a/b versus chlorophyll a/c/phycobilin eukaryotic lineages. Received: 24 April 1998 / Accepted: 12 June 1998  相似文献   

Vertebrate and many invertebrate globin genes have a three-exon/two-intron organization, with introns in highly conserved positions. According to the ``intron early' hypothesis, introns are the vestigial segments which flank previously independent coding sequences, thus providing evidence for the assembly of the ancient proteins by ``exon shuffling.' In this paper, we report the analysis of the genes of the bivalve mollusk Scapharca inaequivalvis tetrameric hemoglobin (HbII), which support this hypothesis, at least for the hemoglobin genes. We show the existence of ``minigenes' in the IIA and IIB globin genes, spanning part of the first and second introns, ``in frame' with the heme-binding domain coded by the second exon. Further support for the exon shuffling hypothesis can be found in the degree of identity of the ``new' translated sequences with those flanking the central protein domain of some invertebrate hemoglobins. Received: 31 July 1997 / Accepted: 12 December 1997  相似文献   

Variability profiles measured over a set of aligned sequences can be used to estimate evolutionary freedom to vary. Differences in variability profiles between clades can be used to identify shifts in function at the molecular level. We demonstrate such a shift between the alpha and beta subunits of hemoglobin. We also show that the variability profiles for myoglobin are different between whales and primates and speculate that the differences between the two clades may reflect a shift associated with the novel oxygen storage demands in the lineage leading to whales. We discuss the relationship between sequence variability and ``evolutionary opportunity' and explore the utility of Maynard Smith's multidimensional evolutionary opportunity space metaphor for exploring functional constraints, genetic redundancy, and the context dependency of the genotype-phenotype map. This work has implications for quantitatively defining and comparing protein function. Supplementary data is available from bioinfo.mbb.yale.edu/align. Received: 16 September 1999 / Accepted: 19 May 2000  相似文献   

Photosynthetic eukaryotes can, according to features of their chloroplasts, be divided into two major groups: the red and the green lineage of plastid evolution. To extend the knowledge about the evolution of the red lineage we have sequenced and analyzed the chloroplast genome (cp-genome) of Cyanidium caldarium RK1, a unicellular red alga (AF022186). The analysis revealed that this genome shows several unusual structural features, such as a hypothetical hairpin structure in a gene-free region and absence of large repeat units. We provide evidence that this structural organization of the cp-genome of C. caldarium may be that of the most ancient cp-genome so far described. We also compared the cp-genome of C. caldarium to the other known cp-genomes of the red lineage. The cp-genome of C. caldarium cannot be readily aligned with that of Porphyra purpurea, a multicellular red alga, or Guillardia theta due to a displacement of a region of the cp-genome. The phylogenetic tree reveals that the secondary endosymbiosis, through which G. theta evolved, took place after the separation of the ancestors of C. caldarium and P. purpurea. We found several genes unique to the cp-genome of C. caldarium. Five of them seem to be involved in the building of bacterial cell envelopes and may be responsible for the thermotolerance of the chloroplast of this alga. Two additional genes may play a role in stabilizing the photosynthetic machinery against salt stress and detoxification of the chloroplast. Thus, these genes may be unique to the cp-genome of C. caldarium and may be required for the endurance of the extreme living conditions of this alga. Received: 3 June 2000 / Accepted: 18 July 2000  相似文献   

The trypsin family of serine proteases is one of the most studied protein families, with a wealth of amino acid sequence information available in public databases. Since trypsin-like enzymes are widely distributed in living organisms in nature, likely evolutionary scenarios have been proposed. A novel methodology for Fourier transformation of biological sequences (FOTOBIS) is presented. The methodology is well suited for the identification of the size and extent of short repeats in protein sequences. In the present paper the trypsin family of enzymes is analyzed with FOTOBIS and strong evidence for tandem gene duplication is found. A likely evolutionary path for the development of present-day trypsins involved an intrinsic extensive tandem gene duplication of a small DNA fragment of 15–18 nucleotides, corresponding to five or six amino acids. This ancestral trypsin gene was subsequently duplicated, leading to the earliest version of a full-sized trypsin, from which the contemporary trypsins have developed. Received: 22 November 1997 / Accepted: 26 January 1998  相似文献   

The nucleotide sequence of the 18S rDNA coding gene in the ascomycetes parasitic fungus Isaria japonica contains a group I intron with a length of 379 nucleotides. The identification of the DNA sequence as a group I intron is based on its sequence homology to other fungal group I introns. Its group I intron contained the highly conserved sequence elements P, Q, R, and S found in other group I introns. Surprisingly, the intron sequence of I. japonica is more similar to that of Ustilago maydis than to the one found in Sclerotinia sclerotiorum. This is in contrast to the sequence identity found on the neighboring rDNA. This is an interesting finding and suggests a horizontal transfer of group I intron sequences. Received: 19 September 1997 / Accepted: 10 September 1998  相似文献   

