首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The entire set of six closely related Drosophila actin genes was isolated using recombinant DNA methodology, and the structures of the respective coding regions were characterized by gene mapping techniques and by nucleotide sequencing of selected portions. Structural comparisons of these genes have resulted in several unexpected findings. Most striking is the nonconservation of the positions of intervening sequences within the protein-encoding regions of these genes. One of the Drosophila actin genes, DmA4, is split within a glycine codon at position 13; none of the remaining five genes is interrupted in the analogous position. Another gene, DmA6, is split within a glycine codon at position 307; at least two of the Drosophila actin genes are not split in the analogous position. Additionally, none of the Drosophila actin genes is split within codon four, where the yeast actin gene is interrupted. The six Drosophila actin genes encode several different proteins, but the amino acid sequence of each is similar to that of vertebrate cytoplasmic actins. None of the genes encodes a protein comparable in primary sequence to vertebrate skeletal muscle actin. Surprisingly, in each of these derived actin amino acid sequences in the initiator methionine is directly followed by a cysteine residue, which in turn precedes the string of three acidic amino acids characteristic of the amino termini of mature vertebrate cytoplasmic actins. We discuss these findings in the context of actin gene evolution and function.  相似文献   

2.
The nucleotide sequence of the rat cytoplasmic beta-actin gene.   总被引:120,自引:23,他引:97       下载免费PDF全文
U Nudel  R Zakut  M Shani  S Neuman  Z Levy    D Yaffe 《Nucleic acids research》1983,11(6):1759-1771
The nucleotide sequence of the rat beta-actin gene was determined. The gene codes for a protein identical to the bovine beta-actin. It has a large intron in the 5' untranslated region 6 nucleotides upstream from the initiator ATG, and 4 introns in the coding region at codons specifying amino acids 41/42, 121/122, 267, and 327/328. Unlike the skeletal muscle actin gene and many other actin genes, the beta-actin gene lacks the codon for Cys between the initiator ATG and the codon for the N-terminal amino acid of the mature protein. The usage of synonymous codons in the beta-actin gene is nonrandom, and is similar to that in the rat skeletal muscle and other vertebrate actin genes, but differs from the codon usage in yeast and soybean actin genes.  相似文献   

3.
We have compared the partial nucleotide and derived amino acid sequences of a phaseolin seed storage protein gene ofPhaseolus vulgaris (1) and a conglycinin storage protein gene ofGlycine max (2). Although these proteins are not antigenically related to one another, the architecture of the genes is similar throughout the sequences compared here. Intervening sequences interrupt the same amino acid positions in both genes. Within the 28% of theG. max gene and the 38% of theP. vulgaris gene represented in this comparison, 73% of the nucleotides in the coding and intervening sequences are identical, excluding the insertions and deletions. The nucleotide mismatches found in the coding sequences are distributed throughout the three codon positions with little bias towards the third codon position. In addition to the single nucleotide differences, six insertions or deletions, ranging from three to twenty-seven nucleotides in length, occur in this portion of the coding region and these are partially responsible for the molecular weight differences of the conglycinin α′-subunit and the phaseolin subunit.  相似文献   

4.
The nucleotide sequence of the chick a-actin gene reveals that the gene is comprised of 7 exons separated by six very short intervening sequences (IVS). The first IVS interrupts the 73 nucleotide 5' untranslated segment between nucleotides 61 and 62. The remaining IVS interrupt the translated region at codons 41/42, 150, 204, 267, and 327/328. The 272 nucleotide 3' untranslated segment is not interrupted by IVS. The amino acid sequence derived from the nucleotide sequence is identical to the published sequence for chick a-actin except for the presence of a met-cys dipeptide at the amino-terminus. The IVS positions in the chick a-actin gene are identical to those of the rat a-actin gene. While there is partial coincidence of the IVS in the a-actin genes with the vertebrate b-actin genes and 2 sea urchin actin genes, there is no coincidence with actin genes from any other source except soybean where one IVS position is shared. This discordance in IVS positions makes the actin gene family unique among the eucaryotic genes analyzed to date.  相似文献   

5.
Abstract— Amino acid encoding genes contain character state information that may be useful for phylogenetic analysis on at least two levels. The nucleotide sequence and the translated amino acid sequences have both been employed separately as character states for cladistic studies of various taxa, including studies of the genealogy of genes in multigene families. In essence, amino acid sequences and nucleic acid sequences are two different ways of character coding the information in a gene. Silent positions in the nucleotide sequence (first or third positions in codons that can accrue change without changing the identity of the amino acid that the triplet codes for) may accrue change relatively rapidly and become saturated, losing the pattern of historical divergence. On the other hand, non-silent nucleotide alterations and their accompanying amino acid changes may evolve too slowly to reveal relationships among closely related taxa. In general, the dynamics of sequence change in silent and non-silent positions in protein coding genes result in homoplasy and lack of resolution, respectively. We suggest that the combination of nucleic acid and the translated amino acid coded character states into the same data matrix for phylogenetic analysis addresses some of the problems caused by the rapid change of silent nucleotide positions and overall slow rate of change of non-silent nucleotide positions and slowly changing amino acid positions. One major theoretical problem with this approach is the apparent non-independence of the two sources of characters. However, there are at least three possible outcomes when comparing protein coding nucleic acid sequences with their translated amino acids in a phylogenetic context on a codon by codon basis. First, the two character sets for a codon may be entirely congruent with respect to the information they convey about the relationships of a certain set of taxa. Second, one character set may display no information concerning a phylogenetic hypothesis while the other character set may impart information to a hypothesis. These two possibilities are cases of non-independence, however, we argue that congruence in such cases can be thought of as increasing the weight of the particular phylogenetic hypothesis that is supported by those characters. In the third case, the two sources of character information for a particular codon may be entirely incongruent with respect to phylogenetic hypotheses concerning the taxa examined. In this last case the two character sets are independent in that information from neither can predict the character states of the other. Examples of these possibilities are discussed and the general applicability of combining these two sources of information for protein coding genes is presented using sequences from the homeobox region of 46 homeobox genes fromDrosophila melanogasterto develop a hypothesis of genealogical relationship of these genes in this large multigene family.  相似文献   

6.
Investigations into mechanims by which cytosine methylation may be genetically controlled have led to the identification of single nucleotide polymorphisms within the coding region of DNMT2 that are conserved in different ethnic groups. The DNMT2 I allele includes a G at nucleotide position 104 of exon 2 and a C at position 50 of exon 4. The alternative allele, DNMT2 II, includes an A and T, respectively, at these positions. G was never found in the absence of C and vice versa and A was never found in the absence of T and vice versa. The gene products of DNMT2 I and DNMT2 II differ by the inclusion of a histidine or tyrosine residue at the position specified by codon 101. This amino acid substitution alters the amino acid composition of a conserved methylating enzyme motif shown to be involved in S-adenosylmethionine binding in M.HhaI, a bacterial methyltransferase that is almost identical to DNMT2 in size and structure. Demonstration of strong linkage disequilibrium between the nucleotide substitutions associated with each DNMT2 allele provides valuable tools for the investigation of molecular genetic mechanisms of evolution and speciation.  相似文献   

7.
8.
The gene coding for the ATCGAT specific BanIII DNA methyltransferase (M-BanIII) of Bacillus aneurinolyticus was cloned and its nucleotides sequenced. The coding region was assigned on the nucleotide sequence on the basis of the N-terminal amino acid sequence and molecular weight of the enzyme. The M-BanIII gene coded for a protein of 580 amino acid residues (MW 66,344). Comparison with other methylases indicated that the M-BanIII sequence contained a segment of tetra-amino acids, NPPY, characteristic of N6-adenine methylases. In addition, some homologous regions were found in the sequences of type II adenine methylases PaeR7I(CTCGAG), TaqI(TCGA) and PstI(CTGCAG), containing TCGA within the recognition sequences.  相似文献   

9.
10.
The relative efficiencies of different protein-coding genes of the mitochondrial genome and different tree-building methods in recovering a known vertebrate phylogeny (two whale species, cow, rat, mouse, opossum, chicken, frog, and three bony fish species) was evaluated. The tree-building methods examined were the neighbor joining (NJ), minimum evolution (ME), maximum parsimony (MP), and maximum likelihood (ML), and both nucleotide sequences and deduced amino acid sequences were analyzed. Generally speaking, amino acid sequences were better than nucleotide sequences in obtaining the true tree (topology) or trees close to the true tree. However, when only first and second codon positions data were used, nucleotide sequences produced reasonably good trees. Among the 13 genes examined, Nd5 produced the true tree in all tree-building methods or algorithms for both amino acid and nucleotide sequence data. Genes Cytb and Nd4 also produced the correct tree in most tree-building algorithms when amino acid sequence data were used. By contrast, Co2, Nd1, and Nd41 showed a poor performance. In general, large genes produced better results, and when the entire set of genes was used, all tree-building methods generated the true tree. In each tree-building method, several distance measures or algorithms were used, but all these distance measures or algorithms produced essentially the same results. The ME method, in which many different topologies are examined, was no better than the NJ method, which generates a single final tree. Similarly, an ML method, in which many topologies are examined, was no better than the ML star decomposition algorithm that generates a single final tree. In ML the best substitution model chosen by using the Akaike information criterion produced no better results than simpler substitution models. These results question the utility of the currently used optimization principles in phylogenetic construction. Relatively simple methods such as the NJ and ML star decomposition algorithms seem to produce as good results as those obtained by more sophisticated methods. The efficiencies of the NJ, ME, MP, and ML methods in obtaining the correct tree were nearly the same when amino acid sequence data were used. The most important factor in constructing reliable phylogenetic trees seems to be the number of amino acids or nucleotides used.   相似文献   

11.
A cDNA encoding human class III (chi ADH5) alcohol dehydrogenase was isolated, sequenced and used to comparatively map this unusual ADH. In their coding sequences, the three major ADH classes were approximately equisimilar, class II and III ADHs sharing the highest sequence identity (67%). A class III-like ADH was mapped to mouse chromosome 3, site of the ADH gene complex, and synteny of ADH5 with four other ADH loci on human chromosome 4 was confirmed. The nearly full-length 1613 nucleotide cDNA contained 433 nucleotides of 3' nontranslated sequence and two possible initiation sites for translation. A protein of 374 amino acid residues could be synthesized using the potential initiation codon at nucleotide 59. However, use of the likely initiation codon at nucleotide 5 would produce a protein of 392 residues with 19 additional N-terminal residues as compared to the known protein sequence. The derived protein sequence also differs at residue 166, where Tyr is found. This difference, due to a single base substitution, could result from cloning artifact, polymorphism, or two expressed class III ADH genes.  相似文献   

12.
C Grabau  J E Cronan  Jr 《Nucleic acids research》1986,14(13):5449-5460
The entire nucleotide sequence of the poxB (pyruvate oxidase) gene of Escherichia coli K-12 has been determined by the dideoxynucleotide (Sanger) sequencing of fragments of the gene cloned into a phage M13 vector. The gene is 1716 nucleotides in length and has an open reading frame which encodes a protein of Mr 62,018. This open reading frame was shown to encode pyruvate oxidase by alignment of the amino acid sequences deduced for the amino and carboxy termini and several internal segments of the mature protein with sequences obtained by amino acid sequence analysis. The deduced amino acid sequence of the oxidase was not unusually rich in hydrophobic sequences despite the peripheral membrane location and lipid binding properties of the protein. The codon usage of the oxidase gene was typical of a moderately expressed protein. The deduced amino acid sequence shares homology with the large subunits of the acetohydroxy acid synthase isozymes I, II, and III, encoded by the ilvB, ilvG, and ilvI genes of E. coli.  相似文献   

13.
The complete nucleotide sequence of a genomic clone encoding the mouse skeletal alpha-actin gene has been determined. This single-copy gene codes for a protein identical in primary sequence to the rabbit skeletal alpha-actin. It has a large intron in the 5'-untranslated region 12 nucleotides upstream from the initiator ATG and five small introns in the coding region at codons specifying amino acids 41/42, 150, 204, 267, and 327/328. These intron positions are identical to those for the corresponding genes of chickens and rats. Similar to other skeletal alpha-actin genes, the nucleotide sequence codes for two amino acids, Met-Cys, preceding the known N-terminal Asp of the mature protein. Comparison of the nucleotide sequences of rat, mouse, chicken, and human skeletal muscle alpha-actin genes reveals conserved sequences (some not previously noted) outside of the protein-coding region. Furthermore, several inverted repeat sequences, partially within these conserved regions, have been identified. These sequences are not present in the vertebrate cytoskeletal beta-actin genes. The strong conservation of the inverted repeat sequences suggests that they may have a role in the tissue-specific expression of skeletal alpha-actin genes.  相似文献   

14.
The evolution of yellow fever virus over 67 years was investigated by comparing the nucleotide sequences of the envelope (E) protein genes of 20 viruses isolated in Africa, the Caribbean, and South America. Uniformly weighted parsimony algorithm analysis defined two major evolutionary yellow fever virus lineages designated E genotypes I and II. E genotype I contained viruses isolated from East and Central Africa. E genotype II viruses were divided into two sublineages: IIA viruses from West Africa and IIB viruses from America, except for a 1979 virus isolated from Trinidad (TRINID79A). Unique signature patterns were identified at 111 nucleotide and 12 amino acid positions within the yellow fever virus E gene by signature pattern analysis. Yellow fever viruses from East and Central Africa contained unique signatures at 60 nucleotide and five amino acid positions, those from West Africa contained unique signatures at 25 nucleotide and two amino acid positions, and viruses from America contained such signatures at 30 nucleotide and five amino acid positions in the E gene. The dissemination of yellow fever viruses from Africa to the Americas is supported by the close genetic relatedness of genotype IIA and IIB viruses and genetic evidence of a possible second introduction of yellow fever virus from West Africa, as illustrated by the TRINID79A virus isolate. The E protein genes of American IIB yellow fever viruses had higher frequencies of amino acid substitutions than did genes of yellow fever viruses of genotypes I and IIA on the basis of comparisons with a consensus amino acid sequence for the yellow fever E gene. The great variation in the E proteins of American yellow fever virus probably results from positive selection imposed by virus interaction with different species of mosquitoes or nonhuman primates in the Americas.  相似文献   

15.
16.
Summary We have analyzed the correlation that exists between the GC levels of third and first or second codon position for about 1400 human coding sequences. The linear relationship that was found indicates that the large differences in GC level of third codon positions of human genes are paralleled by smaller differences in GC levels of first and second codon positions. Whereas third codon position differences correspond to very large differences in codon usage within the human genome, the first and second codon position differences correspond to smaller, yet very remarkable, differences in the amino acid composition of encoded proteins. Because GC levels of codon positions are linearly correlated with the GC levels of the isochores harboring the corresponding genes, both codon usage and amino acid composition are different for proteins encoded by genes located in isochores of different GC levels. Furthermore, we have also shown that a linear relationship with a unity slope and a correlation coefficient of 0.77 exists between GC levels of introns and exons from the 238 human genes currently available for this analysis. Introns are, however, about 5% lower in GC, on average, than exons from the same genes.  相似文献   

17.
Human spermidine synthase: cloning and primary structure   总被引:1,自引:0,他引:1  
Using a synthetic deoxyoligonucleotide mixture constructed for a tryptic peptide of the bovine enzyme as a probe, cDNA coding for the full-length subunit of spermidine synthase was isolated from a human decidual cDNA library constructed on phage lambda gt11. After subcloning into the Eco RI site of pBR322 and propagation, both strands of the insert were sequenced using a shotgun strategy. Starting from the first start codon, which was immediately preceded by a GC-rich region including four overlapping CCGCC consensus sequences, an open reading frame for a 302-amino-acid polypeptide was resolved. This peptide had an Mr of 33,827, started with methionine, and ended with serine. The identity of the isolated cDNA was confirmed by comparison of the deduced amino acid sequence with resolved sequences of the tryptic peptides of bovine spermidine synthase. The coding strand of the cDNA revealed no special regulatory or ribosome-binding signals within 82 nucleotides preceding the start codon and no polyadenylation signal within 247 nucleotides following the stop codon. The coding region, containing a 13-nucleotide repeat close to the 5' end, was longer than, and very different from, that of the bacterial counterpart. This region seems to be of retroviral origin and shows marked homology with sequences found in a variety of human, mammalian, avian, and viral genes and mRNAs. By computer analysis, the first 200 nucleotides of the 5' end of the coding strand appear able to form a very stable secondary structure with a free energy change of -157.6 kcal/mole.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

18.
Previously we reported the amino acid sequences of 4 well-defined sacroplasmic, high-affinity Ca(2+)-binding proteins in the protochordate amphioxus, Branchiostoma lanceolatum [1]. Here we report on the complete amino acid sequence determination of 3 additional minor isoforms. The seven isoforms differ from each other in 9 positions of a contiguous 17-residue-long segment (positions 20-36) and can be classified in a alpha (ASCP I, III and IV) and a beta lineage (ASCP II, V, VI and VII).  相似文献   

19.
The nucleotide sequence running from the genetic left end of bacteriophage T7 DNA to within the coding sequence of gene 4 is given, except for the internal coding sequence for the gene 1 protein, which has been determined elsewhere. The sequence presented contains nucleotides 1 to 3342 and 5654 to 12,100 of the approximately 40,000 base-pairs of T7 DNA. This sequence includes: the three strong early promoters and the termination site for Escherichia coli RNA polymerase: eight promoter sites for T7 RNA polymerase; six RNAase III cleavage sites; the primary origin of replication of T7 DNA; the complete coding sequences for 13 previously known T7 proteins, including the anti-restriction protein, protein kinase, DNA ligase, the gene 2 inhibitor of E. coli RNA polymerase, single-strand DNA binding protein, the gene 3 endonuclease, and lysozyme (which is actually an N-acetylmuramyl-l-alanine amidase); the complete coding sequences for eight potential new T7-coded proteins; and two apparently independent initiation sites that produce overlapping polypeptide chains of gene 4 primase. More than 86% of the first 12,100 base-pairs of T7 DNA appear to be devoted to specifying amino acid sequences for T7 proteins, and the arrangement of coding sequences and other genetic elements is very efficient. There is little overlap between coding sequences for different proteins, but junctions between adjacent coding sequences are typically close, the termination codon for one protein often overlapping the initiation codon for the next. For almost half of the potential T7 proteins, the sequence in the messenger RNA that can interact with 16 S ribosomal RNA in initiation of protein synthesis is part of the coding sequence for the preceding protein. The longest non-coding region, about 900 base-pairs, is at the left end of the DNA. The right half of this region contains the strong early promoters for E. coli RNA polymerase and the first RNAase III cleavage site. The left end contains the terminal repetition (nucleotides 1 to 160), followed by a striking array of repeated sequences (nucleotides 175 to 340) that might have some role in packaging the DNA into phage particles, and an A · T-rich region (nucleotides 356 to 492) that contains a promoter for T7 RNA polymerase, and which might function as a replication origin.  相似文献   

20.
We have determined the nucleotide sequence of the 5' untranslated region and the sequence encoding the signal peptide for mRNAs of the chick alpha 1 type I and alpha 1 type III collagen. These sequences were obtained by synthesizing the corresponding cDNAs using as primers either a synthetic oligonucleotide to prime alpha 1 type I cDNA or a DNA fragment isolated from a genomic clone coding for alpha 1 type III collagen to prime the cognate cDNA. Both primers were selected so that the resulting cDNAs would be short and would contain sequence information for the 5' untranslated region and the signal peptide of the proteins. The nucleotide sequences of these cDNAs were compared with the corresponding sequence of alpha 2 type I collagen. In each mRNA the 5' untranslated segment is approximately 130 nucleotides and contains two or more AUG triplets preceding the AUG which serves as a translation initiation codon. A sequence of about 50 nucleotides surrounding the translation initiation codon is remarkably conserved in all three mRNAs, whereas the sequences preceding and following this segment diverge markedly. This homologous sequence contains an almost identical inverted repeat sequence which could form a stable stem-loop structure. The initiation codon and the AUG which precedes it are found at the same place within this symmetrical sequence and the distance between them is invariant. The rest of the conserved sequence shows a less perfect symmetry. This conserved sequence has not been found in other genes. Our data suggest that these three and perhaps other collagen genes contain an identical regulatory signal that may play a role in determining the level of expression of these genes by modulating translational efficiency.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号