首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Code domains in tandem repetitive DNA sequence structures   总被引:6,自引:0,他引:6  
Peter Vogt 《Chromosoma》1992,101(10):585-589
Traditionally, many people doing research in molecular biology attribute coding properties to a given DNA sequence if this sequence contains an open reading frame for translation into a sequence of amino acids. This protein coding capability of DNA was detected about 30 years ago. The underlying genetic code is highly conserved and present in every biological species studied so far. Today, it is obvious that DNA has a much larger coding potential for other important tasks. Apart from coding for specific RNA molecules such as rRNA, snRNA and tRNA molecules, specific structural and sequence patterns of the DNA chain itself express distinct codes for the regulation and expression of its genetic activity. A chromatin code has been defined for phasing of the histone-octamer protein complex in the nucleosome. A translation frame code has been shown to exist that determines correct triplet counting at the ribosome during protein synthesis. A loop code seems to organize the single stranded interaction of the nascent RNA chain with proteins during the splicing process, and a splicing code phases successive 5' and 3' splicing sites. Most of these DNA codes are not exclusively based on the primary DNA sequence itself, but also seem to include specific features of the corresponding higher order structures. Based on the view that these various DNA codes are genetically instructive for specific molecular interactions or processes, important in the nucleus during interphase and during cell division, the coding capability of tandem repetitive DNA sequences has recently been reconsidered.  相似文献   

2.
Nitrosomonas europaea, a chemolithotrophic bacterium, was found to contain two copies of the gene coding for the presumed active site polypeptide of ammonia monooxygenase, the 32-kDa acetylene-binding polypeptide. One copy of this gene was cloned, and its complete nucleotide sequence is presented. Immediately downstream of this gene, in the same operon, is the gene for a 40-kDa polypeptide that copurifies with the ammonia monooxygenase acetylene-binding polypeptide. The sequence of the first 692 nucleotides of this structural gene, coding for about two-thirds of the protein, is presented. These sequences are the first sequences of protein-encoding genes from an ammonia-oxidizing autotrophic nitrifying bacterium. The two protein sequences are not homologous with the sequences of any other monooxygenase. From radioactive labelling of ammonia monooxygenase with [14C]acetylene it was determined that there are 23 nmol of ammonia monooxygenase per g of cells. The kcat of ammonia monooxygenase for NH3 in vivo was calculated to be 20 s-1.  相似文献   

3.
The glycan code of glycoproteins can be conceptually defined at molecular level by the sequence of well characterized glycans attached to evolutionarily predetermined amino acids along the polypeptide chain. Functional consequences of protein glycosylation are numerous, and include a hierarchy of properties from general physicochemical characteristics such as solubility, stability and protection of the polypeptide from the environment up to specific glycan interactions. Definition of the glycan code for glycoproteins has been so far hampered by the lack of chemically defined glycoprotein glycoforms that proved to be extremely difficult to purify from natural sources, and the total chemical synthesis of which has been hitherto possible only for very small molecular species. This review summarizes the recent progress in chemical and chemoenzymatic synthesis of complex glycans and their protein conjugates. Progress in our understanding of the ways in which a particular glycoprotein glycoform gives rise to a unique set of functional properties is now having far reaching implications for the biotechnology of important glycodrugs such as therapeutical monoclonal antibodies, glycoprotein hormones, carbohydrate conjugates used for vaccination and other practically important protein–carbohydrate conjugates.  相似文献   

4.
We analyse here the definition of the gene in order to distinguish, on the basis of modern insight in molecular biology, what the gene is coding for, namely a specific polypeptide, and how its expression is realized and controlled. Before the coding role of the DNA was discovered, a gene was identified with a specific phenotypic trait, from Mendel through Morgan up to Benzer. Subsequently, however, molecular biologists ventured to define a gene at the level of the DNA sequence in terms of coding. As is becoming ever more evident, the relations between information stored at DNA level and functional products are very intricate, and the regulatory aspects are as important and essential as the information coding for products. This approach led, thus, to a conceptual hybrid that confused coding, regulation and functional aspects. In this essay, we develop a definition of the gene that once again starts from the functional aspect. A cellular function can be represented by a polypeptide or an RNA. In the case of the polypeptide, its biochemical identity is determined by the mRNA prior to translation, and that is where we locate the gene. The steps from specific, but possibly separated sequence fragments at DNA level to that final mRNA then can be analysed in terms of regulation. For that purpose, we coin the new term “genon”. In that manner, we can clearly separate product and regulative information while keeping the fundamental relation between coding and function without the need to introduce a conceptual hybrid. In mRNA, the program regulating the expression of a gene is superimposed onto and added to the coding sequence in cis - we call it the genon. The complementary external control of a given mRNA by trans-acting factors is incorporated in its transgenon. A consequence of this definition is that, in eukaryotes, the gene is, in most cases, not yet present at DNA level. Rather, it is assembled by RNA processing, including differential splicing, from various pieces, as steered by the genon. It emerges finally as an uninterrupted nucleic acid sequence at mRNA level just prior to translation, in faithful correspondence with the amino acid sequence to be produced as a polypeptide. After translation, the genon has fulfilled its role and expires. The distinction between the protein coding information as materialised in the final polypeptide and the processing information represented by the genon allows us to set up a new information theoretic scheme. The standard sequence information determined by the genetic code expresses the relation between coding sequence and product. Backward analysis asks from which coding region in the DNA a given polypeptide originates. The (more interesting) forward analysis asks in how many polypeptides of how many different types a given DNA segment is expressed. This concerns the control of the expression process for which we have introduced the genon concept. Thus, the information theoretic analysis can capture the complementary aspects of coding and regulation, of gene and genon.  相似文献   

5.
Cloned DNAs encoding four different proteins have been isolated from recombinant cDNA libraries constructed with Glycine max seed mRNAs. Two cloned DNAs code for the alpha and alpha'-subunits of the 7S seed storage protein (conglycinin). The other cloned cDNAs code for proteins which are synthesized in vitro as 68,000 d., 60,000 d. or 53,000 d. polypeptides. Hybrid selection experiments indicate that, under low stringency hybridization conditions, all four cDNAs hybridize with mRNAs for the alpha and alpha'-subunits and the 68,000 d., 60,000 d. and 53,000 d. in vitro translation products. Within three of the mRNA, there is a conserved sequence of 155 nucleotides which is responsible for this hybridization. The conserved nucleotides in the alpha and alpha'-subunit cDNAs and the 68,000 d. polypeptide cDNAs span both coding and noncoding sequences. The differences in the coding nucleotides outside the conserved region are extensive. This suggests that selective pressure to maintain the 155 conserved nucleotides has been influenced by the structure of the seed mRNA. RNA blot hybridizations demonstrate that mRNA encoding the other major subunit (beta) of the 7S seed storage protein also shares sequence homology with the conserved 155 nucleotide sequence of the alpha and alpha'-subunit mRNAs, but not with other coding sequences.  相似文献   

6.
Genetic code redundancy allows most amino acids to be encoded by multiple codons that are non-randomly distributed along coding sequences. An accepted theory explaining the biological significance of such non-uniform codon selection is that codons are translated at different speeds. Thus, varying codon placement along a message may confer variable rates of polypeptide emergence from the ribosome, which may influence the capacity to fold toward the native state. Previous studies report conflicting results regarding whether certain codons correlate with particular structural or folding properties of the encoded protein. This is partly due to different criteria traditionally utilized for predicting translation speeds of codons, including their usage frequencies and the concentration of tRNA species capable of decoding them, which do not always correlate. Here, we developed a metric to predict organism-specific relative translation rates of codons based on the availability of tRNA decoding mechanisms: Watson-Crick, non-Watson-Crick or both types of interactions. We determine translation rates of messages by pulse-chase analyses in living Escherichia coli cells and show that sequence engineering based on these concepts predictably modulates translation rates in a manner that is superior to codon usage frequency, which occur during the elongation phase, and significantly impacts folding of the encoded polypeptide. Finally, we demonstrate that sequence harmonization based on expression host tRNA pools, designed to mimic ribosome movement of the original organism, can significantly increase the folding of the encoded polypeptide. These results illuminate how genetic code degeneracy may function to specify properties beyond amino acid encoding, including folding.  相似文献   

7.
8.
The cDNA coding for a glutelin-2 protein from maize endosperm has been cloned and the complete amino acid sequence of the protein derived for the first time. An immature maize endosperm cDNA bank was screened for the expression of a beta-lactamase:glutelin-2 (G2) fusion polypeptide by using antibodies against the purified 28 kd G2 protein. A clone corresponding to the 28 kd G2 protein was sequenced and the primary structure of this protein was derived. Five regions can be defined in the protein sequence: an 11 residue N-terminal part, a repeated region formed by eight units of the sequence Pro-Pro-Pro-Val-His-Leu, an alternating Pro-X stretch 21 residues long, a Cys rich domain and a C-terminal part rich in Gln. The protein sequence is preceded by 19 residues which have the characteristics of the signal peptide found in secreted proteins. Unlike zeins, the main maize storage proteins, 28 kd glutelin-2 has several homologous sequences in common with other cereal storage proteins.  相似文献   

9.
From a human fetal liver cDNA library, a cDNA clone (lambda HFL33) containing the entire coding region for a form of cytochrome P-450 related to P-450 HFLa was obtained. The clone was 1,971 bp long and had an open reading frame of 1,509 nucleotides coding for a 503 amino acid polypeptide. The nucleotide and the deduced amino acid sequences of lambda HFL33 were very similar to but clearly distinct from those of NF25 and HLp cDNAs, which code for forms of cytochrome P-450 in adult human liver. The deduced N-terminal amino acid sequence of the HFL33 protein was identical to that of P-450 HFLa.  相似文献   

10.
The complete nucleotide sequence of human papillomavirus type 1a (7811 nucleotides) has been established. The overall organization of the viral genome is different from that of other related papovaviruses (SV40, BKV, polyoma). Firstly, genetic information seems to be coded by one strand. Secondly, no significant homology is found with SV40 or polyoma coding sequence for either DNA or deducted protein sequences. The relatedness of human and bovine papillomaviruses is revealed by a conserved coding sequence in the two species. Two regions can be defined on the viral genome: the putative early region contains two large open reading frames of 1446 and 966 nucleotides, together with several split ones, and corresponds to the transforming part of the bovine papillomavirus type 1 genome, and the remaining sequences, which include two open reading frames likely to encode structural polypeptide(s). The DNA sequence is analysed and putative signals for regulation of gene expression, and homologies with the Alu family of human ubiquitous repeats and the SV40 72-bp repeat are outlines.  相似文献   

11.
A complementary DNA (cDNA) clone coding for transcobalamin II (TCII) has been isolated from a human umbilical vein endothelial cell cDNA library. The cDNA is 1.9 Kb and includes the nucleotide sequence which encodes the NH2-terminal 19 amino acids of human TCII. The size of the cDNA is sufficient to code for the entire protein and also contains the nucleotide sequence coding for a 24 amino acid leader peptide and a long untranslated 3' region. The availability of this cDNA will provide the opportunity to characterize genetic disorders of TCII.  相似文献   

12.
13.
The cDNA for the human rhodanese (thiosulfate: cyanide sulfurtransferase, EC 2.8.1.1), a nuclearly encoded protein of the mitochondrial matrix, was isolated from a human fetal liver cDNA library. Nucleotide sequence revealed an open reading frame coding for a polypeptide of 295 amino acids, which presented a 57% and 58% identity with the bovine and avian rhodanese, respectively. The analysis of the 5'-ends of the coding region gave no evidence for the presence of a cleavable signal sequence as found in other mitochondrial proteins. A comparison with two available amino acid sequences (cow and chicken) showed that sequence similarity is not restricted to the alpha-helices and beta-structures motifs which are remarkably superimposable in the two halves of bovine rhodanese, but extends to adjacent regions.  相似文献   

14.
Genomic and cDNA clones that code for a protein with structural and biochemical properties similar to the receptor protein kinases from animals were obtained from Arabidopsis. Structural features of the predicted polypeptide include an amino-terminal membrane targeting signal sequence, a region containing blocks of leucine-rich repeat elements, a single putative membrane spanning domain, and a characteristic serine/threonine-specific protein kinase domain. The gene coding for this receptor-like transmembrane kinase was designated TMK1. Portions of the TMK1 gene were expressed in Escherichia coli, and antibodies were raised against the recombinant polypeptides. These antibodies immunodecorated a 120-kD polypeptide present in crude extracts and membrane preparations. The immunodetectable band was present in extracts from leaf, stem, root, and floral tissues. The kinase domain of TMK1 was expressed as a fusion protein in E. coli, and the purified fusion protein was found capable of autophosphorylation on serine and threonine residues. The possible role of the TMK1 gene product in transmembrane signaling is discussed.  相似文献   

15.
Cloning and expression of a human muscle phosphofructokinase cDNA   总被引:10,自引:0,他引:10  
The nucleotide sequence of a 2.86-kb cDNA clone containing the complete human muscle phosphofructokinase (PFK) protein-coding region was determined. It comprises 76 bp of 5'-untranslated sequence, 2340 bp encoding human muscle PFK polypeptide, and 399 bp of 3'-untranslated sequence plus a poly(A) tract. A retroviral vector was utilized to express the product of this coding sequence in mouse fibroblasts. The PFK-coding cDNA was shown to code for an enzymatically active polypeptide by immunoprecipitation analysis and DEAE-Sephadex A-25 chromatography.  相似文献   

16.
Evolvability of biopolymers is based on molecular coding. The molecular coding is represented by biopolymer function vs monomeric sequence relationship, that is, a proper fitness landscape on the sequence space. On the other hand, molecular coding is mostly realized by monomeric sequence vs biopolymer structure relationship. We suggest the evolution of evolvability based on flexible or multiplex coding originating from flexible or polymorphic conformation of evolving biopolymers. We report a finding supporting that the amino acid landscape of the standard genetic code for an amino acid property which is more important to the protein function gives higher value of an evolvability measure. We developed a promising molecular construct which realized genotype-phenotype linking in order to study the in vitroprotein evolution to clarify above mentioned protein evolvability.  相似文献   

17.
The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus (BSMV) has been determined. The sequence is 3289 nucleotides in length and contains four open reading frames (ORFs) which code for proteins of Mr 22,147 (ORF1), Mr 58,098 (ORF2), Mr 17,378 (ORF3), and Mr 14,119 (ORF4). The predicted N-terminal amino acid sequence of the polypeptide encoded by the ORF nearest the 5'-end of the RNA (ORF1) is identical (after the initiator methionine) to the published N-terminal amino acid sequence of BSMV coat protein for 29 of the first 30 amino acids. ORF2 occupies the central portion of the coding region of RNA beta and ORF3 is located at the 3'-end. The ORF4 sequence overlaps the 3'-region of ORF2 and the 5'-region of ORF3 and differs in codon usage from the other three RNA beta ORFs. The coding region of RNA beta is followed by a poly(A) tract and a 238 nucleotide tRNA-like structure which are common to all three BSMV genomic RNAs.  相似文献   

18.
19.
The nucleotide sequence of an almost complete, double-stranded cDNA of chicken Very Low Density Lipoprotein II mRNA, carried in recombinant plasmid pVLDLII 3.33 (Wieringa et al., 1979, 7: 2147-2163) is presented. A stretch of 318 nucleotides codes for the pre-VLDLII polypeptide, which consists of a 24 amino acids signal and a 82 amino acids secreted protein. The coding stretch is flanked by 57 nucleotides in the 5'-leader sequence of the mRNA, and 258 nucleotides in the 3'-non-coding region. Hypothetical self-complementary structures of parts of the mRNA are presented.  相似文献   

20.
The whole nucleotide sequence of pT3.2I, the smallest plasmid of the acidophilic bacterium Thiobacillus T3.2, has been determined. pT3.2I is 15,390 bp long with a 53.7% GC content. Different regions can be defined in it: one 2569-bp putative insertion sequence similar to other insertion sequences of some Agrobacterium Ti plasmids; and a longer sequence, which occurs in two almost identical copies, differing only in a 1-bp deletion (6406 and 6405 bp). Several open reading frames and some smaller sequences were found in this duplicated region: ORFA and ORFG, encoding a putative polyol dehydrogenase and a putative RepA replication protein, respectively, an 83-bp sequence which could code for an antisense RNA, and a 36-bp region highly homologous to ori sequences of ColE2- and ColE3-related plasmids. Another putative gene, ORFH, is only present in the longer copy of this region (it is deleted in the short copy) and might encode a 90-amino-acid polypeptide which could act as a second replication protein, RepB. Based on sequence comparisons, pT3. 2I can be related to plasmids in the pColE2-CA42 incB incompatibility group.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号