首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Ren Zhang M.D. 《Amino acids》1997,12(2):167-177
Summary Based on the genetic codes and a simple theorem for the geometrical property of the regular tetrahedron, each amino acid is mapped onto a unique point in a 3-dimensional tetrahedral space. The distribution of the 20 mapping points for 20 amino acids is studied in detail. It is found that the mapping points for the hydrophobic and hydrophilic amino acids are distributed at distinct regions in the 3-dimensional space. A plane separating the two kinds of points satisfactorily based on the Fisher's algorithm has been calculated. It is shown that the codons coding for the hydrophobic amino acids are constituted dominantly by the bases of keto group, i.e., G and T. While the codons coding for the hydrophilic amino acids are constituted dominantly by the bases of amino group, i.e., A and C. The biological implication of the mapping points and the separating plane has been discussed in some details.  相似文献   

2.
P C Simons  J D Satterlee 《Biochemistry》1989,28(21):8525-8530
The three major monomer hemoglobins from Glycera dibranchiata erythrocytes isolated in this laboratory were sequenced from their N-termini. A stretch of amino acid sequence identity was used to determine the sequence of a mixed oligodeoxynucleotide that would be complementary to all 12 possible mRNA sequences coding for the amino acids. A cDNA library was constructed by using poly(A+) RNA from G. dibranchiata erythrocytes, the library was probed with the oligonucleotide, and the longest positive inserts found were subcloned into a sequencing plasmid and then sequenced. The first one was 745 bases long, containing 85 bases of 5'-untranslated RNA, an open reading frame of 444 bases coding for 148 amino acids, and a 3'-untranslated region of 216 bases. The predicted amino acid sequence matches the first 25 amino acids of G. dibranchiata monomer globin component IV. The sequence contains an N-terminal methionine plus 18 other mostly conservative sequence changes compared to the published sequence of Imamura et al. (1972), which appears from our partial sequencing to be monomer globin component II. We confirm the presence of leucine in the E7 position, which is histidine in most myoglobins and hemoglobins.  相似文献   

3.
Structure-based prediction of DNA target sites by regulatory proteins   总被引:15,自引:0,他引:15  
Kono H  Sarai A 《Proteins》1999,35(1):114-131
Regulatory proteins play a critical role in controlling complex spatial and temporal patterns of gene expression in higher organism, by recognizing multiple DNA sequences and regulating multiple target genes. Increasing amounts of structural data on the protein-DNA complex provides clues for the mechanism of target recognition by regulatory proteins. The analyses of the propensities of base-amino acid interactions observed in those structural data show that there is no one-to-one correspondence in the interaction, but clear preferences exist. On the other hand, the analysis of spatial distribution of amino acids around bases shows that even those amino acids with strong base preference such as Arg with G are distributed in a wide space around bases. Thus, amino acids with many different geometries can form a similar type of interaction with bases. The redundancy and structural flexibility in the interaction suggest that there are no simple rules in the sequence recognition, and its prediction is not straightforward. However, the spatial distributions of amino acids around bases indicate a possibility that the structural data can be used to derive empirical interaction potentials between amino acids and bases. Such information extracted from structural databases has been successfully used to predict amino acid sequences that fold into particular protein structures. We surmised that the structures of protein-DNA complexes could be used to predict DNA target sites for regulatory proteins, because determining DNA sequences that bind to a particular protein structure should be similar to finding amino acid sequences that fold into a particular structure. Here we demonstrate that the structural data can be used to predict DNA target sequences for regulatory proteins. Pairwise potentials that determine the interaction between bases and amino acids were empirically derived from the structural data. These potentials were then used to examine the compatibility between DNA sequences and the protein-DNA complex structure in a combinatorial "threading" procedure. We applied this strategy to the structures of protein-DNA complexes to predict DNA binding sites recognized by regulatory proteins. To test the applicability of this method in target-site prediction, we examined the effects of cognate and noncognate binding, cooperative binding, and DNA deformation on the binding specificity, and predicted binding sites in real promoters and compared with experimental data. These results show that target binding sites for several regulatory proteins are successfully predicted, and our data suggest that this method can serve as a powerful tool for predicting multiple target sites and target genes for regulatory proteins.  相似文献   

4.
An evolutionary scheme is postulated in which the bases enter the genetic code in a definite temporal sequence and the correlated amino acids are assigned definite functions in the evolving system.The scheme requires a singlet code (guanine coding for glycine) evolving into a doublet code (guanine-cytosine doublet coding for gly (GG), ala (GC), arg (CG), pro (CC)). The doublet code evolves into a triplet code. Polymerization of nucleotides is thought to have been by block polymerization rather than by a template mechanism. The proteins formed at first were simple structural peptides. No direct nucleotide-amino acid stereo-chemical interaction was required. Rather an adaptor-type indirect mechanism is thought to have been functioning since the origin.  相似文献   

5.
H Hartman 《Origins of life》1975,6(3):423-427
An evolutionary scheme is postulated in which the bases enter the genetic code in a definite temporal sequence and the correlated amino acids are assigned definite functions in the evolving system. The scheme requires a singlet code (guanine coding for glycine) evolving into a doublet code (guanine-cytosine doublet coding for gly (GG), ala (GC), arg (CG), pro (CC). The doublet code evolves into a triplet code. Polymerization of nucleotides is thought to have been by block polymerization rather than by a template mechanism. The proteins formed at first were simple structural peptides. No direct nucleotide-amino acid stereo-chemical interaction was required. Rather an adaptor-type indirect mechanism is thought to have been functioning since the origin.  相似文献   

6.
A progene hypothesis has been proposed earlier to explain the mechanism of origin of the self-reproducing genetic system. Progenes (precursors of the genetic system) are mixed anhydrides of an amino acid and deoxyribotrinucleotide at the 3'-gamma-terminal phosphate (NpNpNppp-AA); they are produced from dinucleotides (NpNp) and 3'-gamma-aminoacylnucleotidylates (Nppp-AA) as a result of specific interaction between amino acid and dinucleotide. The postulated mechanism of progene formation accounts for the selection of substances, including chirality, the origin of the genetic code as well as for the mechanisms of formation, self-reproduction and evolution of the simpliest genetic system ("gene--polypeptide"). A stereochemical analysis of the progene formation mechanism has allowed us to support the main statements of the hypothesis that relate to the origin of the genetic code and to selection of substances. Atomic groups that could be responsible for the specificity of interaction between dinucleotides and amino acids in progene formation have been revealed. Stereochemical evidence for the physicochemical basis of the origin of the existing genetic code have been produced: 1) a special role of the second nucleotide in the codon is demonstrated in amino acid coding by the progene hypothesis principle; 2) an advantage of T against U in such coding is demonstrated; 3) for 16 amino acids out of 20 an agreement has been obtained between the optimal dinucleotide as revealed by the stereochemical analysis and the codon dinucleotides; 4) an explanation for the third nucleotide selection mechanism is offered. A restoration of the prebiotic code, based on these results, has indicated that the code contains 32 codons, is statistical and group-wise. It encodes 7 groups of isofunctional amino acids: 3 overlapping groups of non-polar amino acids 1) medium-size hydrophobic amino acids (chiefly Val, n-Val and a-But), 2) small and medium-size non-polar amino acids (chiefly Ala Val, n-Val a-But and Gly), 3) small non-polar amino acids (Gly, Ala, a-But) and 4 groups of polar amino acids--1) hydroxy--+dicarbonic (Asp, Glu, Ser and Thr), 2) dicarbonic (Asp and Glu), 3) hydroxy (Ser and Thr) and 4) basic (Arg and Lys). The code includes about 20 amino acids among which are 15-17 canonical and a few common non-canonical. The prebiotic code explains many properties of the existing genetic code and is capable of evolving into the latter by way of a gradual replacement of the physicochemical coding mechanism by the enzymatic coding mechanism.  相似文献   

7.
The thermostable neutral protease gene nprT of Bacillus stearothermophilus was sequenced. The DNA sequence revealed only one large open reading frame, composed of 1,644 bases and 548 amino acid residues. A Shine-Dalgarno sequence was found 9 bases upstream from the translation start site (ATG), and the deduced amino acid sequence contained a signal sequence in its amino-terminal region. The sequence of the first 14 amino acids of purified extracellular protease completely matched that deduced from the DNA sequence starting at GTC (Val), 687 bases (229 amino acids) downstream from ATG. This suggests that the protease is translated as a longer polypeptide. The amino acid sequence of the extracellular form of this protease (319 amino acids) was highly homologous to that of the thermostable neutral protease from Bacillus thermoproteolyticus but less homologous to the thermolabile neutral protease from Bacillus subtilis. A promoter region determined by S1 nuclease mapping (TTTTCC for the -35 region and TATTTT for the -10 region) was different from the conserved promoter sequences recognized by the known or factors in bacilli. However, it was very homologous to the promoter sequence of the spo0B gene from B. subtilis. The guanine-plus-cytosine content of the coding region of the nprT gene was 58 mol%, while that of the third letter of the codons was much higher (72 mol%).  相似文献   

8.
Human liver cDNA coding for protein C has been synthesized, cloned and sequenced. The abundance of protein C message is approximately 0.02% of total mRNA. Three overlapping clones contain 1,798 nucleotides of contiguous sequence, which approximates the size of the protein's mRNA, based upon Northern hybridization. The cDNA sequence consists of 73 5'-noncoding bases, coding sequence for a 461 amino acid nascent polypeptide precursor, a TAA termination codon, 296 3'-noncoding bases, and a 38 base polyadenylation segment. The nascent protein consists of a 33 amino acid "signal", a 9 amino acid propeptide, a 155 amino acid "light" chain, a Lys-Arg connecting dipeptide, and a 262 amino acid "heavy" chain. Human protein C and Factor IX and X precursors possess about one third identical amino acids (59% in the gamma-carboxyglutamate domain), including two forty-six amino acid segments homologous to epidermal growth factor. Human protein C also has similar homology with prothrombin in the "leader", gamma-carboxyglutamate and serine protease domains, but lacks the two "kringle" domains found in prothrombin.  相似文献   

9.
We report here the complete genomic sequence of the Chilean human isolate of Andes virus CHI-7913. The S, M, and L genome segment sequences of this isolate are 1,802, 3,641 and 6,466 bases in length, with an overall GC content of 38.7%. These genome segments code for a nucleocapsid protein of 428 amino acids, a glycoprotein precursor protein of 1,138 amino acids and a RNA-dependent RNA polymerase of 2,152 amino acids. In addition, the genome also has other ORFs coding for putative proteins of 34 to 103 amino acids. The encoded proteins have greater than 98% overall similarity with the proteins of Andes virus isolates AH-1 and Chile R123. Among other sequenced Hantavirus, CHI-7913 is more closely related to Sin Nombre virus, with an overall protein similarity of 92%. The characteristics of the encoded proteins of this isolate, such as hydrophobic domains, glycosylation sites, and conserved amino acid motifs shared with other Hantavirus and other members of the Bunyaviridae family, are identified and discussed.  相似文献   

10.
Nucleotide sequences were determined for cloned cDNAs encoding for more than half of the pro alpha 2 chain of type I procollagen from man. Comparisons with previously published data on homologous cDNAs from chick embryos made it possible to examine evolution of the gene in two species which have diverged for 250-300 million years. The amino acid sequence of the alpha-chain domain supported previous indications that there is a strong selective pressure to maintain glycine as every third amino acid and to maintain a prescribed distribution of charged amino acids. However, there is little apparent selective pressure on other amino acids. The amino acid sequence of the C-propeptide domain showed less divergence than the alpha-chain domain. The 5' end or N terminus of the human C-propeptide, however, contained an insert of 12 bases coding for 4 amino acids not found in the chick C-propeptide. About 100 amino acid residues from the N terminus, two residues found in the chick sequence were missing from the human. In the second half of the C-propeptide, there was complete conservation of a 37 amino acid sequence and conservation of 50 out of 51 amino acids in the same region, an observation which suggested that the region serves some special purpose such as directing the association of one pro alpha 2(I) C-propeptide with two pro alpha 1(I) C-propeptides so as to produce the heteropolymeric structure of type I procollagen. In addition, comparison of human and chick DNAs for pro alpha 2(I) revealed three different classes of conservation of nucleotide sequence which have no apparent effect on the structure of the protein: a preference for U on the third base position of codons for glycine, proline, and alanine; a high degree of nucleotide conservation in the 51 amino acid highly conserved region of the C-propeptide; a high degree of nucleotide conservation in the 3'-noncoding region. These three classes of nucleotide conservation may reflect unusual features of collagen genes, such as their high GC content or their highly repetitive coding sequences.  相似文献   

11.
The 'proliferating cell nuclear antigen' (PCNA), also known as cyclin, appears at the G1/S boundary in the cell cycle. Because of its possible relationship with cell proliferation, PCNA/cyclin has been receiving attention. PCNA/cyclin is a non-histone acidic nuclear protein with an apparent mol. wt of 33000-36000. The amino acid composition and the sequence of the first 25 amino acids of rabbit PCNA/cyclin are known. Using an oligonucleotide probe corresponding to the sequence of the first five amino acids, a cDNA clone for PCNA/cyclin was isolated from rat thymocyte cDNA library. The cDNA (1195 bases) contains an open reading frame of 813 nucleotides coding for 261 amino acids. The 3'-non-coding region is 312 nucleotides long and contains three putative polyadenylation signals. The mol. wt of rat PCNA/cyclin was calculated to be 28 748. The deduced amino acid sequence and composition of rat PCNA/cyclin are in excellent agreement with the published data. Using the cDNA probe, two species of mRNA (1.1 and 0.98 kb) were detected in rat thymocyte RNA. Southern blot analysis of total human genomic DNA suggests that there is a single gene coding for PCNA/cyclin. The deduced amino acid sequence of rat PCNA/cyclin has a similarity with that of herpes simplex virus type-1 DNA binding protein.  相似文献   

12.
Two-step photochemical decomposition of aromatic amino acids under picosecond laser UV-irradiation was investigated. These results were compared with the photochemical stability of nucleic acid bases. Using the known ratio between the nucleic acid bases and aromatic amino acids in native bacteriophages lambda and phi X174 it was shown that picosecond laser UV-inactivation of viruses occurred due to the photodegradation of nucleic acid.  相似文献   

13.
The laminin B2 chain has a multidomain structure homologous to the B1 chain   总被引:31,自引:0,他引:31  
Laminin (Mr = 850,000) is a large basement membrane-specific glycoprotein composed of three chains: A, B1, and B2. Previously, we have reported the primary structure of the B1 chain of mouse laminin deduced from sequencing cDNA clones (Sasaki M., Kato, S., Kohno, K., Martin, G. R., and Yamada, Y. (1987) Proc. Natl. Acad. Sci. U.S.A. 84, 935-939). Here we report the isolation of overlapping cDNA clones spanning 7642 bases which encode the entire B2 chain. The nucleotide sequence of the clones contains an open reading frame of 4821 bases coding for a protein of 1607 amino acids including 33 amino acids of a presumptive signal peptide. The mRNA for the B2 chain contains 2.5 kilobases of 3'-untranslated region. The deduced amino acid sequence indicates that the B2 chain consists of six distinct domains, including two domains with alpha-helical, coiled-coil structures, two domains with cysteine-rich homologous repeats, and two globular domains. These structural features of the B2 chain are similar to those of the B1 chain. In addition, the amino acid sequences of the B2 and B1 chains demonstrate considerable homology, suggesting that the genes for these two chains arose from a common ancestor.  相似文献   

14.
D W Chung  E W Davie 《Biochemistry》1984,23(18):4232-4236
cDNAs and the genomic DNA coding for the gamma and gamma' chains of human fibrinogen have been isolated and characterized by sequence analysis. The cDNAs coding for the gamma and gamma' chains share a common nucleotide sequence coding for the first 407 amino acid residues in each polypeptide chain. The predominant gamma chain contains an additional four amino acids on its carboxyl-terminal end (residues 408-411). These four amino acids, together with the 3' noncoding sequences, are encoded by the tenth exon. Removal of the ninth intervening sequence following the processing and polyadenylation reactions yields a mature mRNA coding for the predominant gamma chain. The less prevalent gamma' chain contains 20 amino acids at its carboxyl-terminal end (residues 408-417). These 20 amino acids are encoded by the immediate 5' end of the ninth intervening sequence. This results from an occasional processing and polyadenylation reaction that occurs within the region normally constituting the ninth intervening sequence. Accordingly, the gene for the gamma chain of human fibrinogen gives rise to two mRNAs that differ in sequence on their 3' ends. These mRNAs code for polypeptide chains with different carboxyl-terminal sequences. Both of these polypeptides are incorporated into the fibrinogen molecule present in plasma.  相似文献   

15.
16.
H Zuber 《Biophysical chemistry》1988,29(1-2):171-179
Comparison of the primary structures of thermophilic, mesophilic and psychrophilic lactate dehydrogenase (LDH) reveals a multitude of temperature-related amino acid substitutions. In the substitutions amino acid residues occurring preferentially in thermophilic, mesophilic (psychrophilic) LDH were found. On this basis, amino acid residues could be classified in an order from typical thermophilic (thermostabilizing) to typical mesophilic (thermolabilizing, increasing dynamics of the enzyme molecule) residues. The temperature-dependent ratio between thermostabilizing and thermolabilizing amino acid residues forms the basis for the specific structural and functional properties of thermophilic or mesophilic LDH. It is interesting that there appears to be a relationship between this order from thermophilic to mesophilic amino acid residues and the type of bases coding for these individual residues in the translation step of protein biosynthesis. Temperature-related amino acid substitutions are based on temperature-related base substitutions. A possible mechanism of temperature adaptation of LDH through alternative selection of thermophilic and mesophilic amino acid residues at the level of tRNA (anticodon)-mRNA (codon) interactions is discussed. These temperature-adaptation processes are evolutionary events in which the evolution and structure of the genetic code are involved.  相似文献   

17.
Recombinant clones expressing antigenic determinants of the 18-kDa protein antigen from Mycobacterium leprae recognized by the L5 monoclonal antibody were isolated from a lambda gt11 expression library and their nucleotide sequences determined. All clones expressed the M. leprae-specific determinant as part of a large fusion protein with Escherichia coli beta-galactosidase. The deduced amino acid sequence of the coding region indicated that all the lambda gt11 recombinant clones contained an incomplete M. leprae gene sequence representing the carboxy-terminal two-thirds (111 amino acids) of the 18-kDa gene and coding for a peptide of m.w. 12,432. Subsequent isolation and sequencing of a 3.2kb BamHI-PstI DNA fragment from a genomic M. leprae cosmid library permitted the deduction of the complete 148 amino acid sequence with a predicted m.w. of 16,607. A second open reading frame 560 bases downstream from the 18-kDa coding sequence was found to code for a putative protein of 137 amino acids (m.w. = 15,196). Neither this nor the 18-kDa amino acid sequence displayed any significant homologies with any proteins in the GENBANK, EMBL, or NBRF data bases. Crude lysates from recombinant lambda gt11 clones expressing part of the 18-kDa protein have been reported to stimulate the proliferation of some M. leprae-specific helper T cell clones. Thus, it is significant that the complete 18-kDa sequence contains five short peptides predicted to be possible helper T cell antigenic epitopes based on their propensity to form amphipathic helices. Although three of these occur within the 111 amino acid carboxy-terminal peptide expressed by lambda gt11 clones, the most highly amphipathic peptide is found in the amino-terminal region not present in the lambda gt11 recombinants.  相似文献   

18.
The laws governing degeneration of the genetic code are discussed below. Of fundamental importance in this context is the classification of the amino acids into groups on the basis of the physicochemical behaviour of their residues. From this, it is possible to formulate arithmetic relationships between the number of amino acids in the same group and the number of coding triplets.It is found that the degeneration of the genetic code obeys certain laws, the reasons for this being related to the number and the qualitative properties of the amino acids and triplets. The fact that the three bases of a coding triplet have different priorities must also be a critical factor.  相似文献   

19.
鲑鱼生长激素cDNA的分子克隆和序列分析   总被引:8,自引:0,他引:8  
宋诗铎  丘才良 《遗传学报》1992,19(4):308-315
从太平洋切奴克鲑鱼(Pacific Chinook Salmon,Oncorthychus tschawytscha)垂体poly(A)~+ RNA构建cDNA文库。按照鲑鱼生长激素(sGH)部分氨基酸序列合成两个寡聚脱氧核苷酸探针,它们分别与编码第1—7和第166—172氨基酸序列互补。用探针筛查cDNA文库,得到了完整的sGH cDNA克隆。cDNA序列已测定,包括编码210个氨基酸的编码序列。其中含有22个氨基酸的信号肽序列和188个氨基酸的成熟GH序列。该克隆还包括了5'端和3'端非翻译区,分别为72个和438个碱基对长。与Chum鲑鱼比较表明,核酸序列和氨基酸序列的同源性分别为97%和99%。  相似文献   

20.
On the basis of the previous article (Morchio and Traverso [1999]), we discuss the possible interactions between the first proteic fragments developed in the hydrophobic layer made of hydrocarbons, which would have covered the surface of the primitive seas, and the nitrogenous bases, particularly the pyrimidinic ones, which would have found in such hydrophobic layer favourable conditions to their prebiotic synthesis. These interactions would have presumably brought, on the basis of the physicochemical laws, at the moment the only ones at work, to the linkage of various bases and so to the construction of the first nucleic acid chains (most likely RNA). Interestingly enough this result would have been obtained by inserting two more bases between those hydrogen bound to the amino acids and this might have been the ground for the future "triplets". These interactions might have been particularly significant because of two important consequences: the birth of a rough genetic code and the starting of interactions of the co-operative type between bases and amino acids that would have made the growth of both proteic and nucleic acid fragments easier and faster. We conclude that the development of the genetic code was neither a "frozen accident" nor an occurrence directed by any information flow.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号