首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein structure and neutral theory of evolution   总被引:2,自引:0,他引:2  
The neutral theory of evolution is extended to the origin of protein molecules. Arguments are presented which suggest that the amino acid sequences of many globular proteins mainly represent "memorized" random sequences while biological evolution reduces to the "editing" these random sequences. Physical requirements for a functional globular protein are formulated and it is shown that many of these requirement do not involve strategical selection of amino acid sequences during biological evolution but are inherent also for typical random sequences. In particular, it is shown that random sequences of polar and amino acid residues can form alpha-helices and beta-strand with lengths and arrangement along the chain similar to those in real globular proteins. These alpha- and beta-regions in random sequences can form three-dimensional folding patterns also similar to those in proteins. The arguments are presented suggesting that even the tight packing of side groups inside protein core do not require very strong biological selection of amino acid sequences either. Thus many structural features of real proteins can exist also in random sequences and the biological selection is needed mainly for the creation of active site of protein and for their stability under physiological conditions.  相似文献   

2.
Distributions of α and β regions in globular proteins among clusters containing different numbers of adjacent α-helices, adjacent β regions, and “overlapping” βαβ units are considered. It is shown that these distributions do not differ greatly from what can be expected for random distributions of α and β regions along protein chains. In particular, it is shown that the amounts of relatively long α, β and βαβ clusters (which provide the basis for the conventional classification of globular proteins or domains into α, β, or α/β types) in random sequences also do not differ very much from those in real globular proteins. It follows that the possibility of structural classification of globular proteins (domains) does not imply the existence of a correlation in protein primary structure. This possibility exists even in random sequences of amino acid residues and therefore may not be the result of biological evolution.  相似文献   

3.
The paper reveals the types of amino acid sequences of polypeptide chain regions of globular protein which form a regular (α or β) or irregular conformation in the native globule. The study was made taking into account general “architectural” principles of packing of polypeptide chains in globular proteins and considering the interactions of proteins with water molecules. An a priori theory is developed which permits the identification, in good agreement with experiment, of α-helical and β-structural regions in globular proteins from their primary structure.  相似文献   

4.
It is widely believed that the unique primary structure of a given protein is quite necessary for its folding into a certain three-dimensional structure as well as for its functioning and is a result of a directed selection in the course of biological evolution. The present paper provides arguments in favour of an alternative point of view according to which typical three-dimensional structures of globular proteins are characteristic even for random sequences of amino acid residues. Therefore it may be possible that primary structures of proteins are mainly examples of random amino acid sequences slightly edited in the course of biological evolution to impart them some additional (functional) meaning.  相似文献   

5.
Evidence from a number of studies indicates that protein folding is dictated not only by factors stabilizing the native state, but also by potentially independent factors that create folding pathways. How natural selection might cope simultaneously with two independent factors was addressed in this study within the framework of the "Lim-model" of protein folding, which postulates that the early stages of folding of all globular proteins, regardless of their native structure, are directed at least in part by potential to form amphiphilic α-helices. For this purpose, the amphiphilic α-helical potential in randomly ordered amino acid sequences and the conservation in phylogeny of amphiphilic α-helical potential within various proteins were assessed. These analyses revealed that amphiphilic α-helical potential is a common occurrence in random sequences, and that the presence of amphiphilic α-helical potential is present but not conserved in phylogeny within a given protein. The results suggest that the rapid formation of molten globules and the variable behavior of those globules depending on the protein may be a fundamental property of polymers of naturally occurring amino acids more so than a trait that must be derived or maintained by natural selection. Further, the results point toward the utility of randomly occurring process in protein function and evolution, and suggest that the formation of efficient pathways that determine early processes in protein folding, unlike the formation of stable, native protein structure, does not present a substantial hurdle during the evolution of amino acid sequences.  相似文献   

6.
The solid state secondary structure of myoglobin, RNase A, concanavalin A (Con A), poly(L -lysine), and two linear heterooligomeric peptides were examined by both far-uv CD spectroscopy1 and by ir spectroscopy. The proteins associated from water solution on glass and mica surfaces into noncrystalline, amorphous films, as judged by transmission electron microscopy of carbon-platinum replicas of surface and cross-fractured layer. The association into the solid state induced insignificant changes in the amide CD spectra of all α-helical myoglobin, decreased the molar ellipticity of the α/β RNase A, and increased the molar ellipticity of all-β Con A with no change in the positions of the bands' maxima. High-temperature exposure of the films induced permanent changes in the conformation of all proteins, resulting in less α-helix and more β-sheet structure. The results suggest that the protein α-helices are less stable in films and that the secondary structure may rearrange into β-sheets at high temperature. Two heterooligomeric peptides and poly (L -lysine), all in solution at neutral pH with “random coil” conformation, formed films with variable degrees of their secondary structure in β-sheets or β-turns. The result corresponded to the protein-derived Chou-Fasman amino acid propensities, and depended on both temperature and solvent used. The ir and CD spectra correlations of the peptides in the solid state indicate that the CD spectrum of a “random” structure in films differs from random coil in solution. Formic acid treatment transformed the secondary structure of the protein and peptide films into a stable α-helix or β-sheet conformations. The results indicate that the proteins aggregate into a noncrystalline, glass-like state with preserved secondary structure. The solid state secondary structure may undergo further irreversible transformations induced by heat or solvent. © 1993 John Wiley & Sons, Inc.  相似文献   

7.
It has been suggested (Doolittle et al., 1977) that portions of the α-, β- and γ-chains of fibrinogen form a coiled-coil rope of α-helices and that this rope connects globular domains of the molecule. A fast Fourier transform analysis of the relevant amino acid sequences has shown that there is a significant 3.5-residue period in the linear disposition of the apolar residues in all three chains. This periodicity is characteristic of amino acid sequences of α-fibrous proteins, such as α-tropomyosin and α-keratin, where the tertiary structure is closely related to a coiled-coil of α-helices. However, a detailed study of the fibrinogen sequences shows that the structure is likely to contain several regions which do not have a simple secondary structure. The detailed conformation of the postulated rodlike region of fibrinogen is therefore complex and may approximate a coiled-coil only over relatively short lengths.An important question to emerge from this analysis is whether correct positioning of apolar residues in a pseudo-repeating heptad is sufficiently important to override low α-helix-favouring potential of other residues in the heptad.  相似文献   

8.
A theoretical study has shown that the occurrence of various structural elements in stable folds of random copolymers is exponentially dependent on the own energy of the element. A similar occurrence-on-energy dependence is observed in globular proteins1 from the level of amino acid conformations to the level of overall architectures. Thus, the structural features stabilized by many random sequences are typical of globular proteins while the features rarely observed in proteins are those which are stabilized by only a minor part of the random sequences. © 1995 Wiley-Liss, Inc.  相似文献   

9.
植物查耳酮异构酶生物信息学分析   总被引:2,自引:0,他引:2  
陈克克  武雪 《生物信息学》2009,7(3):163-167
查耳酮异构酶(CHI)是黄酮类化合物合成途径中的关键酶之一。利用生物信息学方法对该酶基因及编码蛋白进行系统的分析,将为深入开展研究打下基础。本文利用NCBI数据库中注册的CHI基因的核酸及氨基酸序列,以葡萄CHI为主,对其组成成分、疏水性/亲水性、翻译后修饰、蛋白质二级及三级结构等进行预测和推断。结果表明:葡萄CHI不具有明显的亲水或疏水区域;二级结构主要由α-螺旋、不规则卷曲和β-折叠组成,β-转角散布于整个肽链中;β3a—β3f连同α1—α7构成了蛋白三级结构的核心;包含CHI结构域;在高级结构、活性位点等方面具有较高的保守性。  相似文献   

10.
Abstract

The Protein Data Bank (PDB) is the preeminent source of protein structural information. PDB contains over 32,500 experimentally determined 3-D structures solved using X-ray crystallography or nuclear magnetic resonance spectroscopy. Intrinsically disordered regions fail to form a fixed 3-D structure under physiological conditions. In this study, we compare the amino-acid sequences of proteins whose structures are determined by X-ray crystallography with the corresponding sequences from the Swiss-Prot database. The analyzed dataset includes 16,370 structures, which represent 18,101 PDB chains and 5,434 different proteins from 910 different organisms (2,793 eukaryotic, 2,109 bacterial, 288 viral, and 244 archaeal). In this dataset, on average, each Swiss-Prot protein is represented by 7 PDB chains with 76% of the crystallized regions being represented by more than one structure. Intriguingly, the complete sequences of only ~7% of proteins are observed in the corresponding PDB structures, and only ~25% of the total dataset have >95% of their lengths observed in the corresponding PDB structures. This suggests that the vast majority of PDB proteins is shorter than their corresponding Swiss-Prot sequences and/or contain numerous residues, which are not observed in maps of electron density. To determine the prevalence of disordered regions in PDB, the residues in the Swiss-Prot sequences were grouped into four general categories, “Observed” (which correspond to structured regions), “Not observed” (regions with missing electron density, potentially disordered), “Uncharacterized,” and “Ambiguous,” depending on their appearance in the corresponding PDB entries. This non-redundant set of residues can be viewed as a ‘fragment’ or empirical domain database that contains a set of experimentally determined structured regions or domains and a set of experimentally verified disordered regions or domains. We studied the propensities and properties of residues in these four categories and analyzed their relations to the predictions of disorder using several algorithms. “Non-observed,” “Ambiguous,” and “Uncharacterized” regions were shown to possess the amino acid compositional biases typical of intrinsically disordered proteins. The application of four different disorder predictors (PONDR® VL-XT, VL3-BA, VSL1P, and IUPred) revealed that the vast majority of residues in the “Observed” dataset are ordered, and that the “Not observed” regions are mostly disordered. The “Uncharacterized” regions possess some tendency toward order, whereas the predictions for the short “Ambiguous” regions are really ambiguous. Long “Ambiguous” regions (>70 amino acid residues) are mostly predicted to be ordered, suggesting that they are likely to be “wobbly” domains.

Overall, we showed that completely ordered proteins are not highly abundant in PDB and many PDB sequences have disordered regions. In fact, in the analyzed dataset ~10% of the PDB proteins contain regions of consecutive missing or ambiguous residues longer than 30 amino-acids and ~40% of the proteins possess short regions (≥10 and <30 amino-acid long) of missing and ambiguous residues.  相似文献   

11.
In this paper we present a new residue contact potantial derived by statistical analysis of protein crystal structures. This gives mean hydrophobic and pairwise contact energies as a function of residue type and distance interval. To test the accuracy of this potential we generate model structures by “threading” different sequences through backbone folding motifs found in the structural data base. We find that conformational energies calculated by summing contact potentials show perfect specificity in matching the correct sequences with each globular folding motif in a 161-protcin data set. They also identify correct models with the core folding motifs of heme-rythrin and immunoglobulin McPC603 V1-do- main, among millions of alternatives possible when we align subsequences with α-helices and β-strands, and allow for variation in the lengths of intervening loops. We suggest that contact potentials reflect important constraints on nonbonded interaction in native proteins, and that “threading” may be useful for structure prediction by recognition of folding motif. © 1993 Wiley-Liss, Inc.  相似文献   

12.
With the explosive growth of protein sequences entering into protein data banks in the post-genomic era, it is highly demanded to develop automated methods for rapidly and effectively identifying the protein–protein binding sites (PPBSs) based on the sequence information alone. To address this problem, we proposed a predictor called iPPBS-PseAAC, in which each amino acid residue site of the proteins concerned was treated as a 15-tuple peptide segment generated by sliding a window along the protein chains with its center aligned with the target residue. The working peptide segment is further formulated by a general form of pseudo amino acid composition via the following procedures: (1) it is converted into a numerical series via the physicochemical properties of amino acids; (2) the numerical series is subsequently converted into a 20-D feature vector by means of the stationary wavelet transform technique. Formed by many individual “Random Forest” classifiers, the operation engine to run prediction is a two-layer ensemble classifier, with the 1st-layer voting out the best training data-set from many bootstrap systems and the 2nd-layer voting out the most relevant one from seven physicochemical properties. Cross-validation tests indicate that the new predictor is very promising, meaning that many important key features, which are deeply hidden in complicated protein sequences, can be extracted via the wavelets transform approach, quite consistent with the facts that many important biological functions of proteins can be elucidated with their low-frequency internal motions. The web server of iPPBS-PseAAC is accessible at http://www.jci-bioinfo.cn/iPPBS-PseAAC, by which users can easily acquire their desired results without the need to follow the complicated mathematical equations involved.  相似文献   

13.
For a successful analysis of the relation between amino acid sequence and protein structure, an unambiguous and physically meaningful definition of secondary structure is essential. We have developed a set of simple and physically motivated criteria for secondary structure, programmed as a pattern-recognition process of hydrogen-bonded and geometrical features extracted from x-ray coordinates. Cooperative secondary structure is recognized as repeats of the elementary hydrogen-bonding patterns “turn” and “bridge.” Repeating turns are “helices,” repeating bridges are “ladders,” connected ladders are “sheets.” Geometric structure is defined in terms of the concepts torsion and curvature of differential geometry. Local chain “chirality” is the torsional handedness of four consecutive Cα positions and is positive for right-handed helices and negative for ideal twisted β-sheets. Curved pieces are defined as “bends.” Solvent “exposure” is given as the number of water molecules in possible contact with a residue. The end result is a compilation of the primary structure, including SS bonds, secondary structure, and solvent exposure of 62 different globular proteins. The presentation is in linear form: strip graphs for an overall view and strip tables for the details of each of 10.925 residues. The dictionary is also available in computer-readable form for protein structure prediction work.  相似文献   

14.
The 3D structures of α-crystallin, a major eye lens protein, and related small heat shock proteins are unresolved. It has been assumed that α-crystallin is primarily a β-sheet globular protein similar to γ-crystallin (Siezen and Argos, Biochim. Biophys. Acta, 1983, 748, 56–67) containing sequence repeats in its two domains (Wistow, FEBS Lett. 1985, 181, 1–6). Positional flexibility of amino acid residues and far UV-circular dichroism spectroscopy were used to investigate structural relationships among these proteins. The utility of flexibility plots for predicting protein structure is demonstrated by the excellent correlation of these plots with the known 3D X-ray structures of β/γ-crystallins. Similar analyses of α-crystallin subunits, αA and αB, and human heat shock protein 27 show that the C-terminal domains and connecting segments of these proteins are very similar while the N-terminal domains have significant structural differences. Unlike β/γ-crystallins, both Hsp27 and α-crystallin subunits are asymmetrical with highly flexible C-terminal domains. Flexibility is considered essential for protein functional activity. Therefore, the C-terminal region may play an active role in α-crystallin and small heat shock protein function. Differences in flexibility profiles and estimated secondary structure distribution in α-crystallin by three recent/updated algorithms from far UV-CD spectra support our predicted 3D structure and the concept that α-crystallin and members of β/γ-superfamily are structurally dissimilar.  相似文献   

15.
基于氨基酸特征序列的蛋白质结构分析   总被引:2,自引:1,他引:2  
针对蛋白质序列中氨基酸的核苷酸组成部分及其相关特征信息,提出另外的σ-等序列的概念,并讨论了其主要特征与次要特征,可作为对蛋白质进行定性和定量比较的一种方法,用来判断这些物种的同源性和相似性程度。然后,对所取的全α螺旋,全β折叠和αβ类序列,利用σ-,τ-,στ序列的概念,给出蛋白质序列的相关氨基酸特征序列。同时对三类共18个蛋白质序列进行数值刻划,给出数值刻划图并进行分析。  相似文献   

16.
Abstract

Number of naturally occurring primary sequences of proteins is an infinitesimally small subset of the possible number of primary sequences that can be synthesized using 20 amino acids. Prevailing views ascribe this to slow and incremental mutational/selection evolutionary mechanisms. However, considering the large number of avenues available in form of diversity of emerging/evolving and/or disappearing living systems for exploring the primary sequence space over the evolutionary time scale of ~3.5 billion years, this remains a conjecture. Therefore, to investigate primary sequence space limitations, we carried out a systematic study for finding primary sequences absent in nature. We report the discovery of the smallest peptide sequence “Cysteine-Glutamine-Tryptophan-Tryptophan” that is not found in over half-a-million curated protein sequences in the Uniprot (Swiss-Prot) database. Additionally, we report a library of 83605 pentapeptides that are not found in any of the known protein sequences. Compositional analyses of these absent primary sequences yield a remarkably strong power relationship between the percentage occurrence of individual amino acids in all known protein sequences and their respective frequency of occurrence in the absent peptides, regardless of their specific position in the sequences. If random evolutionary mechanisms were responsible for limitations to the primary sequence space, then one would not expect any relationship between compositions of available and absent primary sequences. Thus, we conclusively show that stoichiometric constraints on amino acids limit the primary sequence space of proteins in nature. We discuss the possibly profound implications of our findings in both evolutionary and synthetic biology.

Communicated by Ramaswamy H. Sarma  相似文献   

17.
In extending a previous paper (TIMA Part 1, Wassermann, 1982) “template induced molecular assembly” (TIMA) is being further explored. It is suggested that TIMA could first have evolved proteins without coevolution of mRNA-like systems, in the absence of tRNAs. Some of these early proteins could, by self-assembly, have built up ribosomes. Ribosomes jointly with amino acids could have served as assembly templates for the TIMA-based evolution of tRNAs. Once tRNAs had evolved, TIMA could have participated, via a modified Mekler (1967) mechanism, in the evolution of new proteins and the coevolution of corresponding mRNA-like strands. TIMA also requires gene duplications and/or random mutations of DNA, to produce partial matching by duplicated and/or randomized DNA sequences of TIMA-generated cDNA which is complementary to the mRNA-like strands. The cDNA could then become incorporated by crossover into the position of the partially matching DNA sequences of, say, duplicate genes in genomes of germ-line cells. Since one requires only partial matching between duplicate (and/or randomly generated) DNA and non-randomly, TIMA-generated cDNA, TIMA theory avoids the need to assume (as in the Baldwin effect) that complete genes were randomly evolved. While rejecting crude Lamarckism, TIMA equally avoids the assumption that genes evolved only by combined random events, gene duplications, and adaptive selection. The resulting theory explains typical pseudo-exogenous adaptations via TIMA. Darwinian selection—now in the guise of “molecular selection” (and favourable environmental adaptive selection where present)—combined with TIMA could account for Waddington's “genetic assimilation”, thereby conceding Lamarck's notion that the environment can help to model heredity (while rejecting crude Lamarckism).  相似文献   

18.
19.
Summary We examine in this paper one of the expected consequences of the hypothesis that modern proteins evolved from random heteropeptide sequences. Specifically, we investigate the lengthwise distributions of amino acids in a set of 1,789 protein sequences with little sequence identity using the run test statistic (r o) of Mood (1940,Ann. Math. Stat. 11, 367–392). The probability density ofr o for a collection of random sequences has mean=0 and variance=1 [the N(0,1) distribution] and can be used to measure the tendency of amino acids of a given type to cluster together in a sequence relative to that of a random sequence. We implement the run test using binary representations of protein sequences in which the amino acids of interest are assigned a value of 1 and all others a value of 0. We consider individual amino acids and sets of various combinations of them based upon hydrophobicity (4 sets), charge (3 sets), volume (4 sets), and secondary structure propensity (3 sets). We find that any sequence chosen randomly has a 90% or greater chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. We regard this as strong support for the random-origin hypothesis. However, we do observe significant deviations from the random expectation as might be expected after billions years of evolution. Two important global trends are found: (1) Amino acids with a strong α-helix propensity show a strong tendency to cluster whereas those with β-sheet or reverse-turn propensity do not. (2) Clustered rather than evenly distributed patterns tend to be preferred by the individual amino acids and this is particularly so for methionine. Finally, we consider the problem of reconciling the random nature of protein sequences with structurally meaningful periodic “patterns” that can be detected by sliding-window, autocorrelation, and Fourier analyses. Two examples, rhodopsin and bacteriorhodopsin, show that such patterns are a natural feature of random sequences.  相似文献   

20.
Proteins consist of structural units such as globular domains, secondary structures, and modules. Modules were originally defined by partitioning a globular domain into compact regions, each of which is a contiguous polypeptide segment having a compact conformation. Since modules show close correlations with the intron positions of genes, they are regarded as primordial polypeptide pieces encoded by exons and shuffled, leading to yield new combination of them in early biological evolution. Do modules maintain their native conformations in solution when they are excised at their boundaries? In order to find answers to this question, we have synthesized modules of barnase, one of the bacterial RNases, and studied the solution structures of modules M2 (amino acid residues 24–52) and M3 (52–73) by 2D NMR studies. Some local secondary structures, α-helix, and β-turns in M2 and β-turns in M3, were observed in the modules at the similar positions to those in the intact barnase but the overall state seems to be in a mixture of random and native conformations. The present result shows that the excised modules have propensity to form similar secondary structures to those of the intact barnase. © 1993 Wiley-Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号