首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
C A Orengo  N P Brown  W R Taylor 《Proteins》1992,14(2):139-167
A fast method is described for searching and analyzing the protein structure databank. It uses secondary structure followed by residue matching to compare protein structures and is developed from a previous structural alignment method based on dynamic programming. Linear representations of secondary structures are derived and their features compared to identify equivalent elements in two proteins. The secondary structure alignment then constrains the residue alignment, which compares only residues within aligned secondary structures and with similar buried areas and torsional angles. The initial secondary structure alignment improves accuracy and provides a means of filtering out unrelated proteins before the slower residue alignment stage. It is possible to search or sort the protein structure databank very quickly using just secondary structure comparisons. A search through 720 structures with a probe protein of 10 secondary structures required 1.7 CPU hours on a Sun 4/280. Alternatively, combined secondary structure and residue alignments, with a cutoff on the secondary structure score to remove pairs of unrelated proteins from further analysis, took 10.1 CPU hours. The method was applied in searches on different classes of proteins and to cluster a subset of the databank into structurally related groups. Relationships were consistent with known families of protein structure.  相似文献   

2.
A computer model to dynamically simulate protein folding: studies with crambin   总被引:12,自引:0,他引:12  
C Wilson  S Doniach 《Proteins》1989,6(2):193-209
The current work describes a simplified representation of protein structure with uses in the simulation of protein folding. The model assumes that a protein can be represented by a freely rotating rigid chain with a single atom approximating the effect of each side chain. Potentials describing the attraction or repulsion between different types of amino acids are determined directly from the distribution of amino acids in the database of known protein structures. The optimization technique of simulated annealing has been used to dynamically sample the conformations available to this simple model, allowing the protein to evolve from an extended, random coil into a compact globular structure. Many characteristics expected of true proteins, such as the sequence-dependent formation of secondary structure, the partitioning of hydrophobic residues, and specific disulfide pairing, are reproduced by the simulation, suggesting the model may accurately simulate the folding process.  相似文献   

3.
A new approach is introduced for analyzing and ultimately predicting protein structures, defined at the level of C alpha coordinates. We analyze hexamers (oligopeptides of six amino acid residues) and show that their structure tends to concentrate in specific clusters rather than vary continuously. Thus, we can use a limited set of standard structural building blocks taken from these clusters as representatives of the repertoire of observed hexamers. We demonstrate that protein structures can be approximated by concatenating such building blocks. We have identified about 100 building blocks by applying clustering algorithms, and have shown that they can "replace" about 76% of all hexamers in well-refined known proteins with an error of less than 1 A, and can be joined together to cover 99% of the residues. After replacing each hexamer by a standard building block with similar conformation, we can approximately reconstruct the actual structure by smoothly joining the overlapping building blocks into a full protein. The reconstructed structures show, in most cases, high resemblance to the original structure, although using a limited number of building blocks and local criteria of concatenating them is not likely to produce a very precise global match. Since these building blocks reflect, in many cases, some sequence dependency, it may be possible to use the results of this study as a basis for a protein structure prediction procedure.  相似文献   

4.
Fogel GB  Fogel DB 《Bio Systems》2011,104(1):57-62
The behaviors of individuals and species are often explained in terms of evolutionary stable strategies (ESSs). The analysis of ESSs determines which, if any, combinations of behaviors cannot be invaded by alternative strategies. Two assumptions required to generate an ESS (i.e., an infinite population and payoffs described only on the average) do not hold under natural conditions. Previous experiments indicated that under more realistic conditions of finite populations and stochastic payoffs, populations may evolve in trajectories that are unrelated to an ESS, even in very simple games. The simulations offered here extend earlier research by employing truncation selection with random parental selection in a hawk-dove game. Payoffs are determined in pairwise contests using either the expected outcome, or the result of a random variable. In each case, however, the mean fraction of hawks over many generations and across many independent trials does not conform to the expected ESS. Implications of these results and philosophical underpinnings of ESS theory are offered.  相似文献   

5.
Fan H  Mark AE 《Proteins》2003,53(1):111-120
The relative stability of protein structures determined by either X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy has been investigated by using molecular dynamics simulation techniques. Published structures of 34 proteins containing between 50 and 100 residues have been evaluated. The proteins selected represent a mixture of secondary structure types including all alpha, all beta, and alpha/beta. The proteins selected do not contain cysteine-cysteine bridges. In addition, any crystallographic waters, metal ions, cofactors, or bound ligands were removed before the systems were simulated. The stability of the structures was evaluated by simulating, under identical conditions, each of the proteins for at least 5 ns in explicit solvent. It is found that not only do NMR-derived structures have, on average, higher internal strain than structures determined by X-ray crystallography but that a significant proportion of the structures are unstable and rapidly diverge in simulations.  相似文献   

6.
It is well known that for any evolutionary game there may be more than one evolutionarily stable strategy (ESS). In general, the more ESSs there are, the more difficult it is to work out how the population will behave (unless there are no ESSs at all). If a matrix game has an ESS which allows all possible pure strategies to be played, referred to as an internal ESS, then no other ESS can exist. In fact, the number of ESSs possible is highly dependent upon how many of the pure strategies each allow to be played, their support size. It is shown that if alpha is the ratio of the mean support size to the number of pure strategies n, then as n tends to infinity the greatest number of ESSs can be represented by a continuous function f(alpha) with useful regularity properties, and bounds are found for both f(alpha) and the value alpha(*), where it attains its maximum. Thus we can obtain a limit on the complexity of any particular system as a function of its mean support size.  相似文献   

7.
Structure comparisons of all representative proteins have been done. Employing the relative root mean square deviation (RMSD) from native enables the assessment of the statistical significance of structure alignments of different lengths in terms of a Z-score. Two conclusions emerge: first, proteins with their native fold can be distinguished by their Z-score. Second and somewhat surprising, all small proteins up to 100 residues in length have significant structure alignments to other proteins in a different secondary structure and fold class; i.e. 24.0% of them have 60% coverage by a template protein with a RMSD below 3.5 Å and 6.0% have 70% coverage. If the restriction that we align proteins only having different secondary structure types is removed, then in a representative benchmark set of proteins of 200 residues or smaller, 93% can be aligned to a single template structure (with average sequence identity of 9.8%), with a RMSD less than 4 Å, and 79% average coverage. In this sense, the current Protein Data Bank (PDB) is almost a covering set of small protein structures. The length of the aligned region (relative to the whole protein length) does not differ among the top hit proteins, indicating that protein structure space is highly dense. For larger proteins, non-related proteins can cover a significant portion of the structure. Moreover, these top hit proteins are aligned to different parts of the target protein, so that almost the entire molecule can be covered when combined. The number of proteins required to cover a target protein is very small, e.g. the top ten hit proteins can give 90% coverage below a RMSD of 3.5 Å for proteins up to 320 residues long. These results give a new view of the nature of protein structure space, and its implications for protein structure prediction are discussed.  相似文献   

8.
S Miyazawa  R L Jernigan 《Proteins》1999,36(3):347-356
Short-range interactions for secondary structures of proteins are evaluated as potentials of mean force from the observed frequencies of secondary structures in known protein structures which are assumed to have an equilibrium distribution with the Boltzmann factor of secondary structure energies. A secondary conformation at each residue position in a protein is described by a tripeptide, including one nearest neighbor on each side. The secondary structure potentials are approximated as additive contributions from neighboring residues along the sequence. These are part of an empirical potential to provide a crude estimate of protein conformational energy at a residue level. Unlike previous works, interactions are decoupled into intrinsic potentials of residues, potentials of backbone-backbone interactions, and of side chain-backbone interactions. Also interactions are decoupled into one-body, two-body, and higher order interactions between peptide backbone and side chain and between backbones. These decouplings are essential to correctly evaluate the total secondary structure energy of a protein structure without overcounting interactions. Each interaction potential is evaluated separately by taking account of the correlation in the amino acid order of protein sequences. Interactions among side chains are neglected, because of the relatively limited number of protein structures. Proteins 1999;36:347-356. Published 1999 Wiley-Liss, Inc.  相似文献   

9.
Loops are regions of nonrepetitive conformation connecting regular secondary structures. We identified 2,024 loops of one to eight residues in length, with acceptable main-chain bond lengths and peptide bond angles, from a database of 223 protein and protein-domain structures. Each loop is characterized by its sequence, main-chain conformation, and relative disposition of its bounding secondary structures as described by the separation between the tips of their axes and the angle between them. Loops, grouped according to their length and type of their bounding secondary structures, were superposed and clustered into 161 conformational classes, corresponding to 63% of all loops. Of these, 109 (51% of the loops) were populated by at least four nonhomologous loops or four loops sharing a low sequence identity. Another 52 classes, including 12% of the loops, were populated by at least three loops of low sequence similarity from three or fewer nonhomologous groups. Loop class suprafamilies resulting from variations in the termini of secondary structures are discussed in this article. Most previously described loop conformations were found among the classes. New classes included a 2:4 type IV hairpin, a helix-capping loop, and a loop that mediates dinucleotide-binding. The relative disposition of bounding secondary structures varies among loop classes, with some classes such as beta-hairpins being very restrictive. For each class, sequence preferences as key residues were identified; those most frequently at these conserved positions than in proteins were Gly, Asp, Pro, Phe, and Cys. Most of these residues are involved in stabilizing loop conformation, often through a positive phi conformation or secondary structure capping. Identification of helix-capping residues and beta-breakers among the highly conserved positions supported our decision to group loops according to their bounding secondary structures. Several of the identified loop classes were associated with specific functions, and all of the member loops had the same function; key residues were conserved for this purpose, as is the case for the parvalbumin-like calcium-binding loops. A significant number, but not all, of the member loops of other loop classes had the same function, as is the case for the helix-turn-helix DNA-binding loops. This article provides a systematic and coherent conformational classification of loops, covering a broad range of lengths and all four combinations of bounding secondary structure types, and supplies a useful basis for modelling of loop conformations where the bounding secondary structures are known or reliably predicted.  相似文献   

10.
Correct splice site recognition is critical in pre-mRNA splicing. We find that almost all of a diverse panel of exonic splicing silencer (ESS) elements alter splice site choice when placed between competing sites, consistently inhibiting use of intron-proximal 5' and 3' splice sites. Supporting a general role for ESSs in splice site definition, we found that ESSs are both abundant and highly conserved between alternative splice site pairs and that mutation of ESSs located between natural alternative splice site pairs consistently shifted splicing toward the intron-proximal site. Some exonic splicing enhancers (ESEs) promoted use of intron-proximal 5' splice sites, and tethering of hnRNP A1 and SF2/ASF proteins between competing splice sites mimicked the effects of ESS and ESE elements, respectively. Further, we observed that specific subsets of ESSs had distinct effects on a multifunctional intron retention reporter and that one of these subsets is likely preferred for regulation of endogenous intron retention events. Together, our findings provide a comprehensive picture of the functions of ESSs in the control of diverse types of splicing decisions.  相似文献   

11.
A new, efficient method for the assembly of protein tertiary structure from known, loosely encoded secondary structure restraints and sparse information about exact side chain contacts is proposed and evaluated. The method is based on a new, very simple method for the reduced modeling of protein structure and dynamics, where the protein is described as a lattice chain connecting side chain centers of mass rather than Cαs. The model has implicit built-in multibody correlations that simulate short- and long-range packing preferences, hydrogen bonding cooperativity and a mean force potential describing hydrophobic interactions. Due to the simplicity of the protein representation and definition of the model force field, the Monte Carlo algorithm is at least an order of magnitude faster than previously published Monte Carlo algorithms for structure assembly. In contrast to existing algorithms, the new method requires a smaller number of tertiary restraints for successful fold assembly; on average, one for every seven residues as compared to one for every four residues. For example, for smaller proteins such as the B domain of protein G, the resulting structures have a coordinate root mean square deviation (cRMSD), which is about 3 Å from the experimental structure; for myoglobin, structures whose backbone cRMSD is 4.3 Å are produced, and for a 247-residue TIM barrel, the cRMSD of the resulting folds is about 6 Å. As would be expected, increasing the number of tertiary restraints improves the accuracy of the assembled structures. The reliability and robustness of the new method should enable its routine application in model building protocols based on various (very sparse) experimentally derived structural restraints. Proteins 32:475–494, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

12.
Accurately predicted protein secondary structure provides useful information for target selection, to analyze protein function and to predict higher dimensional structure. Existing research shows that more data + refined search = better prediction. We analyze relation between the prediction accuracy and another crucial factor, the protein size. Empirical tests performed with two secondary structure predictors on a large set of high-resolution, non-redundant proteins show that the average accuracies for small proteins (<100 residues) equal 73% and 54% for alpha-helices and beta-strands, respectively. The alpha-helix/beta-strand accuracies for very large proteins (>300 residues) equal 77%/68%, respectively. Similarly, the tests with three secondary structure content predictors show that the prediction errors for the small/very large proteins equal 0.13/0.09 and 0.09/0.06 for alpha-helix and beta-strand content, respectively. Our tests confirm that the secondary structure/content predictions for the very large proteins are characterized statistically significantly better quality than prediction for the small proteins. This is in contrast with the tertiary structure predictions in which higher accuracy is obtained for smaller proteins.  相似文献   

13.
Amino acid propensities for secondary structures were used since the 1970s, when Chou and Fasman evaluated them within datasets of few tens of proteins and developed a method to predict secondary structure of proteins, still in use despite prediction methods having evolved to very different approaches and higher reliability. Propensity for secondary structures represents an intrinsic property of amino acid, and it is used for generating new algorithms and prediction methods, therefore our work has been aimed to investigate what is the best protein dataset to evaluate the amino acid propensities, either larger but not homogeneous or smaller but homogeneous sets, i.e., all-alpha, all-beta, alpha-beta proteins. As a first analysis, we evaluated amino acid propensities for helix, beta-strand, and coil in more than 2000 proteins from the PDBselect dataset. With these propensities, secondary structure predictions performed with a method very similar to that of Chou and Fasman gave us results better than the original one, based on propensities derived from the few tens of X-ray protein structures available in the 1970s. In a refined analysis, we subdivided the PDBselect dataset of proteins in three secondary structural classes, i.e., all-alpha, all-beta, and alpha-beta proteins. For each class, the amino acid propensities for helix, beta-strand, and coil have been calculated and used to predict secondary structure elements for proteins belonging to the same class by using resubstitution and jackknife tests. This second round of predictions further improved the results of the first round. Therefore, amino acid propensities for secondary structures became more reliable depending on the degree of homogeneity of the protein dataset used to evaluate them. Indeed, our results indicate also that all algorithms using propensities for secondary structure can be still improved to obtain better predictive results.  相似文献   

14.
Dynamic structures of globular proteins are studied on the basis of correlative movements of residues around their native conformations, which are computed by means of the normal mode analysis. To describe the dynamic structures of a protein, the core regions moving with strong positive or negative correlations to other regions of the polypeptide chain are detected from the correlation maps of the movements of residues. Such core regions are different, according to the definition, from the regions defined from a geometrical point of view, such as secondary structures, domains, modules, and so on. The core regions are actually detected for four proteins, myoglobin, Bence-Jones protein, flavodoxin, and hen egg-white lysozyme, with different folding types from each other. The results show that some of them coincide with the secondary structures, domains, or modules, but others do not. Then, the dynamic structure of each protein is discussed in terms of the dynamic cores detected, as compared with the secondary structures, domains, and modules.  相似文献   

15.
Intrinsically disordered proteins (IDPs)/protein regions (IDPRs) lack unique three-dimensional structure at the level of secondary and/or tertiary structure and are represented as an ensemble of interchanging conformations. To investigate the role of presence/absence of secondary structures in promoting intrinsic disorder in proteins, a comparative sequence analysis of IDPs, IDPRs and proteins with minimal secondary structures (less than 5%) is required. A sequence analysis reveals proteins with minimal secondary structure content have high mean net positive charge, low mean net hydrophobicity and low sequence complexity. Interestingly, analysis of the relative local electrostatic interactions reveal that an increase in the relative repulsive interactions between amino acids separated by three or four residues lead to either loss of secondary structure or intrinsic disorder. IDPRs show increase in both local negative-negative and positive-positive repulsive interactions. While IDPs show a marked increase in the local negative-negative interactions, proteins with minimal secondary structure depict an increase in the local positive-positive interactions. IDPs and IDPRs are enriched in D, E and Q residues, while proteins with minimal secondary structure are depleted of these residues. Proteins with minimal secondary structures have higher content of G and C, while IDPs and IDPRs are depleted of these residues. These results confirm that proteins with minimal secondary structure have a distinctly different propensity for charge, hydrophobicity, specific amino acids and local electrostatic interactions as compared to IDPs/IDPRs. Thus we conclude that lack of secondary structure may be a necessary but not a sufficient condition for intrinsic disorder in proteins.  相似文献   

16.
Finding structural similarities in distantly related proteins can reveal functional relationships that can not be identified using sequence comparison. Given two proteins A and B and threshold ε ?, we develop an algorithm, TRiplet-based Iterative ALignment (TRIAL) for computing the transformation of B that maximizes the number of aligned residues such that the root mean square deviation (RMSD) of the alignment is at most ε ?. Our algorithm is designed with the specific goal of effectively handling proteins with low similarity in primary structure, where existing algorithms perform particularly poorly. Experiments show that our method outperforms existing methods. TRIAL alignment brings the secondary structures of distantly related proteins to similar orientations. It also finds larger number of secondary structure matches at lower RMSD values and increased overall alignment lengths. Its classification accuracy is up to 63 percent better than other methods, including CE and DALI. TRIAL successfully aligns 83 percent of the residues from the smaller protein in reasonable time while other methods align only 29 to 65 percent of the residues for the same set of proteins.  相似文献   

17.
Folding type-specific secondary structure propensities of 20 naturally occurring amino acids have been derived from α-helical, β-sheet, α/β, and α+β proteins of known structures. These data show that each residue type of amino acids has intrinsic propensities in different regions of secondary structures for different folding types of proteins. Each of the folding types shows markedly different rank ordering, indicating folding type-specific effects on the secondary structure propensities of amino acids. Rigorous statistical tests have been made to validate the folding type-specific effects. It should be noted that α and β proteins have relatively small α-helices and β-strands forming propensities respectively compared with those of α+β and α/β proteins. This may suggest that, with more complex architectures than α and β proteins, α+β and α/β proteins require larger propensities to distinguish from interacting α-helices and β-strands. Our finding of folding type-specific secondary structure propensities suggests that sequence space accessible to each folding type may have differing features. Differing sequence space features might be constrained by topological requirement for each of the folding types. Almost all strong β-sheet forming residues are hydrophobic in character regardless of folding types, thus suggesting the hydrophobicities of side chains as a key determinant of β-sheet structures. In contrast, conformational entropy of side chains is a major determinant of the helical propensities of amino acids, although other interactions such as hydrophobicities and charged interactions cannot be neglected. These results will be helpful to protein design, class-based secondary structure prediction, and protein folding. © 1998 John Wiley & Sons, Inc. Biopoly 45: 35–49, 1998  相似文献   

18.
C Sander  R Schneider 《Proteins》1991,9(1):56-68
The database of known protein three-dimensional structures can be significantly increased by the use of sequence homology, based on the following observations. (1) The database of known sequences, currently at more than 12,000 proteins, is two orders of magnitude larger than the database of known structures. (2) The currently most powerful method of predicting protein structures is model building by homology. (3) Structural homology can be inferred from the level of sequence similarity. (4) The threshold of sequence similarity sufficient for structural homology depends strongly on the length of the alignment. Here, we first quantify the relation between sequence similarity, structure similarity, and alignment length by an exhaustive survey of alignments between proteins of known structure and report a homology threshold curve as a function of alignment length. We then produce a database of homology-derived secondary structure of proteins (HSSP) by aligning to each protein of known structure all sequences deemed homologous on the basis of the threshold curve. For each known protein structure, the derived database contains the aligned sequences, secondary structure, sequence variability, and sequence profile. Tertiary structures of the aligned sequences are implied, but not modeled explicitly. The database effectively increases the number of known protein structures by a factor of five to more than 1800. The results may be useful in assessing the structural significance of matches in sequence database searches, in deriving preferences and patterns for structure prediction, in elucidating the structural role of conserved residues, and in modeling three-dimensional detail by homology.  相似文献   

19.
Protein function is intimately linked to protein structure and dynamics yet experimentally determined structures frequently omit regions within a protein due to indeterminate data, which is often due protein dynamics. We propose that atomistic molecular dynamics simulations provide a diverse sampling of biologically relevant structures for these missing segments (and beyond) to improve structural modeling and structure prediction. Here we make use of the Dynameomics data warehouse, which contains simulations of representatives of essentially all known protein folds. We developed novel computational methods to efficiently identify, rank and retrieve small peptide structures, or fragments, from this database. We also created a novel data model to analyze and compare large repositories of structural data, such as contained within the Protein Data Bank and the Dynameomics data warehouse. Our evaluation compares these structural repositories for improving loop predictions and analyzes the utility of our methods and models. Using a standard set of loop structures, containing 510 loops, 30 for each loop length from 4 to 20 residues, we find that the inclusion of Dynameomics structures in fragment‐based methods improves the quality of the loop predictions without being dependent on sequence homology. Depending on loop length, ~25–75% of the best predictions came from the Dynameomics set, resulting in lower main chain root‐mean‐square deviations for all fragment lengths using the combined fragment library. We also provide specific cases where Dynameomics fragments provide better predictions for NMR loop structures than fragments from crystal structures. Online access to these fragment libraries is available at http://www.dynameomics.org/fragments .  相似文献   

20.
Vertebrate fibrinogen is a complex multidomained protein, the structure of which has been inferred mainly from electron microscopy and amino acid sequence studies. Among its most prominent features are two terminal globules, moieties that are mostly composed of the carboxyl-terminal two-thirds of the beta and gamma chains. Sequences homologous to the latter segments are found in several other animal proteins, always as the carboxyl-terminal contributions. An alignment of 15 amino acid sequences from various fibrinogens and related proteins has been used to make judgments about secondary structure. The nature of amino acids at each position in the alignment was used to distinguish alpha helices and beta structure on the one hand from loops and turns on the other, and the resulting assignments compared with predictions of secondary structure by other methods. Additionally, constraints imposed by the locations of cystines, carbohydrate attachment residues, and proteinase-sensitive points provided further insights into the general organization of the postulated secondary structures. Other ancillary data, including the effects of bound calcium and the locations of labeled or variant residues, were also considered. An intriguing similarity to a portion of the recently reported structure of a calcium-dependent lectin is noted.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号