首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
It has been shown that probable portions which form contacts in a protein can be predicted by means of an average distance map (ADM) as well as regular structures (-helices and -turns) defined as short-range compact regions (Kikuchiet al., 1988a,c). In this paper, we analyze the occurrence of those portions and short-range compact regions on ADMs for various proteins regarding their folding types. We have found out that each folding type of proteins shows characteristic distribution of such parts on ADMS. We also discuss the possibility of the prediction of folding types of proteins by ADMs.  相似文献   

2.
Protein structure prediction is based mainly on the modeling of proteins by homology to known structures; this knowledgebased approach is the most promising method to date. Although it is used in the whole area of protein research, no general rules concerning the quality and applicability of concepts and procedures used in homology modeling have been put forward yet. Therefore, the main goal of the present work is to provide tools for the assessment of accuracy of modeling at a given level of sequence homology. A large set of known structures from different conformational and functional classes, but various degrees of homology was selected. Pairwise structure superpositions were performed. Starting with the definition of the structurally conserved regions and determination of topologically correct sequence alignments, we correlated geometrical properties with sequence homology (defined by the 250 PAM Dayhoff Matrix) and identity. It is shown that both the topological differences of the protein backbones and the relative positions of corresponding side chains diverge with decreasing sequence identity. Below 50% identity, the deviation in regions that are structurally not conserved continually increases, thus implying that with decreasing sequence identity modeling has to take into account more and more structurally diverging loop regions that are difficult to predict. © 1993 Wiley-Liss, Inc.  相似文献   

3.
A method for analyzing differences in the folding mechanisms of proteins in the same family is presented. Using only information from the amino acid sequences, contact maps derived from the interresidue average distances are employed. These maps, referred to as average distance maps (ADM), are applied to the folding of c-type lysozymes. The results reveal that the ADMs of these lysozymes reflect the differences in the detailed folding mechanisms. Further possible applications of the present method are also discussed.  相似文献   

4.
Rohl CA  Strauss CE  Chivian D  Baker D 《Proteins》2004,55(3):656-677
A major limitation of current comparative modeling methods is the accuracy with which regions that are structurally divergent from homologues of known structure can be modeled. Because structural differences between homologous proteins are responsible for variations in protein function and specificity, the ability to model these differences has important functional consequences. Although existing methods can provide reasonably accurate models of short loop regions, modeling longer structurally divergent regions is an unsolved problem. Here we describe a method based on the de novo structure prediction algorithm, Rosetta, for predicting conformations of structurally divergent regions in comparative models. Initial conformations for short segments are selected from the protein structure database, whereas longer segments are built up by using three- and nine-residue fragments drawn from the database and combined by using the Rosetta algorithm. A gap closure term in the potential in combination with modified Newton's method for gradient descent minimization is used to ensure continuity of the peptide backbone. Conformations of variable regions are refined in the context of a fixed template structure using Monte Carlo minimization together with rapid repacking of side-chains to iteratively optimize backbone torsion angles and side-chain rotamers. For short loops, mean accuracies of 0.69, 1.45, and 3.62 A are obtained for 4, 8, and 12 residue loops, respectively. In addition, the method can provide reasonable models of conformations of longer protein segments: predicted conformations of 3A root-mean-square deviation or better were obtained for 5 of 10 examples of segments ranging from 13 to 34 residues. In combination with a sequence alignment algorithm, this method generates complete, ungapped models of protein structures, including regions both similar to and divergent from a homologous structure. This combined method was used to make predictions for 28 protein domains in the Critical Assessment of Protein Structure 4 (CASP 4) and 59 domains in CASP 5, where the method ranked highly among comparative modeling and fold recognition methods. Model accuracy in these blind predictions is dominated by alignment quality, but in the context of accurate alignments, long protein segments can be accurately modeled. Notably, the method correctly predicted the local structure of a 39-residue insertion into a TIM barrel in CASP 5 target T0186.  相似文献   

5.
The degree of similarity of two protein three-dimensional structures is usually measured with the root-mean-square distance between equivalent atom pairs. Such a similarity measure depends on the dimension of the proteins, that is, on the number of equivalent atom pairs. The present communication presents a simple procedure to make the root-mean-square distances between pairs of three-dimensional structures independent of their dimensions. This normalization may be useful in evolutionary and fold classification studies as well as in simple comparisons between different structural models.  相似文献   

6.
A novel method has been developed for acquiring the correct alignment of a query sequence against remotely homologous proteins by extracting structural information from profiles of multiple structure alignment. A systematic search algorithm combined with a group of score functions based on sequence information and structural information has been introduced in this procedure. A limited number of top solutions (15,000) with high scores were selected as candidates for further examination. On a test-set comprising 301 proteins from 75 protein families with sequence identity less than 30%, the proportion of proteins with completely correct alignment as first candidate was improved to 39.8% by our method, whereas the typical performance of existing sequence-based alignment methods was only between 16.1% and 22.7%. Furthermore, multiple candidates for possible alignment were provided in our approach, which dramatically increased the possibility of finding correct alignment, such that completely correct alignments were found amongst the top-ranked 1000 candidates in 88.3% of the proteins. With the assistance of a sequence database, completely correct alignment solutions were achieved amongst the top 1000 candidates in 94.3% of the proteins. From such a limited number of candidates, it would become possible to identify more correct alignment using a more time-consuming but more powerful method with more detailed structural information, such as side-chain packing and energy minimization, etc. The results indicate that the novel alignment strategy could be helpful for extending the application of highly reliable methods for fold identification and homology modeling to a huge number of homologous proteins of low sequence similarity. Details of the methods, together with the results and implications for future development are presented.  相似文献   

7.
Detection of homologous proteins by an intermediate sequence search   总被引:2,自引:0,他引:2  
We developed a variant of the intermediate sequence search method (ISS(new)) for detection and alignment of weakly similar pairs of protein sequences. ISS(new) relates two query sequences by an intermediate sequence that is potentially homologous to both queries. The improvement was achieved by a more robust overlap score for a match between the queries through an intermediate. The approach was benchmarked on a data set of 2369 sequences of known structure with insignificant sequence similarity to each other (BLAST E-value larger than 0.001); 2050 of these sequences had a related structure in the set. ISS(new) performed significantly better than both PSI-BLAST and a previously described intermediate sequence search method. PSI-BLAST could not detect correct homologs for 1619 of the 2369 sequences. In contrast, ISS(new) assigned a correct homolog as the top hit for 121 of these 1619 sequences, while incorrectly assigning homologs for only nine targets; it did not assign homologs for the remainder of the sequences. By estimate, ISS(new) may be able to assign the folds of domains in approximately 29,000 of the approximately 500,000 sequences unassigned by PSI-BLAST, with 90% specificity (1 - false positives fraction). In addition, we show that the 15 alignments with the most significant BLAST E-values include the nearly best alignments constructed by ISS(new).  相似文献   

8.
A new self-correcting distance geometry method for predicting the three-dimensional structure of small globular proteins was assessed with a test set of 8 helical proteins. With the knowledge of the amino acid sequence and the helical segments, our completely automated method calculated the correct backbone topology of six proteins. The accuracy of the predicted structures ranged from 2.3 A to 3.1 A for the helical segments compared to the experimentally determined structures. For two proteins, the predicted constraints were not restrictive enough to yield a conclusive prediction. The method can be applied to all small globular proteins, provided the secondary structure is known from NMR analysis or can be predicted with high reliability.  相似文献   

9.
A new method to detect remote relationships between protein sequences and known three-dimensional structures based on direct energy calculations and without reliance on statistics has been developed. The likelihood of a residue to occupy a given position on the structural template was represented by an estimate of the stabilization free energy made after explicit prediction of the substituted side chain conformation. The profile matrix derived from these energy values and modified by increasing the residue self-exchange values successfully predicted compatibility of heatshock protein and globin sequences with the three-dimensional structures of actin and phycocyanin, respectively, from a full protein sequence databank search. The high sensitivity of the method makes it a unique tool for predicting the three-dimensional fold for the rapidly growing number of protein sequences. © 1994 Wiley-Liss, Inc.  相似文献   

10.
The goal of this work is to characterize structurally ambivalent fragments in proteins. We have searched the Protein Data Bank and identified all structurally ambivalent peptides (SAPs) of length five or greater that exist in two different backbone conformations. The SAPs were classified in five distinct categories based on their structure. We propose a novel index that provides a quantitative measure of conformational variability of a sequence fragment. It measures the context-dependent width of the distribution of (phi,xi) dihedral angles associated with each amino acid type. This index was used to analyze the local structural propensity of both SAPs and the sequence fragments contiguous to them. We also analyzed type-specific amino acid composition, solvent accessibility, and overall structural properties of SAPs and their sequence context. We show that each type of SAP has an unusual, type-specific amino acid composition and, as a result, simultaneous intrinsic preferences for two distinct types of backbone conformation. All types of SAPs have lower sequence complexity than average. Fragments that adopt helical conformation in one protein and sheet conformation in another have the lowest sequence complexity and are sampled from a relatively limited repertoire of possible residue combinations. A statistically significant difference between two distinct conformations of the same SAP is observed not only in the overall structural properties of proteins harboring the SAP but also in the properties of its flanking regions and in the pattern of solvent accessibility. These results have implications for protein design and structure prediction.  相似文献   

11.
Using only data on sequence, a method of computing a low-resolution tertiary structure of a protein is described. The steps are: (a) Estimate the distances of individual residues from the centroid of the molecule, using data on hydrophobicity and additional geometrical constraints. (b) Using these distances, construct a two-valued matrix whose elements, the distances between residues, are greater or less thanR, the radius of the molecule. (c) Optimize to obtain a three-dimensional structure. This procedure requires modest computing facilities and is applicable to proteins with 164 residues and presumably more. It produces structures withr (correlation between inter-residue distances in the computed and native structures) between 0.5 and 0.7. Furthermore, correct inference of two or three long-range contacts suffices to yield structures withr values of 0.8–0.9. Because segments forming parallel or antiparallel folding structures intersect the radius vector at similar angles, from centroidal point distances it is possible to infer some of these long-range contacts by an elaboration of the procedure used to construct the input matrix. A criterion is also described which can be used to determine the quality of a proposed input matrix even when the native structure is not known.  相似文献   

12.
在DNA序列相似性的研究中,通常采用的动态规划算法对空位罚分函数缺乏理论依据而带有主观性,从而取得不同的结果,本文提出了一种基于DTW(Dynamic Time Warping,动态时间弯曲)距离的DNA序列相似性度量方法可以解决这一问题.通过DNA序列的图形表示把DNA序列转化为时间序列,然后计算DTW距离来度量序列相似度以表征DNA序列属性,得到能够比较DNA序列相似性度量方法,并用这个方法比较分析了七种东亚钳蝎神经毒素(Buthusmartensi Karsch neurotoxin)基因序列的相似性,验证了该度量方法的有效性和准确性.  相似文献   

13.
We proposed a fast and unsupervised clustering method, minimum span clustering (MSC), for analyzing the sequence–structure–function relationship of biological networks, and demonstrated its validity in clustering the sequence/structure similarity networks (SSN) of 682 membrane protein (MP) chains. The MSC clustering of MPs based on their sequence information was found to be consistent with their tertiary structures and functions. For the largest seven clusters predicted by MSC, the consistency in chain function within the same cluster is found to be 100%. From analyzing the edge distribution of SSN for MPs, we found a characteristic threshold distance for the boundary between clusters, over which SSN of MPs could be properly clustered by an unsupervised sparsification of the network distance matrix. The clustering results of MPs from both MSC and the unsupervised sparsification methods are consistent with each other, and have high intracluster similarity and low intercluster similarity in sequence, structure, and function. Our study showed a strong sequence–structure–function relationship of MPs. We discussed evidence of convergent evolution of MPs and suggested applications in finding structural similarities and predicting biological functions of MP chains based on their sequence information. Proteins 2015; 83:1450–1461. © 2015 Wiley Periodicals, Inc.  相似文献   

14.
Simple flexible programs (TREEMOMENT and PILEUPMOMENT) are described for depicting the average amphipathicity (hydrophobic moment) along multiply aligned sequences of a family of evolutionarily related proteins. The programs are applicable to any number of aligned sequences and can be set for any desired angle corresponding to a residue repeat unit in a protein secondary structural element such as 100 per residue for an alpha- helix or 180 per residue for a beta-strand. These programs can be used to identify amphipathic regions common to the members of a protein family. The use of these programs is exemplified by showing that some families of integral membrane transport proteins (i.e. permeases of the bacterial phosphotransferase system (PTS) and the anion exchangers of animals) exhibit strikingly amphipathic alpha-helical structures immediately preceding the first hydrophobic transmembrane segment of their membrane-embedded domain(s). Other families, such as the major facilitator superfamily of uniporters, symporters and antiporters, do not exhibit this structural feature. The amphipathic structures in PTS permeases have been implicated in membrane insertion during biogenesis.  相似文献   

15.
16.
Under the assumption of equivalent heat capacity values, the differential free energy of stability for a pair of proteins midway between their thermal unfolding transition temperatures is shown to be independent of DeltaC(p) up to its cubic term in DeltaT(m). For model calculations reflecting the nearly 30 degrees C difference in T(m) for the adenylate kinases from the arctic bacterium Bacillus globisporus and the thermophilic bacterium Geobacillus stearothermophilus, the resultant error in estimating DeltaDeltaG by the formula 0.5 [DeltaS(T(m1))(1)+DeltaS(T(m2)) (2)] DeltaT(m) is less than 1%. Combined with the analogous thermal unfolding data for the adenylate kinase from Escherichia coli, these three homologous proteins exhibit T(m) and DeltaS(T(m)) values consistent with differential entropy and enthalpy contributions of equal magnitude. When entropy-enthalpy compensation holds for the differential free energy of stability, the incremental changes in T(m) values are shown to be proportionate to the changes in free energy.  相似文献   

17.
MIF proteins are not glutathione transferase homologs.   总被引:2,自引:1,他引:1       下载免费PDF全文
Although macrophage migration inhibitory factor (MIF) proteins conjugate glutathione, sequence analysis does not support their homology to other glutathione transferases. Glutathione transferases are not detected with MIF proteins in searches of protein sequence databases, and MIF proteins do not share significant sequence similarity with glutathione transferases. Homology cannot be demonstrated by multiple sequence alignment or evolutionary tree construction; such methods assume that the proteins being analyzed are homologous.  相似文献   

18.
Summary A set of simple equations is derived which gives the relationship between the observed amino acid differences per 100 codons and the evolutionary distance per 100 codons using Holmquist's stochastic model of molecular evolution.Contribution No. 910 from the National Institute of Genetics, Mishima, Shizuoka-ken 411 Japan.  相似文献   

19.
Comparative docking is based on experimentally determined structures of protein-protein complexes (templates), following the paradigm that proteins with similar sequences and/or structures form similar complexes. Modeling utilizing structure similarity of target monomers to template complexes significantly expands structural coverage of the interactome. Template-based docking by structure alignment can be performed for the entire structures or by aligning targets to the bound interfaces of the experimentally determined complexes. Systematic benchmarking of docking protocols based on full and interface structure alignment showed that both protocols perform similarly, with top 1 docking success rate 26%. However, in terms of the models' quality, the interface-based docking performed marginally better. The interface-based docking is preferable when one would suspect a significant conformational change in the full protein structure upon binding, for example, a rearrangement of the domains in multidomain proteins. Importantly, if the same structure is selected as the top template by both full and interface alignment, the docking success rate increases 2-fold for both top 1 and top 10 predictions. Matching structural annotations of the target and template proteins for template detection, as a computationally less expensive alternative to structural alignment, did not improve the docking performance. Sophisticated remote sequence homology detection added templates to the pool of those identified by structure-based alignment, suggesting that for practical docking, the combination of the structure alignment protocols and the remote sequence homology detection may be useful in order to avoid potential flaws in generation of the structural templates library.  相似文献   

20.
Left-handed polyproline II (PPII) helices commonly occur in globular proteins in segments of 4-8 residues. This paper analyzes the structural conservation of PPII-helices in 3 protein families: serine proteinases, aspartic proteinases, and immunoglobulin constant domains. Calculations of the number of conserved segments based on structural alignment of homologous molecules yielded similar results for the PPII-helices, the alpha-helices, and the beta-strands. The PPII-helices are consistently conserved at the level of 100-80% in the proteins with sequence identity above 20% and RMS deviation of structure alignments below 3.0 A. The most structurally important PPII segments are conserved below this level of sequence identity. These results suggest that the PPII-helices, in addition to the other 2 secondary structure classes, should be identified as part of structurally conserved regions in proteins. This is supported by similar values for the local RMS deviations of the aligned segments for the structural classes of PPII-helices, alpha-helices, and beta-strands. The PPII-helices are shown to participate in supersecondary elements such as PPII-helix/alpha-helix. The conservation of PPII-helices depends on the conservation of a supersecondary element as a whole. PPII-helices also form links, possibly flexible, in the interdomain regions. The role of the PPII-helices in model building by homology is 2-fold; they serve as additional conserved elements in the structure allowing improvement of the accuracy of a model and provide correct chain geometry for modeling of the segments equivalenced to them in a target sequence. The improvement in model building is demonstrated in 2 test studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号