首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Automated methods have been developed to determine the preferred packing arrangement between interacting protein groups. A suite of FORTRAN programs, SIRIUS, is described for calculating and analysing the geometries of interacting protein groups using crystallographically derived atomic co-ordinates. The programs involved in calculating the geometries search for interacting pairs of protein groups using a distance criterion, and then calculate the spatial disposition and orientation of the pair. The second set of programs is devoted to analysis. This involves calculating the observed and expected distributions of the angles and assessing the statistical significance of the difference between the two. A database of the geometries of the 400 combinations of side-chain to side-chain interaction has been created. The approach used in analysing the geometrical information is illustrated here with specific examples of interactions between side-chains, peptide groups and particular types of atom. At the side-chain level, an analysis of aromatic-amino interactions, and the interactions of peptide carbonyl groups with arginine residues is presented. At the atomic level the analyses include the spatial disposition of oxygen atoms around tyrosine residues, and the frequency and type of contact between carbon, nitrogen and oxygen atoms. This information is currently being applied to the modelling of protein interactions.  相似文献   

3.
基于多个结构域联合作用导致蛋白质间相互作用的假设,提出了一种预测蛋白质间相互作用的新方法。使用支持向量机分析结构域组合对序列的氨基酸理化性质得到其序列特征值,同时采用统计分析的方法获取其频率特征值,最后通过融合上述两种特征估计该结构域组合间发生相互作用的可能性,并以此预测蛋白质间相互作用关系。该方法能够预测所有结构域组合间相互作用关系,且对于蛋白质相互作用关系有着较好的预测效果。  相似文献   

4.
The folding process of sea hare myoglobin was simulated by the island model, which does not rely on sequence homologies or statistical inference from database of known structure. Sea hare myoglobin has low sequence homology (28%), but high structural similarity, with sperm whale myoglobin, which was already simulated by the island model. Their structural similarity is shown physiochemically from the distribution of hydrophobic-residue pairs, that is, the key pairs for packing of the secondary structures. Irrelevant to the sequence homology, the secondary structures can be packed into the tertiary structure through the hydrophobic interactions among the amino acid pairs responsible for the local structure formation. The results on the two species of myoglobins indicate that, in contrast to other prediction methods, the island model is applicable to any type of protein without extra information other than the distribution of hydrophobic-residue pairs and the positions of the secondary structures. Consequently the present results provide another verification of the validity of the island model for elucidating the mechanisms of protein folding and predicting protein structures.  相似文献   

5.
Predicting protein structure from primary sequence is one of the ultimate challenges in computational biology. Given the large amount of available sequence data, the analysis of co-evolution, i.e., statistical dependency, between columns in multiple alignments of protein domain sequences remains one of the most promising avenues for predicting residues that are contacting in the structure. A key impediment to this approach is that strong statistical dependencies are also observed for many residue pairs that are distal in the structure. Using a comprehensive analysis of protein domains with available three-dimensional structures we show that co-evolving contacts very commonly form chains that percolate through the protein structure, inducing indirect statistical dependencies between many distal pairs of residues. We characterize the distributions of length and spatial distance traveled by these co-evolving contact chains and show that they explain a large fraction of observed statistical dependencies between structurally distal pairs. We adapt a recently developed Bayesian network model into a rigorous procedure for disentangling direct from indirect statistical dependencies, and we demonstrate that this method not only successfully accomplishes this task, but also allows contacts with weak statistical dependency to be detected. To illustrate how additional information can be incorporated into our method, we incorporate a phylogenetic correction, and we develop an informative prior that takes into account that the probability for a pair of residues to contact depends strongly on their primary-sequence distance and the amount of conservation that the corresponding columns in the multiple alignment exhibit. We show that our model including these extensions dramatically improves the accuracy of contact prediction from multiple sequence alignments.  相似文献   

6.
One of the major contributors to protein structures is the formation of disulphide bonds between selected pairs of cysteines at oxidized state. Prediction of such disulphide bridges from sequence is challenging given that the possible combination of cysteine pairs as the number of cysteines increases in a protein. Here, we describe a SVM (support vector machine) model for the prediction of cystine connectivity in a protein sequence with and without a priori knowledge on their bonding state. We make use of a new encoding scheme based on physico-chemical properties and statistical features (probability of occurrence of each amino acid residue in different secondary structure states along with PSI-blast profiles). We evaluate our method in SPX (an extended dataset of SP39 (swiss-prot 39) and SP41 (swiss-prot 41) with known disulphide information from PDB) dataset and compare our results with the recursive neural network model described for the same dataset.  相似文献   

7.
Regularities in the primary structure of proteins   总被引:3,自引:0,他引:3  
In this paper the latest protein database consisting of more than a million amino acids is analyzed to characterize the short range regularities in the primary structure. The amino acid distributions along the polypeptide chain and among the proteins have been studied first. Their influence on the amino acid pair statistics was taken into account. We are primarily interested in the distances of the covalent structure, where the amino acid pair frequencies show non-random characters. The amino acid pairs separated by at least 20 residues in the covalent structure exhibit an exact Gaussian distribution. We found that there is a range of non-random pairing in the covalent structure. We conclude that the pair preference characters are different for each of the 20 x 20 amino acid pairs. The range of the non-random pairing varies from pair to pair, and in most cases it does not extend beyond the 9th neighbour. The preferences of a certain pair in a certain position can not be derived from the character of that pair in another position. The preference values of 400 amino acid pairs are listed for up to the pairs in 9th neighbour position. Some fields of potential application of these data have also been discussed.  相似文献   

8.
Protein Explorer: easy yet powerful macromolecular visualization   总被引:21,自引:0,他引:21  
Protein Explorer (PE, http://www.proteinexplorer.org) enables students, educators and other nonspecialists to visualize macromolecular structures easily. It also offers several advanced capabilities useful to protein structure specialists. Great attention has been given to making PE easy to use. Explanations, color keys and troubleshooting information are displayed automatically. There are also 'Frequently Asked Questions', a one-hour 'Quick-Tour', an alphabetical 'Help/Index/Glossary', and a detailed 'Tutorial'; all making PE much easier to use than either Chime or RasMol. Moreover, it is much more powerful; in addition to basic macromolecular visualization capabilities common to most similar programs, it offers one-click visualization of interfaces between moieties ('contacts'), cation-pi interactions and salt bridges, as well as easy-to-use routines to visualize regions of conservation in three-dimensional protein structures based on multiple sequence alignments.  相似文献   

9.
Comparison of the primary structures of pancreatic colipases from man, pig, horse and rat shows a high degree of homology between proteins. Fifty-two out of the 95 residues of the polypeptide are identical. All colipases contain 10 half-cystines which are located at invariant positions. The secondary structure of colipases has been predicted from the sequence using the statistical method of Chou and Fasman and the method of Gibrat, Garnier and Robson based on information theory. Predictions indicate that colipases have a low content of alpha-helix and beta-strand structure. The two segments at positions 7-10 and 56-59, assumed to be part of the lipid binding domain, have predicted beta-sheet conformation and should be in close spatial vicinity to each other in the proteins. Four beta-turns are predicted in all colipases at positions 3-6, 46-49, 61-64, and 81-84. They might contribute, with the five disulfide bridges, to a tight packing of the protein molecule. Surface residues and major sequential antigenic determinants of mammalian colipases have been predicted using methods based either on hydrophilicity/hydropathy scales or amino acid mutability. From these studies, it appears that colipases exhibit large conformational homologies. In the absence of data on the tertiary structure of colipase, predictive methods, together with physico-chemical and immunological studies, provide valuable information on the conformation of the protein in relation to the topology of residues involved in the functional and antigenic sites.  相似文献   

10.
The electrostatic free energy contribution of an ion pair in a protein depends on two factors, geometrical orientation of the side-chain charged groups with respect to each other and the structural context of the ion pair in the protein. Conformers in NMR ensembles enable studies of the relationship between geometry and electrostatic strengths of ion pairs, because the protein structural contexts are highly similar across different conformers. We have studied this relationship using a dataset of 22 unique ion pairs in 14 NMR conformer ensembles for 11 nonhomologous proteins. In different NMR conformers, the ion pairs are classified as salt bridges, nitrogen-oxygen (N-O) bridges and longer-range ion pairs on the basis of geometrical criteria. In salt bridges, centroids of the side-chain charged groups and at least a pair of side-chain nitrogen and oxygen atoms of the ion-pairing residues are within a 4 A distance. In N-O bridges, at least a pair of the side-chain nitrogen and oxygen atoms of the ion-pairing residues are within 4 A distance, but the distance between the side-chain charged group centroids is greater than 4 A. In the longer-range ion pairs, the side-chain charged group centroids as well as the side-chain nitrogen and oxygen atoms are more than 4 A apart. Continuum electrostatic calculations indicate that most of the ion pairs have stabilizing electrostatic contributions when their side-chain charged group centroids are within 5 A distance. Hence, most (approximately 92%) of the salt bridges and a majority (68%) of the N-O bridges are stabilizing. Most (approximately 89%) of the destabilizing ion pairs are the longer-range ion pairs. In the NMR conformer ensembles, the electrostatic interaction between side-chain charged groups of the ion-pairing residues is the strongest for salt bridges, considerably weaker for N-O bridges, and the weakest for longer-range ion pairs. These results suggest empirical rules for stabilizing electrostatic interactions in proteins.  相似文献   

11.
Gap junctions form intercellular channels that mediate metabolic and electrical signaling between neighboring cells in a tissue. Lack of an atomic resolution structure of the gap junction has made it difficult to identify interactions that stabilize its transmembrane domain. Using a recently computed model of this domain, which specifies the locations of each amino acid, we postulated the existence of several interactions and tested them experimentally. We introduced mutations within the transmembrane domain of the gap junction-forming protein connexin that were previously implicated in genetic diseases and that apparently destabilized the gap junction, as evidenced here by the absence of the protein from the sites of cell-cell apposition. The model structure helped identify positions on adjacent helices where second-site mutations restored membrane localization, revealing possible interactions between residue pairs. We thus identified two putative salt bridges and one pair involved in packing interactions in which one disease-causing mutation suppressed the effects of another. These results seem to reveal some of the physical forces that underlie the structural stability of the gap junction transmembrane domain and suggest that abrogation of such interactions bring about some of the effects of disease-causing mutations.  相似文献   

12.
13.
Basing on the analysis of a large number of protein sequences (Cserzo M., Simon I., 1989), the structure of the amino acid nearest neighbour pair whose occurrence has a maximal positive deviation from the mean statistical value, is shown to correspond in most cases to the code of the amino acid codon roots. It reveals particularly amino acid pairs in n and n+5 positions of polypeptide chains. Amino acids belonging to A/U family contribute mostly to the folding of peptide chains.  相似文献   

14.
Zhou Y  Zhou YS  He F  Song J  Zhang Z 《Molecular bioSystems》2012,8(5):1396-1404
Deciphering functional interactions between proteins is one of the great challenges in biology. Sequence-based homology-free encoding schemes have been increasingly applied to develop promising protein-protein interaction (PPI) predictors by means of statistical or machine learning methods. Here we analyze the relationship between codon pair usage and PPIs in yeast. We show that codon pair usage of interacting protein pairs differs significantly from randomly expected. This motivates the development of a novel approach for predicting PPIs, with codon pair frequency difference as input to a Support Vector Machine predictor, termed as CCPPI. 10-fold cross-validation tests based on yeast PPI datasets with balanced positive-to-negative ratios indicate that CCPPI performs better than other sequence-based encoding schemes. Moreover, it ranks the best when tested on an unbalanced large-scale dataset. Although CCPPI is subjected to high false positive rates like many PPI predictors, statistical analyses of the predicted true positives confirm that the success of CCPPI is partly ascribed to its capability to capture proteomic co-expression and functional similarities between interacting protein pairs. Our findings suggest that codon pairs of interacting protein pairs evolve in a coordinated manner and consequently they provide additional information beyond amino acids-based encoding schemes. CCPPI has been made freely available at: http://protein.cau.edu.cn/ccppi.  相似文献   

15.
A method for comparison of protein sequences based on their primary and secondary structure is described. Protein sequences are annotated with predicted secondary structures (using a modified Chou and Fasman method). Two lettered code sequences are generated (Xx, where X is the amino acid and x is its annotated secondary structure). Sequences are compared with a dynamic programming method (STRALIGN) that includes a similarity matrix for both the amino acids and secondary structures. The similarity value for each paired two-lettered code is a linear combination of similarity values for the paired amino acids and their annotated secondary structures. The method has been applied to eight globin proteins (28 pairs) for which the X-ray structure is known. For protein pairs with high primary sequence similarity (greater than 45%), STRALIGN alignment is identical to that obtained by a dynamic programming method using only primary sequence information. However, alignment of protein pairs with lower primary sequence similarity improves significantly with the addition of secondary structure annotation. Alignment of the pair with the least primary sequence similarity of 16% was improved from 0 to 37% 'correct' alignment using this method. In addition, STRALIGN was successfully applied to seven pairs of distantly related cytochrome c proteins, and three pairs of distantly related picornavirus proteins.  相似文献   

16.
Collagen fibrils are among the most abundant protein polymers in living organisms. While the longitudinal packing of molecules in the fibrils has been agreed on for many years, there is continuous disagreement over the lateral packing. In this work, we describe a set of computer graphics programs that can be used to visualize fibril packing and simulate fibril growth. An example of a model and simulation are presented.  相似文献   

17.
A knowledge-based approach to the modelling of enzyme-peptide inhibitor complexes is described. Given the structure of an enzyme, and knowledge of its binding site, the method seeks to predict the binding geometry of a peptide ligand. This novel method involves using examples of side-chain packing derived from proteins of known three-dimensional structure to define possible packing arrangements of a peptide inhibitor group to its binding site. A suite of programs, GEMINI, was written and used to predict the packing of pairs of amino acid groups from three inhibitors complexed to their enzymes for which the X-ray structures were available. These included the Phe group of the inhibitor H142 bound to endothiapepsin, the Leu group of CLT complexed to thermolysin and the C-terminus of Gly-L-Tyr bound to carboxypeptidase A. A detailed comparison of the modelled and observed inhibitor coordinates was made. This approach may be extended to modelling other types of protein interactions.  相似文献   

18.
Kumar S  Nussinov R 《Proteins》2001,43(4):433-454
This report investigates the effect of systemic protein conformational flexibility on the contribution of ion pairs to protein stability. Toward this goal, we use all NMR conformer ensembles in the Protein Data Bank (1) that contain at least 40 conformers, (2) whose functional form is monomeric, (3) that are nonredundant, and (4) that are large enough. We find 11 proteins adhering to these criteria. Within these proteins, we identify 22 ion pairs that are close enough to be classified as salt bridges. These are identified in the high-resolution crystal structures of the respective proteins or in the minimized average structures (if the crystal structures are unavailable) or, if both are unavailable, in the "most representative" conformer of each of the ensembles. We next calculate the electrostatic contribution of each such ion pair in each of the conformers in the ensembles. This results in a comprehensive study of 1,201 ion pairs, which allows us to look for consistent trends in their electrostatic contributions to protein stability in large sets of conformers. We find that the contributions of ion pairs vary considerably among the conformers of each protein. The vast majority of the ion pairs interconvert between being stabilizing and destabilizing to the structure at least once in the ensembles. These fluctuations reflect the variabilities in the location of the ion pairing residues and in the geometric orientation of these residues, both with respect to each other, and with respect to other charged groups in the remainder of the protein. The higher crystallographic B-factors for the respective side-chains are consistent with these fluctuations. The major conclusion from this study is that salt bridges observed in crystal structure may break, and new salt bridges may be formed. Hence, the overall stabilizing (or, destabilizing) contribution of an ion pair is conformer population dependent.  相似文献   

19.
To understand how protein segments are inserted and deleted during divergent evolution, a set of pairwise alignments contained exactly one gap, and therefore arising from the first insertion-deletion (indel) event in the time separating the homologs, was examined. The alignments showed that "structure breaking" amino acids (PGDNS) were preferred within and flanking gapped regions, as are two residues with hydrophilic side-chains (QE) that frequently occur at the surface of protein folds. Conversely, hydrophobic residues (FMILYVW) occur infrequently within and flanking the gapped region. These preferences are modestly different in protein pairs separated by an episode of adaptive evolution, than in pairs diverging under strong functional constraints. Surprisingly, regions near an indel have not evolved more rapidly than the sequence pair overall, showing no evidence that an indel event must be compensated by local amino acid replacement. The gap-lengths are best approximated by a Zipfian distribution, with the probability of a gap of length L decreasing as a function of L(-1.8). These features are largely independent of the length of the gap and the extent of divergence (measured by both silent and non-silent sequence changes) separating the two proteins. Surprisingly, amino acid repeats were discovered in more than a third of the polypeptide segments in and around the gap. These correspond to repeats in the DNA sequence. This suggests that a signature of the mechanism by which indels occur in the DNA sequence remains in the encoded protein sequences. These data suggest specific tools to score gap placement in an alignment. They also suggest tools that distinguish true indels from gaps created by mistaken gene finding, including under-predicted and over-predicted introns. By providing mechanisms to identify errors, the tools will enhance the value of genome sequence databases in support of integrated paleogenomics strategies used to extract functional information in a post-genomic environment.  相似文献   

20.
Here we report an orientation-dependent statistical all-atom potential derived from side-chain packing, named OPUS-PSP. It features a basis set of 19 rigid-body blocks extracted from the chemical structures of all 20 amino acid residues. The potential is generated from the orientation-specific packing statistics of pairs of those blocks in a non-redundant structural database. The purpose of such an approach is to capture the essential elements of orientation dependence in molecular packing interactions. Tests of OPUS-PSP on commonly used decoy sets demonstrate that it significantly outperforms most of the existing knowledge-based potentials in terms of both its ability to recognize native structures and consistency in achieving high Z-scores across decoy sets. As OPUS-PSP excludes interactions among main-chain atoms, its success highlights the crucial importance of side-chain packing in forming native protein structures. Moreover, OPUS-PSP does not explicitly include solvation terms, and thus the potential should perform well when the solvation effect is difficult to determine, such as in membrane proteins. Overall, OPUS-PSP is a generally applicable potential for protein structure modeling, especially for handling side-chain conformations, one of the most difficult steps in high-accuracy protein structure prediction and refinement.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号