首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
详细了解蛋白质的三级结构信息有助于理解其生物学功能.随着植物基因组研究的进展,已发现了50多个植物类金属硫蛋白(Metallothionein-Like, MT-L)基因.但至今只有少数几个MT-L蛋白得到了纯化,而其结构尚无报道,因此有必要建立分析这类蛋白结构特征的方法.本研究根据已知的哺乳动物MT的结构数据,分析得出了CXC、CXXC模式和金属-硫络合簇结构原子间的距离限制条件,并用距离几何算法计算得出预测蛋白可能的构象;然后通过统计分析筛选出目标函数值显著较小、构象能低的结构作为这些蛋白半胱氨酸富含区的预测结构,由此建成了适合于植物类金属硫蛋白半胱氨酸富含区的结构预测方法.从应用该方法正确地预测出了已知结构的蓝蟹MT的结构来看,该方法是可行的.并用该方法预测了油菜MT-L蛋白的半胱氨酸富含区的结构.  相似文献   

2.
Three-dimensional structures of membrane proteins from genomic sequencing   总被引:1,自引:0,他引:1  
Hopf TA  Colwell LJ  Sheridan R  Rost B  Sander C  Marks DS 《Cell》2012,149(7):1607-1621
We show that amino acid covariation in proteins, extracted from the evolutionary sequence record, can be used to fold transmembrane proteins. We use this technique to predict previously unknown 3D structures for 11 transmembrane proteins (with up to 14 helices) from their sequences alone. The prediction method (EVfold_membrane) applies a maximum entropy approach to infer evolutionary covariation in pairs of sequence positions within a protein family and then generates all-atom models with the derived pairwise distance constraints. We benchmark the approach with blinded de novo computation of known transmembrane protein structures from 23 families, demonstrating unprecedented accuracy of the method for large transmembrane proteins. We show how the method can predict oligomerization, functional sites, and conformational changes in transmembrane proteins. With the rapid rise in large-scale sequencing, more accurate and more comprehensive information on evolutionary constraints can be decoded from genetic variation, greatly expanding the repertoire of transmembrane proteins amenable to modeling by this method.  相似文献   

3.
植物类金属硫蛋白半胱氨酸富含区结构的建模   总被引:1,自引:0,他引:1  
详细了解蛋白质的三级结构信息有助于理解其生物学功能。随着植物基因组研究的进展 ,已发现了 50多个植物类金属硫蛋白 (Metallothionein_Like ,MT_L)基因。但至今只有少数几个MT_L蛋白得到了纯化 ,而其结构尚无报道 ,因此有必要建立分析这类蛋白结构特征的方法。本研究根据已知的哺乳动物MT的结构数据 ,分析得出了CXC、CXXC模式和金属 硫络合簇结构原子间的距离限制条件 ,并用距离几何算法计算得出预测蛋白可能的构象 ;然后通过统计分析筛选出目标函数值显著较小、构象能低的结构作为这些蛋白半胱氨酸富含区的预测结构 ,由此建成了适合于植物类金属硫蛋白半胱氨酸富含区的结构预测方法。从应用该方法正确地预测出了已知结构的蓝蟹MT的结构来看 ,该方法是可行的。并用该方法预测了油菜MT_L蛋白的半胱氨酸富含区的结构。  相似文献   

4.
Correlations of amino acids in proteins   总被引:2,自引:0,他引:2  
Du Q  Wei D  Chou KC 《Peptides》2003,24(12):1863-1869
A correlation analysis among 20 amino acids is performed for four protein structural classes (, β, /β, and +β) in a total of 204 proteins. The correlation relationships among amino acids can be classified into the following four types: (1) strong positive correlation, (2) strong negative correlation, (3) weak correlation, and (4) no correlation. The correlation relationships are different for different proteins and are correlated with the features of their structural classes. The amino acids with the weak correlation relationship can be treated as the independent basis functions for the space where proteins are defined. The amino acids with large correlation coefficients are linear correlative with each other and they are not independent. The strong correlation among amino acids reflects their mutual constrained relationship, as exhibited by their relevant structural features. The information obtained through the correlation analysis is used for predicting protein structural classes and a better prediction quality is obtained than that by the simple geometry distance methods without taking into account the correlation effects.  相似文献   

5.
We have developed a new methodology that determines protein structures using small-angle X-ray scattering (SAXS) data. The current bottlenecks in determining the protein structures require a new strategy using the simple design of an experiment, and SAXS is suitable for this purpose in spite of its low information content. First we demonstrated that SAXS constraints work additively to NMR-derived information in calculating structures. Next, structure calculations for nine proteins taking different folds were performed using the SAXS constraints combined with the NMR-derived distance restraints for local geometry such as secondary structures or those for tertiary structure. The results show that the SAXS constraints complemented the tertiary-structural information for all the proteins, and that accuracy of the structures thus obtained with SAXS constraints and local geometrical restraints ranged from 1.85 to 4.33 Å. Based on these results, we were able to construct a coarse-grained protein model at amino acid residue resolution.  相似文献   

6.
Coordinated amino acid changes in homologous protein families   总被引:4,自引:0,他引:4  
In the tobamovirus coat protein family, amino acid residues at some spatially close positions are found to be substituted in a coordinated manner [Altschuh et al. (1987) J. Mol. Biol., 193, 693]. Therefore, these positions show an identical pattern of amino acid substitutions when amino acid sequences of these homologous proteins are aligned. Based on this principle, coordinated substitutions have been searched for in three additional protein families: serine proteases, cysteine proteases and the haemoglobins. Coordinated changes have been found in all three protein families mostly within structurally constrained regions. This method works with a varying degree of success depending on the function of the proteins, the range of sequence similarities and the number of sequences considered. By relaxing the criteria for residue selection, the method was adapted to cover a broader range of protein families and to study regions of the proteins having weaker structural constraints. The information derived by these methods provides a general guide for engineering of a large variety of proteins to analyse structure-function relationships.  相似文献   

7.
One of the goals of molecular bioinformatics is decoding amino acid sequences to extract information on the principles of protein folding. However, this is difficult to perform with standard bioinformatics techniques such as multiple sequence alignment and so on. Thus, we propose a technique based on inter-residue average distance statistics to make predictions regarding the protein folding mechanisms of amino acid sequences. Our method involves constructing a kind of predicted contact map called an Average Distance Map (ADM) based on average distance statistics to pinpoint regions of possible folding nuclei for proteins. Only information on the amino acid sequence of a given protein is required for the present method. In this article, we summarize the results of studies using our method to analyze how specific protein sequences affect folding properties. In particular, we present studies on proteins in the phage lysozyme, such as the globin, fatty acid binding protein-like, and the cupredoxin-like fold families. In the present review, we characterize the 3D architectures of these proteins through the properties of the protein ADMs. Furthermore, we combine the information on the conserved residues within the regions predicted by the ADMs with our results obtained so far. Such information may help identify the folding characteristics of each protein. We discuss this possibility in the present review.  相似文献   

8.
"Pseudo-structures" of the 20 common amino acid residues are introduced for use in protein spatial structure determinations, which rely on the use of intramolecular proton-proton distance constraints determined by nuclear Overhauser effects as input for distance geometry calculations. The proposed structures satisfy requirements for the initial structural interpretation of the nuclear magnetic resonance data that arise from the absence of stereospecific assignments and/or limited spectral resolution for certain resonance lines. The pseudo-atoms used as reference points for the experimental distance constraints can be used in conjunction with the real amino acid structures representing the van der Waals' constraints on the spatial molecular structure, or with simplified models in order to reduce the computing time for the distance geometry calculations.  相似文献   

9.
The construction of a realistic theoretical model of proteins is determinant for improving the computational simulations of their structural and functional aspects. Modeling proteins as a network of non-covalent connections between the atoms of amino acid residues has shown valuable insights into these macromolecules. The energy-related properties of protein structures are known to be very important in molecular dynamics. However, these same properties have been neglected when the protein structures are modeled as networks of atoms and amino acid residues. A new approach for the construction of protein models based on a network of atoms is presented. This method, based on interatomic interaction, takes into account the energy and geometric aspects of the protein structures that were not employed before, such as atomic occlusion inside the protein, the use of solvation, protein modeling and analysis, and the use of energy potentials to estimate the energies of interatomic non-covalent contacts. As a result, we achieved a more realistic network model of proteins. This model has the virtue of being more robust in face of different unknown variables that usually are arbitrarily estimated. We were able to determine the most connected residues of all the proteins studied, so that we are now in a better condition to study their structural role.  相似文献   

10.
Chemical cross-linking is an attractive technique for the study of the structure of protein complexes due to its low sample consumption and short analysis time. Furthermore, distance constraints obtained from the identification of cross-linked peptides by MS can be used to construct and validate protein models. If a sufficient number of distance constraints are obtained, then determining the secondary structure of a protein can allow inference of the protein's fold. In this work, we show how the distance constraints obtained from cross-linking experiments can identify secondary structures within the protein sequence. Molecular modeling of alpha helices and beta sheets reveals that each secondary structure presents different cross-linking possibilities due to the topological distances between reactive residues. Cross-linking experiments performed with amine reactive cross-linkers with model alpha helix containing proteins corroborated the molecular modeling predictions. The cross-linking patterns established here can be extended to other cross-linkers with known lengths for the determination of secondary structures in proteins.  相似文献   

11.
It is known that the backbone conformation of a protein can be reproduced with precision once a correct contact map (two-dimensional representation showing residue pairs in contact) is given as geometrical constraints. There is, however, no way to infer the correct contact map for a protein of unknown structure. We started with one-dimensional constraints using the quantity N14 (the number of neighboring residues within the radius of 14 Å). Since the plot of N14 along a chain shows a good correlation with the corresponding amino acid sequence, the N14 profile obtained from the X-ray structure is predictable from the sequence. Construction of backbone conformations under a given N14 profile was carried out in the following two steps: (1) a contact map from the N14 profile was produced by taking the product of N14 values of every two residues; (2) backbone conformations were generated by applying the distance geometry technique to distance constraints given by the contact map. If present, disulfide bonds in a protein, as well as the secondary structure, were treated as additional constraints, and both cases with or without the additional information were examined. The method was tested for 11 proteins of known structure, and the results indicated that the reproduced conformation was fairly good, using an X-ray structure for comparison, for small proteins of less than 80 residues long. The basic assumption and effectiveness of the present method were compared with those of previous studies employing the geometrical constraint approach. It has become clear that the specific, one-dimensional information (e.g., N14 profile) is more effective than nonspecific, two-dimensional constraints, such as average interresidue distances between particular types of amino acids. © 1993 Wiley-Liss, Inc.  相似文献   

12.
Homology detection and protein structure prediction are central themes in bioinformatics. Establishment of relationship between protein sequences or prediction of their structure by sequence comparison methods finds limitations when there is low sequence similarity. Recent works demonstrate that the use of profiles improves homology detection and protein structure prediction. Profiles can be inferred from protein multiple alignments using different approaches. The "Conservatism-of-Conservatism" is an effective profile analysis method to identify structural features between proteins having the same fold but no detectable sequence similarity. The information obtained from protein multiple alignments varies according to the amino acid classification employed to calculate the profile. In this work, we calculated entropy profiles from PSI-BLAST-derived multiple alignments and used different amino acid classifications summarizing almost 500 different attributes. These entropy profiles were converted into pseudocodes which were compared using the FASTA program with an ad-hoc matrix. We tested the performance of our method to identify relationships between proteins with similar fold using a nonredundant subset of sequences having less than 40% of identity. We then compared our results using Coverage Versus Error per query curves, to those obtained by methods like PSI-BLAST, COMPASS and HHSEARCH. Our method, named HIP (Homology Identification with Profiles) presented higher accuracy detecting relationships between proteins with the same fold. The use of different amino acid classifications reflecting a large number of amino acid attributes, improved the recognition of distantly related folds. We propose the use of pseudocodes representing profile information as a fast and powerful tool for homology detection, fold assignment and analysis of evolutionary information enclosed in protein profiles.  相似文献   

13.
The sequences of four-alpha-helical bundle proteins are characterized by a pattern of hydrophilic and hydrophobic amino acids which is repeated every seven residues. At each position of the heptad repeat there are specific constraints on the amino acid properties which result from the topology of the tertiary motif. These constraints give rise to patterns of amino acid distribution which are distinct from those of other proteins. The distributions in each of the heptad positions have been determined by a statistical analysis of structural and sequence data derived from seven families of aligned protein sequences. The constitution of each position is dominated by a very small number of different amino acids, with the core positions consisting overwhelmingly of Leu and Ala. The positional preferences of the individual amino acids can be generally interpreted in terms of residue properties and topological constraints. The potential for four-alpha-helix bundle folding is reflected primarily in the pattern of residue occurrence in the heptad and not in the overall amino acid composition of the protein. Possible applications of this analysis in structure predictions, sequence alignments and in the rational design and engineering of four-alpha-helical bundle proteins are discussed.  相似文献   

14.
Miyazawa S 《PloS one》2011,6(3):e17244

Background

Empirical substitution matrices represent the average tendencies of substitutions over various protein families by sacrificing gene-level resolution. We develop a codon-based model, in which mutational tendencies of codon, a genetic code, and the strength of selective constraints against amino acid replacements can be tailored to a given gene. First, selective constraints averaged over proteins are estimated by maximizing the likelihood of each 1-PAM matrix of empirical amino acid (JTT, WAG, and LG) and codon (KHG) substitution matrices. Then, selective constraints specific to given proteins are approximated as a linear function of those estimated from the empirical substitution matrices.

Results

Akaike information criterion (AIC) values indicate that a model allowing multiple nucleotide changes fits the empirical substitution matrices significantly better. Also, the ML estimates of transition-transversion bias obtained from these empirical matrices are not so large as previously estimated. The selective constraints are characteristic of proteins rather than species. However, their relative strengths among amino acid pairs can be approximated not to depend very much on protein families but amino acid pairs, because the present model, in which selective constraints are approximated to be a linear function of those estimated from the JTT/WAG/LG/KHG matrices, can provide a good fit to other empirical substitution matrices including cpREV for chloroplast proteins and mtREV for vertebrate mitochondrial proteins.

Conclusions/Significance

The present codon-based model with the ML estimates of selective constraints and with adjustable mutation rates of nucleotide would be useful as a simple substitution model in ML and Bayesian inferences of molecular phylogenetic trees, and enables us to obtain biologically meaningful information at both nucleotide and amino acid levels from codon and protein sequences.  相似文献   

15.
Here we perform a systematic exploration of the use of distance constraints derived from small angle X-ray scattering (SAXS) measurements to filter candidate protein structures for the purpose of protein structure prediction. This is an intrinsically more complex task than that of applying distance constraints derived from NMR data where the identity of the pair of amino acid residues subject to a given distance constraint is known. SAXS, on the other hand, yields a histogram of pair distances (pair distribution function), but the identities of the pairs contributing to a given bin of the histogram are not known. Our study is based on an extension of the Levitt-Hinds coarse grained approach to ab initio protein structure prediction to generate a candidate set of C(alpha) backbones. In spite of the lack of specific residue information inherent in the SAXS data, our study shows that the implementation of a SAXS filter is capable of effectively purifying the set of native structure candidates and thus provides a substantial improvement in the reliability of protein structure prediction. We test the quality of our predicted C(alpha) backbones by doing structural homology searches against the Dali domain library, and find that the results are very encouraging. In spite of the lack of local structural details and limited modeling accuracy at the C(alpha) backbone level, we find that useful information about fold classification can be extracted from this procedure. This approach thus provides a way to use a SAXS data based structure prediction algorithm to generate potential structural homologies in cases where lack of sequence homology prevents identification of candidate folds for a given protein. Thus our approach has the potential to help in determination of the biological function of a protein based on structural homology instead of sequence homology.  相似文献   

16.
Due to advances in molecular biology the DNA sequences of structural genes coding for proteins are often known before a protein is characterized or even isolated. The function of a protein whose amino acid sequence has been deduced from a DNA sequence may not even be known. This has created greater interest in the development of methods to predict the tertiary structures of proteins. The a priori prediction of a protein's structure from its amino acid sequence is not yet possible. However, since proteins with similar amino acid sequences are observed to have similar three-dimensional structures, it is possible to use an analogy with a protein of known structure to draw some conclusions about the structure and properties of an uncharacterized protein. The process of predicting the tertiary structure of a protein relies very much upon computer modeling and analysis of the structure. The prediction of the structure of the bacteriophage 434 cro repressor is used as an example illustrating current procedures.  相似文献   

17.
What are the major forces governing protein evolution? A common view is that proteins with strong structural and functional requirements evolve more slowly than proteins with weak constraints, because a stringent negative selection pressure limits the number of substitutions. In contrast, Graur claimed that the substitution rate of a protein is mainly determined by its amino acid composition and the changeabilities of amino acids. In this paper, however, we found that the relative changeabilities of amino acids in mammalian proteins are different for transmembranal and nontransmembranal segments, which have very distinct structural requirements. This indicates that the changeability of a given residue is influenced by the structural and functional context. We also reexamined the relationship between substitution rate and amino acid composition. Indeed, the two kinds of segments exhibit contrasting amino acid compositions: transmembranal regions are made up mainly of hydrophobic residues (a total frequency of approximately 60%) and are very poor in polar amino acids (<5%), whereas nontransmembranal segments have frequencies of 30% and 22%, respectively. Interestingly, we found that within a given integral membrane protein, nontransmembranal segments accumulate, on average, twice as many substitutions as transmembranal regions. However, regression analyses showed that the variability in amino acid frequencies among proteins cannot explain more than 30% of the variability in substitution rate for the transmembranal and nontransmembranal data sets. Furthermore, transmembranal and nontransmembranal segments evolving at the same rate in different proteins have different compositions, and the compositions of slowly evolving and rapidly evolving segments of the same type are similar. From these observations, we conclude that the rate of protein evolution is only weakly affected by amino acid composition but is mostly determined by the strength of functional requirements or selective constraints.  相似文献   

18.
We present a two-step approach to modeling the transmembrane spanning helical bundles of integral membrane proteins using only sparse distance constraints, such as those derived from chemical cross-linking, dipolar EPR and FRET experiments. In Step 1, using an algorithm, we developed, the conformational space of membrane protein folds matching a set of distance constraints is explored to provide initial structures for local conformational searches. In Step 2, these structures refined against a custom penalty function that incorporates both measures derived from statistical analysis of solved membrane protein structures and distance constraints obtained from experiments. We begin by describing the statistical analysis of the solved membrane protein structures from which the theoretical portion of the penalty function was derived. We then describe the penalty function, and, using a set of six test cases, demonstrate that it is capable of distinguishing helical bundles that are close to the native bundle from those that are far from the native bundle. Finally, using a set of only 27 distance constraints extracted from the literature, we show that our method successfully recovers the structure of dark-adapted rhodopsin to within 3.2 A of the crystal structure.  相似文献   

19.
Fares MA  Travers SA 《Genetics》2006,173(1):9-23
Protein evolution depends on intramolecular coevolutionary networks whose complexity is proportional to the underlying functional and structural interactions among sites. Here we present a novel approach that vastly improves the sensitivity of previous methods for detecting coevolution through a weighted comparison of divergence between amino acid sites. The analysis of the HIV-1 Gag protein detected convergent adaptive coevolutionary events responsible for the selective variability emerging between subtypes. Coevolution analysis and functional data for heat-shock proteins, Hsp90 and GroEL, highlight that almost all detected coevolving sites are functionally or structurally important. The results support previous suggestions pinpointing the complex interdomain functional interactions within these proteins and we propose new amino acid sites as important for interdomain functional communication. Three-dimensional information sheds light on the functional and structural constraints governing the coevolution between sites. Our covariation analyses propose two types of coevolving sites in agreement with previous reports: pairs of sites spatially proximal, where compensatory mutations could maintain the local structure stability, and clusters of distant sites located in functional domains, suggesting a functional dependency between them. All sites detected under adaptive evolution in these proteins belong to coevolution groups, further underlining the importance of testing for coevolution in selective constraints analyses.  相似文献   

20.
We present an approach to predicting protein structural class that uses amino acid composition and hydrophobic pattern frequency information as input to two types of neural networks: (1) a three-layer back-propagation network and (2) a learning vector quantization network. The results of these methods are compared to those obtained from a modified Euclidean statistical clustering algorithm. The protein sequence data used to drive these algorithms consist of the normalized frequency of up to 20 amino acid types and six hydrophobic amino acid patterns. From these frequency values the structural class predictions for each protein (all-alpha, all-beta, or alpha-beta classes) are derived. Examples consisting of 64 previously classified proteins were randomly divided into multiple training (56 proteins) and test (8 proteins) sets. The best performing algorithm on the test sets was the learning vector quantization network using 17 inputs, obtaining a prediction accuracy of 80.2%. The Matthews correlation coefficients are statistically significant for all algorithms and all structural classes. The differences between algorithms are in general not statistically significant. These results show that information exists in protein primary sequences that is easily obtainable and useful for the prediction of protein structural class by neural networks as well as by standard statistical clustering algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号