共查询到20条相似文献,搜索用时 0 毫秒
1.
Database of homology-derived protein structures and the structural meaning of sequence alignment 总被引:85,自引:0,他引:85
The database of known protein three-dimensional structures can be significantly increased by the use of sequence homology, based on the following observations. (1) The database of known sequences, currently at more than 12,000 proteins, is two orders of magnitude larger than the database of known structures. (2) The currently most powerful method of predicting protein structures is model building by homology. (3) Structural homology can be inferred from the level of sequence similarity. (4) The threshold of sequence similarity sufficient for structural homology depends strongly on the length of the alignment. Here, we first quantify the relation between sequence similarity, structure similarity, and alignment length by an exhaustive survey of alignments between proteins of known structure and report a homology threshold curve as a function of alignment length. We then produce a database of homology-derived secondary structure of proteins (HSSP) by aligning to each protein of known structure all sequences deemed homologous on the basis of the threshold curve. For each known protein structure, the derived database contains the aligned sequences, secondary structure, sequence variability, and sequence profile. Tertiary structures of the aligned sequences are implied, but not modeled explicitly. The database effectively increases the number of known protein structures by a factor of five to more than 1800. The results may be useful in assessing the structural significance of matches in sequence database searches, in deriving preferences and patterns for structure prediction, in elucidating the structural role of conserved residues, and in modeling three-dimensional detail by homology. 相似文献
2.
Costantini S Colonna G Facchiano AM 《Biochemical and biophysical research communications》2006,342(2):441-451
Amino acid propensities for secondary structures were used since the 1970s, when Chou and Fasman evaluated them within datasets of few tens of proteins and developed a method to predict secondary structure of proteins, still in use despite prediction methods having evolved to very different approaches and higher reliability. Propensity for secondary structures represents an intrinsic property of amino acid, and it is used for generating new algorithms and prediction methods, therefore our work has been aimed to investigate what is the best protein dataset to evaluate the amino acid propensities, either larger but not homogeneous or smaller but homogeneous sets, i.e., all-alpha, all-beta, alpha-beta proteins. As a first analysis, we evaluated amino acid propensities for helix, beta-strand, and coil in more than 2000 proteins from the PDBselect dataset. With these propensities, secondary structure predictions performed with a method very similar to that of Chou and Fasman gave us results better than the original one, based on propensities derived from the few tens of X-ray protein structures available in the 1970s. In a refined analysis, we subdivided the PDBselect dataset of proteins in three secondary structural classes, i.e., all-alpha, all-beta, and alpha-beta proteins. For each class, the amino acid propensities for helix, beta-strand, and coil have been calculated and used to predict secondary structure elements for proteins belonging to the same class by using resubstitution and jackknife tests. This second round of predictions further improved the results of the first round. Therefore, amino acid propensities for secondary structures became more reliable depending on the degree of homogeneity of the protein dataset used to evaluate them. Indeed, our results indicate also that all algorithms using propensities for secondary structure can be still improved to obtain better predictive results. 相似文献
3.
Heteronuclear three-dimensional NMR spectroscopy of a partially denatured protein: The A-state of human ubiquitin 总被引:6,自引:0,他引:6
Summary Human ubiquitin is a 76-residue protein that serves as a protein degradation signal when conjugated to another protein. Ubiquitin has been shown to exist in at least three states: native (N-state), unfolded (U-state), and, when dissolved in 60% methanol:40% water at pH 2.0, partially folded (A-state). If the A-state represents an intermediate in the folding pathway of ubiquitin, comparison of the known structure of the N-state with that of the A-state may lead to an understanding of the folding pathway. Insights into the structural basis for ubiquitin's role in protein degradation may also be obtained. To this end we determined the secondary structure of the A-state using heteronuclear three-dimensional NMR spectroscopy of uniformly 15N-enriched ubiquitin. Sequence-specific 1H and 15N resonance assignments were made for more than 90% of the residues in the A-state. The assignments were made by concerted analysis of three-dimensional 1H-15N NOESY-HMQC and TOCSY-HMQC data sets. Because of 1H chemical shift degeneracies, the increased resolution provided by the 15N dimension was critical. Analysis of short- and long-range NOEs indicated that only the first two strands of -sheet, comprising residues 2–17, remain in the A-state, compared to five strands in the N-state. NOEs indicative of an -helix, comprising residues 25–33, were also identified. These residues were also helical in the N-state. In the N-state, residues in this helix were in contact with residues from the first two strands of -sheet. It is likely, therefore, that residues 1–33 comprise a folded domain in the A-state of ubiquitin. On the basis of 1H chemical shifts and weak short-range NOEs, residues 34–76 do not adopt a rigid secondary structure but favor a helical conformation. This observation may be related to the helix-inducing effects of the methanol present. The secondary structure presented here differs from and is more thorough than that determined previously by two-dimensional 1H methods [Harding et al. (1991) Biochemistry, 30, 3120–3128]. 相似文献
4.
A novel automated method for the optimal placement of polar hydrogens in a protein structure is presented. The algorithm adds initially, to a protein data bank file of the protein, nonrotatable hydrogens such as peptide backbone hydrogens according to geometric considerations. Then, water protons and polar side chain protons of lysine, serine, threonine, tyrosine, aspartic acid, glutamic acid, and the C and N termini of a protein are added according to energy considerations. A unique stochastic approach has been developed to overcome a combinatorial explosion in the search for the lowest energy structure. First, the system is divided into ensembles. Each ensemble is treated separately: N conformations are sampled at random, their energies computed, whereas common components of high-energy combinations are gathered on one hand, and low-energy combinations on the other. Components that yield only high-energy conformations and do not contribute to any low energies are excluded. This is reiterated while the total amount of combinations is decreased along the iterative process. When the total number of combinations is lower than a user defined threshold, all remaining combinations are evaluated by exhaustive search. Energy evaluations use nonbonding energy expressions alone. The program was tested on five high-resolution crystal structures: bovine pancreatic trypsin inhibitor (Brookhaven Protein Data Bank file 5PTI), RNase-A (5RSA), trypsin (1NTP), and carbon monoxymyoglobin (2MB5), for which neutron diffraction structures are available, as well as phosphate binding protein (1IXH) for which very high resolution X-ray crystallography was used. The low RMS values prove the efficiency of this algorithm as a tool for positioning protons in proteins. It may be used for other biological structures. 相似文献
5.
6.
A fast method of comparing protein structures 总被引:1,自引:0,他引:1
M R Murthy 《FEBS letters》1984,168(1):97-102
Comparative studies on protein structures form an integral part of protein crystallography. Here, a fast method of comparing protein structures is presented. Protein structures are represented as a set of secondary structural elements. The method also provides information regarding preferred packing arrangements and evolutionary dynamics of secondary structural elements. This information is not easily obtained from previous methods. In contrast to those methods, the present one can be used only for proteins with some secondary structure. The method is illustrated with globin folds, cytochromes and dehydrogenases as examples. 相似文献
7.
We describe a method to identify protein domain boundaries from sequence information alone based on the assumption that hydrophobic residues cluster together in space. SnapDRAGON is a suite of programs developed to predict domain boundaries based on the consistency observed in a set of alternative ab initio three-dimensional (3D) models generated for a given protein multiple sequence alignment. This is achieved by running a distance geometry-based folding technique in conjunction with a 3D-domain assignment algorithm. The overall accuracy of our method in predicting the number of domains for a non-redundant data set of 414 multiple alignments, representing 185 single and 231 multiple-domain proteins, is 72.4 %. Using domain linker regions observed in the tertiary structures associated with each query alignment as the standard of truth, inter-domain boundary positions are delineated with an accuracy of 63.9 % for proteins comprising continuous domains only, and 35.4 % for proteins with discontinuous domains. Overall, domain boundaries are delineated with an accuracy of 51.8 %. The prediction accuracy values are independent of the pair-wise sequence similarities within each of the alignments. These results demonstrate the capability of our method to delineate domains in protein sequences associated with a wide variety of structural domain organisation. 相似文献
8.
Brian R. Ginn 《Journal of theoretical biology》2010,265(4):554-564
The offspring of closely related parents often suffer from inbreeding depression, sometimes resulting in a slower growth rate for inbred offspring relative to non-inbred offspring. Previous research has shown that some of the slower growth rate of inbred organisms can be attributed to the inbred organisms’ increased levels of protein turnover. This paper attempts to show that the higher levels of protein turnover among inbred organisms can be attributed to accumulations of misfolded and aggregated proteins that require degradation by the inbred organisms’ protein quality control systems. The accumulation of misfolded and aggregated proteins within inbred organisms are the result of more negative free energies of folding for proteins encoded at homozygous gene loci and higher concentrations of potentially aggregating non-native protein species within the cell. The theory presented here makes several quantitative predictions that suggest a connection between protein misfolding/aggregation and polyploidy that can be tested by future research. 相似文献
9.
10.
We discuss the potential for inert biopolymers existing in cells to play a role in regulating the macromolecular crowding effect via their ability to undergo shape changing structural transitions. We have explored this possibility by the use of theory and experiment. The theoretical component utilized Monte-Carlo based simulations to examine the folding of a hypothetical protein in a concentrated environment of hard spheres which are themselves capable of reversible expansion and contraction. The experimental component of the study involved examination of the effect of different sized crowding agents on the thermally induced denaturation of cytochrome c [in phosphate buffered saline solution containing 1.0M guanidinium hydrochloride at pH 7.0]. On the basis of our findings we suggest that in a crowded solution environment the presence of a non-reactive polymer capable of reversible expansion/contraction via folding and unfolding may alter the excluded volume component of the solution. This ability would confer on the non-reactive polymer a novel role in influencing other processes in solution affected by macromolecular crowding. 相似文献
11.
Zbilut JP Chua GH Krishnan A Bossa C Colafranceschi M Giuliani A 《FEBS letters》2006,580(20):4861-4864
Some research has suggested that patches of six constitute an important amino acid window length in proteins for conveying information. We present database evidence that supports this conjecture, as well as additional recurrence-based data that characterization and quantification of these words affect the folding/aggregation features of proteins. Other indirect evidence is presented and discussed. 相似文献
12.
Structural features of protein folding nuclei 总被引:1,自引:0,他引:1
A crucial event of protein folding is the formation of a folding nucleus. We demonstrate the presence of a considerable coincidence between the location of folding nuclei and the location of so-called "root structural motifs", which have unique overall folds and handedness. In the case of proteins with a single root structural motif, the involvement in the formation of a folding nucleus is in average significantly higher for amino acids residues that are in root structural motifs, compared to residues in other parts of the protein. The tests carried out revealed that the observed difference is statistically reliable. Thus, a structural feature that corresponds to the protein folding nucleus is now found. 相似文献
13.
The properties of hemoproteins strictly depend on the type and orientation of axial ligands. Here, the orientations of axially coordinated His in bis-His complexes and the heme geometry in protein data bank have been analyzed. The effect of the bis-histidyl formation on the heme cavity of Antarctic fish hemoglobins has been also evaluated. The results show that protein matrix exerts a major effect on the conformation of axially ligated histidines: the imidazoles in bis-His complexes occupy a preferred relative orientation in globins and in model systems, whereas they adopt a variety of relative orientations in other hemoproteins. The bis-histidyl adducts affect the heme geometry inducing larger distortions from planarity with respect to other ligands. These deviations are larger in bis-His multiheme cytochromes than in globins. In Antarctic fish hemoglobins the bis-histidyl adduct adopts preferentially a distorted coordination and the formation of the bis-His complex induces a slight but significant modification in the shape, area and volume of the heme cavity. 相似文献
14.
Secondary structure-based profiles: use of structure-conserving scoring tables in searching protein sequence databases for structural similarities 总被引:11,自引:0,他引:11
The profile method, for detecting distantly related proteins by sequence comparison, has been extended to incorporate secondary structure information from known X-ray structures. The sequence of a known structure is aligned to sequences of other members of a given folding class. From the known structure, the secondary structure (alpha-helix, beta-strand or "other") is assigned to each position of the aligned sequences. As in the standard profile method, a position-dependent scoring table, termed a profile, is calculated from the aligned sequences. However, rather than using the standard Dayhoff mutation table in calculating the profile, we use distinct amino acid mutation tables for residues in alpha-helices, beta-strands or other secondary structures to calculate the profile. In addition, we also distinguish between internal and external residues. With this new secondary structure-based profile method, we created a profile for eight-stranded, antiparallel beta barrels of the insecticyanin folding class. It is based on the sequences of retinol-binding protein, insecticyanin and beta-lactoglobulin. Scanning the sequence database with this profile, it was possible to detect the sequence of avidin. The structure of streptavidin is known, and it appears to be distantly related to the antiparallel beta barrels. Also detected is the sequence of complement component C8, which we therefore predict to be a member of this folding class. 相似文献
15.
We evaluated the i-peptides occurrence frequency in the protein sequences belonging to the two datasets which include proteins with a sequence similarity lower than 25% and 40%, respectively. We worked out a new structural class prediction algorithm using the most frequent i-peptides (with i=2, 3, 4), which characterize the four structural classes. Using the tri-peptides, much more able to gain structural information from sequences compared to the di-peptides, the best results were obtained. Compared to the other methods, similarly founded on peptide occurrence frequencies, our method achieves the best prediction accuracy. We compared it also with methods founded on more sophisticated computational approaches. 相似文献
16.
R. Balabsubramanian G. Raghunathan 《International journal of biological macromolecules》1982,4(6):377-378
The distribution of regular secondary structures, viz. α-helices and β-strands, along the length of over 70 properties whose secondary structural details have been reported, has been analysed. The occurrence of these regular structures tends to be a maximum at the N- and C-termini. Our analysis suggests that both these free ends could possibly serve as nucleating centers for secondary structures and could play an important role in the folding of proteins. 相似文献
17.
A fuzzy cluster method is presented to recognize protein domains. This algorithm can identify domains globally. A protein
structure set was used to test the algorithm. Among 219 proteins, 66.7% yielded results that agreed with the reference definitions,
30.6% showed minor differences, and only 2.7% (six proteins) showed major differences with the reference. The new method is
more than 20 times fast than previous algorithms.
Received: 9 November 1998 / Revised version: 20 December 1999 / Accepted: 20 December 1999 相似文献
18.
Jahandideh S Abdolmaleki P Jahandideh M Hayatshahi SH 《Journal of theoretical biology》2007,244(2):275-281
Due to the increasing gap between structure-determined and sequenced proteins, prediction of protein structural classes has been an important problem. It is very important to use efficient sequential parameters for developing class predictors because of the close sequence-structure relationship. The multinomial logistic regression model was used for the first time to evaluate the contribution of sequence parameters in determining the protein structural class. An in-house program generated parameters including single amino acid and all dipeptide composition frequencies. Then, the most effective parameters were selected by a multinomial logistic regression. Selected variables in the multinomial logistic model were Valine among single amino acid composition frequencies and Ala-Gly, Cys-Arg, Asp-Cys, Glu-Tyr, Gly-Glu, His-Tyr, Lys-Lys, Leu-Asp, Leu-Arg, Pro-Cys, Gln-Met, Gln-Thr, Ser-Trp, Val-Asn and Trp-Asn among dipeptide composition frequencies. Also a neural network model was constructed and fed by the parameters selected by multinomial logistic regression to build a hybrid predictor. In this study, self-consistency and jackknife tests on a database constructed by Zhou [1998. An intriguing controversy over protein structural class prediction. J. Protein Chem. 17(8), 729-738] containing 498 proteins are used to verify the performance of this hybrid method, and are compared with some of prior works. The results showed that our two-stage hybrid model approach is very promising and may play a complementary role to the existing powerful approaches. 相似文献
19.
Anna Rutkowska Nandhakishore Rajagopalan Peter Schmieder Hartmut Oschkinat 《FEBS letters》2009,583(14):2407-359
Here we present a method to purify large amounts of highly pure and stably arrested ribosome-nascent chain complexes (RNCs) from Escherichia coli cells. It relies on the combined use of translation-arrest sequences to generate nascent polypeptides of specified length and subsequent tag purification of the RNCs. Moreover, we adapted this method for the in vivo production of RNCs with specific isotope labeling of the nascent chains for nuclear magnetic resonance (NMR) studies. This method opens therefore possibilities for a wide range of biochemical and structural studies exploring conformations of nascent chains during the early steps of protein folding and targeting. 相似文献
20.
Burd HJ 《Biomechanics and modeling in mechanobiology》2009,8(3):217-231
Published data on the mechanical performance of the human lens capsule when tested under uniaxial and biaxial conditions are
reviewed. It is concluded that two simple phenomenological constitutive models (namely a linear elastic model and a Fung-type
hyperelastic model) are unable to provide satisfactory representations of the mechanical behaviour of the capsule for both
of these loading conditions. The possibility of resolving these difficulties using a structural constitutive model for the
capsule, of a form that is inspired by the network of collagen IV filaments that exist within the lens capsule, is explored.
The model is implemented within a rectangular periodic cell. Prescribed stretches are imposed on the periodic cell and the
network is allowed to deform in a non-affine manner. The performance of the constitutive model correlates well with previously
published test data. One possible application of the model is in the development of a multi-scale analysis of the mechanics
of the human lens capsule. 相似文献