首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein topology representations such as residue contact maps are an important intermediate step towards ab initio prediction of protein structure, but the problem of predicting reliable contact maps is far from solved. One of the main pitfalls of existing contact map predictors is that they generally predict unphysical maps, i.e. maps that cannot be embedded into three-dimensional structures or, at best, violate a number of basic constraints observed in real protein structures, such as the maximum number of contacts for a residue. Here, we focus on the problem of learning to predict more "physical" contact maps. We do so by first predicting contact maps through a traditional system (XXStout), and then filtering these maps by an ensemble of artificial neural networks. The filter is provided as input not only the bare predicted map, but also a number of global or long-range features extracted from it. In a rigorous cross-validation test, we show that the filter greatly improves the predicted maps it is input. CASP7 results, on which we report here, corroborate this finding. Importantly, since the approach we present here is fully modular, it may be beneficial to any other ab initio contact map predictor.  相似文献   

2.
To improve the prediction accuracy in the regime where template alignment quality is poor, an updated version of TASSER_2.0, namely TASSER_WT, was developed. TASSER_WT incorporates more accurate contact restraints from a new method, COMBCON. COMBCON uses confidence-weighted contacts from PROSPECTOR_3.5, the latest version, PROSPECTOR_4, and a new local structural fragment-based threading algorithm, STITCH, implemented in two variants depending on expected fragment prediction accuracy. TASSER_WT is tested on 622 Hard proteins, the most difficult targets (incorrect alignments and/or templates and incorrect side-chain contact restraints) in a comprehensive benchmark of 2591 nonhomologous, single domain proteins ≤200 residues that cover the PDB at 35% pairwise sequence identity. For 454 of 622 Hard targets, COMBCON provides contact restraints with higher accuracy and number of contacts per residue. As contact coverage with confidence weight ≥3 (Fwt≥3cov) increases, the more improved are TASSER_WT models. When Fwt≥3cov > 1.0 and > 0.4, the average root mean-square deviation of TASSER_WT (TASSER_2.0) models is 4.11 Å (6.72 Å) and 5.03 Å (6.40 Å), respectively. Regarding a structure prediction as successful when a model has a TM-score to the native structure ≥0.4, when Fwt≥3cov > 1.0 and > 0.4, the success rate of TASSER_WT (TASSER_2.0) is 98.8% (76.2%) and 93.7% (81.1%), respectively.  相似文献   

3.
Prediction of topological representations of proteins that are geometrically invariants can contribute towards the solution of fundamental open problems in structural genomics like folding. In this paper we focus on coarse grained protein contact maps, a representation that describes the spatial neighborhood relation between secondary structure elements such as helices, beta sheets, and random coils. Our methodology is based on searching the graph space. The search algorithm is guided by an adaptive evaluation function computed by a specialized noncausal recursive connectionist architecture. The neural network is trained using candidate graphs generated during examples of successful searches. Our results demonstrate the viability of the approach for predicting coarse contact maps.  相似文献   

4.
An algorithm for searching restriction maps   总被引:1,自引:0,他引:1  
This paper presents an algorithm thai searches a DNA restrictionenzyme map for regions that approximately match a shorter 'probe'map. Both the map and the probe consist of a sequence of address-enzymepairs denoting restriction sites, and the algorithm penalizesa potential match for undetected or missing sites and for discrepanciesin the distance between adjacent sites. The algorithm was designedspecifically for comparing relatively short DNA sequences witha long restriction map, a problem that will become increasingcommon as large physical maps are generated. The algorithm hasbeen used to extract information from a restriction map of theentire Escherichia coli genome. Received on October 28, 1989; accepted on February 2, 1990  相似文献   

5.
This document outlines the use of an algorithm to filter out impossible crystal-packing arrangements based on steric considerations. Within an exhaustive grid search frame, the space sample is reduced by analysis of spherical areas where atom pairs from different rigid units might clash.This technique finds areas in the state space where the global energy minimum might lie. The minimum can then be found by the usual methods of molecular modeling restricted to these particular areas.Only a tiny fraction of atom pair distances need to be tested; usually a single quantity on average per one state of model space! For example, a crystal of three rigid molecules, each containing 12 atoms, has 3×12×12=432 atom pairs just in one unit cell but our method needs to test on average 1 to 4 atom pairs per state.Using modern computers, about 1012–15 models can be tested within several hours or days. For example, a crystal model with six rotational degrees of freedom (two rigid molecules in the unit cell) each with step 3° can be tested in a few hours on a 1-GHz x86 processor-based machine.The method presented here has been implemented in the SUPRAMOL program.  相似文献   

6.
The prediction of the protein tertiary structure from solely its residue sequence (the so called Protein Folding Problem) is one of the most challenging problems in Structural Bioinformatics. We focus on the protein residue contact map. When this map is assigned it is possible to reconstruct the 3D structure of the protein backbone. The general problem of recovering a set of 3D coordinates consistent with some given contact map is known as a unit-disk-graph realization problem and it has been recently proven to be NP-Hard. In this paper we describe a heuristic method (COMAR) that is able to reconstruct with an unprecedented rate (3-15 seconds) a 3D model that exactly matches the target contact map of a protein. Working with a non-redundant set of 1760 proteins, we find that the scoring efficiency of finding a 3D model very close to the protein native structure depends on the threshold value adopted to compute the protein residue contact map. Contact maps whose threshold values range from 10 to 18 Ångstroms allow reconstructing 3D models that are very similar to the proteins native structure.  相似文献   

7.
In the last years, small-world behavior has been extensively described for proteins, when they are represented by the undirected graph defined by the inter-residue protein contacts. By adopting this representation it was possible to compute the average clustering coefficient (C) and characteristic path length (L) of protein structures, and their values were found to be similar to those of graphs characterized by small-world topology. In this comment, we analyze a large set of non-redundant protein structures (1753) and show that by randomly mimicking the protein collapse, the covalent structure of the protein chain significantly contributes to the small-world behavior of the inter-residue contact graphs. When protein graphs are generated, imposing constraints similar to those induced by the backbone connectivity, their characteristic path lengths and clustering coefficients are indistinguishable from those computed using the real contact maps showing that L and C values cannot be used for 'protein fingerprinting'. Moreover we verified that these results are independent of the selected protein representations, residue composition and protein secondary structures.  相似文献   

8.
9.
10.
11.
Reconstructing protein structure based on contact maps leads to two types of models: properly oriented models and mirror models. This is due to the fact that contact maps do not include information on protein chirality. Therefore, both types of model orientations share the same contact map and are geometrically allowed. In this work, we verified the hypothesis that some of the energy terms calculated by PyRosetta could be useful to distinguish between properly oriented and mirror models. We studied 440 models of all-alpha protein domains reconstructed manually from their contact maps, where 50 % of the models were properly oriented and 50 % had mirror orientation. We showed that dihedral angles and energy terms, based on the probability of specific geometrical arrangement of the residues, differed significantly for properly oriented and mirror models.  相似文献   

12.
The excluded volume occupied by protein side-chains and the requirement of high packing density in the protein interior should severely limit the number of side-chain conformations compatible with a given native backbone. To examine the relationship between side-chain geometry and side-chain packing, we use an all-atom Monte Carlo simulation to sample the large space of side-chain conformations. We study three models of excluded volume and use umbrella sampling to effectively explore the entire space. We find that while excluded volume constraints reduce the size of conformational space by many orders of magnitude, the number of allowed conformations is still large. An average repacked conformation has 20 % of its chi angles in a non-native state, a marked reduction from the expected 67 % in the absence of excluded volume. Interestingly, well-packed conformations with up to 50 % non-native chi angles exist. The repacked conformations have native packing density as measured by a standard Voronoi procedure. Entropy is distributed non-uniformly over positions, and we partially explain the observed distribution using rotamer probabilities derived from the Protein Data Bank database. In several cases, native rotamers that occur infrequently in the database are seen with high probability in our simulation, indicating that sequence-specific excluded volume interactions can stabilize rotamers that are rare for a given backbone. In spite of our finding that 65 % of the native rotamers and 85 % of chi(1) angles can be predicted correctly on the basis of excluded volume only, 95 % of positions can accommodate more than one rotamer in simulation. We estimate that, in order to quench the side-chain entropy observed in the presence of excluded volume interactions, other interactions (hydrophobic, polar, electrostatic) must provide an additional stabilization of at least 0.6 kT per residue in order to single out the native state.  相似文献   

13.
MOTIVATION: Local structure segments (LSSs) are small structural units shared by unrelated proteins. They are extensively used in protein structure comparison, and predicted LSSs (PLSSs) are used very successfully in ab initio folding simulations. However, predicted or real LSSs are rarely exploited by protein sequence comparison programs that are based on position-by-position alignments. RESULTS: We developed a SEgment Alignment algorithm (SEA) to compare proteins described as a collection of predicted local structure segments (PLSSs), which is equivalent to an unweighted graph (network). Any specific structure, real or predicted corresponds to a specific path in this network. SEA then uses a network matching approach to find two most similar paths in networks representing two proteins. SEA explores the uncertainty and diversity of predicted local structure information to search for a globally optimal solution. It simultaneously solves two related problems: the alignment of two proteins and the local structure prediction for each of them. On a benchmark of protein pairs with low sequence similarity, we show that application of the SEA algorithm improves alignment quality as compared to FFAS profile-profile alignment, and in some cases SEA alignments can match the structural alignments, a feat previously impossible for any sequence based alignment methods.  相似文献   

14.
The derivation and characterization of a neuroattenuated reassortant clone (RFC 25/B.5) of California serogroup bunyavirus was described previously (M. J. Endres, A. Valsamakis, F. Gonzalez-Scarano, and N. Nathanson, J. Virol. 64:1927-1933, 1990). To map the RNA segment responsible for this attenuation, a panel of reassortants was constructed between the attenuated clone B.5 (genotype TLL) and a virulent clone (B1-1a) of reciprocal genotype (LTT). Parent viruses and clones representing all of the six possible reassortants were examined for neurovirulence by intracerebral injection in adult mice. Reassortants bearing the large RNA segment from the virulent parent were almost as virulent as the virulent parent virus, while reassortants bearing the large RNA segment from the avirulent parent virus exhibited low or intermediate virulence. These results indicate that the large RNA segment is the major determinant of neuroattenuation of clone B.5. In addition to its neuroattenuation, clone B.5 was temperature sensitive and exhibited an altered plaque morphology. These phenotypes also segregated with the large RNA segment. The importance of the large RNA segment (which encodes the viral polymerase) in neurovirulence contrasts with prior studies which indicate that the ability to cause lethal encephalitis after peripheral injection of suckling mice (neuroinvasiveness) is primarily determined by the middle-sized RNA segment, which encodes the viral glycoproteins.  相似文献   

15.

Background

Conformational flexibility creates errors in the comparison of protein structures. Even small changes in backbone or sidechain conformation can radically alter the shape of ligand binding cavities. These changes can cause structure comparison programs to overlook functionally related proteins with remote evolutionary similarities, and cause others to incorrectly conclude that closely related proteins have different binding preferences, when their specificities are actually similar. Towards the latter effort, this paper applies protein structure prediction algorithms to enhance the classification of homologous proteins according to their binding preferences, despite radical conformational differences.

Methods

Specifically, structure prediction algorithms can be used to "remodel" existing structures against the same template. This process can return proteins in very different conformations to similar, objectively comparable states. Operating on close homologs exploits the accuracy of structure predictions on closely related proteins, but structure prediction is often a nondeterministic process. Identical inputs can generate subtly different models with very different binding cavities that make structure comparison difficult. We present a first method to mitigate such errors, called "medial remodeling", that examines a large number of predicted structures to eliminate extreme models of the same binding cavity.

Results

Our results, on the enolase and tyrosine kinase superfamilies, demonstrate that remodeling can enable proteins in very different conformations to be returned to states that can be objectively compared. Structures that would have been erroneously classified as having different binding preferences were often correctly classified after remodeling, while structures that would have been correctly classified as having different binding preferences almost always remained distinct. The enolase superfamily, which exhibited less sequential diversity than the tyrosine kinase superfamily, was classified more accurately after remodeling than the tyrosine kinases. Medial remodeling reduced errors from models with unusual perturbations that distort the shape of the binding site, enhancing classification accuracy.

Conclusions

This paper demonstrates that protein structure prediction can compensate for conformational variety in the comparison of protein-ligand binding sites. While protein structure prediction introduces new uncertainties into the structure comparison problem, our results indicate that unusual models can be ignored through an analysis of many models, using techniques like medial remodeling. These results point to applications of protein structure comparison that extend beyond existing crystal structures.
  相似文献   

16.
In this retrospective study we analysed changes of the ST segment in patients with arterial hypertension using multi-lead body surface mapping of the electric heart field as the ST segment often shows non-specific changes and is influenced by many different conditions. We constructed isointegral maps (IIM) of chosen intervals (the first 35 ms, the first 80 ms, and the whole ST segment) in 42 patients with arterial hypertension (with and without left ventricular hypertrophy) and in the control group involving 23 healthy persons. We analysed the position and values of map extrema. Spatial distribution of voltage integrals was similar in the control group and in the "pure" hypertensives. Patients with the left ventricular hypertrophy exhibited shifts of the integral minima. Despite our expectations, the highest extrema values were found in the control group and not in the left ventricular hypertrophy group. The extrema values were similar in all hypertensives, with or without left ventricular hypertrophy. Differences could be explained neither by the influence of the age, nor by the body habitus.  相似文献   

17.
Genome annotation in differently evolved organisms presents challenges because the lack of sequence-based homology limits the ability to determine the function of putative coding regions. To provide an alternative to annotation by sequence homology, we developed a method that takes advantage of unusual trypanosomatid biology and skews in nucleotide composition between coding regions and upstream regions to rank putative open reading frames based on the likelihood of coding. The method is 93% accurate when tested on known genes. We have applied our method to the full complement of open reading frames on Chromosome I of Trypanosoma brucei, and we can predict with high confidence that 226 putative coding regions are likely to be functional. Methods such as the one described here for discriminating true coding regions are critical for genome annotation when other sources of evidence for function are limited.  相似文献   

18.
The aim of our work was to study the opposite polarity of the PQ segment to the P wave body surface potential maps in different groups of patients. We constructed isointegral maps (IIM) in 26 healthy controls (C), 16 hypertensives (HT), 26 patients with arterial hypertension and left ventricular hypertrophy (LVH) and 15 patients with myocardial infarction (MI). We analyzed values and positions of map extrema and compared the polarity of maps using the correlation coefficient. The IIM P maxima appeared mainly over the precordium, the minima mainly in the right subclavicular area. The highest maxima were in the MI group, being significantly higher than in the HT and LVH groups. No differences concerning any values of other extrema were significant. The IIM PQ maxima were distributed over the upper half of the chest; the minima mainly over the middle sternum. A statistically significant opposite polarity between the IIM P and IIM PQ was found in 80 % of cases. The opposite polarity of the P wave and the PQ segment was proved in isointegral body surface maps. The extrema occurred in areas not examined by the standard chest leads. This has to be considered for diagnostic purposes.  相似文献   

19.
The prediction of protein side-chain conformation is central for understanding protein functions. Side-chain packing is a sub-problem of protein folding and its computational complexity has been shown to be NP-hard. We investigated the capabilities of a hybrid (genetic algorithm/simulated annealing) technique for side-chain packing and for the generation of an ensemble of low energy side-chain conformations. Our method first relies on obtaining a near-optimal low energy protein conformation by optimizing its amino-acid side-chains. Upon convergence, the genetic algorithm is allowed to undergo forward and “backward” evolution by alternating selection pressures between minimal and higher energy setpoints. We show that this technique is very efficient for obtaining distributions of solutions centered at any desired energy from the minimum. We outline the general concepts of our evolutionary sampling methodology using three different alternating selective pressure schemes. Quality of the method was assessed by using it for protein pK(a) prediction.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号