首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We introduce a novel contact prediction method that achieves high prediction accuracy by combining evolutionary and physicochemical information about native contacts. We obtain evolutionary information from multiple-sequence alignments and physicochemical information from predicted ab initio protein structures. These structures represent low-energy states in an energy landscape and thus capture the physicochemical information encoded in the energy function. Such low-energy structures are likely to contain native contacts, even if their overall fold is not native. To differentiate native from non-native contacts in those structures, we develop a graph-based representation of the structural context of contacts. We then use this representation to train an support vector machine classifier to identify most likely native contacts in otherwise non-native structures. The resulting contact predictions are highly accurate. As a result of combining two sources of information—evolutionary and physicochemical—we maintain prediction accuracy even when only few sequence homologs are present. We show that the predicted contacts help to improve ab initio structure prediction. A web service is available at http://compbio.robotics.tu-berlin.de/epc-map/.  相似文献   

2.
3.
4.
Proteins have many functions and predicting these is still one of the major challenges in theoretical biophysics and bioinformatics. Foremost amongst these functions is the need to fold correctly thereby allowing the other genetically dictated tasks that the protein has to carry out to proceed efficiently. In this work, some earlier algorithms for predicting protein domain folds are revisited and they are compared with more recently developed methods. In dealing with intractable problems such as fold prediction, when different algorithms show convergence onto the same result there is every reason to take all algorithms into account such that a consensus result can be arrived at. In this work it is shown that the application of different algorithms in protein structure prediction leads to results that do not converge as such but rather they collude in a striking and useful way that has never been considered before.  相似文献   

5.
Despite recent progress in proteomics most protein complexes are still unknown. Identification of these complexes will help us understand cellular regulatory mechanisms and support development of new drugs. Therefore it is really important to establish detailed information about the composition and the abundance of protein complexes but existing algorithms can only give qualitative predictions. Herein, we propose a new approach based on stochastic simulations of protein complex formation that integrates multi-source data—such as protein abundances, domain-domain interactions and functional annotations—to predict alternative forms of protein complexes together with their abundances. This method, called SiComPre (Simulation based Complex Prediction), achieves better qualitative prediction of yeast and human protein complexes than existing methods and is the first to predict protein complex abundances. Furthermore, we show that SiComPre can be used to predict complexome changes upon drug treatment with the example of bortezomib. SiComPre is the first method to produce quantitative predictions on the abundance of molecular complexes while performing the best qualitative predictions. With new data on tissue specific protein complexes becoming available SiComPre will be able to predict qualitative and quantitative differences in the complexome in various tissue types and under various conditions.  相似文献   

6.
Previously proposed methods for protein secondary structure prediction from multiple sequence alignments do not efficiently extract the evolutionary information that these alignments contain. The predictions of these methods are less accurate than they could be, because of their failure to consider explicitly the phylogenetic tree that relates aligned protein sequences. As an alternative, we present a hidden Markov model approach to secondary structure prediction that more fully uses the evolutionary information contained in protein sequence alignments. A representative example is presented, and three experiments are performed that illustrate how the appropriate representation of evolutionary relatedness can improve inferences. We explain why similar improvement can be expected in other secondary structure prediction methods and indeed any comparative sequence analysis method.  相似文献   

7.
Recent work has shown that the accuracy of ab initio structure prediction can be significantly improved by integrating evolutionary information in form of intra-protein residue-residue contacts. Following this seminal result, much effort is put into the improvement of contact predictions. However, there is also a substantial need to develop structure prediction protocols tailored to the type of restraints gained by contact predictions. Here, we present a structure prediction protocol that combines evolutionary information with the resolution-adapted structural recombination approach of Rosetta, called RASREC. Compared to the classic Rosetta ab initio protocol, RASREC achieves improved sampling, better convergence and higher robustness against incorrect distance restraints, making it the ideal sampling strategy for the stated problem. To demonstrate the accuracy of our protocol, we tested the approach on a diverse set of 28 globular proteins. Our method is able to converge for 26 out of the 28 targets and improves the average TM-score of the entire benchmark set from 0.55 to 0.72 when compared to the top ranked models obtained by the EVFold web server using identical contact predictions. Using a smaller benchmark, we furthermore show that the prediction accuracy of our method is only slightly reduced when the contact prediction accuracy is comparatively low. This observation is of special interest for protein sequences that only have a limited number of homologs.  相似文献   

8.
Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes.  相似文献   

9.
The task of extracting the maximal amount of information from a biological network has drawn much attention from researchers, for example, predicting the function of a protein from a protein-protein interaction (PPI) network. It is well known that biological networks consist of modules/communities, a set of nodes that are more densely inter-connected among themselves than with the rest of the network. However, practical applications of utilizing the community information have been rather limited. For protein function prediction on a network, it has been shown that none of the existing community-based protein function prediction methods outperform a simple neighbor-based method. Recently, we have shown that proper utilization of a highly optimal modularity community structure for protein function prediction can outperform neighbor-assisted methods. In this study, we propose two function prediction approaches on bipartite networks that consider the community structure information as well as the neighbor information from the network: 1) a simple screening method and 2) a random forest based method. We demonstrate that our community-assisted methods outperform neighbor-assisted methods and the random forest method yields the best performance. In addition, we show that using the optimal community structure information is essential for more accurate function prediction for the protein-complex bipartite network of Saccharomyces cerevisiae. Community detection can be carried out either using a modified modularity for dealing with the original bipartite network or first projecting the network into a single-mode network (i.e., PPI network) and then applying community detection to the reduced network. We find that the projection leads to the loss of information in a significant way. Since our prediction methods rely only on the network topology, they can be applied to various fields where an efficient network-based analysis is required.  相似文献   

10.
Hybrid global optimization methods attempt to combine the beneficial features of two or more algorithms, and can be powerful methods for solving challenging nonconvex optimization problems. In this paper, novel classes of hybrid global optimization methods, termed alternating hybrids, are introduced for application as a tool in treating the peptide and protein structure prediction problems. In particular, these new optimization methods take the form of hybrids between a deterministic global optimization algorithm, the αBB, and a stochastically based method, conformational space annealing (CSA). The αBB method, as a theoretically proven global optimization approach, exhibits consistency, as it guarantees convergence to the global minimum for twice-continuously differentiable constrained nonlinear programming problems, but can benefit from computationally related enhancements. On the other hand, the independent CSA algorithm is highly efficient, though the method lacks theoretical guarantees of convergence. Furthermore, both the αBB method and the CSA method are found to identify ensembles of low-energy conformers, an important feature for determining the true free energy minimum of the system. The proposed hybrid methods combine the desirable features of efficiency and consistency, thus enabling the accurate prediction of the structures of larger peptides. Computational studies for met-enkephalin and melittin, employing sequential and parallel computing frameworks, demonstrate the promise for these proposed hybrid methods.  相似文献   

11.
12.
Internal protein dynamics is essential for biological function. During evolution, protein divergence is functionally constrained: properties more relevant for function vary more slowly than less important properties. Thus, if protein dynamics is relevant for function, it should be evolutionary conserved. In contrast with the well-studied evolution of protein structure, the evolutionary divergence of protein dynamics has not been addressed systematically before, apart from a few case studies. X-Ray diffraction analysis gives information not only on protein structure but also on B-factors, which characterize the flexibility that results from protein dynamics. Here we study the evolutionary divergence of protein backbone dynamics by comparing the Cα flexibility (B-factor) profiles for a large dataset of homologous proteins classified into families and superfamilies. We show that Cα flexibility profiles diverge slowly, so that they are conserved at family and superfamily levels, even for pairs of proteins with nonsignificant sequence similarity. We also analyze and discuss the correlations among the divergences of flexibility, sequence, and structure. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users. [Reviewing Editor: Dr. David Pollock]  相似文献   

13.
Maximizing light capture by light-harvesting pigment optimization represents an attractive but challenging strategy to improve photosynthetic efficiency. Here, we report that loss of a previously uncharacterized gene, HIGH PHOTOSYNTHETIC EFFICIENCY1 (HPE1), optimizes light-harvesting pigments, leading to improved photosynthetic efficiency and biomass production. Arabidopsis (Arabidopsis thaliana) hpe1 mutants show faster electron transport and increased contents of carbohydrates. HPE1 encodes a chloroplast protein containing an RNA recognition motif that directly associates with and regulates the splicing of target RNAs of plastid genes. HPE1 also interacts with other plastid RNA-splicing factors, including CAF1 and OTP51, which share common targets with HPE1. Deficiency of HPE1 alters the expression of nucleus-encoded chlorophyll-related genes, probably through plastid-to-nucleus signaling, causing decreased total content of chlorophyll (a+b) in a limited range but increased chlorophyll a/b ratio. Interestingly, this adjustment of light-harvesting pigment reduces antenna size, improves light capture, decreases energy loss, mitigates photodamage, and enhances photosynthetic quantum yield during photosynthesis. Our findings suggest a novel strategy to optimize light-harvesting pigments that improves photosynthetic efficiency and biomass production in higher plants.The tremendous increase in world population and environmental deterioration pose serious challenges to agricultural production and food security (Ray et al., 2013). To meet this challenge, crops with high yield potential need to be developed (Long et al., 2015). However, the yield traits that have played key roles during the green revolution have had their potential nearly exhausted; thus, new strategies are needed. Photosynthesis, the unique biological process responsible for the conversion of light energy to chemical forms, is the ultimate basis of crop yield (Zhu et al., 2010). Theoretically, enhancing photosynthetic efficiency should be an excellent strategy to increase crop yield. However, the improvement of photosynthetic efficiency has played only a minor role in the remarkable crop productivity improvement achieved in the last half-century (Zhu et al., 2010; Ort et al., 2015).In the light reactions of photosynthesis, light energy is used by chlorophyll and associated pigments, water is split, and electron transport on the chloroplast membrane reduces NADP, resulting in a proton gradient that powers the phosphorylation of ADP. NADPH and ATP power the Calvin cycle, which assimilates and reduces carbon dioxide to carbohydrate (Ort et al., 2015). Strategies to improve photosynthesis mainly include the optimization of light capture, light energy conversion in the light reaction, and carbon capture and conversion in the dark reaction (Ort et al., 2015). Previous research focused mainly on the optimization of dark reactions through the improvement of carbon capture and conversion to directly increase biomass (Miyagawa et al., 2001; Kebeish et al., 2007; Lin et al., 2014; Ort et al., 2015). However, less effort has been spent to optimize light capture and light energy conversion in the light reactions to improve the whole photosynthetic efficiency (Ort et al., 2015).Maximizing light capture by the adjustment of antenna size can optimize light capture and light energy conversion, but it is difficult to achieve (Blankenship and Chen, 2013). Antenna in photosynthetic systems typically consist of pigments specifically bound to membrane-associated proteins. These antenna pigment-protein complexes closely associate with the reaction center complexes and deliver absorbed energy to the reaction centers, where some of the energy originally in the photon is captured by electron-transfer processes (Blankenship, 2002; Green and Parson, 2003). However, light saturation could take place at intensities much lower than would be expected if every chlorophyll was able to carry out photosynthesis by itself (Blankenship, 2002). The light saturation problem also has been addressed from the antenna perspective, and many efforts are under way to truncate the antenna system in photosynthetic microorganisms. A smaller antenna associated with each reaction center will, in principle, also shift the light-response curve, so that light saturation sets in at higher intensities, thereby reducing excess light and increasing productive light. While the concept of increased efficiency due to reduced antenna size is simple, reaching this goal has not yet been achieved (Blankenship and Chen, 2013). In green algae, the reduction of light-harvesting pigments by decreasing the expression of the chlorophyll a oxygenase gene, which is responsible for the synthesis of chlorophyll b via the oxidation of chlorophyll a (Czarnecki and Grimm, 2012), led to efficient photosynthesis due to the balance between captured light and photochemical reactions (Perrine et al., 2012). However, there is still no success in higher plants.In this study, we performed a large-scale genetic screen using the model organism Arabidopsis (Arabidopsis thaliana) and identified two independent alleles of an uncharacterized gene that we named HIGH PHOTOSYNTHETIC EFFICIENCY1 (HPE1), whose mutation confers improved photosynthetic efficiency by optimizing light-harvesting pigment. A deficiency of HPE1 shows higher light reaction activity of photosynthesis, more efficient carbon fixation, and significantly increased biomass production. Interestingly, HPE1 encodes a chloroplast protein containing an RNA recognition motif and regulates the splicing of RNAs of plastid genes by directly associating with target RNAs. HPE1 mutation results in a splicing deficiency of plastid genes that may alter the expression of chlorophyll-related genes, probably through plastid-to-nucleus signaling. Altered expression of chlorophyll-related genes changes the content of light-harvesting pigments and optimizes the light-harvesting system. Our characterization of HPE1 mutants suggests a novel strategy to optimize light harvesting and improve photosynthetic efficiency in higher plants.  相似文献   

14.
Monoterpenes are liquid hydrocarbons with applications ranging from flavor and fragrance to replacement jet fuel. Their toxicity, however, presents a major challenge for microbial synthesis. Here we evolved limonene-tolerant Saccharomyces cerevisiae strains and sequenced six strains across the 200-generation evolutionary time course. Mutations were found in the tricalbin proteins Tcb2p and Tcb3p. Genomic reconstruction in the parent strain showed that truncation of a single protein (tTcb3p1-989), but not its complete deletion, was sufficient to recover the evolved phenotype improving limonene fitness 9-fold. tTcb3p1-989 increased tolerance toward two other monoterpenes (β-pinene and myrcene) 11- and 8-fold, respectively, and tolerance toward the biojet fuel blend AMJ-700t (10% cymene, 50% limonene, 40% farnesene) 4-fold. tTcb3p1-989 is the first example of successful engineering of phase tolerance and creates opportunities for production of the highly toxic C10 alkenes in yeast.  相似文献   

15.
Advances in mass spectrometry (MS) have encouraged interest in its deployment in urine biomarker studies, but success has been limited. Urine exosomes have been proposed as an ideal source of biomarkers for renal disease. However, the abundant urinary protein, uromodulin, cofractionates with exosomes during isolation and represents a practical contaminant that limits MS sensitivity. Uromodulin depletion has been attempted but is labor- and time-intensive and may remove important protein biomarkers. We describe the application of an exclusion list (ExL) of uromodulin-related peptide ions, coupled with high-sensitivity mass spectrometric analysis, to increase the depth of coverage of the urinary exosomal proteome. Urine exosomal protein samples from healthy volunteers were subjected to tandem MS and abundant uromodulin peptides identified. Samples were run for a second time, while excluding these uromodulin peptides from fragmentation to allow identification of peptides from lower-abundance proteins. Uromodulin exclusion was performed in addition to dynamic exclusion. Results from these two procedures revealed 222 distinct proteins from conventional analysis, compared with 254 proteins after uromodulin exclusion, of which 188 were common to both methods. By unmasking a previously unidentified protein set, adding the ExL increased overall protein identifications by 29.7% to a total of 288 proteins. A fixed ExL, used in combination with conventional methods, effectively increases the depth of urinary exosomal proteins identified by MS, reducing the need for uromodulin depletion.  相似文献   

16.
杨子恒 《遗传学报》1994,21(3):198-200
本文考察了目前采用的估计同源蛋白质序列间进化距离的方法缺陷,并提出了几个新的计算公式,它们考虑了氨基酸位点间显然存在的替代速率的差异。另外,提出了一种考虑氨基酸间不同替代概率的最大似然估计方法。文中对这些公式进行了计算比较,并对它在实际中的运用提出了建议。  相似文献   

17.
Four basic stages of evolution of protein structure are described, basing on recent work of the authors aimed specifically to reconstruct the earliest events in the protein evolution. According to this reconstruction, the initial stage of short peptides comprising, probably, only a few amino acid residues had been followed by formation of closed loops of 25–30 residues, which corresponds to the polymer-statistically optimal ring closure size for mixed polypeptide chains. The next stage involved fusion of relatively small linear genes and formation of protein structures consisting of several closed loops of a nearly standard size, with 4–6 loops (100–200 amino acid residues) in a typical protein fold. The last, modern stage began with combinatorial fusion of the presumably circular 300–600 bp DNA units and, accordingly, formation of multidomain proteins.  相似文献   

18.
One of the most widely accepted ideas related to the evolutionary rates of proteins is that functionally important residues or regions evolve slower than other regions, a reasonable outcome of which should be a slower evolutionary rate of the proteins with a higher density of functionally important sites. Oddly, the role of functional importance, mainly measured by essentiality, in determining evolutionary rate has been challenged in recent studies. Several variables other than protein essentiality, such as expression level, gene compactness, protein–protein interactions, etc., have been suggested to affect protein evolutionary rate. In the present review, we try to refine the concept of functional importance of a gene, and consider three factors—functional importance, expression level, and gene compactness, as independent determinants of evolutionary rate of a protein, based not only on their known correlation with evolutionary rate but also on a reasonable mechanistic model. We suggest a framework based on these mechanistic models to correctly interpret the correlations between evolutionary rates and the various variables as well as the interrelationships among the variables.  相似文献   

19.
基于模板的蛋白结构预测和不依赖模板的蛋白结构预测是计算预测蛋白质三维结构的两种方法,前者由于具有快速和较高准确性的优点,而得到了广泛的应用.基于模板的结构预测是通过寻找与目标蛋白序列相似并且有实验测定的结构作为模板,进而构建目标序列的结构模型的方法.文章详细综述了基于模板的结构预测方法的步骤、关键环节,并对影响结构预测...  相似文献   

20.
Conformational diversity of the native state plays a central role in modulating protein function. The selection paradigm sustains that different ligands shift the conformational equilibrium through their binding to highest-affinity conformers. Intramolecular vibrational dynamics associated to each conformation should guarantee conformational transitions, which due to its importance, could possibly be associated with evolutionary conserved traits. Normal mode analysis, based on a coarse-grained model of the protein, can provide the required information to explore these features. Herein, we present a novel procedure to identify key positions sustaining the conformational diversity associated to ligand binding. The method is applied to an adequate refined dataset of 188 paired protein structures in their bound and unbound forms. Firstly, normal modes most involved in the conformational change are selected according to their corresponding overlap with structural distortions introduced by ligand binding. The subspace defined by these modes is used to analyze the effect of simulated point mutations on preserving the conformational diversity of the protein. We find a negative correlation between the effects of mutations on these normal mode subspaces associated to ligand-binding and position-specific evolutionary conservations obtained from multiple sequence-structure alignments. Positions whose mutations are found to alter the most these subspaces are defined as key positions, that is, dynamically important residues that mediate the ligand-binding conformational change. These positions are shown to be evolutionary conserved, mostly buried aliphatic residues localized in regular structural regions of the protein like β-sheets and α-helix.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号