首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
The prediction of the secondary structure of a protein from its amino acid sequence is an important step towards the prediction of its three-dimensional structure. However, the accuracy of ab initio secondary structure prediction from sequence is about 80 % currently, which is still far from satisfactory. In this study, we proposed a novel method that uses binomial distribution to optimize tetrapeptide structural words and increment of diversity with quadratic discriminant to perform prediction for protein three-state secondary structure. A benchmark dataset including 2,640 proteins with sequence identity of less than 25 % was used to train and test the proposed method. The results indicate that overall accuracy of 87.8 % was achieved in secondary structure prediction by using ten-fold cross-validation. Moreover, the accuracy of predicted secondary structures ranges from 84 to 89 % at the level of residue. These results suggest that the feature selection technique can detect the optimized tetrapeptide structural words which affect the accuracy of predicted secondary structures.  相似文献   

De novo protein design offers a unique means to test and advance our understanding of how proteins fold. However, most current design methods are native structure eccentric and folding kinetics has rarely been considered in the design process. Here, we show that a de novo designed mini-protein DS119, which folds into a βαβ structure, exhibits unusually slow and concentration-dependent folding kinetics. For example, the folding time for 50 μM of DS119 was estimated to be ∼2 s. Stopped-flow fluorescence resonance energy transfer experiments further suggested that its folding was likely facilitated by a transient dimerization process. Taken together, these results highlight the need for consideration of the entire folding energy landscape in de novo protein design and provide evidence suggesting nonnative interactions can play a key role in protein folding.  相似文献   

De novo protein design offers a unique means to test and advance our understanding of how proteins fold. However, most current design methods are native structure eccentric and folding kinetics has rarely been considered in the design process. Here, we show that a de novo designed mini-protein DS119, which folds into a βαβ structure, exhibits unusually slow and concentration-dependent folding kinetics. For example, the folding time for 50 μM of DS119 was estimated to be ∼2 s. Stopped-flow fluorescence resonance energy transfer experiments further suggested that its folding was likely facilitated by a transient dimerization process. Taken together, these results highlight the need for consideration of the entire folding energy landscape in de novo protein design and provide evidence suggesting nonnative interactions can play a key role in protein folding.  相似文献   

Through systematic studies of lattice Monte Carlo simulations of thefolding of designed heteropolymers, we have identified a hierarchy ofspecific elementary phenomena which control the way single domain proteinfold: a) formation of few, local elementary structures, b) creation ofthe (post-critical) folding nucleus through the assemblage together ofthe local elementary structures, c) relaxation of the remaining aminoacids to the native conformation. These results, which are consistentwith a two-state kinetics of the folding of small, single domain proteins,where the local elementary structures and the folding nucleus can be viewedas hidden intermediates along the reaction pathway, provide the basis fora strategy to read the tertiary structure of a protein from its aminoacid sequence.  相似文献   

Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10). We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. “Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources”.  相似文献   

P. James  J. Halladay    E. A. Craig 《Genetics》1996,144(4):1425-1436
The two-hybrid system is a powerful technique for detecting protein-protein interactions that utilizes the well-developed molecular genetics of the yeast Saccharomyces cerevisiae. However, the full potential of this technique has not been realized due to limitations imposed by the components available for use in the system. These limitations include unwieldy plasmid vectors, incomplete or poorly designed two-hybrid libraries, and host strains that result in the selection of large numbers of false positives. We have used a novel multienzyme approach to generate a set of highly representative genomic libraries from S. cerevisiae. In addition, a unique host strain was created that contains three easily assayed reporter genes, each under the control of a different inducible promoter. This host strain is extremely sensitive to weak interactions and eliminates nearly all false positives using simple plate assays. Improved vectors were also constructed that simplify the construction of the gene fusions necessary for the two-hybrid system. Our analysis indicates that the libraries and host strain provide significant improvements in both the number of interacting clones identified and the efficiency of two-hybrid selections.  相似文献   

石鸥燕  杨晶  杨惠云  田心 《现代生物医学进展》2007,7(11):1723-1724,1706
蛋白质二级结构预测对于我们了解蛋白质空间结构是至关重要的一步。文章提出了一种简单的二级结构预测方法,该方法采用多数投票法将现有的3种较好的二级结构预测方法的预测结果汇集形成一致性预测结果。从PDB数据库中随机选取近两年新测定结构的57条相似性小于30%的蛋白质,对该方法的预测结果进行测试,其Q3准确率比3种独立的方法提高了1.12—2.29%,相关系数及SOV准确率也有相应的提高。并且各项准确率均比同样采用一致性方法的Jpred二级结构预测程序准确率要高。这种预测方法虽然原理简单,但无须使用额外的参数,计算量小,易于实现,最重要的前提就是必须选用目前准确性比较出色的蛋白质二级结构预测方法。  相似文献   

Computational de novo protein structure prediction is limited to small proteins of simple topology. The present work explores an approach to extend beyond the current limitations through assembling protein topologies from idealized α-helices and β-strands. The algorithm performs a Monte Carlo Metropolis simulated annealing folding simulation. It optimizes a knowledge-based potential that analyzes radius of gyration, β-strand pairing, secondary structure element (SSE) packing, amino acid pair distance, amino acid environment, contact order, secondary structure prediction agreement and loop closure. Discontinuation of the protein chain favors sampling of non-local contacts and thereby creation of complex protein topologies. The folding simulation is accelerated through exclusion of flexible loop regions further reducing the size of the conformational search space. The algorithm is benchmarked on 66 proteins with lengths between 83 and 293 amino acids. For 61 out of these proteins, the best SSE-only models obtained have an RMSD100 below 8.0 Å and recover more than 20% of the native contacts. The algorithm assembles protein topologies with up to 215 residues and a relative contact order of 0.46. The method is tailored to be used in conjunction with low-resolution or sparse experimental data sets which often provide restraints for regions of defined secondary structure.  相似文献   

Determining the primary structure (i.e., amino acid sequence) of a protein has become cheaper, faster, and more accurate. Higher order protein structure provides insight into a protein’s function in the cell. Understanding a protein’s secondary structure is a first step towards this goal. Therefore, a number of computational prediction methods have been developed to predict secondary structure from just the primary amino acid sequence. The most successful methods use machine learning approaches that are quite accurate, but do not directly incorporate structural information. As a step towards improving secondary structure reduction given the primary structure, we propose a Bayesian model based on the knob-socket model of protein packing in secondary structure. The method considers the packing influence of residues on the secondary structure determination, including those packed close in space but distant in sequence. By performing an assessment of our method on 2 test sets we show how incorporation of multiple sequence alignment data, similarly to PSIPRED, provides balance and improves the accuracy of the predictions. Software implementing the methods is provided as a web application and a stand-alone implementation.  相似文献   

Community structure detection has proven to be important in revealing the underlying properties of complex networks. The standard problem, where a partition of disjoint communities is sought, has been continually adapted to offer more realistic models of interactions in these systems. Here, a two-step procedure is outlined for exploring the concept of overlapping communities. First, a hard partition is detected by employing existing methodologies. We then propose a novel mixed integer non linear programming (MINLP) model, known as OverMod, which transforms disjoint communities to overlapping. The procedure is evaluated through its application to protein-protein interaction (PPI) networks of the rat, E. coli, yeast and human organisms. Connector nodes of hard partitions exhibit topological and functional properties indicative of their suitability as candidates for multiple module membership. OverMod identifies two types of connector nodes, inter and intra-connector, each with their own particular characteristics pertaining to their topological and functional role in the organisation of the network. Inter-connector proteins are shown to be highly conserved proteins participating in pathways that control essential cellular processes, such as proliferation, differentiation and apoptosis and their differences with intra-connectors is highlighted. Many of these proteins are shown to possess multiple roles of distinct nature through their participation in different network modules, setting them apart from proteins that are simply ‘hubs’, i.e. proteins with many interaction partners but with a more specific biochemical role.  相似文献   

Asparagine residues in proteins undergo spontaneous deamidation, a post-translational modification that may act as a molecular clock for the regulation of protein function and turnover. Asparagine deamidation is modulated by protein local sequence, secondary structure and hydrogen bonding. We present NGOME, an algorithm able to predict non-enzymatic deamidation of internal asparagine residues in proteins in the absence of structural data, using sequence-based predictions of secondary structure and intrinsic disorder. Compared to previous algorithms, NGOME does not require three-dimensional structures yet yields better predictions than available sequence-only methods. Four case studies of specific proteins show how NGOME may help the user identify deamidation-prone asparagine residues, often related to protein gain of function, protein degradation or protein misfolding in pathological processes. A fifth case study applies NGOME at a proteomic scale and unveils a correlation between asparagine deamidation and protein degradation in yeast. NGOME is freely available as a webserver at the National EMBnet node Argentina, URL: http://www.embnet.qb.fcen.uba.ar/ in the subpage “Protein and nucleic acid structure and sequence analysis”.  相似文献   

Prediction of the Secondary Structure of Myelin Basic Protein   总被引:14,自引:10,他引:4  
An investigation into the probable secondary structure of the myelin basic protein was carried out by the application of three procedures currently in use to predict the secondary structures of proteins from knowledge of their amino acid sequences. In order to increase the accuracy of the predictions, the amino acid substitutions that occur in the basic protein from different species were incorporated into the predictive algorithms. It was possible to locate regions of probable alpha-helix, beta-structure, beta-turn, and unordered conformation (coil) in the protein. One of the predictive methods introduces a bias into the algorithm to maximize or minimize the amounts of alpha-helix and/or beta-structure present; this made it possible to assess how conditions such as pH and protein concentration or the presence of anionic amphiphilic molecules could influence the protein's secondary structure. The predictions made by the three methods were in reasonably good agreement with one another. They were consistent with experimental data, provided that the stabilizing or destabilizing effects of the environment were taken into account. According to the predictions, the extent of possible alpha-helix and beta-structure formation in the protein s severely restricted by the low frequency and extensive scattering of hydrophobic residues, along with a high frequency and extensive scattering of residues that favor the formation of beta-turns and coils. Neither prolyl residues nor cationic residues per se are responsible for the low content of alpha-helix predicted in the protein. The principal ordered conformation predicted is the beta-turn. Many of the predicted beta-turns overlap extensively, involving in some cases up to 10 residues. In some of these structures it is possible for the peptide backbone to oscillate in a sinusoidal manner, generating a flat, pleated sheetlike structure. Cationic residues located in these structures would appear to be ideally oriented for interaction with lipid phosphate groups located at the cytoplasmic surface of the myelin membrane. An analysis of possible and probable conformations that the triproline sequence could assume questions the popular notion that this sequence produces a hairpin turn in the basic protein.  相似文献   

A random hexapeptide library, cloned in bacteriophage, was used to select affinity peptides using nickel-nitrilotriacetic acid (Ni-NTA) columns. The screening protocol was successful by isolating peptides sharing common features and, in most cases, common amino acid sequences were isolated (e.g. WHHHPH, AQHHHH). Ni-NTA chromatography of the fusion phage of the selected peptides exhibited a more homogeneous elution behavior (i.e. elution in one peak) than the most commonly used His6peptide (elution in multiple peaks).  相似文献   

A key question in evolutionary genomics is how populations navigate the adaptive landscape in the presence of epistasis, or interactions among loci. This problem can be directly addressed by studying the evolution of RNA secondary structures, for which there is constraint to maintain pairing between Watson-Crick (WC) sites. Replacement of a nucleotide at one site of a WC pair reduces fitness by disrupting binding, which can be restored via a compensatory replacement at the interacting site. Here, I present the first genome-scale analysis of epistasis on the RNA secondary structure of human immunodeficiency virus type 1 (HIV-1). Comparison of polymorphism frequencies at ancestrally conserved sites reveals that selection against replacements is ∼2.7 times stronger at WC than at non-WC sites, such that nearly 50% of constraint can be attributed to epistasis. However, almost all epistatic constraint is due to selection against conversions of WC pairs to unpaired (UP) nucleotides, whereas conversions to GU wobbles are only slightly deleterious. This disparity is also evident in pairs with second-site compensatory replacements; conversions from UP nucleotides to WC pairs increase median fitness by ∼4.2%, whereas conversions from GU wobbles to WC pairs only increase median fitness by ∼0.3%. Moreover, second-site replacements that convert UP nucleotides to GU wobbles also increase median fitness by ∼4%, indicating that such replacements are nearly as compensatory as those that restore WC pairing. Thus, WC peaks of the HIV-1 epistatic adaptive landscape are connected by high GU ridges, enabling the viral population to rapidly explore distant peaks without traversing deep UP valleys.  相似文献   

Remarkable advances in DNA sequencing technology have created a need for de novo genome assembly methods tailored to work with the new sequencing data types. Many such methods have been published in recent years, but assembling raw sequence data to obtain a draft genome has remained a complex, multi-step process, involving several stages of sequence data cleaning, error correction, assembly, and quality control. Successful application of these steps usually requires intimate knowledge of a diverse set of algorithms and software. We present an assembly pipeline called A5 (Andrew And Aaron''s Awesome Assembly pipeline) that simplifies the entire genome assembly process by automating these stages, by integrating several previously published algorithms with new algorithms for quality control and automated assembly parameter selection. We demonstrate that A5 can produce assemblies of quality comparable to a leading assembly algorithm, SOAPdenovo, without any prior knowledge of the particular genome being assembled and without the extensive parameter tuning required by the other assembly algorithm. In particular, the assemblies produced by A5 exhibit 50% or more reduction in broken protein coding sequences relative to SOAPdenovo assemblies. The A5 pipeline can also assemble Illumina sequence data from libraries constructed by the Nextera (transposon-catalyzed) protocol, which have markedly different characteristics to mechanically sheared libraries. Finally, A5 has modest compute requirements, and can assemble a typical bacterial genome on current desktop or laptop computer hardware in under two hours, depending on depth of coverage.  相似文献   

二级结构形成:蛋白质折叠起始过程的框架模型   总被引:8,自引:1,他引:7  
框架模型认为二级结构形成是蛋白质起始过程的结构基础.文章介绍蛋白质同源片段的溶液构象及其构象研究法和多肽二级结构的从头设计,并综述这些研究成果应用于折叠起始过程的理论模型和蛋白质折叠起始过程的最新研究进展.  相似文献   

The rise in alternative respiratory capacity upon aging of potato (Solanum tuberosum) tuber slices is correlated with changes in mitochondrial membrane protein composition and a requirement for cytoplasmic protein synthesis. However, the lack of an antibody specific to the alternative oxidase has, until recently, prevented examination of the alternative oxidase protein(s) itself. We have employed a monoclonal antibody raised against the Sauromatum guttatum alternative oxidase to investigate developmental changes in the alternative pathway of aging potato slice mitochondria and to characterize the potato alternative oxidase by one- and two-dimensional gel electrophoresis. The relative levels of a 36 kilodalton protein parallel the rise in alternative path capacity. A plausible interpretation is that this alternative oxidase protein is synthesized de novo during aging of potato slices.  相似文献   


In this paper, we propose a nongraphical representation for protein secondary structures. By counting the frequency of occurrence of all possible four-tuples (i.e., four-letter words) of a protein secondary structure sequence, we construct a set of 3 × 3 matrices for the corresponding protein secondary structure sequence. Furthermore, the leading eigenvalues of these matrices are computed and considered as invariants for the protein secondary structure sequences. To illustrate the utility of our approach, we apply it to a set of real data to distinguish protein structural classes. The result indicates that it can be used to complement the classification of protein secondary structures.  相似文献   

针对传统方法在蛋白质二级结构分类中精度低的问题,介绍了一种基于灰狼优化算法的卷积神经网络图像分类算法.首先,选取卷积神经网络模型中所需优化的参数,并且初始化灰狼优化算法的迭代次数、灰狼数量、搜索边界和空间维数;其次,计算优化参数的个体适应度函数,对个体适应度进行排序,确定历史最优解、优解和次优解,更新灰狼的位置;最后,...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号