首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Structures of homologous proteins are usually conserved during evolution, as are critical active site residues. This is the case for actin and tubulin, the two most important cytoskeleton proteins in eukaryotes. Actins and their related proteins (Arps) constitute a large superfamily whereas the tubulin family has fewer members. Unaligned sequences of these two protein families were analysed by searching for short groups of family-specific amino acid residues, that we call motifs, and by counting the number of residues from one motif to the next. For each sequence, the set of motif-to-motif residue counts forms a subfamily-specific pattern (landmark pattern) allowing actin and tubulin superfamily members to be identified and sorted into subfamilies. The differences between patterns of individual subfamilies are due to inserts and deletions (indels). Inserts appear to have arisen at an early stage in eukaryote evolution as suggested by the small but consistent kingdom-dependent differences found within many Arp subfamilies and in γ-tubulins. Inserts tend to be in surface loops where they can influence subfamily-specific function without disturbing the core structure of the protein. The relatively few indels found for tubulins have similar positions to established results, whereas we find many previously unreported indel positions and lengths for the metazoan Arps.  相似文献   

2.

Background  

The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins.  相似文献   

3.
Protein Structural Interactome map (PSIMAP) is a global interaction map that describes domain-domain and protein-protein interaction information for known Protein Data Bank structures. It calculates the Euclidean distance to determine interactions between possible pairs of structural domains in proteins. PSIbase is a database and file server for protein structural interaction information calculated by the PSIMAP algorithm. PSIbase also provides an easy-to-use protein domain assignment module, interaction navigation and visual tools. Users can retrieve possible interaction partners of their proteins of interests if a significant homology assignment is made with their query sequences. AVAILABILITY: http://psimap.org and http://psibase.kaist.ac.kr/  相似文献   

4.

Background  

Mass spectrometry based peptide mass fingerprints (PMFs) offer a fast, efficient, and robust method for protein identification. A protein is digested (usually by trypsin) and its mass spectrum is compared to simulated spectra for protein sequences in a database. However, existing tools for analyzing PMFs often suffer from missing or heuristic analysis of the significance of search results and insufficient handling of missing and additional peaks.  相似文献   

5.

Background  

Several methods are currently available for the comparison of protein structures. These methods have been analysed regarding the performance in the identification of structurally/evolutionary related proteins, but so far there has been less focus on the objective comparison between the alignments produced by different methods.  相似文献   

6.
7.
The evolution of proteins is one of the fundamental processes that has delivered the diversity and complexity of life we see around ourselves today. While we tend to define protein evolution in terms of sequence level mutations, insertions and deletions, it is hard to translate these processes to a more complete picture incorporating a polypeptide''s structure and function. By considering how protein structures change over time we can gain an entirely new appreciation of their long-term evolutionary dynamics. In this work we seek to identify how populations of proteins at different stages of evolution explore their possible structure space. We use an annotation of superfamily age to this space and explore the relationship between these ages and a diverse set of properties pertaining to a superfamily''s sequence, structure and function. We note several marked differences between the populations of newly evolved and ancient structures, such as in their length distributions, secondary structure content and tertiary packing arrangements. In particular, many of these differences suggest a less elaborate structure for newly evolved superfamilies when compared with their ancient counterparts. We show that the structural preferences we report are not a residual effect of a more fundamental relationship with function. Furthermore, we demonstrate the robustness of our results, using significant variation in the algorithm used to estimate the ages. We present these age estimates as a useful tool to analyse protein populations. In particularly, we apply this in a comparison of domains containing greek key or jelly roll motifs.  相似文献   

8.
A technique for prediction of protein membrane toplogy (intra- and extraceullular sidedness) has been developed. Membrane-spanning segments are first predicted using an algorithm based upon multiply aligned amino acid sequences. The compositional differences in the protein segments exposed at each side of the membrane are then investigated. The ratios are calculated for Asn, Asp, Gly, Phe, Pro, Trp, Tyr, and Val, mostly found on the extracellular side, and for Ala, Arg, Cys, and Lys, mostly occurring on the intracellular side. The consensus over these 12 residue distributions is used for sidedness prediction. The method was developed with a set of 42 protein families for which all but one were correctly predicted with the new algorithm. This represents an improvement over previous techniques. The new method, applied to a set of 12 membrane protein families different from the test set and with recently determined topologies, performed well, with 11 of 12 sidedness assignments agreeing with experimental results. The method has also been applied to several membrane protein families for which the topology has yet to be determined. An electronic prediction service is available at the E-mail address tmap@embl-heidelberg.de and on WWW via http://www.emblheidelberg.de.  相似文献   

9.
10.
Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues.  相似文献   

11.
The dramatic increase in heterogeneous types of biological data—in particular, the abundance of new protein sequences—requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity—GPCRs and kinases from humans, and the crotonase superfamily of enzymes—we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.  相似文献   

12.
Predicting protein structure from primary sequence is one of the ultimate challenges in computational biology. Given the large amount of available sequence data, the analysis of co-evolution, i.e., statistical dependency, between columns in multiple alignments of protein domain sequences remains one of the most promising avenues for predicting residues that are contacting in the structure. A key impediment to this approach is that strong statistical dependencies are also observed for many residue pairs that are distal in the structure. Using a comprehensive analysis of protein domains with available three-dimensional structures we show that co-evolving contacts very commonly form chains that percolate through the protein structure, inducing indirect statistical dependencies between many distal pairs of residues. We characterize the distributions of length and spatial distance traveled by these co-evolving contact chains and show that they explain a large fraction of observed statistical dependencies between structurally distal pairs. We adapt a recently developed Bayesian network model into a rigorous procedure for disentangling direct from indirect statistical dependencies, and we demonstrate that this method not only successfully accomplishes this task, but also allows contacts with weak statistical dependency to be detected. To illustrate how additional information can be incorporated into our method, we incorporate a phylogenetic correction, and we develop an informative prior that takes into account that the probability for a pair of residues to contact depends strongly on their primary-sequence distance and the amount of conservation that the corresponding columns in the multiple alignment exhibit. We show that our model including these extensions dramatically improves the accuracy of contact prediction from multiple sequence alignments.  相似文献   

13.
蛋白质折叠类型分类方法及分类数据库   总被引:1,自引:0,他引:1  
李晓琴  仁文科  刘岳  徐海松  乔辉 《生物信息学》2010,8(3):245-247,253
蛋白质折叠规律研究是生命科学重大前沿课题,折叠分类是蛋白质折叠研究的基础。目前的蛋白质折叠类型分类基本上靠专家完成,不同的库分类并不相同,迫切需要一个建立在统一原理基础上的蛋白质折叠类型数据库。本文以ASTRAL-1.65数据库中序列同源性在25%以下、分辨率小于2.5的蛋白为基础,通过对蛋白质空间结构的观察及折叠类型特征的分析,提出以蛋白质折叠核心为中心、以蛋白质结构拓扑不变性为原则、以蛋白质折叠核心的规则结构片段组成、连接和空间排布为依据的蛋白质折叠类型分类方法,建立了低相似度蛋白质折叠分类数据库——LIFCA,包含259种蛋白质折叠类型。数据库的建立,将为进一步的蛋白质折叠建模及数据挖掘、蛋白质折叠识别、蛋白质折叠结构进化研究奠定基础。  相似文献   

14.
The Yeast Protein Database (YPD) is a database for the proteins of the budding yeast,Saccharomyces cerevisiae. YPD is the first annotated database for the complete proteome of any organism. Now that the complete genome sequence of yeast is available, YPD contains entries for each of the characterized proteins and for each of the uncharacterized proteins predicted from the sequence. Contained in YPD are the calculated properties of each protein such as molecular weight and isoelectric point, experimentally determined properties such as subcellular localization and post-translational modifications, and extensive annotations from the yeast literature. YPD contains 25 000 lines of textual annotation that describe the known functions, mutant phenotypes, interactions, and other properties for the approximately 6000 proteins in the yeast proteome. The information in YPD is updated daily, and it is available on the World Wide Web at http://www.proteome.com/YPDhome.html .  相似文献   

15.
Detecting similarities between ligand binding sites in the absence of global homology between target proteins has been recognized as one of the critical components of modern drug discovery. Local binding site alignments can be constructed using sequence order-independent techniques, however, to achieve a high accuracy, many current algorithms for binding site comparison require high-quality experimental protein structures, preferably in the bound conformational state. This, in turn, complicates proteome scale applications, where only various quality structure models are available for the majority of gene products. To improve the state-of-the-art, we developed eMatchSite, a new method for constructing sequence order-independent alignments of ligand binding sites in protein models. Large-scale benchmarking calculations using adenine-binding pockets in crystal structures demonstrate that eMatchSite generates accurate alignments for almost three times more protein pairs than SOIPPA. More importantly, eMatchSite offers a high tolerance to structural distortions in ligand binding regions in protein models. For example, the percentage of correctly aligned pairs of adenine-binding sites in weakly homologous protein models is only 4–9% lower than those aligned using crystal structures. This represents a significant improvement over other algorithms, e.g. the performance of eMatchSite in recognizing similar binding sites is 6% and 13% higher than that of SiteEngine using high- and moderate-quality protein models, respectively. Constructing biologically correct alignments using predicted ligand binding sites in protein models opens up the possibility to investigate drug-protein interaction networks for complete proteomes with prospective systems-level applications in polypharmacology and rational drug repositioning. eMatchSite is freely available to the academic community as a web-server and a stand-alone software distribution at http://www.brylinski.org/ematchsite.
This is a PLOS Computational Biology Software Article
  相似文献   

16.
Whole genome sequencing of the free-living nematode Caenorhabditis elegans is a prominent achievement in genomics and uncovers the existence of enormous known and unknown gene products. Characterization and linking of all gene products are the next challenging theme of biology. Genome-wide researches are already progressing on C. elegans and the fruits of these efforts are accessible through the internet. To link the sequence-function relationship, proteomic research has been applied to provide comprehensive information of the worm proteins. In addition to 2-dimensional gel electrophoresis for visualization of the proteome, recent advances in liquid chromatography (LC)-based technologies have allowed the large-scale analysis of proteins and are at cutting-edge of high-throughput analysis of focused proteome.  相似文献   

17.
Both the forward and inverse problems of electrocardiography rely on the precise modelling of the anatomic and electrical properties of the thoracic tissues. This, in turn, requires good knowledge of the electrical anisotropy as well as conductivity inhomogeneity of the heart, lungs and the rest of the thorax. Cardiac electrical anisotropy is related to its microstructure (fibre length, density and orientation). We hereby present detailed three-dimensional (3D) meshes of the thorax and heart, using image data from contiguous 2D magnetic resonance (MR) imaging slices as well as a realistic 3D cardiac fibre orientation model that derives its data from high-resolution ex vivo human heart MR images and from histology specimens of heart tissue. Using specific software, we integrated the 3D thorax and heart meshes in one that addresses the related modelling requirements for the solution of the forward and inverse problems of electrocardiography.  相似文献   

18.
CK2 is probably the most pleiotropic Ser/Thr protein kinase with hundreds of endogenous substrates already known, which are implicated in a variety of cellular functions. At variance with most protein kinases whose activity is turned on only in response to specific stimuli, and whose genetic alterations often underlie pathological situations, CK2 is not susceptible to tight regulation and there are no mutations known to affect its constitutive activity. Nevertheless an abnormally high level of CK2 is invariably found in tumours, and solid arguments have accumulated suggesting that CK2 plays a global pro-survival function, which under special circumstances creates a cellular environment particularly favourable to the development and potentiation of the tumour phenotype. Therefore any strategy aimed at attenuating CK2 activity may represent a "master key" for the treatment of different neoplastic diseases. Waiting for the clarification of the epigenetic mechanisms promoting the rise of CK2 in cells predisposed to develop a tumour phenotype, a useful pharmacological aid can come from the improvement of a number of fairly potent and selective CK2 inhibitors already available.  相似文献   

19.
ProADD, a database for protein aggregation diseases, is developed to organize the data under a single platform to facilitate easy access for researchers. Diseases caused due to protein aggregation and the proteins involved in each of these diseases are integrated. The database helps in classification of proteins involved in the protein aggregation diseases based on sequence and structural analysis. Analysis of proteins can be done to mine patterns prevailing among the aggregating proteins.

Availability

http://bicmku.in/ProADD  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号