期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Automated selection of positions determining functional specificity of proteins by comparative analysis of orthologous groups in protein families

Kalinina OV Mironov AA Gelfand MS Rakhmaninova AB 《Protein science : a publication of the Protein Society》2004,13(2):443-456

相似文献

2.

Using multiple sequence correlation analysis to characterize functionally important protein regions

Saraf MC Moore GL Maranas CD 《Protein engineering》2003,16(6):397-406

Protein co-evolution under structural and functional constraints necessitates the preservation of important interactions. Identifying functionally important regions poses many obstacles in protein engineering efforts. In this paper, we present a bioinformatics-inspired approach (residue correlation analysis, RCA) for predicting functionally important domains from protein family sequence data. RCA is comprised of two major steps: (i) identifying pairs of residue positions that mutate in a coordinated manner, and (ii) using these results to identify protein regions that interact with an uncommonly high number of other residues. We hypothesize that strongly correlated pairs result not only from contacting pairs, but also from residues that participate in conformational changes involved during catalysis or important interactions necessary for retaining functionality. The results show that highly mobile loops that assist in ligand association/dissociation tend to exhibit high correlation. RCA results exhibit good agreement with the findings of experimental and molecular dynamics studies for the three protein families that are analyzed: (i) DHFR (dihydrofolate reductase), (ii) cyclophilin, and (iii) formyl-transferase. Specifically, the specificity (percentage of correct predictions) in all three cases is substantially higher than those obtained by entropic measures or contacting residue pairs. In addition, we use our approach in a predictive fashion to identify important regions of a transmembrane amino acid transporter protein for which there is limited structural and functional information available. 相似文献

3.

Prediction of functional specificity determinants from protein sequences using log-likelihood ratios

Pei J Cai W Kinch LN Grishin NV 《Bioinformatics (Oxford, England)》2006,22(2):164-171

MOTIVATION: A number of methods have been developed to predict functional specificity determinants in protein families based on sequence information. Most of these methods rely on pre-defined functional subgroups. Manual subgroup definition is difficult because of the limited number of experimentally characterized subfamilies with differing specificity, while automatic subgroup partitioning using computational tools is a non-trivial task and does not always yield ideal results. RESULTS: We propose a new approach SPEL (specificity positions by evolutionary likelihood) to detect positions that are likely to be functional specificity determinants. SPEL, which does not require subgroup definition, takes a multiple sequence alignment of a protein family as the only input, and assigns a P-value to every position in the alignment. Positions with low P-values are likely to be important for functional specificity. An evolutionary tree is reconstructed during the calculation, and P-value estimation is based on a random model that involves evolutionary simulations. Evolutionary log-likelihood is chosen as a measure of amino acid distribution at a position. To illustrate the performance of the method, we carried out a detailed analysis of two protein families (LacI/PurR and G protein alpha subunit), and compared our method with two existing methods (evolutionary trace and mutual information based). All three methods were also compared on a set of protein families with known ligand-bound structures. AVAILABILITY: SPEL is freely available for non-commercial use. Its pre-compiled versions for several platforms and alignments used in this work are available at ftp://iole.swmed.edu/pub/SPEL/ 相似文献

4.

Common and specific amino acid residues in the prokaryotic polypeptide release factors RF1 and RF2: possible functional implications

下载免费PDF全文

Oparina NJ Kalinina OV Gelfand MS Kisselev LL 《Nucleic acids research》2005,33(16):5226-5234

Termination of protein synthesis is promoted in ribosomes by proper stop codon discrimination by class 1 polypeptide release factors (RFs). A large set of prokaryotic RFs differing in stop codon specificity, RF1 for UAG and UAA, and RF2 for UGA and UAA, was analyzed by means of a recently developed computational method allowing identification of the specificity-determining positions (SDPs) in families composed of proteins with similar but not identical function. Fifteen SDPs were identified within the RF1/2 superdomain II/IV known to be implicated in stop codon decoding. Three of these SDPs had particularly high scores. Five residues invariant for RF1 and RF2 [invariant amino acid residues (IRs)] were spatially clustered with the highest-scoring SDPs that in turn were located in two zones within the SDP/IR area. Zone 1 (domain II) included PxT and SPF motifs identified earlier by others as 'discriminator tripeptides'. We suggest that IRs in this zone take part in the recognition of U, the first base of all stop codons. Zone 2 (domain IV) possessed two SDPs with the highest scores not identified earlier. Presumably, they also take part in stop codon binding and discrimination. Elucidation of potential functional role(s) of the newly identified SDP/IR zones requires further experiments. 相似文献

5.

In silico discovery of enzyme-substrate specificity-determining residue clusters

Yu GX Park BH Chandramohan P Munavalli R Geist A Samatova NF 《Journal of molecular biology》2005,352(5):1105-1117

The binding between an enzyme and its substrate is highly specific, despite the fact that many different enzymes show significant sequence and structure similarity. There must be, then, substrate specificity-determining residues that enable different enzymes to recognize their unique substrates. We reason that a coordinated, not independent, action of both conserved and non-conserved residues determine enzymatic activity and specificity. Here, we present a surface patch ranking (SPR) method for in silico discovery of substrate specificity-determining residue clusters by exploring both sequence conservation and correlated mutations. As case studies we apply SPR to several highly homologous enzymatic protein pairs, such as guanylyl versus adenylyl cyclases, lactate versus malate dehydrogenases, and trypsin versus chymotrypsin. Without using experimental data, we predict several single and multi-residue clusters that are consistent with previous mutagenesis experimental results. Most single-residue clusters are directly involved in enzyme-substrate interactions, whereas multi-residue clusters are vital for domain-domain and regulator-enzyme interactions, indicating their complementary role in specificity determination. These results demonstrate that SPR may help the selection of target residues for mutagenesis experiments and, thus, focus rational drug design, protein engineering, and functional annotation to the relevant regions of a protein. 相似文献

6.

Prediction of amino acid positions specific for functional groups in a protein family based on local sequence similarity

下载免费PDF全文

Dmitry A. Karasev Alexander V. Veselovsky Nina Yu. Oparina Dmitry A. Filimonov Boris N. Sobolev 《Journal of molecular recognition : JMR》2016,29(4):159-169

相似文献

7.

Computational approaches to predict protein functional families and functional sites

《Current opinion in structural biology》2021

Understanding the mechanisms of protein function is indispensable for many biological applications, such as protein engineering and drug design. However, experimental annotations are sparse, and therefore, theoretical strategies are needed to fill the gap. Here, we present the latest developments in building functional subclassifications of protein superfamilies and using evolutionary conservation to detect functional determinants, for example, catalytic-, binding- and specificity-determining residues important for delineating the functional families. We also briefly review other features exploited for functional site detection and new machine learning strategies for combining multiple features. 相似文献

8.

Experimental identification of specificity determinants in the domain linker of a LacI/GalR protein: bioinformatics-based predictions generate true positives and false negatives

Meinhardt S Swint-Kruse L 《Proteins》2008,73(4):941-957

相似文献

9.

Text mining improves prediction of protein functional sites

Verspoor KM Cohn JD Ravikumar KE Wall ME 《PloS one》2012,7(2):e32171

We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. 相似文献

10.

Automatic methods for predicting functionally important residues 总被引：9，自引：0，他引：9

del Sol A del Sol Mesa A Pazos F Valencia A 《Journal of molecular biology》2003,326(4):1289-1302

Sequence analysis is often the first guide for the prediction of residues in a protein family that may have functional significance. A few methods have been proposed which use the division of protein families into subfamilies in the search for those positions that could have some functional significance for the whole family, but at the same time which exhibit the specificity of each subfamily ("Tree-determinant residues"). However, there are still many unsolved questions like the best division of a protein family into subfamilies, or the accurate detection of sequence variation patterns characteristic of different subfamilies. Here we present a systematic study in a significant number of protein families, testing the statistical meaning of the Tree-determinant residues predicted by three different methods that represent the range of available approaches. The first method takes as a starting point a phylogenetic representation of a protein family and, following the principle of Relative Entropy from Information Theory, automatically searches for the optimal division of the family into subfamilies. The second method looks for positions whose mutational behavior is reminiscent of the mutational behavior of the full-length proteins, by directly comparing the corresponding distance matrices. The third method is an automation of the analysis of distribution of sequences and amino acid positions in the corresponding multidimensional spaces using a vector-based principal component analysis. These three methods have been tested on two non-redundant lists of protein families: one composed by proteins that bind a variety of ligand groups, and the other composed by proteins with annotated functionally relevant sites. In most cases, the residues predicted by the three methods show a clear tendency to be close to bound ligands of biological relevance and to those amino acids described as participants in key aspects of protein function. These three automatic methods provide a wide range of possibilities for biologists to analyze their families of interest, in a similar way to the one presented here for the family of proteins related with ras-p21. 相似文献

11.

Computational method for prediction of protein functional sites using specificity determinants

Kalinina OV Rassel RB Rakhmaninova AB Gel'fand MS 《Molekuliarnaia biologiia》2007,41(1):151-162

The current available data on protein sequences largely exceeds the experimental capabilities to annotate their function. So annotation in silico, i.e. using computational methods becomes increasingly important. This annotation is inevitably a prediction, but it can be an important starting point for further experimental studies. Here we present a method for prediction of protein functional sites, SDPsite, based on the identification of protein specificity determinants. Taking as an input a protein sequence alignment and a phylogenetic tree, the algorithm predicts conserved positions and specificity determinants, maps them onto the protein's 3D structure, and searches for clusters of the predicted positions. Comparison of the obtained predictions with experimental data and data on performance of several other methods for prediction of functional sites reveals that SDPsite agrees well with the experiment and outperforms most of the previously available methods. SDPsite is publicly available under http://bioinf.fbb.msu.ru/SDPsite. 相似文献

12.

Using orthologous and paralogous proteins to identify specificity-determining residues in bacterial transcription factors

Mirny LA Gelfand MS 《Journal of molecular biology》2002,321(1):7-20

相似文献

13.

Isofunctional Protein Subfamily Detection Using Data Integration and Spectral Clustering

Elisa Boari de Lima Wagner Meira Júnior Raquel Cardoso de Melo-Minardi 《PLoS computational biology》2016,12(6)

As increasingly more genomes are sequenced, the vast majority of proteins may only be annotated computationally, given experimental investigation is extremely costly. This highlights the need for computational methods to determine protein functions quickly and reliably. We believe dividing a protein family into subtypes which share specific functions uncommon to the whole family reduces the function annotation problem’s complexity. Hence, this work’s purpose is to detect isofunctional subfamilies inside a family of unknown function, while identifying differentiating residues. Similarity between protein pairs according to various properties is interpreted as functional similarity evidence. Data are integrated using genetic programming and provided to a spectral clustering algorithm, which creates clusters of similar proteins. The proposed framework was applied to well-known protein families and to a family of unknown function, then compared to ASMC. Results showed our fully automated technique obtained better clusters than ASMC for two families, besides equivalent results for other two, including one whose clusters were manually defined. Clusters produced by our framework showed great correspondence with the known subfamilies, besides being more contrasting than those produced by ASMC. Additionally, for the families whose specificity determining positions are known, such residues were among those our technique considered most important to differentiate a given group. When run with the crotonase and enolase SFLD superfamilies, the results showed great agreement with this gold-standard. Best results consistently involved multiple data types, thus confirming our hypothesis that similarities according to different knowledge domains may be used as functional similarity evidence. Our main contributions are the proposed strategy for selecting and integrating data types, along with the ability to work with noisy and incomplete data; domain knowledge usage for detecting subfamilies in a family with different specificities, thus reducing the complexity of the experimental function characterization problem; and the identification of residues responsible for specificity. 相似文献

14.

Deciphering the protein-DNA code of bacterial winged helix-turn-helix transcription factors

Adam P. Joyce James J. Havranek 《Quantitative Biology.》2018,6(1):68

相似文献

15.

Multi-RELIEF: a method to recognize specificity determining residues from multiple sequence alignments using a Machine-Learning approach for feature weighting

Ye K Feenstra KA Heringa J Ijzerman AP Marchiori E 《Bioinformatics (Oxford, England)》2008,24(1):18-25

MOTIVATION: Identification of residues that account for protein function specificity is crucial, not only for understanding the nature of functional specificity, but also for protein engineering experiments aimed at switching the specificity of an enzyme, regulator or transporter. Available algorithms generally use multiple sequence alignments to identify residue positions conserved within subfamilies but divergent in between. However, many biological examples show a much subtler picture than simple intra-group conservation versus inter-group divergence. RESULTS: We present multi-RELIEF, a novel approach for identifying specificity residues that is based on RELIEF, a state-of-the-art Machine-Learning technique for feature weighting. It estimates the expected 'local' functional specificity of residues from an alignment divided in multiple classes. Optionally, 3D structure information is exploited by increasing the weight of residues that have high-weight neighbors. Using ROC curves over a large body of experimental reference data, we show that (a) multi-RELIEF identifies specificity residues for the seven test sets used, (b) incorporating structural information improves prediction for specificity of interaction with small molecules and (c) comparison of multi-RELIEF with four other state-of-the-art algorithms indicates its robustness and best overall performance. AVAILABILITY: A web-server implementation of multi-RELIEF is available at www.ibi.vu.nl/programs/multirelief. Matlab source code of the algorithm and data sets are available on request for academic use. 相似文献

16.

Analysis and prediction of functional sub-types from protein sequence alignments 总被引：15，自引：0，他引：15

Hannenhalli SS Russell RB 《Journal of molecular biology》2000,303(1):61-76

The increasing number and diversity of protein sequence families requires new methods to define and predict details regarding function. Here, we present a method for analysis and prediction of functional sub-types from multiple protein sequence alignments. Given an alignment and set of proteins grouped into sub-types according to some definition of function, such as enzymatic specificity, the method identifies positions that are indicative of functional differences by comparison of sub-type specific sequence profiles, and analysis of positional entropy in the alignment. Alignment positions with significantly high positional relative entropy correlate with those known to be involved in defining sub-types for nucleotidyl cyclases, protein kinases, lactate/malate dehydrogenases and trypsin-like serine proteases. We highlight new positions for these proteins that suggest additional experiments to elucidate the basis of specificity. The method is also able to predict sub-type for unclassified sequences. We assess several variations on a prediction method, and compare them to simple sequence comparisons. For assessment, we remove close homologues to the sequence for which a prediction is to be made (by a sequence identity above a threshold). This simulates situations where a protein is known to belong to a protein family, but is not a close relative of another protein of known sub-type. Considering the four families above, and a sequence identity threshold of 30 %, our best method gives an accuracy of 96 % compared to 80 % obtained for sequence similarity and 74 % for BLAST. We describe the derivation of a set of sub-type groupings derived from an automated parsing of alignments from PFAM and the SWISSPROT database, and use this to perform a large-scale assessment. The best method gives an average accuracy of 94 % compared to 68 % for sequence similarity and 79 % for BLAST. We discuss implications for experimental design, genome annotation and the prediction of protein function and protein intra-residue distances. 相似文献

17.

LTHREADER: prediction of extracellular ligand-receptor interactions in cytokines using localized threading

Pulim V Bienkowska J Berger B 《Protein science : a publication of the Protein Society》2008,17(2):279-292

Identification of extracellular ligand-receptor interactions is important for drug design and the treatment of diseases. Difficulties in detecting these interactions using high-throughput experimental techniques motivate the development of computational prediction methods. We propose a novel threading algorithm, LTHREADER, which generates accurate local sequence-structure interface alignments and integrates various statistical scores and experimental binding data to predict interactions within ligand-receptor families. LTHREADER uses a profile of secondary structure and solvent accessibility predictions with residue contact maps to guide and constrain alignments. Using a decision tree classifier and low-throughput experimental data for training, it combines information inferred from statistical interaction potentials, energy functions, correlated mutations, and conserved residue pairs to predict interactions. We apply our method to cytokines, which play a central role in the development of many diseases including cancer and inflammatory and autoimmune disorders. We tested our approach on two representative families from different structural classes (all-alpha and all-beta proteins) of cytokines. In comparison with the state-of-the-art threader RAPTOR, LTHREADER generates on average 20% more accurate alignments of interacting residues. Furthermore, in cross-validation tests, LTHREADER correctly predicts experimentally confirmed interactions for a common binding mode within the 4-helical long-chain cytokine family with 75% sensitivity and 86% specificity with 40% gain in sensitivity compared to RAPTOR. For the TNF-like family our method achieves 70% sensitivity with 55% specificity with 70% gain in sensitivity. LTHREADER combines information from multiple complex templates when such data are available. When only one solved structure is available, a localized PSI-BLAST approach also outperforms standard threading methods with 25%-50% improvements in sensitivity. 相似文献

18.

Binding site graphs: a new graph theoretical framework for prediction of transcription factor binding sites

下载免费PDF全文

Reddy TE DeLisi C Shakhnovich BE 《PLoS computational biology》2007,3(5):e90

相似文献

19.

An evolutionarily conserved network of amino acids mediates gating in voltage-dependent potassium channels

Fleishman SJ Yifrach O Ben-Tal N 《Journal of molecular biology》2004,340(2):307-318

A novel sequence-analysis technique for detecting correlated amino acid positions in intermediate-size protein families (50-100 sequences) was developed, and applied to study voltage-dependent gating of potassium channels. Most contemporary methods for detecting amino acid correlations within proteins use very large sets of data, typically comprising hundreds or thousands of evolutionarily related sequences, to overcome the relatively low signal-to-noise ratio in the analysis of co-variations between pairs of amino acid positions. Such methods are impractical for voltage-gated potassium (Kv) channels and for many other protein families that have not yet been sequenced to that extent. Here, we used a phylogenetic reconstruction of paralogous Kv channels to follow the evolutionary history of every pair of amino acid positions within this family, thus increasing detection accuracy of correlated amino acids relative to contemporary methods. In addition, we used a bootstrapping procedure to eliminate correlations that were statistically insignificant. These and other measures allowed us to increase the method's sensitivity, and opened the way to reliable identification of correlated positions even in intermediate-size protein families. Principal-component analysis applied to the set of correlated amino acid positions in Kv channels detected a network of inter-correlated residues, a large fraction of which were identified as gating-sensitive upon mutation. Mapping the network of correlated residues onto the 3D structure of the Kv channel from Aeropyrum pernix disclosed correlations between residues in the voltage-sensor paddle and the pore region, including regions that are involved in the gating transition. We discuss these findings with respect to the evolutionary constraints acting on the channel's various domains. The software is available on our website 相似文献

20.

Computational design, construction, and characterization of a set of specificity determining residues in protein-protein interactions

Nagao C Izako N Soga S Khan SH Kawabata S Shirai H Mizuguchi K 《Proteins》2012,80(10):2426-2436

Proteins interact with different partners to perform different functions and it is important to elucidate the determinants of partner specificity in protein complex formation. Although methods for detecting specificity determining positions have been developed previously, direct experimental evidence for these amino acid residues is scarce, and the lack of information has prevented further computational studies. In this article, we constructed a dataset that is likely to exhibit specificity in protein complex formation, based on available crystal structures and several intuitive ideas about interaction profiles and functional subclasses. We then defined a “structure‐based specificity determining position (sbSDP)” as a set of equivalent residues in a protein family showing a large variation in their interaction energy with different partners. We investigated sequence and structural features of sbSDPs and demonstrated that their amino acid propensities significantly differed from those of other interacting residues and that the importance of many of these residues for determining specificity had been verified experimentally. Proteins 2012;. © 2012 Wiley Periodicals, Inc. 相似文献