首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
We have mined the evolutionary record for the large family of intracellular lipid-binding proteins (iLBPs) by calculating the statistical coupling of residue variations in a multiple sequence alignment using methods developed by Ranganathan and coworkers (Lockless and Ranganathan, Science 1999:286;295-299). The 213 sequences analyzed have a wide range of ligand-binding functions as well as highly divergent phylogenetic origins, assuring broad sampling of sequence space. Emerging from this analysis were two major clusters of coupled residues, which when mapped onto the structure of a representative iLBP under study in our laboratory, cellular retinoic-acid binding protein I, are largely contiguous and provide useful points of comparison to available data for the folding of this protein. One cluster comprises a predominantly hydrophobic core away from the ligand-binding site and likely represents key structural information for the iLBP fold. The other cluster includes the portal region where ligand enters its binding site, regions of the ligand-binding cavity, and the region where the 10-stranded beta-barrel characteristic of this family closes (between strands 1' and 10). Linkages between these two clusters suggest that evolutionary pressures on this family constrain structural and functional sequence information in an interdependent fashion. The necessity of the structure to wrap around a hydrophobic ligand confounds the typical sequestration of hydrophobic side chains. Additionally, ligand entry and exit require these structures to have a capacity for specific conformational change during binding and release. We conclude that an essential and structurally apparent separation of local and global sequence information is conserved throughout the iLBP family.  相似文献   

3.
Hydrophobic cluster analysis (HCA) [15] is a very efficient method to analyse and compare protein sequences. Despite its effectiveness, this method is not widely used because it relies in part on the experience and training of the user. In this article, detailed guidelines as to the use of HCA are presented and include discussions on: the definition of the hydrophobic clusters and their relationships with secondary and tertiary structures; the length of the clusters; the amino acid classification used for HCA; the HCA plot programs; and the working strategies. Various procedures for the analysis of a single sequence are presented: structural segmentation, structural domains and secondary structure evaluation. Like most sequence analysis methods, HCA is more efficient when several homologous sequences are compared. Procedures for the detection and alignment of distantly related proteins by HCA are described through several published examples along with 2 previously unreported cases: the beta-glucosidase from Ruminococcus albus is clearly related to the beta-glucosidases from Clostridum thermocellum and Hansenula anomala although they display a reverse organization of their constitutive domains; the alignment of the sequence of human GTPase activating protein with that of the Crk oncogene is presented. Finally, the pertinence of HCA in the identification of important residues for structure/function as well as in the preparation of homology modelling is discussed.  相似文献   

4.
5.
The Berkeley Phylogenomics Group presents PhyloFacts, a structural phylogenomic encyclopedia containing almost 10,000 'books' for protein families and domains, with pre-calculated structural, functional and evolutionary analyses. PhyloFacts enables biologists to avoid the systematic errors associated with function prediction by homology through the integration of a variety of experimental data and bioinformatics methods in an evolutionary framework. Users can submit sequences for classification to families and functional subfamilies. PhyloFacts is available as a worldwide web resource from .  相似文献   

6.
7.
Membrane proteins are involved in various critical biological processes,and studying membrane proteins represents a major challenge in protein biochemistry.As shown by both structural and functional studies,the membrane environment plays an essential role for membrane proteins.In vitro studies are reliant on the successful reconstitution of membrane proteins.This review describes the interaction between detergents and lipids that aids the understanding of the reconstitution processes.Then the techniques of detergent removal and a few useful techniques to refine the formed proteoliposomes are reviewed.Finally the applications of reconstitution techniques to study membrane proteins involved in Ca2+ signaling are summarized.  相似文献   

8.
MOTIVATION: Increase the discriminatory power of PROSITE profiles to facilitate function determination and provide biologically relevant information about domains detected by profiles for the annotation of proteins. SUMMARY: We have created a new database, ProRule, which contains additional information about PROSITE profiles. ProRule contains notably the position of structurally and/or functionally critical amino acids, as well as the condition they must fulfill to play their biological role. These supplementary data should help function determination and annotation of the UniProt Swiss-Prot knowledgebase. ProRule also contains information about the domain detected by the profile in the Swiss-Prot line format. Hence, ProRule can be used to make Swiss-Prot annotation more homogeneous and consistent. The format of ProRule can be extended to provide information about combination of domains. AVAILABILITY: ProRule can be accessed through ScanProsite at http://www.expasy.org/tools/scanprosite. A file containing the rules will be made available under the PROSITE copyright conditions on our ftp site (ftp://www.expasy.org/databases/prosite/) by the next PROSITE release.  相似文献   

9.
Relatively few protein structures are known, compared to the enormous amount of sequence data produced in the sequencing of different genomes, and relatively few protein complexes are deposited in the PDB with respect to the great amount of interaction data coming from high-throughput experiments (two-hybrid or affinity purification of protein complexes and mass spectrometry). Nevertheless, we can rely on computational techniques for the extraction of high-quality and information-rich data from the known structures and for their spreading in the protein sequence space. We describe here the ongoing research projects in our group: we analyse the protein complexes stored in the PDB and, for each complex involving one domain belonging to a family of interaction domains for which some interaction data are available, we can calculate its probability of interaction with any protein sequence. We analyse the structures of proteins encoding a function specified in a PROSITE pattern, which exhibits relatively low selectivity and specificity, and build extended patterns. To this aim, we consider residues that are well-conserved in the structure, even if their conservation cannot easily be recognized in the sequence alignment of the proteins holding the function. We also analyse protein surface regions and, through the annotation of the solvent-exposed residues, we annotate protein surface patches via a structural comparison performed with stringent parameters and independently of the residue order in the sequence. Local surface comparison may also help in identifying new sequence patterns, which could not be highlighted with other sequence-based methods.  相似文献   

10.
11.
We present a model of amino acid sequence evolution based on a hidden Markov model that extends to transmembrane proteins previous methods that incorporate protein structural information into phylogenetics. Our model aims to give a better understanding of processes of molecular evolution and to extract structural information from multiple alignments of transmembrane sequences and use such information to improve phylogenetic analyses. This should be of value in phylogenetic studies of transmembrane proteins: for example, mitochondrial proteins have acquired a special importance in phylogenetics and are mostly transmembrane proteins. The improvement in fit to example data sets of our new model relative to less complex models of amino acid sequence evolution is statistically tested. To further illustrate the potential utility of our method, phylogeny estimation is performed on primate CCR5 receptor sequences, sequences of l and m subunits of the light reaction center in purple bacteria, guinea pig sequences with respect to lagomorph and rodent sequences of calcitonin receptor and K-substance receptor, and cetacean sequences of cytochrome b.  相似文献   

12.
13.
MOTIVATION: Recognizing proteins that have similar tertiary structure is the key step of template-based protein structure prediction methods. Traditionally, a variety of alignment methods are used to identify similar folds, based on sequence similarity and sequence-structure compatibility. Although these methods are complementary, their integration has not been thoroughly exploited. Statistical machine learning methods provide tools for integrating multiple features, but so far these methods have been used primarily for protein and fold classification, rather than addressing the retrieval problem of fold recognition-finding a proper template for a given query protein. RESULTS: Here we present a two-stage machine learning, information retrieval, approach to fold recognition. First, we use alignment methods to derive pairwise similarity features for query-template protein pairs. We also use global profile-profile alignments in combination with predicted secondary structure, relative solvent accessibility, contact map and beta-strand pairing to extract pairwise structural compatibility features. Second, we apply support vector machines to these features to predict the structural relevance (i.e. in the same fold or not) of the query-template pairs. For each query, the continuous relevance scores are used to rank the templates. The FOLDpro approach is modular, scalable and effective. Compared with 11 other fold recognition methods, FOLDpro yields the best results in almost all standard categories on a comprehensive benchmark dataset. Using predictions of the top-ranked template, the sensitivity is approximately 85, 56, and 27% at the family, superfamily and fold levels respectively. Using the 5 top-ranked templates, the sensitivity increases to 90, 70, and 48%.  相似文献   

14.
ZRANB2 was identified originally in a differential display experiment on 2-day and 10-day primary cultures of rat juxtaglomerular cells. During prolonged culture it was found to undergo down-regulation in concert with renin, the archetypical constituent of these cells. ZRANB2 has two zinc fingers that form a novel fold and show striking homology to Ran-binding protein domains. Human ZRANB2 mRNA is alternatively spliced to give two variants with different 3' ends. ZRANB2 has homologues across a range of species, the N-terminal end being particularly conserved. ZRANB2 is present in the nucleus of human cells. It binds to mRNA, as well as the essential splicing factors U170K and U2AF(35) and the novel splicing component SFRS17A (formerly known as XE7). ZRANB2 is one of 20 genes up-regulated in grade III ovarian serous papillary carcinoma. Here, we review current knowledge surrounding ZRANB2.  相似文献   

15.
Surfactant proteins B and C (SP-B and SP-C), together with phospholipids, are important constituents of pulmonary surfactant and of preparations used for treatment of respiratory distress syndrome (RDS). SP-B belongs to the saposin family of homologous proteins, which include other lipid-interacting proteins, like the membranolytic NK-lysin. SP-B, in contrast to other saposins, is hydrophobic and a disulfide-linked dimer, and its mechanism of action is not known. A model of the three-dimensional structure of one SP-B subunit was generated from the structure of monomeric NK-lysin determined by nuclear magnetic resonance, and the SP-B dimer was formed by joining two subunits via the intersubunit disulfide bond Cys48-Cys48'. After energy minimization, intersubunit hydrogen bonds/ion pairs were formed between the strictly conserved residues Glu51 and Arg52, which creates a central non-polar region located in between two clusters of positively charged residues. The structural features support a function of SP-B in cross-linking of lipid membranes. Mixtures of phospholipids, an SP-C analogue and polymyxin B (which cross-links lipid vesicles but is structurally unrelated to SP-B) exhibit in vitro surface activity which is indistinguishable from that of analogous mixtures containing SP-B instead of polymyxin B. This suggests an avenue for identification of SP-B analogues that can be used in synthetic surfactants for treatment of RDS.  相似文献   

16.
McDermott J  Samudrala R 《Trends in biotechnology》2004,22(2):60-2; discussion 62-3
Experimentally derived genome-wide protein interaction networks have been useful in the elucidation of functional information that is not evident from examining individual proteins but determination of these networks is complex and time consuming. To address this problem, several computational methods for predicting protein networks in novel genomes have been developed. A recent publication by Date and Marcotte describes the use of phylogenetic profiling for elucidating novel pathways in proteomes that have not been experimentally characterized. This method, in combination with other computational methods for generating protein-interaction networks, might help identify novel functional pathways and enhance functional annotation of individual proteins.  相似文献   

17.
The ability to predict local structural features of a protein from the primary sequence is of paramount importance for unraveling its function in absence of experimental structural information. Two main factors affect the utility of potential prediction tools: their accuracy must enable extraction of reliable structural information on the proteins of interest, and their runtime must be low to keep pace with sequencing data being generated at a constantly increasing speed. Here, we present NetSurfP-2.0, a novel tool that can predict the most important local structural features with unprecedented accuracy and runtime. NetSurfP-2.0 is sequence-based and uses an architecture composed of convolutional and long short-term memory neural networks trained on solved protein structures. Using a single integrated model, NetSurfP-2.0 predicts solvent accessibility, secondary structure, structural disorder, and backbone dihedral angles for each residue of the input sequences. We assessed the accuracy of NetSurfP-2.0 on several independent test datasets and found it to consistently produce state-of-the-art predictions for each of its output features. We observe a correlation of 80% between predictions and experimental data for solvent accessibility, and a precision of 85% on secondary structure 3-class predictions. In addition to improved accuracy, the processing time has been optimized to allow predicting more than 1000 proteins in less than 2 hours, and complete proteomes in less than 1 day.  相似文献   

18.

Background  

With the current technological advances in high-throughput biology, the necessity to develop tools that help to analyse the massive amount of data being generated is evident. A powerful method of inspecting large-scale data sets is gene set enrichment analysis (GSEA) and investigation of protein structural features can guide determining the function of individual genes. However, a convenient tool that combines these two features to aid in high-throughput data analysis has not been developed yet. In order to fill this niche, we developed the user-friendly, web-based application, PhenoFam.  相似文献   

19.
The functional domain composition is introduced to predict the structural class of a protein or domain according to the following classification: all-alpha, all-beta, alpha/beta, alpha+beta, micro (multi-domain), sigma (small protein), and rho (peptide). The advantage by doing so is that both the sequence-order-related features and the function-related features are naturally incorporated in the predictor. As a demonstration, the jackknife cross-validation test was performed on a dataset that consists of proteins and domains with only less than 20% sequence identity to each other in order to get rid of any homologous bias. The overall success rate thus obtained was 98%. In contrast to this, the corresponding rates obtained by the simple geometry approaches based on the amino acid composition were only 36-39%. This indicates that using the functional domain composition to represent the sample of a protein for statistical prediction is very promising, and that the functional type of a domain is closely correlated with its structural class.  相似文献   

20.
The knowledge collated from the known protein structures has revealed that the proteins are usually folded into the four structural classes: all-α, all-β, α/β and α + β. A number of methods have been proposed to predict the protein's structural class from its primary structure; however, it has been observed that these methods fail or perform poorly in the cases of distantly related sequences. In this paper, we propose a new method for protein structural class prediction using low homology (twilight-zone) protein sequences dataset. Since protein structural class prediction is a typical classification problem, we have developed a Support Vector Machine (SVM)-based method for protein structural class prediction that uses features derived from the predicted secondary structure and predicted burial information of amino acid residues. The examination of different individual as well as feature combinations revealed that the combination of secondary structural content, secondary structural and solvent accessibility state frequencies of amino acids gave rise to the best leave-one-out cross-validation accuracy of ~81% which is comparable to the best accuracy reported in the literature so far.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号