首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Wen Z  Li M  Li Y  Guo Y  Wang K 《Amino acids》2007,32(2):277-283
As an important transmembrane protein family in eukaryon, G-protein coupled receptors (GPCRs) play a significant role in cellular signal transduction and are important targets for drug design. However, it is very difficult to resolve their tertiary structure by X-ray crystallography. In this study, we have developed a Delaunay model, which constructs a series of simplexes with latent variables to classify the families of GPCRs and projects unknown sequences to principle component space (PC-space) to predict their topology. Computational results show that, for the classification of GPCRs, the method achieves the accuracy of 91.0 and 87.6% for Class A, more than 80% for the other three classes in differentiating GPCRs from non-GPCRs and 70% for discriminating between four major classes of GPCR, respectively. When recognizing the structure of GPCRs, all the N-terminals of sequences can be determined correctly. The maximum accuracy of predicting transmembrane segments is achieved in the 7th transmembrane segment of Rhodopsin, which is 99.4%, and the average error is 2.1 amino acids, which is the lowest in all of the segments prediction. This method could provide structural information of a novel GPCR as a tool for experiments and other algorithms of structure prediction of GPCRs. Academic users should send their request for the MATLAB program for classifying GPCRs and predicting the topology of them at liml@scu.edu.cn .  相似文献   

2.
G-protein coupled receptors (GPCRs) represent one of the most important classes of drug targets for pharmaceutical industry and play important roles in cellular signal transduction. Predicting the coupling specificity of GPCRs to G-proteins is vital for further understanding the mechanism of signal transduction and the function of the receptors within a cell, which can provide new clues for pharmaceutical research and development. In this study, the features of amino acid compositions and physiochemical properties of the full-length GPCR sequences have been analyzed and extracted. Based on these features, classifiers have been developed to predict the coupling specificity of GPCRs to G-protelns using support vector machines. The testing results show that this method could obtain better prediction accuracy.  相似文献   

3.
We have developed an alignment-independent method for classification of G-protein coupled receptors (GPCRs) according to the principal chemical properties of their amino acid sequences. The method relies on a multivariate approach where the primary amino acid sequences are translated into vectors based on the principal physicochemical properties of the amino acids and transformation of the data into a uniform matrix by applying a modified autocross-covariance transform. The application of principal component analysis to a data set of 929 class A GPCRs showed a clear separation of the major classes of GPCRs. The application of partial least squares projection to latent structures created a highly valid model (cross-validated correlation coefficient, Q(2) = 0.895) that gave unambiguous classification of the GPCRs in the training set according to their ligand binding class. The model was further validated by external prediction of 535 novel GPCRs not included in the training set. Of the latter, only 14 sequences, confined in rapidly expanding GPCR classes, were mispredicted. Moreover, 90 orphan GPCRs out of 165 were tentatively identified to GPCR ligand binding class. The alignment-independent method could be used to assess the importance of the principal chemical properties of every single amino acid in the protein sequences for their contributions in explaining GPCR family membership. It was then revealed that all amino acids in the unaligned sequences contributed to the classifications, albeit to varying extent; the most important amino acids being those that could also be determined to be conserved by using traditional alignment-based methods.  相似文献   

4.
Li Z  Zhou X  Dai Z  Zou X 《Amino acids》2012,43(2):793-804
The coupling between G protein-coupled receptors (GPCRs) and guanine nucleotide-binding proteins (G proteins) regulates various signal transductions from extracellular space into the cell. However, the coupling mechanism between GPCRs and G proteins is still unknown, and experimental determination of their coupling specificity and function is both expensive and time consuming. Therefore, it is significant to develop a theoretical method to predict the coupling specificity between GPCRs and G proteins as well as their function using their primary sequences. In this study, a novel four-layer predictor (GPCRsG_CWTIT) based on support vector machine (SVM), continuous wavelet transform (CWT) and information theory (IT) is developed to classify G proteins and predict the coupling specificity between GPCRs and G proteins. SVM is used for construction of models. CWT and IT are used to characterize the primary structure of protein. Performance of GPCRsG_CWTIT is evaluated with cross-validation test on various working dataset. The overall accuracy of the G proteins at the levels of class and family is 98.23 and 85.42%, respectively. The accuracy of the coupling specificity prediction varies from 74.60 to 94.30%. These results indicate that the proposed predictor is an effective and feasible tool to predict the coupling specificity between GPCRs and G proteins as well as their functions using only the protein full sequence. The establishment of such an accurate prediction method will facilitate drug discovery by improving the ability to identify and predict protein-protein interactions. GPCRsG_CWTIT and dataset can be acquired freely on request from the authors.  相似文献   

5.
Although the sequence information on G-protein coupled receptors (GPCRs) continues to grow, many GPCRs remain orphaned (i.e. ligand specificity unknown) or poorly characterized with little structural information available, so an automated and reliable method is badly needed to facilitate the identification of novel receptors. In this study, a method of fast Fourier transform-based support vector machine has been developed for predicting GPCR subfamilies according to protein's hydrophobicity. In classifying Class B, C, D and F subfamilies, the method achieved an overall Matthew's correlation coefficient and accuracy of 0.95 and 93.3%, respectively, when evaluated using the jackknife test. The method achieved an accuracy of 100% on the Class B independent dataset. The results show that this method can classify GPCR subfamilies as well as their functional classification with high accuracy. A web server implementing the prediction is available at http://chem.scu.edu.cn/blast/Pred-GPCR.  相似文献   

6.
Understanding the coupling specificity between G protein-coupled receptors (GPCRs) and specific classes of G proteins is important for further elucidation of receptor functions within a cell. Increasing information on GPCR sequences and the G protein family would facilitate prediction of the coupling properties of GPCRs. In this study, we describe a novel approach for predicting the coupling specificity between GPCRs and G proteins. This method uses not only GPCR sequences but also the functional knowledge generated by natural language processing, and can achieve 92.2% prediction accuracy by using the C4.5 algorithm. Furthermore, rules related to GPCR-G protein coupling are generated. The combination of sequence analysis and text mining improves the prediction accuracy for GPCR-G protein coupling specificity, and also provides clues for understanding GPCR signaling.  相似文献   

7.
A computational system for the prediction and classification of human G-protein coupled receptors (GPCRs) has been developed based on the support vector machine (SVM) method and protein sequence information. The feature vectors used to develop the SVM prediction models consist of statistically significant features selected from single amino acid, dipeptide, and tripeptide compositions of protein sequences. Furthermore, the length distribution difference between GPCRs and non-GPCRs has also been exploited to improve the prediction performance. The testing results with annotated human protein sequences demonstrate that this system can get good performance for both prediction and classification of human GPCRs.  相似文献   

8.
Huang JH  Cao DS  Yan J  Xu QS  Hu QN  Liang YZ 《Biochimie》2012,94(8):1697-1704
As the most frequent drug target, G protein-coupled receptors (GPCRs) are a large family of seven trans-membrane receptors that sense molecules outside the cell and activate inside signal transduction pathways. The activity and lifetime of activated receptors are regulated by receptor phosphorylation. Therefore, investigating the exact positions of phosphorylation sites in GPCRs sequence could provide useful clues for drug design and other biotechnology applications. Experimental identification of phosphorylation sites is expensive and laborious. Hence, there is significant interest in the development of computational methods for reliable prediction of phosphorylation sites from amino acid sequences. In this article, we presented a simple and effective method to recognize phosphorylation sites of human GPCRs by combining amino acid hydrophobicity and support vector machine. The prediction accuracy, sensitivity, specificity, Matthews correlation coefficient and area under the curve values for phosphoserine, phosphothreonine, and phosphotyrosine were 0.964, 0.790, 0.999, 0.866, 0.941; 0.954, 0.800, 0.985, 0.828, 0.958; and 0.976, 0.820, 0.993, 0.861, 0.959, respectively. The establishment of such a fast and accurate prediction method will speed up the pace of identifying proper GPCRs sites to facilitate drug discovery.  相似文献   

9.
Being the largest family of cell surface receptors, G-protein-coupled receptors (GPCRs) are among the most frequent targets. The functions of many GPCRs are unknown, and it is both time-consuming and expensive to determine their ligands and signaling pathways by experimental methods. It is of great practical significance to develop an automated and reliable method for classification of GPCRs. In this study, a novel method based on the concept of Chou’s pseudo amino acid composition has been developed for predicting and recognizing GPCRs. The discrete wavelet transform was used to extract feature vectors from the hydrophobicity scales of amino acid to construct pseudo amino acid (PseAA) composition for training support vector machine. The prediction accuracies by the current method among the major families of GPCRs, subfamilies of class A, and types of amine receptors were 99.72%, 97.64%, and 99.20%, respectively, showing 9.4% to 18.0% improvement over other existing methods and indicating that the proposed method is a useful automated tool in identifying GPCRs.  相似文献   

10.
G-protein coupled receptors (GPCRs) belong to biologically important and functionally diverse and largest super family of membrane proteins. GPCRs retain a characteristic membrane topology of seven alpha helices with three intracellular, three extracellular loops and flanking N' and C' terminal residues. Subtle differences do exist in the helix boundaries (TM-domain), loop lengths, sequence features such as conserved motifs, and substituting amino acid patterns and their physiochemical properties amongst these sequences (clusters) at intra-genomic and inter-genomic level (please re-phrase into 2 statements for clarity). In the current study, we employ prediction of helix boundaries and scores derived from amino acid substitution exchange matrices to identify the conserved amino acid residues (motifs) as consensus in aligned set of homologous GPCR sequences. Co-clustered GPCRs from human and other genomes, organized as 32 clusters, were employed to study the amino acid conservation patterns and species-specific or cluster-specific motifs. Critical analysis on sequence composition and properties provide clues to connect functional relevance within and across genome for vast practical applications such as design of mutations and understanding of disease-causing genetic abnormalities.  相似文献   

11.
Multiple sequence alignments become biologically meaningful only if conserved and functionally important residues and secondary structural elements preserved can be identified at equivalent positions. This is particularly important for transmembrane proteins like G-protein coupled receptors (GPCRs) with seven transmembrane helices. TM-MOTIF is a software package and an effective alignment viewer to identify and display conserved motifs and amino acid substitutions (AAS) at each position of the aligned set of homologous sequences of GPCRs. The key feature of the package is to display the predicted membrane topology for seven transmembrane helices in seven colours (VIBGYOR colouring scheme) and to map the identified motifs on its respective helices /loop regions. It is an interactive package which provides options to the user to submit query or pre-aligned set of GPCR sequences to align with a reference sequence, like rhodopsin, whose structure has been solved experimentally. It also provides the possibility to identify the nearest homologue from the available inbuilt GPCR or Olfactory Receptor cluster dataset whose association is already known for its receptor type. AVAILABILITY: The database is available for free at mini@ncbs.res.in.  相似文献   

12.
MOTIVATION: An understanding of the coupling between a G-protein coupled receptor (GPCR) and a specific class of heterotrimeric GTP-binding proteins (G-proteins) is vital for further comprehending the function of the receptor within a cell. However, predicting G-protein coupling based on the amino acid sequence of a receptor has been a daunting task. While experimental data for G-protein coupling exist, published models that rely on sequence based prediction are few. In this study, we have developed a Naive Bayes model to successfully predict G-protein coupling specificity by training over 80 GPCRs with known coupling. Each intracellular domain of GPCRs was treated as a discrete random variable, conditionally independent of one another. In order to determine the conditional probability distributions of these variables, ClustalW-generated phylogenetic trees were used as an approximation for the clustering of the intracellular domain sequences. The sampling of an intracellular domain sequence was achieved by identifying the cluster containing the homologue with the highest sequence similarity. RESULTS: Out of 55 GPCRs validated, the model yielded a correct classification rate of 72%. Our model also predicted multiple G-protein coupling for most of the GPCRs in the validation set. The Bayesian approach in this work offers an alternative to the experimental approach in order to answer the biological problem of GPCR/G-protein coupling selectivity. AVAILABILITY: Academic users should send their request for the perl program for calculating likelihood probabilities at jack.cao@astrazeneca.com. SUPPLEMENTARY INFORMATION: The materials can be viewed at http://www.astrazeneca-montreal.com/AZRDM_info/supporting_info.pdf.  相似文献   

13.

Background  

G- Protein coupled receptors (GPCRs) comprise the largest group of eukaryotic cell surface receptors with great pharmacological interest. A broad range of native ligands interact and activate GPCRs, leading to signal transduction within cells. Most of these responses are mediated through the interaction of GPCRs with heterotrimeric GTP-binding proteins (G-proteins). Due to the information explosion in biological sequence databases, the development of software algorithms that could predict properties of GPCRs is important. Experimental data reported in the literature suggest that heterotrimeric G-proteins interact with parts of the activated receptor at the transmembrane helix-intracellular loop interface. Utilizing this information and membrane topology information, we have developed an intensive exploratory approach to generate a refined library of statistical models (Hidden Markov Models) that predict the coupling preference of GPCRs to heterotrimeric G-proteins. The method predicts the coupling preferences of GPCRs to Gs, Gi/o and Gq/11, but not G12/13 subfamilies.  相似文献   

14.
Olfactory receptors (ORs) are a large family of proteins involved in the recognition and discrimination of numerous odorants. These receptors belong to the G-protein coupled receptor (GPCR) hyperfamily, for which little structural data are available. In this study we predict the binding site residues of OR proteins by analyzing a set of 1441 OR protein sequences from mouse and human. The central insight utilized is that functional contact residues would be conserved among pairs of orthologous receptors, but considerably less conserved among paralogous pairs. Using judiciously selected subsets of 218 ortholog pairs and 518 paralog pairs, we have identified 22 sequence positions that are both highly conserved among the putative orthologs and variable among paralogs. These residues are disposed on transmembrane helices 2 to 7, and on the second extracellular loop of the receptor. Strikingly, although the prediction makes no assumption about the location of the binding site, these amino acid positions are clustered around a pocket in a structural homology model of ORs, mostly facing the inner lumen. We propose that the identified positions constitute the odorant binding site. This conclusion is supported by the observation that all but one of the predicted binding site residues correspond to ligand-contact positions in other rhodopsin-like GPCRs.  相似文献   

15.
The group of 2502 transmembrane (TM) protein sequences with seven TM segments (7-tms) registered in SWISS-PROT 46.0 contains 2200 G-protein-coupled receptors (GPCRs), indicating that GPCR candidates can be detected with a reliability of 87.9% in the eukaryotic genomes merely by correctly predicting the number of TM segments as 7-tms. The predictive accuracies of TM topology-prediction methods proposed so far are not as high as expected; even the best method, HMMTOP 2.0, can only achieve a capture rate of 7-tms sequences of 77.6%. It is necessary to improve this performance as much as possible, even if by only a few percentage points, in order to identify as many novel GPCR candidate genes as possible among the increasing number of newly sequenced genomes. In this study, we propose a simple but useful prediction method for detecting as many 7-tms TM protein sequences as GPCR candidates in eukaryotic genomes as possible. This is achieved by employing a two-step prediction procedure. The first step involves collecting 7-tms sequences by the best prediction method (HMMTOP 2.0), and the second involves picking up the remaining 7-tms sequences by the second-best method (TMHMM 2.0). By this procedure, the capture rate of 7-tms TM protein sequences in SWISS-PROT can be improved considerably from 77.6% to 84.5%, and the number of GPCR candidate sequences predicted as 7-tms in the human genome (Build 35) is increased from 790 (by HMMTOP 2.0) to 903. These 790 and 903 candidate sequences include, respectively, 587 and 636 of the known human GPCRs of the 717 registered in SWISS-PROT 46.0, demonstrating that the proposed combinatorial method is effective in detecting GPCR candidate genes in eukaryotic genomes.  相似文献   

16.
The amino acid sequences of 369 human nonolfactory G-protein-coupled receptors (GPCRs) have been aligned at the seven transmembrane domain (TM) and used to extract the nature of 30 critical residues supposed--from the X-ray structure of bovine rhodopsin bound to retinal--to line the TM binding cavity of ground-state receptors. Interestingly, the clustering of human GPCRs from these 30 residues mirrors the recently described phylogenetic tree of full-sequence human GPCRs (Fredriksson et al., Mol Pharmacol 2003;63:1256-1272) with few exceptions. A TM cavity could be found for all investigated GPCRs with physicochemical properties matching that of their cognate ligands. The current approach allows a very fast comparison of most human GPCRs from the focused perspective of the predicted TM cavity and permits to easily detect key residues that drive ligand selectivity or promiscuity.  相似文献   

17.
Filamentous fungi respond to hundreds of nutritional, chemical and environmental signals that affect expression of primary metabolism and biosynthesis of secondary metabolites. These signals are sensed at the membrane level by G protein coupled receptors (GPCRs). GPCRs contain usually seven transmembrane domains, an external amino terminal fragment that interacts with the ligand, and an internal carboxy terminal end interacting with the intracellular G protein. There is a great variety of GPCRs in filamentous fungi involved in sensing of sugars, amino acids, cellulose, cell-wall components, sex pheromones, oxylipins, calcium ions and other ligands. Mechanisms of signal transduction at the membrane level by GPCRs are discussed, including the internalization and compartmentalisation of these sensor proteins. We have identified and analysed the GPCRs in the genome of Penicillium chrysogenum and compared them with GPCRs of several other filamentous fungi. We have found 66 GPCRs classified into 14 classes, depending on the ligand recognized by these proteins, including most previously proposed classes of GPCRs. We have found 66 putative GPCRs, representatives of twelve of the fourteen previously proposed classes of GPCRs, depending on the ligand recognized by these proteins. A staggering fortytwo putative members of the new GPCR class XIV, the so-called Pth11 sensors of cellulosic material as reported for Neurospora crassa and some other fungi, were identified. Several GPCRs sensing sex pheromones, known in yeast and in several fungi, were also identified in P. chrysogenum, confirming the recent unravelling of the hidden sexual capacity of this species. Other sensing mechanisms do not involve GPCRs, including the two-component systems (HKRR), the HOG signalling system and the PalH mediated pH transduction sensor. GPCR sensor proteins transmit their signals by interacting with intracellular heterotrimeric G proteins, that are well known in several fungi, including P. chrysogenum. These G proteins are inactive in the GDP containing heterotrimeric state, and become active by nucleotide exchange, allowing the separation of the heterotrimeric protein in active Gα and Gβγ dimer subunits. The conversion of GTP in GDP is mediated by the endogenous GTPase activity of the G proteins. Downstream of the ligand interaction, the activated Gα protein and also the Gβ/Gγ dimer, transduce the signals through at least three different cascades: adenylate cyclase/cAMP, MAPK kinase, and phospholipase C mediated pathways.  相似文献   

18.
The metabotropic glutamate receptors (mGluRs) have been predicted to have a classical seven transmembrane domain structure similar to that seen for members of the G-protein-coupled receptor (GPCR) superfamily. However, the mGluRs (and other members of the family C GPCRs) show no sequence homology to the rhodopsin-like GPCRs, for which this seven transmembrane domain structure has been experimentally confirmed. Furthermore, several transmembrane domain prediction algorithms suggest that the mGluRs have a topology that is distinct from these receptors. In the present study, we set out to test whether mGluR5 has seven true transmembrane domains. Using a variety of approaches in both prokaryotic and eukaryotic systems, our data provide strong support for the proposed seven transmembrane domain model of mGluR5. We propose that this membrane topology can be extended to all members of the family C GPCRs.  相似文献   

19.
G-protein coupled receptors (GPCRs) are a class of seven-helix transmembrane proteins that have been used in bioinformatics as the targets to facilitate drug discovery for human diseases. Although thousands of GPCR sequences have been collected, the ligand specificity of many GPCRs is still unknown and only one crystal structure of the rhodopsin-like family has been solved. Therefore, identifying GPCR types only from sequence data has become an important research issue. In this study, a novel technique for identifying GPCR types based on the weighted Levenshtein distance between two receptor sequences and the nearest neighbor method (NNM) is introduced, which can deal with receptor sequences with different lengths directly. In our experiments for classifying four classes (acetylcholine, adrenoceptor, dopamine, and serotonin) of the rhodopsin-like family of GPCRs, the error rates from the leave-one-out procedure and the leave-half-out procedure were 0.62% and 1.24%, respectively. These results are prior to those of the covariant discriminant algorithm, the support vector machine method, and the NNM with Euclidean distance.  相似文献   

20.
Naveed M  Khan A  Khan AU 《Amino acids》2012,42(5):1809-1823
G protein-coupled receptors (GPCRs) are transmembrane proteins, which transduce signals from extracellular ligands to intracellular G protein. Automatic classification of GPCRs can provide important information for the development of novel drugs in pharmaceutical industry. In this paper, we propose an evolutionary approach, GPCR-MPredictor, which combines individual classifiers for predicting GPCRs. GPCR-MPredictor is a web predictor that can efficiently predict GPCRs at five levels. The first level determines whether a protein sequence is a GPCR or a non-GPCR. If the predicted sequence is a GPCR, then it is further classified into family, subfamily, sub-subfamily, and subtype levels. In this work, our aim is to analyze the discriminative power of different feature extraction and classification strategies in case of GPCRs prediction and then to use an evolutionary ensemble approach for enhanced prediction performance. Features are extracted using amino acid composition, pseudo amino acid composition, and dipeptide composition of protein sequences. Different classification approaches, such as k-nearest neighbor (KNN), support vector machine (SVM), probabilistic neural networks (PNN), J48, Adaboost, and Naives Bayes, have been used to classify GPCRs. The proposed hierarchical GA-based ensemble classifier exploits the prediction results of SVM, KNN, PNN, and J48 at each level. The GA-based ensemble yields an accuracy of 99.75, 92.45, 87.80, 83.57, and 96.17% at the five levels, on the first dataset. We further perform predictions on a dataset consisting of 8,000 GPCRs at the family, subfamily, and sub-subfamily level, and on two other datasets of 365 and 167 GPCRs at the second and fourth levels, respectively. In comparison with the existing methods, the results demonstrate the effectiveness of our proposed GPCR-MPredictor in classifying GPCRs families. It is accessible at .  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号