首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Identification of potential viral-host protein interactions is a vital and useful approach towards development of new drugs targeting those interactions. In recent days, computational tools are being utilized for predicting viral-host interactions. Recently a database containing records of experimentally validated interactions between a set of HIV-1 proteins and a set of human proteins has been published. The problem of predicting new interactions based on this database is usually posed as a classification problem. However, posing the problem as a classification one suffers from the lack of biologically validated negative interactions. Therefore it will be beneficial to use the existing database for predicting new viral-host interactions without the need of negative samples. Motivated by this, in this article, the HIV-1-human protein interaction database has been analyzed using association rule mining. The main objective is to identify a set of association rules both among the HIV-1 proteins and among the human proteins, and use these rules for predicting new interactions. In this regard, a novel association rule mining technique based on biclustering has been proposed for discovering frequent closed itemsets followed by the association rules from the adjacency matrix of the HIV-1-human interaction network. Novel HIV-1-human interactions have been predicted based on the discovered association rules and tested for biological significance. For validation of the predicted new interactions, gene ontology-based and pathway-based studies have been performed. These studies show that the human proteins which are predicted to interact with a particular viral protein share many common biological activities. Moreover, literature survey has been used for validation purpose to identify some predicted interactions that are already validated experimentally but not present in the database. Comparison with other prediction methods is also discussed.  相似文献   

3.
Protein interaction networks have become a tool to study biological processes, either for predicting molecular functions or for designing proper new drugs to regulate the main biological interactions. Furthermore, such networks are known to be organized in sub-networks of proteins contributing to the same cellular function. However, the protein function prediction is not accurate and each protein has traditionally been assigned to only one function by the network formalism. By considering the network of the physical interactions between proteins of the yeast together with a manual and single functional classification scheme, we introduce a method able to reveal important information on protein function, at both micro- and macro-scale. In particular, the inspection of the properties of oscillatory dynamics on top of the protein interaction network leads to the identification of misclassification problems in protein function assignments, as well as to unveil correct identification of protein functions. We also demonstrate that our approach can give a network representation of the meta-organization of biological processes by unraveling the interactions between different functional classes.  相似文献   

4.
Elucidation of the interaction of proteins with different molecules is of significance in the understanding of cellular processes. Computational methods have been developed for the prediction of protein-protein interactions. But insufficient attention has been paid to the prediction of protein-RNA interactions, which play central roles in regulating gene expression and certain RNA-mediated enzymatic processes. This work explored the use of a machine learning method, support vector machines (SVM), for the prediction of RNA-binding proteins directly from their primary sequence. Based on the knowledge of known RNA-binding and non-RNA-binding proteins, an SVM system was trained to recognize RNA-binding proteins. A total of 4011 RNA-binding and 9781 non-RNA-binding proteins was used to train and test the SVM classification system, and an independent set of 447 RNA-binding and 4881 non-RNA-binding proteins was used to evaluate the classification accuracy. Testing results using this independent evaluation set show a prediction accuracy of 94.1%, 79.3%, and 94.1% for rRNA-, mRNA-, and tRNA-binding proteins, and 98.7%, 96.5%, and 99.9% for non-rRNA-, non-mRNA-, and non-tRNA-binding proteins, respectively. The SVM classification system was further tested on a small class of snRNA-binding proteins with only 60 available sequences. The prediction accuracy is 40.0% and 99.9% for snRNA-binding and non-snRNA-binding proteins, indicating a need for a sufficient number of proteins to train SVM. The SVM classification systems trained in this work were added to our Web-based protein functional classification software SVMProt, at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi. Our study suggests the potential of SVM as a useful tool for facilitating the prediction of protein-RNA interactions.  相似文献   

5.
Structural classification of membrane proteins is still in its infancy due to the relative paucity of available three‐dimensional structures compared with soluble proteins. However, recent technological advances in protein structure determination have led to a significant increase in experimentally known membrane protein folds, warranting exploration of the structural universe of membrane proteins. Here, a new and completely membrane protein specific structural classification system is introduced that classifies α‐helical membrane proteins according to common helix architectures. Each membrane protein is represented by a helix interaction graph depicting transmembrane helices with their pairwise interactions resulting from individual residue contacts. Subsequently, proteins are clustered according to similarities among these helix interaction graphs using a newly developed structural similarity score called HISS. As HISS scores explicitly disregard structural properties of loop regions, they are more suitable to capture conserved transmembrane helix bundle architectures than other structural similarity scores. Importantly, we are able to show that a classification approach based on helix interaction similarity closely resembles conventional structural classification databases such as SCOP and CATH implying that helix interactions are one of the major determinants of α‐helical membrane protein folds. Furthermore, the classification of all currently available membrane protein structures into 20 recurrent helix architectures and 15 singleton proteins demonstrates not only an impressive variability of membrane helix bundles but also the conservation of common helix interaction patterns among proteins with distinctly different sequences. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

6.
Iterative cluster analysis of protein interaction data   总被引:3,自引:0,他引:3  
MOTIVATION: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. RESULTS: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are considered. We show that this novel strategy has advantages over conventional clustering methods to explore protein-protein interaction data. UVCLUSTER easily incorporates the information of the largest available interaction datasets to generate comprehensive primary distance tables. The versatility, simplicity of use and high speed of UVCLUSTER on standard personal computers suggest that it can be a benchmark analytical tool for interactome data analysis. AVAILABILITY: The program is available upon request from the authors, free for academic users. Additional information available at http://www.uv.es/genomica/UVCLUSTER.  相似文献   

7.
We report a theoretical investigation on the different stabilities of two plastocyanins. The first one belongs to the thermophilic cyanobacterium Phormidium laminosum and the second one belongs to its mesophilic relative Synechocystis sp. These proteins share the same topology and secondary-structure elements; however, the melting temperatures of their oxidised species differ by approximately 15 K. Long-time-scale molecular dynamics simulations, performed at different temperatures, show that the thermophilic protein optimises a set of intramolecular interactions (interstrand hydrogen bonding, salt bridging and hydrophobic clustering) within the region that comprises the strands β5 and β6, loop L5 and the helix. This region exhibits most of the differences in the primary sequence between the two proteins and, in addition, it is involved in the interaction with known physiological partners. Further work is in progress to unveil the specific structural features responsible for the different thermal stability of the two proteins.  相似文献   

8.
Self-organization of tree form: a model for complex social systems   总被引:1,自引:0,他引:1  
  相似文献   

9.
Predicting protein function is one of the most challenging problems of the post-genomic era. The development of experimental methods for genome scale analysis of molecular interaction networks has provided new approaches to inferring protein function. In this paper we introduce a new graph-based semi-supervised classification algorithm Sequential Linear Neighborhood Propagation (SLNP), which addresses the problem of the classification of partially labeled protein interaction networks. The proposed SLNP first constructs a sequence of node sets according to their shortest distance to the labeled nodes, and then predicts the function of the unlabel proteins from the set closer to labeled one, using Linear Neighborhood Propagation. Its performance is assessed on the Saccharomyces cerevisiae PPI network data sets, with good results compared with three current state-of-the-art algorithms, especially in settings where only a small fraction of the proteins are labeled.  相似文献   

10.
Idiopathic pulmonary fibrosis (IPF), characterized by irreversible scarring and progressive destruction of the lung tissue, is one of the most common types of idiopathic interstitial pneumonia worldwide. However, there are no reliable candidates for curative therapies. Hence, elucidation of the mechanisms of IPF genesis and exploration of potential biomarkers and prognostic indicators are essential for accurate diagnosis and treatment of IPF. Recently, efficient microarray and bioinformatics analyses have promoted an understanding of the molecular mechanisms of disease occurrence and development, which is necessary to explore genetic alternations and identify potential diagnostic biomarkers. However, high false-positive rates results have been observed based on single microarray datasets. In the current study, we performed a comprehensive analysis of the differential expression, biological functions, and interactions of IPF-related genes. Three publicly available microarray datasets including 54 IPF samples and 34 normal samples were integrated by performing gene set enrichment analysis and analyzing differentially expressed genes (DEGs). Our results identified 350 DEGs genetically associated with IPF. Gene ontology analyses revealed that the changes in the modules were mostly enriched in the positive regulation of smooth muscle cell proliferation, positive regulation of inflammatory responses, and the extracellular space. Kyoto encyclopedia of genes and genomes enrichment analysis of DEGs revealed that IPF involves the TNF signaling pathway, NOD-like receptor signaling pathway, and PPAR signaling pathway. To identify key genes related to IPF in the protein-protein interaction network, 20 hub genes were screened out with highest scores. Our results provided a framework for developing new pathological molecular networks related to specific diseases in silico.  相似文献   

11.
12.
ABSTRACT: BACKGROUND: Identification of protein structural cores requires isolation of sets of proteins all sharing a same subset of structural motifs. In the context of ever growing number of available 3D protein structures, standard and automatic clustering algorithms require adaptations so as to allow for efficient identification of such sets of proteins. RESULTS: When considering a pair of 3D structures, they are stated as similar or not according to the local similarities of their matching substructures in a structural alignment. This binary relation can be represented in a graph of similarities where a node represents a 3D protein structure and an edge states that two 3D protein structures are similar. Therefore, the classification of proteins into structural families can be viewed as graph clustering task. Unfortunately, because such a graph encodes only pairwise similarity information, clustering algorithms may group in the same cluster a subset of 3D structures that do not share a common substructure. To overcome this drawback we first define a ternary similarity on a triple of 3D structures as a constraint to be satisfied by the graph of similarities. Such a ternary constraint takes into account similarities between pairwise alignments, so as to ensure that the three involved protein structures do have some common substructure. We propose hereunder a modification algorithm that eliminates edges from the original graph of similarities and outputs a reduced graph in which no ternary constraints are violated. Our proposition is then first to build a graph of similarities, then to reduce the graph according to the modification algorithm, and finally to apply to the reduced graph a standard graph clustering algorithm. We applied this method to ASTRAL-40 non-redundant protein domains, identifying significant pairwise similarities with Yakusa, a program devised for rapid 3D structure alignments. CONCLUSIONS: We show that filtering similarities prior to standard graph based clustering process by applying ternary similarity constraints i) improves the separation of proteins of different classes and consequently ii) improves the classification quality of standard graph based clustering algorithms according to the reference classification SCOP.  相似文献   

13.
One of the underlying principles in drug discovery is that a biologically active compound is complimentary in shape and molecular recognition features to its receptor. This principle infers that molecules binding to the same receptor may share some common features. Here, we have investigated whether the electrostatic similarity can be used for the discovery of small molecule protein-protein interaction inhibitors (SMPPIIs). We have developed a method that can be used to evaluate the similarity of electrostatic potentials between small molecules and known protein ligands. This method was implemented in a software called EleKit. Analyses of all available (at the time of research) SMPPII structures indicate that SMPPIIs bear some similarities of electrostatic potential with the ligand proteins of the same receptor. This is especially true for the more polar SMPPIIs. Retrospective analysis of several successful SMPPIIs has shown the applicability of EleKit in the design of new SMPPIIs.  相似文献   

14.
Cytoprophet is a software tool that allows prediction and visualization of protein and domain interaction networks. It is implemented as a plug-in of Cytoscape, an open source software framework for analysis and visualization of molecular networks. Cytoprophet implements three algorithms that predict new potential physical interactions using the domain composition of proteins and experimental assays. The algorithms for protein and domain interaction inference include maximum likelihood estimation (MLE) using expectation maximization (EM); the set cover approach maximum specificity set cover (MSSC) and the sum-product algorithm (SPA). After accepting an input set of proteins with Uniprot ID/Accession numbers and a selected prediction algorithm, Cytoprophet draws a network of potential interactions with probability scores and GO distances as edge attributes. A network of domain interactions between the domains of the initial protein list can also be generated. Cytoprophet was designed to take advantage of the visual capabilities of Cytoscape and be simple to use. An example of inference in a signaling network of myxobacterium Myxococcus xanthus is presented and available at Cytoprophet's website. AVAILABILITY: http://cytoprophet.cse.nd.edu.  相似文献   

15.
In this paper, the structure and evolution of the protein interaction network of the yeast Saccharomyces cerevisiae is analyzed. The network is viewed as a graph whose nodes correspond to proteins. Two proteins are connected by an edge if they interact. The network resembles a random graph in that it consists of many small subnets (groups of proteins that interact with each other but do not interact with any other protein) and one large connected subnet comprising more than half of all interacting proteins. The number of interactions per protein appears to follow a power law distribution. Within approximately 200 Myr after a duplication, the products of duplicate genes become almost equally likely to (1) have common protein interaction partners and (2) be part of the same subnetwork as two proteins chosen at random from within the network. This indicates that the persistence of redundant interaction partners is the exception rather than the rule. After gene duplication, the likelihood that an interaction gets lost exceeds 2.2 x 10(-3)/Myr. New interactions are estimated to evolve at a rate that is approximately three orders of magnitude smaller. Every 300 Myr, as many as half of all interactions may be replaced by new interactions.  相似文献   

16.
The Biomolecular Interaction Network Database (BIND: http://bind.ca) archives biomolecular interaction, complex and pathway information. A web-based system is available to query, view and submit records. BIND continues to grow with the addition of individual submissions as well as interaction data from the PDB and a number of large-scale interaction and complex mapping experiments using yeast two hybrid, mass spectrometry, genetic interactions and phage display. We have developed a new graphical analysis tool that provides users with a view of the domain composition of proteins in interaction and complex records to help relate functional domains to protein interactions. An interaction network clustering tool has also been developed to help focus on regions of interest. Continued input from users has helped further mature the BIND data specification, which now includes the ability to store detailed information about genetic interactions. The BIND data specification is available as ASN.1 and XML DTD.  相似文献   

17.
Dynamic protein-protein interactions are essential in all cellular and developmental processes. Protein-fragment complementation assays allow such protein-protein interactions to be investigated in vivo. In contrast to other protein-fragment complementation assays, the split-luciferase (split-LUC) complementation approach facilitates dynamic and quantitative in vivo analysis of protein interactions, as the restoration of luciferase activity upon protein-protein interaction of investigated proteins is reversible. Here, we describe the development of a floated-leaf luciferase complementation imaging (FLuCI) assay that enables rapid and quantitative in vivo analyses of protein interactions in leaf discs floating on a luciferin infiltration solution after transient expression of split-LUC-labelled interacting proteins in Nicotiana benthamiana. We generated a set of eight Gateway-compatible split-LUC destination vectors, enabling fast, and almost fail-safe cloning of candidate proteins to the LUC termini in all possible constellations. We demonstrate their functionality by visualizing the well-established homodimerization of the 14-3-3 regulator proteins. Quantitative interaction analyses of the molybdenum co-factor biosynthesis proteins CNX6 and CNX7 show that the luciferase-based protein-fragment complementation assay allows direct real-time monitoring of absolute values of protein complex assembly. Furthermore, the split-LUC assay is established as valuable tool to investigate the dynamics of protein interactions by monitoring the disassembly of actin filaments in planta. The new Gateway-compatible split-LUC destination vector system, in combination with the FLuCI assay, provides a useful means to facilitate quantitative analyses of interactions between large numbers of proteins constituting interaction networks in plant cells.  相似文献   

18.
A new method has been developed to detect functional relationships among proteins independent of a given sequence or fold homology. It is based on the idea that protein function is intimately related to the recognition and subsequent response to the binding of a substrate or an endogenous ligand in a well-characterized binding pocket. Thus, recognition of similar ligands, supposedly linked to similar function, requires conserved recognition features exposed in terms of common physicochemical interaction properties via the functional groups of the residues flanking a particular binding cavity. Following a technique commonly used in the comparison of small molecule ligands, generic pseudocenters coding for possible interaction properties were assigned for a large sample set of cavities extracted from the entire PDB and stored in the database Cavbase. Using a particular query cavity a series of related cavities of decreasing similarity is detected based on a clique detection algorithm. The detected similarity is ranked according to property-based surface patches shared in common by the different clique solutions. The approach either retrieves protein cavities accommodating the same (e.g. co-factors) or closely related ligands or it extracts proteins exhibiting similar function in terms of a related catalytic mechanism. Finally the new method has strong potential to suggest alternative molecular skeletons in de novo design. The retrieval of molecular building blocks accommodated in a particular sub-pocket that shares similarity with the pocket in a protein studied by drug design can inspire the discovery of novel ligands.  相似文献   

19.
Intrinsically disordered proteins (IDPs) exist without the presence of a stable tertiary structure in isolation. These proteins are often involved in molecular recognition processes via their disordered binding regions that can recognize partner molecules by undergoing a coupled folding and binding process. The specific properties of disordered binding regions give way to specific, yet transient interactions that enable IDPs to play central roles in signaling pathways and act as hubs of protein interaction networks. An alternative model of protein-protein interactions with largely overlapping functional properties is offered by the concept of linear interaction motifs. This approach focuses on distilling a short consensus sequence pattern from proteins with a common interaction partner. These motifs often reside in disordered regions and are considered to mediate the interaction roughly independent from the rest of the protein. Although a connection between linear motifs and disordered binding regions has been established through common examples, the complementary nature of the two concepts has yet to be fully explored. In many cases the sequence based definition of linear motifs and the structural context based definition of disordered binding regions describe two aspects of the same phenomenon. To gain insight into the connection between the two models, prediction methods were utilized. We combined the regular expression based prediction of linear motifs with the disordered binding region prediction method ANCHOR, each specialized for either model to get the best of both worlds. The thorough analysis of the overlap of the two methods offers a bioinformatics tool for more efficient binding site prediction that can serve a wide range of practical implications. At the same time it can also shed light on the theoretical connection between the two co-existing interaction models.  相似文献   

20.
Li M  Liu J  Ran X  Fang M  Shi J  Qin H  Goh JM  Song J 《Biophysical journal》2006,91(11):4201-4209
Many proteins expressed in Escherichia coli cells form inclusion bodies that are neither refoldable nor soluble in buffers. Very surprisingly, we recently discovered that all 11 buffer-insoluble protein fragments/domains we have, with a great diversity of cellular function, location, and molecular size, could be easily solubilized in salt-free water. The circular dichroism (CD) and NMR characterization led to classification of these proteins into three groups: group 1, with no secondary structure by CD and with narrowly-dispersed but sharp (1)H-(15)N heteronuclear single quantum correlation (HSQC) peaks; group 2, with secondary structure by CD but with HSQC peaks broadened and, consequently, only a small set of peaks detectable; and group 3, with secondary structure by CD and also well-separated HSQC peaks. Intriguingly, we failed to find any protein with a tight tertiary packing. Therefore, we propose that buffer-insoluble proteins may lack intrinsic ability to reach or/and to maintain a well-packed conformation, and thus are trapped in partially-folded states with many hydrophobic side chains exposed to the bulk solvent. As such, a very low ionic strength is sufficient to screen out intrinsic repulsive interactions and, consequently, allow the hydrophobic clustering/aggregation to occur. Marvelously enough, it appears that in pure water, proteins have the potential to manifest their full spectrum of structural states by utilizing intrinsic repulsive interactions to suppress the attractive hydrophobic clustering. Our discovery not only gives a novel insight into the properties of insoluble proteins, but also sheds the first light that we know of on previously unknown regimes associated with proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号