共查询到20条相似文献,搜索用时 0 毫秒
1.
Background
Recent computational techniques have facilitated analyzing genome-wide protein-protein interaction data for several model organisms. Various graph-clustering algorithms have been applied to protein interaction networks on the genomic scale for predicting the entire set of potential protein complexes. In particular, the density-based clustering algorithms which are able to generate overlapping clusters, i.e. the clusters sharing a set of nodes, are well-suited to protein complex detection because each protein could be a member of multiple complexes. However, their accuracy is still limited because of complex overlap patterns of their output clusters.Results
We present a systematic approach of refining the overlapping clusters identified from protein interaction networks. We have designed novel metrics to assess cluster overlaps: overlap coverage and overlapping consistency. We then propose an overlap refinement algorithm. It takes as input the clusters produced by existing density-based graph-clustering methods and generates a set of refined clusters by parameterizing the metrics. To evaluate protein complex prediction accuracy, we used the f-measure by comparing each refined cluster to known protein complexes. The experimental results with the yeast protein-protein interaction data sets from BioGRID and DIP demonstrate that accuracy on protein complex prediction has increased significantly after refining cluster overlaps.Conclusions
The effectiveness of the proposed cluster overlap refinement approach for protein complex detection has been validated in this study. Analyzing overlaps of the clusters from protein interaction networks is a crucial task for understanding of functional roles of proteins and topological characteristics of the functional systems.2.
NetAlign is a web-based tool designed to enable comparative analysis of protein interaction networks (PINs). NetAlign compares a query PIN with a target PIN by combining interaction topology and sequence similarity to identify conserved network substructures (CoNSs), which may derive from a common ancestor and disclose conserved topological organization of interactions in evolution. To exemplify the application of NetAlign, we perform two genome-scale comparisons with (1) the Escherichia coli PIN against the Helicobacter pylori PIN and (2) the Saccharomyces cerevisiae PIN against the Caenorrhabditis elegans PIN. Many of the identified CoNSs correspond to known complexes; therefore, cross-species PIN comparison provides a way for discovery of conserved modules. In addition, based on the species-to-species differences in CoNSs, we reformulate the problems of protein-protein interaction (PPI) prediction and species divergence from a network perspective. AVAILABILITY: http://www1.ustc.edu.cn/lab/pcrystal/NetAlign. 相似文献
3.
Markillie LM Lin CT Adkins JN Auberry DL Hill EA Hooker BS Moore PA Moore RJ Shi L Wiley HS Kery V 《Journal of proteome research》2005,4(2):268-274
Most current methods for purification and identification of protein complexes use endogenous expression of affinity-tagged bait, tandem affinity tag purification of protein complexes followed by specific elution of complexes from beads, and gel separation and in-gel digestion prior to mass spectrometric analysis of protein interactors. We propose a single affinity tag in vitro pull-down assay with denaturing elution, trypsin digestion in organic solvent, and LC-ESI MS/MS protein identification using SEQUEST analysis. Our method is simple and easy to scale-up and automate, making it suitable for high-throughput mapping of protein interaction networks and functional proteomics. 相似文献
4.
An automated method for finding molecular complexes in large protein interaction networks 总被引:12,自引:0,他引:12
Background
Recent advances in proteomics technologies such as two-hybrid, phage display and mass spectrometry have enabled us to create a detailed map of biomolecular interaction networks. Initial mapping efforts have already produced a wealth of data. As the size of the interaction set increases, databases and computational methods will be required to store, visualize and analyze the information in order to effectively aid in knowledge discovery. 相似文献5.
6.
Cytoprophet is a software tool that allows prediction and visualization of protein and domain interaction networks. It is implemented as a plug-in of Cytoscape, an open source software framework for analysis and visualization of molecular networks. Cytoprophet implements three algorithms that predict new potential physical interactions using the domain composition of proteins and experimental assays. The algorithms for protein and domain interaction inference include maximum likelihood estimation (MLE) using expectation maximization (EM); the set cover approach maximum specificity set cover (MSSC) and the sum-product algorithm (SPA). After accepting an input set of proteins with Uniprot ID/Accession numbers and a selected prediction algorithm, Cytoprophet draws a network of potential interactions with probability scores and GO distances as edge attributes. A network of domain interactions between the domains of the initial protein list can also be generated. Cytoprophet was designed to take advantage of the visual capabilities of Cytoscape and be simple to use. An example of inference in a signaling network of myxobacterium Myxococcus xanthus is presented and available at Cytoprophet's website. AVAILABILITY: http://cytoprophet.cse.nd.edu. 相似文献
7.
In this paper, we describe an algorithm which can be used to generate the collection of networks, in order of increasing size, that are compatible with a list of chemical reactions and that satisfy a constraint. Our algorithm has been encoded and the software, called Netscan, can be freely downloaded from ftp://ftp.stat.ubc.ca/pub/riffraff/Netscanfiles, along with a manual, for general scientific use. Our algorithm may require pre-processing to ensure that the quantities it acts on are physically relevant and because it outputs sets of reactions, which we call canonical networks, that must be elaborated into full networks. 相似文献
8.
With the advent of large-scale protein interaction studies, there is much debate about data quality. Can different noise levels in the measurements be assessed by analyzing network structure? Because proteomic regulation is inherently co-operative, modular and redundant, it is inherently compressible when represented as a network. Here we propose that network compression can be used to compare false positive and false negative noise levels in protein interaction networks. We validate this hypothesis by first confirming the detrimental effect of false positives and false negatives. Second, we show that gold standard networks are more compressible. Third, we show that compressibility correlates with co-expression, co-localization, and shared function. Fourth, we also observe correlation with better protein tagging methods, physiological expression in contrast to over-expression of tagged proteins, and smart pooling approaches for yeast two-hybrid screens. Overall, this new measure is a proxy for both sensitivity and specificity and gives complementary information to standard measures such as average degree and clustering coefficients. 相似文献
9.
In this work, we introduce a novel network synthesis model that can generate families of evolutionarily related synthetic protein-protein interaction (PPI) networks. Given an ancestral network, the proposed model generates the network family according to a hypothetical phylogenetic tree, where the descendant networks are obtained through duplication and divergence of their ancestors, followed by network growth using network evolution models. We demonstrate that this network synthesis model can effectively create synthetic networks whose internal and cross-network properties closely resemble those of real PPI networks. The proposed model can serve as an effective framework for generating comprehensive benchmark datasets that can be used for reliable performance assessment of comparative network analysis algorithms. Using this model, we constructed a large-scale network alignment benchmark, called NAPAbench, and evaluated the performance of several representative network alignment algorithms. Our analysis clearly shows the relative performance of the leading network algorithms, with their respective advantages and disadvantages. The algorithm and source code of the network synthesis model and the network alignment benchmark NAPAbench are publicly available at http://www.ece.tamu.edu/bjyoon/NAPAbench/. 相似文献
10.
11.
MOTIVATION: Graph drawing algorithms are often used for visualizing relational information, but a naive implementation of a graph drawing algorithm encounters real difficulties when drawing large-scale graphs such as protein interaction networks. RESULTS: We have developed a new, extremely fast layout algorithm for visualizing large-scale protein interaction networks in the three-dimensional space. The algorithm (1) first finds a layout of connected components of an entire network, (2) finds a global layout of nodes with respect to pivot nodes within a connected component and (3) refines the local layout of each connected component by first relocating midnodes with respect to their cutvertices and direct neighbors of the cutvertices and then by relocating all nodes with respect to their neighbors within distance 2. Advantages of this algorithm over classical graph drawing methods include: (1) it is an order of magnitude faster, (2) it can directly visualize data from protein interaction databases and (3) it provides several abstraction and comparison operations for effectively analyzing large-scale protein interaction networks. AVAILABILITY: http://wilab.inha.ac.kr/interviewer/ 相似文献
12.
The ligand interaction scan (LIScan) method is a general procedure for engineering small molecule ligand-regulated forms of a protein that is complementary to other 'reverse' genetic and chemical-genetic methods for drug-target validation. It involves insertional mutagenesis by a chemical-genetic 'switch', comprising a genetically encoded peptide module that binds with high affinity to a small-molecule ligand. We demonstrated the method with TEM-1 beta-lactamase, using a tetracysteine hexapeptide insert and a biarsenical fluorescein ligand (FlAsH). 相似文献
13.
RRW: repeated random walks on genome-scale protein networks for local cluster discovery 总被引:1,自引:0,他引:1
Background
We propose an efficient and biologically sensitive algorithm based on repeated random walks (RRW) for discovering functional modules, e.g., complexes and pathways, within large-scale protein networks. Compared to existing cluster identification techniques, RRW implicitly makes use of network topology, edge weights, and long range interactions between proteins. 相似文献14.
Iterative cluster analysis of protein interaction data 总被引:3,自引:0,他引:3
MOTIVATION: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. RESULTS: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are considered. We show that this novel strategy has advantages over conventional clustering methods to explore protein-protein interaction data. UVCLUSTER easily incorporates the information of the largest available interaction datasets to generate comprehensive primary distance tables. The versatility, simplicity of use and high speed of UVCLUSTER on standard personal computers suggest that it can be a benchmark analytical tool for interactome data analysis. AVAILABILITY: The program is available upon request from the authors, free for academic users. Additional information available at http://www.uv.es/genomica/UVCLUSTER. 相似文献
15.
Construction of reliable protein-protein interaction networks with a new interaction generality measure 总被引:3,自引:0,他引:3
MOTIVATION: Recent screening techniques have made large amounts of protein-protein interaction data available, from which biologically important information such as the function of uncharacterized proteins, the existence of novel protein complexes, and novel signal-transduction pathways can be discovered. However, experimental data on protein interactions contain many false positives, making these discoveries difficult. Therefore computational methods of assessing the reliability of each candidate protein-protein interaction are urgently needed. RESULTS: We developed a new 'interaction generality' measure (IG2) to assess the reliability of protein-protein interactions using only the topological properties of their interaction-network structure. Using yeast protein-protein interaction data, we showed that reliable protein-protein interactions had significantly lower IG2 values than less-reliable interactions, suggesting that IG2 values can be used to evaluate and filter interaction data to enable the construction of reliable protein-protein interaction networks. 相似文献
16.
Bork P 《Bioinformatics (Oxford, England)》2002,18(Z2):S64
Recent advances in proteomics and computational biology have lead to a flood of protein interaction data and resulting interaction networks (e.g. (Gavin et al., 2002)). Here I first analyse the status and quality of parts lists (genes and proteins), then comparatively assess large-scale protein interaction data (von Mering et al., 2002) and finally try to identify biological meaningful units (e.g. pathways, cellular processes) within interaction networks that are derived from the conservation of gene neighborhood (Snel et al., 2002). Possible extensions of gene neighborhood analysis to eukaryotes (von Mering and Bork, 2002) will be discussed. 相似文献
17.
Luo Feng; Yang Yunfeng; Chen Chin-Fu; Chang Roger; Zhou Jizhong; Scheuermann Richard H. 《Bioinformatics (Oxford, England)》2007,23(7):916
Bioinformatics (2007) 23(2), 相似文献
18.
M Syvanen 《Journal of theoretical biology》1985,112(2):333-343
It has been established that genes can be transferred and expressed among procaryotes of different species. I am hypothesizing--and there is mounting evidence for this conclusion--that genes are transferred and expressed among all species, and that such exchange is facilitated by, and can help account for, the existence of the biological unities, from the uniform genetic code to the cross-species similarity of the stages of embryological development. If this idea is correct, the uniformity of the genetic code would allow organisms to decipher and use genes transposed from chromosomes of foreign species, and the shared sequence of embryological development within each phylum would allow the organism to integrate these genes, particularly when the genes affect complex morphological traits. The cross-species gene transfer model could help explain many observations which have puzzled evolutionists, such as rapid bursts in evolution and the widespread occurrence of parallelism in the fossil record. 相似文献
19.
Modular organization of protein interaction networks 总被引:6,自引:0,他引:6
Luo F Yang Y Chen CF Chang R Zhou J Scheuermann RH 《Bioinformatics (Oxford, England)》2007,23(2):207-214
MOTIVATION: Accumulating evidence suggests that biological systems are composed of interacting, separable, functional modules. Identifying these modules is essential to understand the organization of biological systems. RESULT: In this paper, we present a framework to identify modules within biological networks. In this approach, the concept of degree is extended from the single vertex to the sub-graph, and a formal definition of module in a network is used. A new agglomerative algorithm was developed to identify modules from the network by combining the new module definition with the relative edge order generated by the Girvan-Newman (G-N) algorithm. A JAVA program, MoNet, was developed to implement the algorithm. Applying MoNet to the yeast core protein interaction network from the database of interacting proteins (DIP) identified 86 simple modules with sizes larger than three proteins. The modules obtained are significantly enriched in proteins with related biological process Gene Ontology terms. A comparison between the MoNet modules and modules defined by Radicchi et al. (2004) indicates that MoNet modules show stronger co-clustering of related genes and are more robust to ties in betweenness values. Further, the MoNet output retains the adjacent relationships between modules and allows the construction of an interaction web of modules providing insight regarding the relationships between different functional modules. Thus, MoNet provides an objective approach to understand the organization and interactions of biological processes in cellular systems. AVAILABILITY: MoNet is available upon request from the authors. 相似文献
20.
Schächter V 《BioTechniques》2002,(Z1):16-8, 20-4, 26-7
We survey recent techniques for construction and prediction of large-scale protein interaction networks, focusing on computational processing steps. Special emphasis is placed on critical assessment of data completeness and reliability of the various approaches. Once built, protein interaction networks can be used for functional annotation or to generate higher-level biological hypotheses on pathways. 相似文献