Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence,full structure,and active site microenvironment similarity期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence,full structure,and active site microenvironment similarity

Authors:	Janelle B Leuthaeuser Stacy T Knutson Kiran Kumar Patricia C Babbitt Jacquelyn S Fetrow

Affiliation:	1. Department of Molecular Genetics and Genomics, Wake Forest University, Winston‐Salem, North Carolina;2. Departments of Computer Science and Physics, Wake Forest University, Winston‐Salem, North Carolina;3. Department of Bioengineering and Therapeutic Sciences, Institute for Quantitative Biosciences University of California San Francisco, San Francisco, California;4. Department of Pharmaceutical Chemistry, Institute for Quantitative Biosciences University of California San Francisco, San Francisco, California;5. Office of the Provost, Maryland Hall 202, University of Richmond, VA

Abstract:	The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods.

Keywords:	active site profiling similarity-based clustering network-based clustering protein similarity network analysis Structure-Function Linkage Database (SFLD) protein function annotation function annotation transfer

设为首页 | 免责声明 | 关于勤云 | 加入收藏