首页 | 本学科首页   官方微博 | 高级检索  
   检索      


In Silico Analysis of Phosphoproteome Data Suggests a Rich-get-richer Process of Phosphosite Accumulation over Evolution
Authors:Nozomu Yachie  Rintaro Saito  Junichi Sugahara  Masaru Tomita  and Yasushi Ishihama
Institution:3. Institute for Advanced Biosciences, Keio University, 403-1, Daihoji, Tsuruoka, Yamagata 997-0017, Japan;4. Systems Biology Program, Graduate School of Media and Governance and Keio University, Endo 5322, Fujisawa, Kanagawa 252-8520, Japan;5. Faculty of Environment and Information Studies, Keio University, Endo 5322, Fujisawa, Kanagawa 252-8520, Japan, and
Abstract:Recent phosphoproteome analyses using mass spectrometry-based technologies have provided new insights into the extensive presence of protein phosphorylation in various species and have raised the interesting question of how this protein modification was gained evolutionarily on such a large scale. We investigated this issue by using human and mouse phosphoproteome data. We initially found that phosphoproteins followed a power-law distribution with regard to their number of phosphosites: most of the proteins included only a few phosphosites, but some included dozens of phosphosites. The power-law distribution, unlike more commonly observed distributions such as normal and log-normal distributions, is considered by the field of complex systems science to be produced by a specific rich-get-richer process called preferential attachment growth. Therefore, we explored the factors that may have promoted the rich-get-richer process during phosphosite evolution. We conducted a bioinformatics analysis to evaluate the relationship of amino acid sequences of phosphoproteins with the positions of phosphosites and found an overconcentration of phosphosites in specific regions of protein surfaces and implications that in many phosphoproteins these clusters of phosphosites are activated simultaneously. Multiple phosphosites concentrated in limited spaces on phosphoprotein surfaces may therefore function biologically as cooperative modules that are resistant to selective pressures during phosphoprotein evolution. We therefore proposed a hypothetical model by which the modularization of multiple phosphosites has been resistant to natural selection and has driven the rich-get-richer process of the evolutionary growth of phosphosite numbers.Protein phosphorylation is an important and ubiquitous post-translational modification that regulates a variety of biological processes in various organisms (14). Reversible phosphorylations of serine, threonine, and tyrosine residues are critical steps in the control of signal transduction pathways (14). Recent advances in MS-based technologies and phosphopeptide enrichment methods have allowed high throughput and large scale in vivo phosphosite mapping for a wide variety of organisms such as human (58), mouse (9), yeast (1012), fly (13,14), bacteria (15,16), and plants (1719). Moreover information on several hundred to several thousand phosphosites from each study has been gathered in public databases such as Phospho.ELM (20), PhosphoSitePlus, PHOSIDA (21), and UniProt (22). However, the total number of phosphosites and most of their biological functions are still unknown. Similarly only about 10% of the estimated 500–600 human kinases target known phosphosite consensus sequences within their substrate proteins (23). Although the tyrosine phosphoproteome in Arabidopsis was recently published (24), the corresponding tyrosine kinases have not been identified because of the lack of known consensus sequences activated by tyrosine kinases.Computational data-mining approaches have been required to extract information from the large amount of accumulated phosphosite data obtained from experimental approaches. These approaches have also been used to add more meaningful information about each of the phosphosites to understand the proteome-wide protein phosphorylation in various organisms. One of the most useful strategies of computational data mining is to identify phosphorylated sequence motifs by extracting consensus sequences from the sets of amino acid sequences clustered around phosphorylated residues (25). A number of kinases and their corresponding recognition substrate motifs have been successfully identified by the experimental approach of incubating each target kinase with a combinatorial substrate peptide library and ATP, and these data are registered in various databases, including the Human Protein Reference Database (HPRD)1 (26). With this knowledge of documented kinases and their related sequence motifs, we can use computational biology techniques to discover additional phosphorylated motifs in the numerous substrates shown in phosphoproteomics studies to be biologically phosphorylated. This has allowed us to reconstruct the kinome on a large scale (2729).A comparative study of phosphoproteome data in multiple species has revealed that a wide range of phosphoproteins are relatively well conserved relative to non-phosphoproteins, and similarly many phosphoserine (Ser(P)), phosphothreonine (Thr(P)), and phosphotyrosine (Tyr(P)) phosphosites are well conserved compared with non-phosphorylated sites (21). Under natural selection, the emergence of phosphoproteins and the gain and loss of phosphosites should have changed the regulation of many intracellular systems, such as kinetic pathways, subcellular protein localization, and protein interactions and stabilization. This triggered our interest in the evolution of phosphoproteins and their phosphosites.In this study, we combined statistical physics and computational biology to investigate the role of selective pressure in the evolution of phosphoproteins and to create a model of the evolutionary gain of phosphosites. First using the human and mouse phosphoproteome data registered in public databases, we discovered that the number of phosphosites in each phosphoprotein follows a power-law distribution, which has been shown in complex systems science and statistical physics to emerge through a specific rich-get-richer process called preferential attachment growth (3032). We therefore hypothesized that phosphoproteins may have evolved through a rich-get-richer process, gaining new phosphosites according to a probability density proportional to their current number of phosphosites. Starting from this hypothesis, we then explored how this particular evolutionary pattern may have arisen during natural selection and suggested that sets of phosphosites localized in limited spaces on protein surfaces may function as cooperative modules that are resistant to selective pressures. Therefore, to explain phosphosite evolution, we proposed a model in which the evolutionary gain of phosphosites follows a rich-get-richer process and evolution is promoted by the development of cooperative functional modules on protein surfaces.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号