首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

A goal of systems biology is to analyze large-scale molecular networks including gene expressions and protein-protein interactions, revealing the relationships between network structures and their biological functions. Dividing a protein-protein interaction (PPI) network into naturally grouped parts is an essential way to investigate the relationship between topology of networks and their functions. However, clear modular decomposition is often hard due to the heterogeneous or scale-free properties of PPI networks.

Methodology/Principal Findings

To address this problem, we propose a diffusion model-based spectral clustering algorithm, which analytically solves the cluster structure of PPI networks as a problem of random walks in the diffusion process in them. To cope with the heterogeneity of the networks, the power factor is introduced to adjust the diffusion matrix by weighting the transition (adjacency) matrix according to a node degree matrix. This algorithm is named adjustable diffusion matrix-based spectral clustering (ADMSC). To demonstrate the feasibility of ADMSC, we apply it to decomposition of a yeast PPI network, identifying biologically significant clusters with approximately equal size. Compared with other established algorithms, ADMSC facilitates clear and fast decomposition of PPI networks.

Conclusions/Significance

ADMSC is proposed by introducing the power factor that adjusts the diffusion matrix to the heterogeneity of the PPI networks. ADMSC effectively partitions PPI networks into biologically significant clusters with almost equal sizes, while being very fast, robust and appealing simple.  相似文献   

2.
Understanding complex networks of protein-protein interactions (PPIs) is one of the foremost challenges of the post-genomic era. Due to the recent advances in experimental bio-technology, including yeast-2-hybrid (Y2H), tandem affinity purification (TAP) and other high-throughput methods for protein-protein interaction (PPI) detection, huge amounts of PPI network data are becoming available. Of major concern, however, are the levels of noise and incompleteness. For example, for Y2H screens, it is thought that the false positive rate could be as high as 64%, and the false negative rate may range from 43% to 71%. TAP experiments are believed to have comparable levels of noise.We present a novel technique to assess the confidence levels of interactions in PPI networks obtained from experimental studies. We use it for predicting new interactions and thus for guiding future biological experiments. This technique is the first to utilize currently the best fitting network model for PPI networks, geometric graphs. Our approach achieves specificity of 85% and sensitivity of 90%. We use it to assign confidence scores to physical protein-protein interactions in the human PPI network downloaded from BioGRID. Using our approach, we predict 251 interactions in the human PPI network, a statistically significant fraction of which correspond to protein pairs sharing common GO terms. Moreover, we validate a statistically significant portion of our predicted interactions in the HPRD database and the newer release of BioGRID. The data and Matlab code implementing the methods are freely available from the web site: http://www.kuchaev.com/Denoising.  相似文献   

3.
Essential proteins are those that are indispensable to cellular survival and development. Existing methods for essential protein identification generally rely on knock-out experiments and/or the relative density of their interactions (edges) with other proteins in a Protein-Protein Interaction (PPI) network. Here, we present a computational method, called EW, to first rank protein-protein interactions in terms of their Edge Weights, and then identify sub-PPI-networks consisting of only the highly-ranked edges and predict their proteins as essential proteins. We have applied this method to publicly-available PPI data on Saccharomyces cerevisiae (Yeast) and Escherichia coli (E. coli) for essential protein identification, and demonstrated that EW achieves better performance than the state-of-the-art methods in terms of the precision-recall and Jackknife measures. The highly-ranked protein-protein interactions by our prediction tend to be biologically significant in both the Yeast and E. coli PPI networks. Further analyses on systematically perturbed Yeast and E. coli PPI networks through randomly deleting edges demonstrate that the proposed method is robust and the top-ranked edges tend to be more associated with known essential proteins than the lowly-ranked edges.  相似文献   

4.
Protein-protein complex formation involves removal of water from the interface region. Surface regions with a small free energy penalty for water removal or desolvation may correspond to preferred interaction sites. A method to calculate the electrostatic free energy of placing a neutral low-dielectric probe at various protein surface positions has been designed and applied to characterize putative interaction sites. Based on solutions of the finite-difference Poisson equation, this method also includes long-range electrostatic contributions and the protein solvent boundary shape in contrast to accessible-surface-area-based solvation energies. Calculations on a large set of proteins indicate that in many cases (>90%), the known binding site overlaps with one of the six regions of lowest electrostatic desolvation penalty (overlap with the lowest desolvation region for 48% of proteins). Since the onset of electrostatic desolvation occurs even before direct protein-protein contact formation, it may help guide proteins toward the binding region in the final stage of complex formation. It is interesting that the probe desolvation properties associated with residue types were found to depend to some degree on whether the residue was outside of or part of a binding site. The probe desolvation penalty was on average smaller if the residue was part of a binding site compared to other surface locations. Applications to several antigen-antibody complexes demonstrated that the approach might be useful not only to predict protein interaction sites in general but to map potential antigenic epitopes on protein surfaces.  相似文献   

5.
6.
基因表达谱富集分析方法研究进展   总被引:1,自引:0,他引:1  
微阵列技术是生物技术变革的核心,允许研究者同时监测成千上万个基因的表达水平,已广泛应用于医学研究。如何挖掘海量基因表达信息中的有用信息并进行生物学专业解释,是基因表达谱数据分析领域所面临的一个重要挑战。不同的研究者提出了各种基于基因集进行富集分析的方法,在此将这些方法大致分为两大类,即bottom-up方法和top-down方法。前者先进行单基因分析,然后根据生物学领域知识注释基因集并进行分析。该方法应用广泛,且结果比单基因分析容易解释。后者先根据生物学领域知识将各基因进行归类,然后进行基因差异表达模式分析。该方法不仅能提高结论的可解释性,而且能达到降维的目的。  相似文献   

7.
8.
9.
Cellular functions are based on the complex interplay of proteins, therefore the structure and dynamics of these protein-protein interaction (PPI) networks are the key to the functional understanding of cells. In the last years, large-scale PPI networks of several model organisms were investigated. A number of theoretical models have been developed to explain both the network formation and the current structure. Favored are models based on duplication and divergence of genes, as they most closely represent the biological foundation of network evolution. However, studies are often based on simulated instead of empirical data or they cover only single organisms. Methodological improvements now allow the analysis of PPI networks of multiple organisms simultaneously as well as the direct modeling of ancestral networks. This provides the opportunity to challenge existing assumptions on network evolution. We utilized present-day PPI networks from integrated datasets of seven model organisms and developed a theoretical and bioinformatic framework for studying the evolutionary dynamics of PPI networks. A novel filtering approach using percolation analysis was developed to remove low confidence interactions based on topological constraints. We then reconstructed the ancient PPI networks of different ancestors, for which the ancestral proteomes, as well as the ancestral interactions, were inferred. Ancestral proteins were reconstructed using orthologous groups on different evolutionary levels. A stochastic approach, using the duplication-divergence model, was developed for estimating the probabilities of ancient interactions from today''s PPI networks. The growth rates for nodes, edges, sizes and modularities of the networks indicate multiplicative growth and are consistent with the results from independent static analysis. Our results support the duplication-divergence model of evolution and indicate fractality and multiplicative growth as general properties of the PPI network structure and dynamics.  相似文献   

10.
In addition to their biological function, protein complexes reduce the exposure of the constituent proteins to the risk of undesired oligomerization by reducing the concentration of the free monomeric state. We interpret this reduced risk as a stabilization of the functional state of the protein. We estimate that protein-protein interactions can account for of additional stabilization; a substantial contribution to intrinsic stability. We hypothesize that proteins in the interaction network act as evolutionary capacitors which allows their binding partners to explore regions of the sequence space which correspond to less stable proteins. In the interaction network of baker''s yeast, we find that statistically proteins that receive higher energetic benefits from the interaction network are more likely to misfold. A simplified fitness landscape wherein the fitness of an organism is inversely proportional to the total concentration of unfolded proteins provides an evolutionary justification for the proposed trends. We conclude by outlining clear biophysical experiments to test our predictions.  相似文献   

11.
Cholinergic neurons of the nucleus basalis (NB) are selectively vulnerable in Alzheimer's disease (AD), yet the molecular mechanisms associated with their dysfunction remain unknown. We used single cell RNA amplification and custom array technology to examine the expression of functional classes of mRNAs found in anterior NB neurons from normal aged and AD subjects. mRNAs encoding neurotrophin receptors, synaptic proteins, protein phosphatases, and amyloid-related proteins were evaluated. We found that trkB and trkC mRNAs were selectively down-regulated in NB neurons, whereas p75NTR mRNA levels remained stable in end stage AD. TrkA mRNA was reduced by approximately 28%, but did not reach statistical significance. There was a down-regulation of synaptophysin, synaptotagmin, and protein phosphatases PP1 and PP1 mRNAs in AD. In contrast, we found a selective up-regulation of cathepsin D mRNA in NB neurons in AD brain. Thus, anterior NB neurons undergo selective alterations in gene expression in AD. These results may provide clues to the molecular pathogenesis of NB neuronal degeneration during AD.  相似文献   

12.
Mutations in the angiogenic factor, angiogenin (ANG), have been identified in patients with both familial and sporadic amyotrophic lateral sclerosis (ALS) and are thought to have a neuroprotective function. Parkinsonism has been noted in kindreds with ANG mutations and variants in the ANG gene have been found to associate with PD in two Caucasian populations. We therefore hypothesized that mutations in ANG may also contribute to idiopathic Parkinson''s disease (PD). We sequenced ANG gene in a total of 1498 participants comprising 750 PD patients and 748 age/gender matched controls from Taiwan. We identified one novel synonymous substitution, c.C100T (p.L10L), in a single heterozygous state in one PD patient, which was not observed in controls. The clinical phenotypes and [99mTc]-TORDAT-SPECT images of the p.L10L carrier were similar to that seen in idiopathic PD. In addition, we also identified one common variant, c.T330G (p.G110G, rs11701), which was previously reported to associate with PD risk in Caucasians. However, the frequency of TG/GG genotype was comparable between PD cases and controls (odds ratio: 0.85, 95% confidence interval: 0.29–2.55, P = 0.78). Our results did not support that ANG rs11701 variant is a genetic risk factor for PD in our population. We conclude that mutations in ANG are not a common cause for idiopathic PD.  相似文献   

13.
Escherichia coli-mycobacterium shuttle vectors are important tools for gene expression and gene replacement in mycobacteria. However, most of the currently available vectors are limited in their use because of the lack of extended multiple cloning sites (MCSs) and convenience of appending an epitope tag(s) to the cloned open reading frames (ORFs). Here we report a new series of vectors that allow for the constitutive and regulatable expression of proteins, appended with peptide tag sequences at their N and C termini, respectively. The applicability of these vectors is demonstrated by the constitutive and induced expression of the Mycobacterium tuberculosis pknK gene, coding for protein kinase K, a serine-threonine protein kinase. Furthermore, a suicide plasmid with expanded MCS for creating gene replacements, a plasmid for chromosomal integrations at the commonly used L5 attB site, and a hypoxia-responsive vector, for expression of a gene(s) under hypoxic conditions that mimic latency, have also been created. Additionally, we have created a vector for the coexpression of two proteins controlled by two independent promoters, with each protein being in fusion with a different tag. The shuttle vectors developed in the present study are excellent tools for the analysis of gene function in mycobacteria and are a valuable addition to the existing repertoire of vectors for mycobacterial research.  相似文献   

14.
As pharmacodynamic drug-drug interactions (PD DDIs) could lead to severe adverse effects in patients, it is important to identify potential PD DDIs in drug development. The signaling starting from drug targets is propagated through protein-protein interaction (PPI) networks. PD DDIs could occur by close interference on the same targets or within the same pathways as well as distant interference through cross-talking pathways. However, most of the previous approaches have considered only close interference by measuring distances between drug targets or comparing target neighbors. We have applied a random walk with restart algorithm to simulate signaling propagation from drug targets in order to capture the possibility of their distant interference. Cross validation with DrugBank and Kyoto Encyclopedia of Genes and Genomes DRUG shows that the proposed method outperforms the previous methods significantly. We also provide a web service with which PD DDIs for drug pairs can be analyzed at http://biosoft.kaist.ac.kr/targetrw.  相似文献   

15.
滋养层细胞侵袭相关基因表达谱分析   总被引:1,自引:0,他引:1  
分离收集正常妊娠第8~12周的细胞滋养层细胞和绒毛外滋养层细胞,提取细胞总RNA,制备cRNA探针并与AffymetrixU133plus2.0基因芯片进行杂交,获得正常细胞滋养层细胞和绒毛外滋养层细胞基因表达谱芯片。经计算机分析共筛选到1318个差异表达基因,其中上调基因813个,下调505个。所有差异表达基因按GeneOntoloty功能分类标准进行了功能检索。为胚胎发育早期绒毛外滋养层细胞侵袭的基因调控机制的研究提供了实验基础。  相似文献   

16.
17.
RNA-Seq已成为当前转录组学研究的强有力工具,尤其在肿瘤差异表达基因的筛选方面有重要的应用价值。为进一步阐明肝细胞癌(HCC)的分子机制,本研究对GEO中1个包括12对HCC组织标本的RNA-Seq数据集(GSE63863)进行了生物信息学分析。采用edgeR、DESeq2、voom等3种不同算法的软件进行统计分析,共获得976个差异表达基因(adj. p-value<0.01或FDR<0.01,|logFC|≥2),其中上调表达422个(43.2%),下调554个(56.8%)。GO富集分析显示这些差异表达基因主要涉及离子结合、氧化还原酶活性等分子功能以及氧化还原、细胞分裂等生物学过程;KEGG通路分析显示,这些差异表达基因主要涉及细胞周期、视黄醇等代谢通路。STRING分析显示,共有654个基因编码的蛋白质存在相互作用,进一步利用MCODE分析显示,169个基因编码蛋白构成4个子网络,相应的中心节点基因分别为UBE2C、GNG4、TTR、FOS,这些基因的异常表达可能在HCC的发生发展过程中具有重要作用。上述研究结果将为进一步阐明HCC分子发病机制、寻找新型生物标志物提供初步的依据。  相似文献   

18.
Cellular functions are always performed by protein complexes. At present, many approaches have been proposed to identify protein complexes from protein–protein interaction (PPI) networks. Some approaches focus on detecting local dense subgraphs in PPI networks which are regarded as protein‐complex cores, then identify protein complexes by including local neighbors. However, from gene expression profiles at different time points or tissues it is known that proteins are dynamic. Therefore, identifying dynamic protein complexes should become very important and meaningful. In this study, a novel core‐attachment–based method named CO‐DPC to detect dynamic protein complexes is presented. First, CO‐DPC selects active proteins according to gene expression profiles and the 3‐sigma principle, and constructs dynamic PPI networks based on the co‐expression principle and PPI networks. Second, CO‐DPC detects local dense subgraphs as the cores of protein complexes and then attach close neighbors of these cores to form protein complexes. In order to evaluate the method, the method and the existing algorithms are applied to yeast PPI networks. The experimental results show that CO‐DPC performs much better than the existing methods. In addition, the identified dynamic protein complexes can match very well and thus become more meaningful for future biological study.  相似文献   

19.
The quorum sensing (QS) system, as a well-functioning population-dependent gene switch, has been widely applied in many gene circuits in synthetic biology. In our work, an efficient cell density-controlled expression system (QS) was established via engineering of the Vibrio fischeri luxI-luxR quorum sensing system. In order to achieve in vivo programmed gene expression, a synthetic binary regulation circuit (araQS) was constructed by assembling multiple genetic components, including the quorum quenching protein AiiA and the arabinose promoter ParaBAD, into the QS system. In vitro expression assays verified that the araQS system was initiated only in the absence of arabinose in the medium at a high cell density. In vivo expression assays confirmed that the araQS system presented an in vivo-triggered and cell density-dependent expression pattern. Furthermore, the araQS system was demonstrated to function well in different bacteria, indicating a wide range of bacterial hosts for use. To explore its potential applications in vivo, the araQS system was used to control the production of a heterologous protective antigen in an attenuated Edwardsiella tarda strain, which successfully evoked efficient immune protection in a fish model. This work suggested that the araQS system could program bacterial expression in vivo and might have potential uses, including, but not limited to, bacterial vector vaccines.  相似文献   

20.
蛋白质的亚细胞定位是进行蛋白质功能研究的重要信息.蛋白质合成后被转运到特定的细胞器中,只有转运到正确的部位才能参与细胞的各种生命活动,有效地发挥功能.尝试了将保守序列及蛋白质相互作用数据的编码信息结合传统的氨基酸组成编码,采用支持向量机进行蛋白质亚细胞定位预测,在真核生物中5轮交叉验证精度达到91.8%,得到了显著的提高.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号