首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Understanding protein complexes is important for understanding the science of cellular organization and function. Many computational methods have been developed to identify protein complexes from experimentally obtained protein-protein interaction (PPI) networks. However, interaction information obtained experimentally can be unreliable and incomplete. Reconstructing these PPI networks with PPI evidences from other sources can improve protein complex identification.

Results

We combined PPI information from 6 different sources and obtained a reconstructed PPI network for yeast through machine learning. Some popular protein complex identification methods were then applied to detect yeast protein complexes using the new PPI networks. Our evaluation indicates that protein complex identification algorithms using the reconstructed PPI network significantly outperform ones on experimentally verified PPI networks.

Conclusions

We conclude that incorporating PPI information from other sources can improve the effectiveness of protein complex identification.  相似文献   

2.

Background

Our knowledge of global protein-protein interaction (PPI) networks in complex organisms such as humans is hindered by technical limitations of current methods.

Results

On the basis of short co-occurring polypeptide regions, we developed a tool called MP-PIPE capable of predicting a global human PPI network within 3 months. With a recall of 23% at a precision of 82.1%, we predicted 172,132 putative PPIs. We demonstrate the usefulness of these predictions through a range of experiments.

Conclusions

The speed and accuracy associated with MP-PIPE can make this a potential tool to study individual human PPI networks (from genomic sequences alone) for personalized medicine.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0383-1) contains supplementary material, which is available to authorized users.  相似文献   

3.
Ou-Yang  Le  Yan  Hong  Zhang  Xiao-Fei 《BMC bioinformatics》2017,18(13):463-34

Background

The accurate identification of protein complexes is important for the understanding of cellular organization. Up to now, computational methods for protein complex detection are mostly focus on mining clusters from protein-protein interaction (PPI) networks. However, PPI data collected by high-throughput experimental techniques are known to be quite noisy. It is hard to achieve reliable prediction results by simply applying computational methods on PPI data. Behind protein interactions, there are protein domains that interact with each other. Therefore, based on domain-protein associations, the joint analysis of PPIs and domain-domain interactions (DDI) has the potential to obtain better performance in protein complex detection. As traditional computational methods are designed to detect protein complexes from a single PPI network, it is necessary to design a new algorithm that could effectively utilize the information inherent in multiple heterogeneous networks.

Results

In this paper, we introduce a novel multi-network clustering algorithm to detect protein complexes from multiple heterogeneous networks. Unlike existing protein complex identification algorithms that focus on the analysis of a single PPI network, our model can jointly exploit the information inherent in PPI and DDI data to achieve more reliable prediction results. Extensive experiment results on real-world data sets demonstrate that our method can predict protein complexes more accurately than other state-of-the-art protein complex identification algorithms.

Conclusions

In this work, we demonstrate that the joint analysis of PPI network and DDI network can help to improve the accuracy of protein complex detection.
  相似文献   

4.

Background

Effectively predicting protein complexes not only helps to understand the structures and functions of proteins and their complexes, but also is useful for diagnosing disease and developing new drugs. Up to now, many methods have been developed to detect complexes by mining dense subgraphs from static protein-protein interaction (PPI) networks, while ignoring the value of other biological information and the dynamic properties of cellular systems.

Results

In this paper, based on our previous works CPredictor and CPredictor2.0, we present a new method for predicting complexes from PPI networks with both gene expression data and protein functional annotations, which is called CPredictor3.0. This new method follows the viewpoint that proteins in the same complex should roughly have similar functions and are active at the same time and place in cellular systems. We first detect active proteins by using gene express data of different time points and cluster proteins by using gene ontology (GO) functional annotations, respectively. Then, for each time point, we do set intersections with one set corresponding to active proteins generated from expression data and the other set corresponding to a protein cluster generated from functional annotations. Each resulting unique set indicates a cluster of proteins that have similar function(s) and are active at that time point. Following that, we map each cluster of active proteins of similar function onto a static PPI network, and get a series of induced connected subgraphs. We treat these subgraphs as candidate complexes. Finally, by expanding and merging these candidate complexes, the predicted complexes are obtained.We evaluate CPredictor3.0 and compare it with a number of existing methods on several PPI networks and benchmarking complex datasets. The experimental results show that CPredictor3.0 achieves the highest F1-measure, which indicates that CPredictor3.0 outperforms these existing method in overall.

Conclusion

CPredictor3.0 can serve as a promising tool of protein complex prediction.
  相似文献   

5.
Wang J  Xie D  Lin H  Yang Z  Zhang Y 《Proteome science》2012,10(Z1):S18

Background

Many biological processes recognize in particular the importance of protein complexes, and various computational approaches have been developed to identify complexes from protein-protein interaction (PPI) networks. However, high false-positive rate of PPIs leads to challenging identification.

Results

A protein semantic similarity measure is proposed in this study, based on the ontology structure of Gene Ontology (GO) terms and GO annotations to estimate the reliability of interactions in PPI networks. Interaction pairs with low GO semantic similarity are removed from the network as unreliable interactions. Then, a cluster-expanding algorithm is used to detect complexes with core-attachment structure on filtered network. Our method is applied to three different yeast PPI networks. The effectiveness of our method is examined on two benchmark complex datasets. Experimental results show that our method performed better than other state-of-the-art approaches in most evaluation metrics.

Conclusions

The method detects protein complexes from large scale PPI networks by filtering GO semantic similarity. Removing interactions with low GO similarity significantly improves the performance of complex identification. The expanding strategy is also effective to identify attachment proteins of complexes.
  相似文献   

6.

Background

Studies of functional modules in a Protein-Protein Interaction (PPI) network contribute greatly to the understanding of biological mechanisms. With the development of computing science, computational approaches have played an important role in detecting functional modules.

Results

We present a new approach using multi-agent evolution for detection of functional modules in PPI networks. The proposed approach consists of two stages: the solution construction for agents in a population and the evolutionary process of computational agents in a lattice environment, where each agent corresponds to a candidate solution to the detection problem of functional modules in a PPI network. First, the approach utilizes a connection-based encoding scheme to model an agent, and employs a random-walk behavior merged topological characteristics with functional information to construct a solution. Next, it applies several evolutionary operators, i.e., competition, crossover, and mutation, to realize information exchange among agents as well as solution evolution. Systematic experiments have been conducted on three benchmark testing sets of yeast networks. Experimental results show that the approach is more effective compared to several other existing algorithms.

Conclusions

The algorithm has the characteristics of outstanding recall, F-measure, sensitivity and accuracy while keeping other competitive performances, so it can be applied to the biological study which requires high accuracy.  相似文献   

7.
Yang P  Li X  Wu M  Kwoh CK  Ng SK 《PloS one》2011,6(7):e21502

Background

Phenotypically similar diseases have been found to be caused by functionally related genes, suggesting a modular organization of the genetic landscape of human diseases that mirrors the modularity observed in biological interaction networks. Protein complexes, as molecular machines that integrate multiple gene products to perform biological functions, express the underlying modular organization of protein-protein interaction networks. As such, protein complexes can be useful for interrogating the networks of phenome and interactome to elucidate gene-phenotype associations of diseases.

Methodology/Principal Findings

We proposed a technique called RWPCN (Random Walker on Protein Complex Network) for predicting and prioritizing disease genes. The basis of RWPCN is a protein complex network constructed using existing human protein complexes and protein interaction network. To prioritize candidate disease genes for the query disease phenotypes, we compute the associations between the protein complexes and the query phenotypes in their respective protein complex and phenotype networks. We tested RWPCN on predicting gene-phenotype associations using leave-one-out cross-validation; our method was observed to outperform existing approaches. We also applied RWPCN to predict novel disease genes for two representative diseases, namely, Breast Cancer and Diabetes.

Conclusions/Significance

Guilt-by-association prediction and prioritization of disease genes can be enhanced by fully exploiting the underlying modular organizations of both the disease phenome and the protein interactome. Our RWPCN uses a novel protein complex network as a basis for interrogating the human phenome-interactome network. As the protein complex network can capture the underlying modularity in the biological interaction networks better than simple protein interaction networks, RWPCN was found to be able to detect and prioritize disease genes better than traditional approaches that used only protein-phenotype associations.  相似文献   

8.

Background

The nature of dynamic traits with their phenotypic plasticity suggests that they are under the control of a dynamic genetic regulation. We employed a precision phenotyping platform to non-invasively assess biomass yield in a large mapping population of triticale at three developmental stages.

Results

Using multiple-line cross QTL mapping we identified QTL for each of these developmental stages which explained a considerable proportion of the genotypic variance. Some QTL were identified at each developmental stage and thus contribute to biomass yield throughout the studied developmental phases. Interestingly, we also observed QTL that were only identified for one or two of the developmental stages illustrating a temporal contribution of these QTL to the trait. In addition, epistatic QTL were detected and the epistatic interaction landscape was shown to dynamically change with developmental progression.

Conclusions

In summary, our results reveal the temporal dynamics of the genetic architecture underlying biomass accumulation in triticale and emphasize the need for a temporal assessment of dynamic traits.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-458) contains supplementary material, which is available to authorized users.  相似文献   

9.

Background

Recently, large data sets of protein-protein interactions (PPI) which can be modeled as PPI networks are generated through high-throughput methods. And locally dense regions in PPI networks are very likely to be protein complexes. Since protein complexes play a key role in many biological processes, detecting protein complexes in PPI networks is one of important tasks in post-genomic era. However, PPI networks are often incomplete and noisy, which builds barriers to mining protein complexes.

Results

We propose a new and effective algorithm based on robustness to detect overlapping clusters as protein complexes in PPI networks. And in order to improve the accuracy of resulting clusters, our algorithm tries to reduce bad effects brought by noise in PPI networks. And in our algorithm, each new cluster begins from a seed and is expanded through adding qualified nodes from the cluster's neighbourhood nodes. Besides, in our algorithm, a new distance measurement method between a cluster K and a node in the neighbours of K is proposed as well. The performance of our algorithm is evaluated by applying it on two PPI networks which are Gavin network and Database of Interacting Proteins (DIP). The results show that our algorithm is better than Markov clustering algorithm (MCL), Clique Percolation method (CPM) and core-attachment based method (CoAch) in terms of F-measure, co-localization and Gene Ontology (GO) semantic similarity.

Conclusions

Our algorithm detects locally dense regions or clusters as protein complexes. The results show that protein complexes generated by our algorithm have better quality than those generated by some previous classic methods. Therefore, our algorithm is effective and useful.
  相似文献   

10.

Background

Human T-cell leukemia viruses (HTLV) tend to induce some fatal human diseases like Adult T-cell Leukemia (ATL) by targeting human T lymphocytes. To indentify the protein-protein interactions (PPI) between HTLV viruses and Homo sapiens is one of the significant approaches to reveal the underlying mechanism of HTLV infection and host defence. At present, as biological experiments are labor-intensive and expensive, the identified part of the HTLV-human PPI networks is rather small. Although recent years have witnessed much progress in computational modeling for reconstructing pathogen-host PPI networks, data scarcity and data unavailability are two major challenges to be effectively addressed. To our knowledge, no computational method for proteome-wide HTLV-human PPI networks reconstruction has been reported.

Results

In this work we develop Multi-instance Adaboost method to conduct homolog knowledge transfer for computationally reconstructing proteome-wide HTLV-human PPI networks. In this method, the homolog knowledge in the form of gene ontology (GO) is treated as auxiliary homolog instance to address the problems of data scarcity and data unavailability, while the potential negative knowledge transfer is automatically attenuated by AdaBoost instance reweighting. The cross validation experiments show that the homolog knowledge transfer in the form of independent homolog instances can effectively enrich the feature information and substitute for the missing GO information. Moreover, the independent tests show that the method can validate 70.3% of the recently curated interactions, significantly exceeding the 2.1% recognition rate by the HT-Y2H experiment. We have used the method to reconstruct the proteome-wide HTLV-human PPI networks and further conducted gene ontology based clustering of the predicted networks for further biomedical research. The gene ontology based clustering analysis of the predictions provides much biological insight into the pathogenesis of HTLV retroviruses.

Conclusions

The Multi-instance AdaBoost method can effectively address the problems of data scarcity and data unavailability for the proteome-wide HTLV-human PPI interaction networks reconstruction. The gene ontology based clustering analysis of the predictions reveals some important signaling pathways and biological modules that HTLV retroviruses are likely to target.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-245) contains supplementary material, which is available to authorized users.  相似文献   

11.

Background

Studying protein complexes is very important in biological processes since it helps reveal the structure-functionality relationships in biological networks and much attention has been paid to accurately predict protein complexes from the increasing amount of protein-protein interaction (PPI) data. Most of the available algorithms are based on the assumption that dense subgraphs correspond to complexes, failing to take into account the inherence organization within protein complex and the roles of edges. Thus, there is a critical need to investigate the possibility of discovering protein complexes using the topological information hidden in edges.

Results

To provide an investigation of the roles of edges in PPI networks, we show that the edges connecting less similar vertices in topology are more significant in maintaining the global connectivity, indicating the weak ties phenomenon in PPI networks. We further demonstrate that there is a negative relation between the weak tie strength and the topological similarity. By using the bridges, a reliable virtual network is constructed, in which each maximal clique corresponds to the core of a complex. By this notion, the detection of the protein complexes is transformed into a classic all-clique problem. A novel core-attachment based method is developed, which detects the cores and attachments, respectively. A comprehensive comparison among the existing algorithms and our algorithm has been made by comparing the predicted complexes against benchmark complexes.

Conclusions

We proved that the weak tie effect exists in the PPI network and demonstrated that the density is insufficient to characterize the topological structure of protein complexes. Furthermore, the experimental results on the yeast PPI network show that the proposed method outperforms the state-of-the-art algorithms. The analysis of detected modules by the present algorithm suggests that most of these modules have well biological significance in context of complexes, suggesting that the roles of edges are critical in discovering protein complexes.
  相似文献   

12.

Background

Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem affinity-purification/mass-spectrometry (TAP-MS) are characterised by the presence of a significant number of false positives and false negatives. In recent years there has been a growing trend to incorporate diverse domain knowledge to support large-scale analysis of PPI networks.

Methods

This paper presents a new algorithm, by incorporating Gene Ontology (GO) based semantic similarities, to detect protein complexes from PPI networks generated by TAP-MS. By taking co-complex relations in TAP-MS data into account, TAP-MS PPI networks are modelled as bipartite graph, where bait proteins consist of one set of nodes and prey proteins are on the other. Similarities between pairs of bait proteins are computed by considering both the topological features and GO-driven semantic similarities. Bait proteins are then grouped in to sets of clusters based on their pair-wise similarities to produce a set of 'seed' clusters. An expansion process is applied to each 'seed' cluster to recruit prey proteins which are significantly associated with the same set of bait proteins. Thus, completely identified protein complexes are then obtained.

Results

The proposed algorithm has been applied to real TAP-MS PPI networks. Fifteen quality measures have been employed to evaluate the quality of generated protein complexes. Experimental results show that the proposed algorithm has greatly improved the accuracy of identifying complexes and outperformed several state-of-the-art clustering algorithms. Moreover, by incorporating semantic similarity, the proposed algorithm is more robust to noises in the networks.
  相似文献   

13.

Background

Network inference deals with the reconstruction of molecular networks from experimental data. Given N molecular species, the challenge is to find the underlying network. Due to data limitations, this typically is an ill-posed problem, and requires the integration of prior biological knowledge or strong regularization. We here focus on the situation when time-resolved measurements of a system’s response after systematic perturbations are available.

Results

We present a novel method to infer signaling networks from time-course perturbation data. We utilize dynamic Bayesian networks with probabilistic Boolean threshold functions to describe protein activation. The model posterior distribution is analyzed using evolutionary MCMC sampling and subsequent clustering, resulting in probability distributions over alternative networks. We evaluate our method on simulated data, and study its performance with respect to data set size and levels of noise. We then use our method to study EGF-mediated signaling in the ERBB pathway.

Conclusions

Dynamic Probabilistic Threshold Networks is a new method to infer signaling networks from time-series perturbation data. It exploits the dynamic response of a system after external perturbation for network reconstruction. On simulated data, we show that the approach outperforms current state of the art methods. On the ERBB data, our approach recovers a significant fraction of the known interactions, and predicts novel mechanisms in the ERBB pathway.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-250) contains supplementary material, which is available to authorized users.  相似文献   

14.

Background

One aspect in which RNA sequencing is more valuable than microarray-based methods is the ability to examine the allelic imbalance of the expression of a gene. This process is often a complex task that entails quality control, alignment, and the counting of reads over heterozygous single-nucleotide polymorphisms. Allelic imbalance analysis is subject to technical biases, due to differences in the sequences of the measured alleles. Flexible bioinformatics tools are needed to ease the workflow while retaining as much RNA sequencing information as possible throughout the analysis to detect and address the possible biases.

Results

We present AllelicImblance, a software program that is designed to detect, manage, and visualize allelic imbalances comprehensively. The purpose of this software is to allow users to pose genetic questions in any RNA sequencing experiment quickly, enhancing the general utility of RNA sequencing. The visualization features can reveal notable, non-trivial allelic imbalance behavior over specific regions, such as exons.

Conclusions

The software provides a complete framework to perform allelic imbalance analyses of aligned RNA sequencing data, from detection to visualization, within the robust and versatile management class, ASEset.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0620-2) contains supplementary material, which is available to authorized users.  相似文献   

15.

Background

A goal of systems biology is to analyze large-scale molecular networks including gene expressions and protein-protein interactions, revealing the relationships between network structures and their biological functions. Dividing a protein-protein interaction (PPI) network into naturally grouped parts is an essential way to investigate the relationship between topology of networks and their functions. However, clear modular decomposition is often hard due to the heterogeneous or scale-free properties of PPI networks.

Methodology/Principal Findings

To address this problem, we propose a diffusion model-based spectral clustering algorithm, which analytically solves the cluster structure of PPI networks as a problem of random walks in the diffusion process in them. To cope with the heterogeneity of the networks, the power factor is introduced to adjust the diffusion matrix by weighting the transition (adjacency) matrix according to a node degree matrix. This algorithm is named adjustable diffusion matrix-based spectral clustering (ADMSC). To demonstrate the feasibility of ADMSC, we apply it to decomposition of a yeast PPI network, identifying biologically significant clusters with approximately equal size. Compared with other established algorithms, ADMSC facilitates clear and fast decomposition of PPI networks.

Conclusions/Significance

ADMSC is proposed by introducing the power factor that adjusts the diffusion matrix to the heterogeneity of the PPI networks. ADMSC effectively partitions PPI networks into biologically significant clusters with almost equal sizes, while being very fast, robust and appealing simple.  相似文献   

16.

Background

Box C/D snoRNPs, which are typically composed of box C/D snoRNA and the four core protein components Nop1, Nop56, Nop58, and Snu13, play an essential role in the modification and processing of pre-ribosomal RNA. The highly conserved R2TP complex, comprising the proteins Rvb1, Rvb2, Tah1, and Pih1, has been shown to be required for box C/D snoRNP biogenesis and assembly; however, the molecular basis of R2TP chaperone-like activity is not yet known.

Results

Here, we describe an unexpected finding in which the activity of the R2TP complex is required for Nop58 protein stability and is controlled by the dynamic subcellular redistribution of the complex in response to growth conditions and nutrient availability. In growing cells, the complex localizes to the nucleus and interacts with box C/D snoRNPs. This interaction is significantly reduced in poorly growing cells as R2TP predominantly relocalizes to the cytoplasm. The R2TP-snoRNP interaction is mainly mediated by Pih1.

Conclusions

The R2TP complex exerts a novel regulation on box C/D snoRNP biogenesis that affects their assembly and consequently pre-rRNA maturation in response to different growth conditions.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0404-4) contains supplementary material, which is available to authorized users.  相似文献   

17.

Background

Alzheimer’s disease (AD) is one of the leading genetically complex and heterogeneous disorder that is influenced by both genetic and environmental factors. The underlying risk factors remain largely unclear for this heterogeneous disorder. In recent years, high throughput methodologies, such as genome-wide linkage analysis (GWL), genome-wide association (GWA) studies, and genome-wide expression profiling (GWE), have led to the identification of several candidate genes associated with AD. However, due to lack of consistency within their findings, an integrative approach is warranted. Here, we have designed a rank based gene prioritization approach involving convergent analysis of multi-dimensional data and protein-protein interaction (PPI) network modelling.

Results

Our approach employs integration of three different AD datasets- GWL,GWA and GWE to identify overlapping candidate genes ranked using a novel cumulative rank score (SR) based method followed by prioritization using clusters derived from PPI network. SR for each gene is calculated by addition of rank assigned to individual gene based on either p value or score in three datasets. This analysis yielded 108 plausible AD genes. Network modelling by creating PPI using proteins encoded by these genes and their direct interactors resulted in a layered network of 640 proteins. Clustering of these proteins further helped us in identifying 6 significant clusters with 7 proteins (EGFR, ACTB, CDC2, IRAK1, APOE, ABCA1 and AMPH) forming the central hub nodes. Functional annotation of 108 genes revealed their role in several biological activities such as neurogenesis, regulation of MAP kinase activity, response to calcium ion, endocytosis paralleling the AD specific attributes. Finally, 3 potential biochemical biomarkers were found from the overlap of 108 AD proteins with proteins from CSF and plasma proteome. EGFR and ACTB were found to be the two most significant AD risk genes.

Conclusions

With the assumption that common genetic signals obtained from different methodological platforms might serve as robust AD risk markers than candidates identified using single dimension approach, here we demonstrated an integrated genomic convergence approach for disease candidate gene prioritization from heterogeneous data sources linked to AD.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-199) contains supplementary material, which is available to authorized users.  相似文献   

18.

Background

The physical interactions between proteins constitute the basis of protein quaternary structures. They dominate many biological processes in living cells. Deciphering the structural features of interacting proteins is essential to understand their cellular functions. Similar to the space of protein tertiary structures in which discrete patterns are clearly observed on fold or sub-fold motif levels, it has been found that the space of protein quaternary structures is highly degenerate due to the packing of compact secondary structure elements at interfaces. Therefore, it is necessary to further decompose the protein quaternary structural space into a more local representation.

Results

Here we constructed an interface fragment pair library from the current structure database of protein complexes. After structural-based clustering, we found that more than 90% of these interface fragment pairs can be represented by a limited number of highly abundant motifs. These motifs were further used to guide complex assembly. A large-scale benchmark test shows that the native-like binding is highly likely in the structural ensemble of modeled protein complexes that were built through the library.

Conclusions

Our study therefore presents supportive evidences that the space of protein quaternary structures can be represented by the combination of a small set of secondary-structure-based packing at binding interfaces. Finally, after future improvements such as adding sequence profiles, we expect this new library will be useful to predict structures of unknown protein-protein interactions.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0437-4) contains supplementary material, which is available to authorized users.  相似文献   

19.

Background

Protein-protein interactions (PPIs) play fundamental roles in nearly all biological processes. The systematic analysis of PPI networks can enable a great understanding of cellular organization, processes and function. In this paper, we investigate the problem of protein complex detection from noisy protein interaction data, i.e., finding the subsets of proteins that are closely coupled via protein interactions. However, protein complexes are likely to overlap and the interaction data are very noisy. It is a great challenge to effectively analyze the massive data for biologically meaningful protein complex detection.

Results

Many people try to solve the problem by using the traditional unsupervised graph clustering methods. Here, we stand from a different point of view, redefining the properties and features for protein complexes and designing a “semi-supervised” method to analyze the problem. In this paper, we utilize the neural network with the “semi-supervised” mechanism to detect the protein complexes. By retraining the neural network model recursively, we could find the optimized parameters for the model, in such a way we can successfully detect the protein complexes. The comparison results show that our algorithm could identify protein complexes that are missed by other methods. We also have shown that our method achieve better precision and recall rates for the identified protein complexes than other existing methods. In addition, the framework we proposed is easy to be extended in the future.

Conclusions

Using a weighted network to represent the protein interaction network is more appropriate than using a traditional unweighted network. In addition, integrating biological features and topological features to represent protein complexes is more meaningful than using dense subgraphs. Last, the “semi-supervised” learning model is a promising model to detect protein complexes with more biological and topological features available.
  相似文献   

20.

Background

Protein interaction networks (PINs) are known to be useful to detect protein complexes. However, most available PINs are static, which cannot reflect the dynamic changes in real networks. At present, some researchers have tried to construct dynamic networks by incorporating time-course (dynamic) gene expression data with PINs. However, the inevitable background noise exists in the gene expression array, which could degrade the quality of dynamic networkds. Therefore, it is needed to filter out contaminated gene expression data before further data integration and analysis.

Results

Firstly, we adopt a dynamic model-based method to filter noisy data from dynamic expression profiles. Then a new method is proposed for identifying active proteins from dynamic gene expression profiles. An active protein at a time point is defined as the protein the expression level of whose corresponding gene at that time point is higher than a threshold determined by a standard variance involved threshold function. Furthermore, a noise-filtered active protein interaction network (NF-APIN) is constructed. To demonstrate the efficiency of our method, we detect protein complexes from the NF-APIN, compared with those from other dynamic PINs.

Conclusion

A dynamic model based method can effectively filter out noises in dynamic gene expression data. Our method to compute a threshold for determining the active time points of noise-filtered genes can make the dynamic construction more accuracy and provide a high quality framework for network analysis, such as protein complex prediction.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号