首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Detecting protein complexes from protein interaction networks is one major task in the postgenome era. Previous developed computational algorithms identifying complexes mainly focus on graph partition or dense region finding. Most of these traditional algorithms cannot discover overlapping complexes which really exist in the protein-protein interaction (PPI) networks. Even if some density-based methods have been developed to identify overlapping complexes, they are not able to discover complexes that include peripheral proteins. In this study, motivated by recent successful application of generative network model to describe the generation process of PPI networks and to detect communities from social networks, we develop a regularized sparse generative network model (RSGNM), by adding another process that generates propensities using exponential distribution and incorporating Laplacian regularizer into an existing generative network model, for protein complexes identification. By assuming that the propensities are generated using exponential distribution, the estimators of propensities will be sparse, which not only has good biological interpretation but also helps to control the overlapping rate among detected complexes. And the Laplacian regularizer will lead to the estimators of propensities more smooth on interaction networks. Experimental results on three yeast PPI networks show that RSGNM outperforms six previous competing algorithms in terms of the quality of detected complexes. In addition, RSGNM is able to detect overlapping complexes and complexes including peripheral proteins simultaneously. These results give new insights about the importance of generative network models in protein complexes identification.  相似文献   

2.
Zhao  Chengshuai  Qiu  Yang  Zhou  Shuang  Liu  Shichao  Zhang  Wen  Niu  Yanqing 《BMC genomics》2020,21(13):1-12
Background

Researchers discover LncRNA–miRNA regulatory paradigms modulate gene expression patterns and drive major cellular processes. Identification of lncRNA-miRNA interactions (LMIs) is critical to reveal the mechanism of biological processes and complicated diseases. Because conventional wet experiments are time-consuming, labor-intensive and costly, a few computational methods have been proposed to expedite the identification of lncRNA-miRNA interactions. However, little attention has been paid to fully exploit the structural and topological information of the lncRNA-miRNA interaction network.

Results

In this paper, we propose novel lncRNA-miRNA prediction methods by using graph embedding and ensemble learning. First, we calculate lncRNA-lncRNA sequence similarity and miRNA-miRNA sequence similarity, and then we combine them with the known lncRNA-miRNA interactions to construct a heterogeneous network. Second, we adopt several graph embedding methods to learn embedded representations of lncRNAs and miRNAs from the heterogeneous network, and construct the ensemble models using two ensemble strategies. For the former, we consider individual graph embedding based models as base predictors and integrate their predictions, and develop a method, named GEEL-PI. For the latter, we construct a deep attention neural network (DANN) to integrate various graph embeddings, and present an ensemble method, named GEEL-FI. The experimental results demonstrate both GEEL-PI and GEEL-FI outperform other state-of-the-art methods. The effectiveness of two ensemble strategies is validated by further experiments. Moreover, the case studies show that GEEL-PI and GEEL-FI can find novel lncRNA-miRNA associations.

Conclusion

The study reveals that graph embedding and ensemble learning based method is efficient for integrating heterogeneous information derived from lncRNA-miRNA interaction network and can achieve better performance on LMI prediction task. In conclusion, GEEL-PI and GEEL-FI are promising for lncRNA-miRNA interaction prediction.

  相似文献   

3.
Kinetochores maintain a mechanical grip on disassembling microtubule plus ends, possibly through a 16-member Dam1 ring that acts as a sliding clamp. It turns out, however, that a ring is not required for maintaining grip: individual Dam1 complexes in vitro can diffuse on the microtubule lattice and track shortening microtubule tips.  相似文献   

4.

Background

Identifying protein complexes is crucial to understanding principles of cellular organization and functional mechanisms. As many evidences have indicated that the subgraphs with high density or with high modularity in PPI network usually correspond to protein complexes, protein complexes detection methods based on PPI network focused on subgraph's density or its modularity in PPI network. However, dense subgraphs may have low modularity and subgraph with high modularity may have low density, which results that protein complexes may be subgraphs with low modularity or with low density in the PPI network. As the density-based methods are difficult to mine protein complexes with low density, and the modularity-based methods are difficult to mine protein complexes with low modularity, both two methods have limitation for identifying protein complexes with various density and modularity.

Results

To identify protein complexes with various density and modularity, including those have low density but high modularity and those have low modularity but high density, we define a novel subgraph's fitness, f ρ , as f ρ = (density) ρ *(modularity)1-ρ, and propose a novel algorithm, named LF_PIN, to identify protein complexes by expanding seed edges to subgraphs with the local maximum fitness value. Experimental results of LF-PIN in S.cerevisiae show that compared with the results of fitness equal to density (ρ = 1) or equal to modularity (ρ = 0), the LF-PIN identifies known protein complexes more effectively when the fitness value is decided by both density and modularity (0<ρ<1). Compared with the results of seven competing protein complex detection methods (CMC, Core-Attachment, CPM, DPClus, HC-PIN, MCL, and NFC) in S.cerevisiae and E.coli, LF-PIN outperforms other seven methods in terms of matching with known complexes and functional enrichment. Moreover, LF-PIN has better performance in identifying protein complexes with low density or with low modularity.

Conclusions

By considering both the density and the modularity, LF-PIN outperforms other protein complexes detection methods that only consider density or modularity, especially in identifying known protein complexes with low density or low modularity.
  相似文献   

5.
MOTIVATION: Data on protein-protein interactions (PPIs) are increasing exponentially. To date, large-scale protein interaction networks are available for human and most model species. The arising challenge is to organize these networks into models of cellular machinery. As in other biological domains, a comparative approach provides a powerful basis for addressing this challenge. RESULTS: We develop a probabilistic model for protein complexes that are conserved across two species. The model describes the evolution of conserved protein complexes from an ancestral species by protein interaction attachment and detachment and gene duplication events. We apply our model to search for conserved protein complexes within the PPI networks of yeast and fly, which are the largest networks in public databases. We detect 150 conserved complexes that match well-known complexes in yeast and are coherent in their functional annotations both in yeast and in fly. In comparison with two previous approaches, our model yields higher specificity and sensitivity levels in protein complex detection. AVAILABILITY: The program is available upon request.  相似文献   

6.
The types of myocardial ischemia can be revealed by electrocardiographic (ECG) ST segment. Effective measurement and electrocardiographic analysis of ST as well as calculation of displacement and shape change of ST segment can help doctors diagnose coronary heart disease and myocardial ischemia, especially for asymptomatic myocardial ischemia. Therefore, it is a very important subject in clinical practice to measure and classify the ECG ST segment. In this paper, we introduce a computerized automatic identification method of the electrocardiographic ST segment shape with radial basis function neural network based on adaptive fuzzy system, which has a better effect than other methods. It helps to analyze the reason of the ST segment change and confirm the position of myocardial ischemia, and is useful for doctor diagnosis. Translated from Acta Biophysica Sinica, 2005, 21(6): 443–448 [译自: 生物物理学报]  相似文献   

7.
卷积神经网络可以通过树木年轮样本构造特征图像实现物种识别的自动化。本研究通过建立树木年轮样本构造特征图像集,选用LeNet、AlexNet、GoogLeNet和VGGNet 4个卷积神经网络模型,实现基于树木年轮横切面的计算机自动化树种精准识别,进而确定各模型的树种识别准确率,明晰不同树种在自动识别中的混淆情况,探测不同模型识别结果的差异。结果表明: 本研究训练的用于树种识别的卷积神经网络模型具有较好的可信度;4个模型中GoogLeNet模型树种识别准确率最高,为96.7%,LeNet模型识别准确率最低(66.4%);不同模型对于所选树种的识别结果具有一致性,表现为对蒙古栎识别准确率最高(AlexNet模型识别率达到100%),对臭冷杉的识别准确率最低。本研究中也存在类似结构树种的识别混淆情况。模型在科和属水平的识别准确率高于种水平;阔叶树种因其显著的结构差异容易区分,阔叶树树种的识别准确率高于针叶树。总体上,通过卷积神经网络,探测了树木年轮特征的深层信息,达到树种的精准识别,提供了一种快速便捷的自动树种初筛鉴定方法。  相似文献   

8.
The types of myocardial ischemia can be revealed by electrocardiographic (ECG) ST segment.Effective measurement and electrocardiographic analysis of ST as well as calculation of displacement and shape change of ST segment can help doctors diagnose coronary heart disease and myocardial ischemia,especially for asymptomatic myocardial ischemia.Therefore,it is a very important subject in clinical practice to measure and classify the ECG ST segment.In this paper,we introduce a computerized automatic identification method of the electrocardiographic ST segment shape with radial basis function neural network based on adaptive fuzzy system,which has a better effect than other methods.It helps to analyze the reason of the ST segment change and confirm the position of myocardial ischemia,and is useful for doctor diagnosis.  相似文献   

9.
Despite the increasing number of published protein structures, and the fact that each protein's function relies on its three-dimensional structure, there is limited access to automatic programs used for the identification of critical residues from the protein structure, compared with those based on protein sequence. Here we present a new algorithm based on network analysis applied exclusively on protein structures to identify critical residues. Our results show that this method identifies critical residues for protein function with high reliability and improves automatic sequence-based approaches and previous network-based approaches. The reliability of the method depends on the conformational diversity screened for the protein of interest. We have designed a web site to give access to this software at http://bis.ifc.unam.mx/jamming/. In summary, a new method is presented that relates critical residues for protein function with the most traversed residues in networks derived from protein structures. A unique feature of the method is the inclusion of the conformational diversity of proteins in the prediction, thus reproducing a basic feature of the structure/function relationship of proteins.  相似文献   

10.
11.
12.
The greatest challenge to structural biologists in the post-genomic era is to decipher both stable and transient interactions of protein complexes. In response to this challenge, significant advances in mass spectrometry have been made in the past two years. With the inception of novel approaches targeted at defining interaction partners, stoichiometry and topology, as well as monitoring real-time reactions, mass spectrometry is well placed to contribute to the understanding of dynamic macromolecular complexes.  相似文献   

13.
DNA microarrays have been used extensively to identify cell cycle regulated genes in yeast; however, the overlap in the genes identified is surprisingly small. We show that certain protein features can be used to distinguish cell cycle regulated genes from other genes with high confidence (features include protein phosphorylation, glycosylation, subcellular location and instability/degradation). We demonstrate that co-expressed, periodic genes encode proteins which share combinations of features, and provide an overview of the proteome dynamics during the cycle. A large set of novel putative cell cycle regulated proteins were identified, many of which have no known function.  相似文献   

14.
Plant protein-protein interaction networks have not been identified by large-scale experiments. In order to better understand the protein interactions in rice, the Predicted Rice Interactome Network (PRIN; http://bis.zju.edu.cn/prin/) presented 76,585 predicted interactions involving 5,049 rice proteins. After mapping genomic features of rice (GO annotation, subcellular localization prediction, and gene expression), we found that a well-annotated and biologically significant network is rich enough to capture many significant functional linkages within higher-order biological systems, such as pathways and biological processes. Furthermore, we took MADS-box domain-containing proteins and circadian rhythm signaling pathways as examples to demonstrate that functional protein complexes and biological pathways could be effectively expanded in our predicted network. The expanded molecular network in PRIN has considerably improved the capability of these analyses to integrate existing knowledge and provide novel insights into the function and coordination of genes and gene networks.  相似文献   

15.
A simple method for the definition of protein structural domains is described that requires only alpha-carbon coordinate data. The basic method, which encodes no specific aspects of protein structure, captures the essence of most domains but does not give high enough priority to the integrity of beta-sheet structure. This aspect was encouraged both by a bias toward attaining intact beta-sheets and also as an acceptance condition on the final result. The method has only one variable parameter, reflecting the granularity level of the domains, and an attempt was made to set this level automatically for each protein based on the best agreement attained between the domains predicted on the native structure and a set of smoothed coordinates. While not perfect, this feature allowed some tightly packed domains to be separated that would have remained undivided had the best fixed granularity level been used. The quality of the results was high and, when compared with a large collection of accepted domain definitions, only a few could be said to be clearly incorrect. The simplicity of the method allowed its easy extension to the simultaneous definition of domains across related structures in a way that does not involve loss of detail through averaging the structures. This was found to be a useful approach to reconciling differences among structural family members. The method is fast, taking less than 1 s per 100 residues for medium-sized proteins.  相似文献   

16.
In the last two years, because of advances in protein separation and mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in their infancy. We describe MS-Align+, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. We also propose a method for evaluating statistical significance of top-down protein identifications and further benchmark various software tools on two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium. We demonstrate that MS-Align+ significantly increases the number of identified spectra as compared with MASCOT and OMSSA on both data sets. Although MS-Align+ and ProSightPC have similar performance on the Salmonella typhimurium data set, MS-Align+ outperforms ProSightPC on the (more complex) Saccharomyces cerevisiae data set.  相似文献   

17.
Synfire chains, sequences of pools linked by feedforward connections, support the propagation of precisely timed spike sequences, or synfire waves. An important question remains, how synfire chains can efficiently be embedded in cortical architecture. We present a model of synfire chain embedding in a cortical scale recurrent network using conductance-based synapses, balanced chains, and variable transmission delays. The network attains substantially higher embedding capacities than previous spiking neuron models and allows all its connections to be used for embedding. The number of waves in the model is regulated by recurrent background noise. We computationally explore the embedding capacity limit, and use a mean field analysis to describe the equilibrium state. Simulations confirm the mean field analysis over broad ranges of pool sizes and connectivity levels; the number of pools embedded in the system trades off against the firing rate and the number of waves. An optimal inhibition level balances the conflicting requirements of stable synfire propagation and limited response to background noise. A simplified analysis shows that the present conductance-based synapses achieve higher contrast between the responses to synfire input and background noise compared to current-based synapses, while regulation of wave numbers is traced to the use of variable transmission delays.  相似文献   

18.
The subcellular location of a protein is highly related to its function. Identifying the location of a given protein is an essential step for investigating its related problems. Traditional experimental methods can produce solid determination. However, their limitations, such as high cost and low efficiency, are evident. Computational methods provide an alternative means to address these problems. Most previous methods constantly extract features from protein sequences or structures for building prediction models. In this study, we use two types of features and combine them to construct the model. The first feature type is extracted from a protein–protein interaction network to abstract the relationship between the encoded protein and other proteins. The second type is obtained from gene ontology and biological pathways to indicate the existing functions of the encoded protein. These features are analyzed using some feature selection methods. The final optimum features are adopted to build the model with recurrent neural network as the classification algorithm. Such model yields good performance with Matthews correlation coefficient of 0.844. A decision tree is used as a rule learning classifier to extract decision rules. Although the performance of decision rules is poor, they are valuable in revealing the molecular mechanism of proteins with different subcellular locations. The final analysis confirms the reliability of the extracted rules. The source code of the propose method is freely available at https://github.com/xypan1232/rnnloc  相似文献   

19.
We review the recently discovered phenomenon of protein splicing which is the excision of an internal protein sequence at the protein level rather than at the RNA level. The means by which examples of protein splicing have been identified are described, and the similarities of the internally spliced protein products (or inteins) are discussed. Comparisons are made between inteins and group I RNA introns. We describe the evidence supporting excision of intiens by a post-translational autocatalytic reaction of a full length polypeptide precursor, rather than by RNA splicing. An examination is made of some of the proposed mechanism schemes and the supporting them presented.  相似文献   

20.
Bovine pancreatic ribonuclease A (RNase A) catalyzes the cleavage of P-O5' bonds in RNA on the 3' side of pyrimidine to form cyclic 2', 5'-phosphates. It has several high affinity binding sites that make it possible target for many organic and inorganic molecules. Ligand binding to RNase A can alter protein secondary structure and its catalytic activity. In this review, the effects of several drugs such as AZT (anti-AIDS), cis-Pt (antitumor), aspirin (anti-inflammatory), and vitamin C (antioxidant) on the stability and conformation of RNase A in vitro are compared. The results of UV-visible, FTIR, and CD spectroscopic analysis of RNase complexes with aspirin, AZT, cis-Pt, and vitamin C at physiological conditions are discussed here. Spectroscopic results showed one major binding for each drug-RNase adduct with KAZT=5.29 (+/-1.6)x10(4) M(-1), Kaspirin=3.57 (+/-1.4)x10(4) M(-1), Kcis-Pt=5.66 (+/-1.9)x10(3) M(-1), and Kascorbate=3.50 (+/-1.5)x10(3) M(-1). Major protein unfolding occurred with reduction of alpha-helix from 29% (free protein) to 20% and increase of beta-sheet from 39% (free protein) to 45% in the aspirin-, ascorbate-, and cis-Pt-RNase complexes, while minor increase of alpha-helix was observed for AZT-RNase adduct.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号