首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
付新  徐振源 《生物信息学》2007,5(3):113-116
利用一种新的基于图论理论的DNA序列(片段)分析的方法,即通过复杂网络研究生物体的拓扑结构,主要通过测量聚类系数(集团系数)构建网络的拓扑结构。依据DNA序列的前缀、后缀关联性质构造了所选取DNA序列(片段)的相关网络,发现该网络分布满足幂率特征,有较大的聚类系数。结果表明构建得到的网络同时满足小世界网络和无尺度网络的特征,证明DNA序列不全是随机的序列,而是有随机扰动的确定结构的序列。  相似文献   

2.
3.
Fan  Yetian  Tang  Xiwei  Hu  Xiaohua  Wu  Wei  Ping  Qing 《BMC bioinformatics》2017,18(13):470-21

Background

Essential proteins are indispensable to the survival and development process of living organisms. To understand the functional mechanisms of essential proteins, which can be applied to the analysis of disease and design of drugs, it is important to identify essential proteins from a set of proteins first. As traditional experimental methods designed to test out essential proteins are usually expensive and laborious, computational methods, which utilize biological and topological features of proteins, have attracted more attention in recent years. Protein-protein interaction networks, together with other biological data, have been explored to improve the performance of essential protein prediction.

Results

The proposed method SCP is evaluated on Saccharomyces cerevisiae datasets and compared with five other methods. The results show that our method SCP outperforms the other five methods in terms of accuracy of essential protein prediction.

Conclusions

In this paper, we propose a novel algorithm named SCP, which combines the ranking by a modified PageRank algorithm based on subcellular compartments information, with the ranking by Pearson correlation coefficient (PCC) calculated from gene expression data. Experiments show that subcellular localization information is promising in boosting essential protein prediction.
  相似文献   

4.
研究表明,许多神经退行性疾病都与蛋白质在高尔基体中的定位有关,因此,正确识别亚高尔基体蛋白质对相关疾病药物的研制有一定帮助,本文建立了两类亚高尔基体蛋白质数据集,提取了氨基酸组分信息、联合三联体信息、平均化学位移、基因本体注释信息等特征信息,利用支持向量机算法进行预测,基于5-折交叉检验下总体预测成功率为87.43%。  相似文献   

5.
Diverse procedures for identifying antigenic determinants on proteins have been developed, including experimental as well as computational approaches. However, most of these techniques focus on continuous epitopes, whereas fast and reliable identification and verification of discontinuous epitopes remains barely amenable. In this paper, we describe a computational workflow for the detection of discontinuous epitopes on proteins. The workflow uses a given protein 3D structure as input, and combines a per residue solvent accessibility constraint with epitope to paratope shape complementarity measures and binding energies for assigning antigenic determinants in the conformational context. We have developed the procedure on a given set of 26 antigen-antibody complexes with a known structure, and have further expanded the available paratope shapes by generating a virtual paratope library in order to improve the screening for candidate residues constituting discontinuous epitopes. Applying the workflow on the 26 given antigens with known discontinuous epitopes resulted in the correct identification of the spatial proximity of 12 antigen-antibody interaction sites. Combining solvent accessibility, shape complementarity and binding energies towards the identification of discontinuous epitopes clearly outperforms approaches solely considering accessibility and residue distance constraints.  相似文献   

6.

Background

Essential proteins play an indispensable role in the cellular survival and development. There have been a series of biological experimental methods for finding essential proteins; however they are time-consuming, expensive and inefficient. In order to overcome the shortcomings of biological experimental methods, many computational methods have been proposed to predict essential proteins. The computational methods can be roughly divided into two categories, the topology-based methods and the sequence-based ones. The former use the topological features of protein-protein interaction (PPI) networks while the latter use the sequence features of proteins to predict essential proteins. Nevertheless, it is still challenging to improve the prediction accuracy of the computational methods.

Results

Comparing with nonessential proteins, essential proteins appear more frequently in certain subcellular locations and their evolution more conservative. By integrating the information of subcellular localization, orthologous proteins and PPI networks, we propose a novel essential protein prediction method, named SON, in this study. The experimental results on S.cerevisiae data show that the prediction accuracy of SON clearly exceeds that of nine competing methods: DC, BC, IC, CC, SC, EC, NC, PeC and ION.

Conclusions

We demonstrate that, by integrating the information of subcellular localization, orthologous proteins with PPI networks, the accuracy of predicting essential proteins can be improved. Our proposed method SON is effective for predicting essential proteins.
  相似文献   

7.
Protein structural annotation and classification is an important and challenging problem in bioinformatics. Research towards analysis of sequence-structure correspondences is critical for better understanding of a protein's structure, function, and its interaction with other molecules. Clustering of protein domains based on their structural similarities provides valuable information for protein classification schemes. In this article, we attempt to determine whether structure information alone is sufficient to adequately classify protein structures. We present an algorithm that identifies regions of structural similarity within a given set of protein structures, and uses those regions for clustering. In our approach, called STRALCP (STRucture ALignment-based Clustering of Proteins), we generate detailed information about global and local similarities between pairs of protein structures, identify fragments (spans) that are structurally conserved among proteins, and use these spans to group the structures accordingly. We also provide a web server at http://as2ts.llnl.gov/AS2TS/STRALCP/ for selecting protein structures, calculating structurally conserved regions and performing automated clustering.  相似文献   

8.
Single-stranded DNA-binding proteins (SSBs) play vital roles in all aspects of DNA metabolism in all three domains of life and are characterized by the presence of one or more OB fold ssDNA-binding domains. Here, using the genetically tractable euryarchaeon Haloferax volcanii as a model, we present the first genetic analysis of SSB function in the archaea. We show that genes encoding the OB fold and zinc finger-containing RpaA1 and RpaB1 proteins are individually non-essential for cell viability but share an essential function, whereas the gene encoding the triple OB fold RpaC protein is essential. Loss of RpaC function can however be rescued by elevated expression of RpaB, indicative of functional overlap between the two classes of haloarchaeal SSB. Deletion analysis is used to demonstrate important roles for individual OB folds in RpaC and to show that conserved N- and C-terminal domains are required for efficient repair of DNA damage. Consistent with a role for RpaC in DNA repair, elevated expression of this protein leads to enhanced resistance to DNA damage. Taken together, our results offer important insights into archaeal SSB function and establish the haloarchaea as a valuable model for further studies.  相似文献   

9.

Background

Understanding the information-processing capabilities of signal transduction networks, how those networks are disrupted in disease, and rationally designing therapies to manipulate diseased states require systematic and accurate reconstruction of network topology. Data on networks central to human physiology, such as the inflammatory signalling networks analyzed here, are found in a multiplicity of on-line resources of pathway and interactome databases (Cancer CellMap, GeneGo, KEGG, NCI-Pathway Interactome Database (NCI-PID), PANTHER, Reactome, I2D, and STRING). We sought to determine whether these databases contain overlapping information and whether they can be used to construct high reliability prior knowledge networks for subsequent modeling of experimental data.

Results

We have assembled an ensemble network from multiple on-line sources representing a significant portion of all machine-readable and reconcilable human knowledge on proteins and protein interactions involved in inflammation. This ensemble network has many features expected of complex signalling networks assembled from high-throughput data: a power law distribution of both node degree and edge annotations, and topological features of a ??bow tie?? architecture in which diverse pathways converge on a highly conserved set of enzymatic cascades focused around PI3K/AKT, MAPK/ERK, JAK/STAT, NF??B, and apoptotic signaling. Individual pathways exhibit ??fuzzy?? modularity that is statistically significant but still involving a majority of ??cross-talk?? interactions. However, we find that the most widely used pathway databases are highly inconsistent with respect to the actual constituents and interactions in this network. Using a set of growth factor signalling networks as examples (epidermal growth factor, transforming growth factor-beta, tumor necrosis factor, and wingless), we find a multiplicity of network topologies in which receptors couple to downstream components through myriad alternate paths. Many of these paths are inconsistent with well-established mechanistic features of signalling networks, such as a requirement for a transmembrane receptor in sensing extracellular ligands.

Conclusions

Wide inconsistencies among interaction databases, pathway annotations, and the numbers and identities of nodes associated with a given pathway pose a major challenge for deriving causal and mechanistic insight from network graphs. We speculate that these inconsistencies are at least partially attributable to cell, and context-specificity of cellular signal transduction, which is largely unaccounted for in available databases, but the absence of standardized vocabularies is an additional confounding factor. As a result of discrepant annotations, it is very difficult to identify biologically meaningful pathways from interactome networks a priori. However, by incorporating prior knowledge, it is possible to successively build out network complexity with high confidence from a simple linear signal transduction scaffold. Such reduced complexity networks appear suitable for use in mechanistic models while being richer and better justified than the simple linear pathways usually depicted in diagrams of signal transduction.  相似文献   

10.
ABSTRACT: BACKGROUND: Identification of essential proteins plays a significant role in understanding minimal requirements for the cellular survival and development. Many computational methods have been proposed for predicting essential proteins by using the topological features of protein-protein interaction (PPI) networks. However, most of these methods ignored intrinsic biological meaning of proteins. Moreover, PPI data contains many false positives and false negatives. To overcome these limitations, recently many research groups have started to focus on identification of essential proteins by integrating PPI networks with other biological information. However, none of their methods has widely been acknowledged. RESULTS: By considering the facts that essential proteins are more evolutionarily conserved than nonessential proteins and essential proteins frequently bind each other, we propose an iteration method for predicting essential proteins by integrating the orthology with PPI networks, named by ION. Differently from other methods, ION identifies essential proteins depending on not only the connections between proteins but also their orthologous properties and features of their neighbors. ION is implemented to predict essential proteins in S. cerevisiae. Experimental results show that ION can achieve higher identification accuracy than eight other existing centrality methods in terms of area under the curve (AUC). Moreover, ION identifies a large amount of essential proteins which have been ignored by eight other existing centrality methods because of their low-connectivity. Many proteins ranked in top 100 by ION are both essential and belong to the complexes with certain biological functions. Furthermore, no matter how many reference organisms were selected, ION outperforms all eight other existing centrality methods. While using as many as possible reference organisms can improve the performance of ION. Additionally, ION also shows good prediction performance in E.Coli K-12. CONCLUSIONS: The accuracy of predicting essential proteins can be improved by integrating the orthology with PPI networks.  相似文献   

11.
差异分析对于揭示生命体的生长、发育和衰老过程及疾病发生具有重大的意义,基于网络的差异分析方法已经成为系统生物学的一个研究热点。网络节点往往通过与局部结构作用实现某种功能,其与局部结构的关系变化,很可能影响其功能。本文利用仿真实验的方法比较了图元向量和点的聚类系数两种局部结构测度的性能,并且利用他们分别设计算法挖掘差异网络中模块化变化的基因簇。应用AGEMAP数据库中小鼠12个组织基因表达数据进行实验,大部分聚类簇都高度显著富集与衰老相关的GO项。  相似文献   

12.
An activity coefficient model for proteins   总被引:2,自引:0,他引:2  
Modeling of the properties of biochemical components is gaining increasing interest due to its potential for further application within the area of biochemical process development. Generally protein solution properties such as protein solubility are expressed through component activity coefficients which are studied here. The original UNIQUAC model is chosen for the representation of protein activity coefficients and, to the best of our knowledge, this is the first time it has been directly applied to protein solutions. Ten different protein-salt-water systems with four different proteins, serum albumin, alphacymotrypsin, beta-lactoglobulin and ovalbumin, are investigated. A root-mean-squared deviation of 0.54% is obtained for the model by comparing calculated protein activity coefficients and protein activity coefficients deduced from osmotic measurements through virial expansion. Model predictions are used to analyze the effect of salt concentrations, pH, salt types, and temperature on protein activity coefficients and also on protein solubility and demonstrate consistency with results from other references. (c) 1997 John Wiley & Sons, Inc. Biotechnol Bioeng 55: 65-71, 1997.  相似文献   

13.
14.
Cbl promotes clustering of endocytic adaptor proteins   总被引:2,自引:0,他引:2  
The ubiquitin ligases c-Cbl and Cbl-b play a crucial role in receptor downregulation by mediating multiple monoubiquitination of receptors and promoting their sorting for lysosomal degradation. Their function is modulated through interactions with regulatory proteins including CIN85 and PIX, which recognize a proline-arginine motif in Cbl and thus promote or inhibit receptor endocytosis. We report the structures of SH3 domains of CIN85 and beta-PIX in complex with a proline-arginine peptide from Cbl-b. Both structures reveal a heterotrimeric complex containing two SH3 domains held together by a single peptide. Trimerization also occurs in solution and is facilitated by the pseudo-symmetrical peptide sequence. Moreover, ternary complexes of CIN85 and Cbl are formed in vivo and are important for the ability of Cbl to promote epidermal growth factor receptor (EGFR) downregulation. These results provide molecular explanations for a novel mechanism by which Cbl controls receptor downregulation.  相似文献   

15.
We perform a computational study using a new approach to the analysis of protein sequences. The contextual alignment model, proposed recently by Gambin et al. (2002), is based on the assumption that, while constructing an alignment, the score of a substitution of one residue by another depends on the surrounding residues. The contextual alignment scores calculated in this model were used to hierarchical clustering of several protein families from the database of Clusters of Orthologous Groups (COG). The clustering has been also constructed based on the standard approach. The comparative analysis shows that the contextual model results in more consistent clustering trees. The difference, although small, is with no exception in favour of the contextual model. The consistency of the family of trees is measured by several consensus and agreement methods, as well as by the inter-tree distance approach.  相似文献   

16.
The lamellar membrane at the leading edge of motile cells participates in a series of complex movements that involve the assembly and reorganization of actin bundles and networks, both structures formed by actin crosslinking proteins. Immunofluorescence miscroscopy localizes within lamellipodia and filopodia several crosslinking proteins including fascin, fimbrin, α-actinin and filamin. While these proteins may organize actin into bundles and networks, fimbrin and α-actinin may play an additional role of linking the cytoskeleton to cell-substratum adhesion sites.  相似文献   

17.
Multiconstrained gene clustering based on generalized projections   总被引:1,自引:0,他引:1  

Background  

Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem.  相似文献   

18.
c-Jun is essential for organization of the epidermal leading edge   总被引:9,自引:0,他引:9  
The migration of epithelial layers requires specific and coordinated organization of the cells at the leading edge of the sheet. Mice that are conditionally deleted for the c-jun protooncogene in epidermis are born at expected frequencies, but with open eyes and with defects in epidermal wound healing. Keratinocytes lacking c-Jun are unable to migrate or elongate properly in culture at the border of scratch assays. Histological analyses in vitro and in vivo demonstrate an inability to activate EGF receptor at the leading edge of wounds, and we demonstrate that this can be rescued by supplementation with conditioned medium or the EGF receptor ligand HB-EGF. Lack of c-Jun prevents EGF-induced expression of HB-EGF, indicating that c-jun controls formation of the epidermal leading edge through its control of an EGF receptor autocrine loop.  相似文献   

19.
20.
The heart sound signal is first separated into cycles, where the cycle detection is based on an instantaneous cycle frequency. The heart sound data of one cardiac cycle can be decomposed into a number of atoms characterized by timing delay, frequency, amplitude, time width and phase. To segment heart sounds, we made a hypothesis that the atoms of a heart sound congregate as a cluster in time–frequency domains. We propose an atom density function to indicate clusters. To suppress clusters of murmurs and noise, weighted density function by atom energy is further proposed to improve the segmentation of heart sounds. Therefore, heart sounds are indicated by the hybrid analysis of clustering and medical knowledge. The segmentation scheme is automatic and no reference signal is needed. Twenty-six subjects, including 3 normal and 23 abnormal subjects, were tested for heart sound signals in various clinical cases. Our statistics show that the segmentation was successful for signals collected from normal subjects and patients with moderate murmurs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号