首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Pazos F  Valencia A 《Proteins》2002,47(2):219-227
Deciphering the interaction links between proteins has become one of the main tasks of experimental and bioinformatic methodologies. Reconstruction of complex networks of interactions in simple cellular systems by integrating predicted interaction networks with available experimental data is becoming one of the most demanding needs in the postgenomic era. On the basis of the study of correlated mutations in multiple sequence alignments, we propose a new method (in silico two-hybrid, i2h) that directly addresses the detection of physically interacting protein pairs and identifies the most likely sequence regions involved in the interactions. We have applied the system to several test sets, showing that it can discriminate between true and false interactions in a significant number of cases. We have also analyzed a large collection of E. coli protein pairs as a first step toward the virtual reconstruction of its complete interaction network.  相似文献   

2.
Domains are basic evolutionary units of proteins and most proteins have more than one domain. Advances in domain modeling and collection are making it possible to annotate a large fraction of known protein sequences by a linear ordering of their domains, yielding their architecture. Protein domain architectures link evolutionarily related proteins and underscore their shared functions. Here, we attempt to better understand this association by identifying the evolutionary pathways by which extant architectures may have evolved. We propose a model of evolution in which architectures arise through rearrangements of inferred precursor architectures and acquisition of new domains. These pathways are ranked using a parsimony principle, whereby scenarios requiring the fewest number of independent recombination events, namely fission and fusion operations, are assumed to be more likely. Using a data set of domain architectures present in 159 proteomes that represent all three major branches of the tree of life allows us to estimate the history of over 85% of all architectures in the sequence database. We find that the distribution of rearrangement classes is robust with respect to alternative parsimony rules for inferring the presence of precursor architectures in ancestral species. Analyzing the most parsimonious pathways, we find 87% of architectures to gain complexity over time through simple changes, among which fusion events account for 5.6 times as many architectures as fission. Our results may be used to compute domain architecture similarities, for example, based on the number of historical recombination events separating them. Domain architecture "neighbors" identified in this way may lead to new insights about the evolution of protein function.  相似文献   

3.
The hordeiviral movement protein encoded by the first gene of the triple gene block (TGBp1) of Poa semilatent virus (PSLV), interacts with viral genomic RNAs to form RNP particles which are considered to be a form of viral genome capable of cell-to-cell and long-distance transport in infected plants. The PSLV TGBp1 contains a C-terminal NTPase/helicase domain (HELD) and an N-terminal extension region consisting of two structurally and functionally distinct domains: an extreme N-terminal domain (NTD) and an internal domain (ID). This study demonstrates that transient expression of TGBp1 fused to GFP in Nicotiana benthamiana leaves results in faint but obvious fluorescence in the nucleolus in addition to cytosolic distribution. Mutagenesis of the basic amino acids inside the NTD clusters A 116KSKRKKKNKK125 and B 175KKATKKESKKQTK187 reveals that these clusters are indispensable for nuclear and nucleolar targeting of PSLV TGBp1 and may contain nuclear and nucleolar localization signals or their elements. The PSLV TGBp1 is able to bind to fibrillarin, the major nucleolar protein (AtFib2 from Arabidopsis thaliana) in vitro. This protein–protein interaction occurs between the glycine-arginine-rich (GAR) domain of fibrillarin and the first 82 amino acid residues of TGBp1. The interaction of TGBp1 with fibrillarin is also visualized in vivo by bimolecular fluorescence complementation (BiFC) during co-expression of TGBp1 or its deletion mutants, and fibrillarin as fusions to different halves of YFP in N. benthamiana plants. The sites responsible for nuclear/nucleolar localization and fibrillarin binding, have been located within the intrinsically disordered TGBp1 NTD. These data could suggest that specific functions of hordeivirus TGBp1 may depend on its interaction with nucleolar components.  相似文献   

4.
Accurate and large‐scale prediction of protein–protein interactions directly from amino‐acid sequences is one of the great challenges in computational biology. Here we present a new Bayesian network method that predicts interaction partners using only multiple alignments of amino‐acid sequences of interacting protein domains, without tunable parameters, and without the need for any training examples. We first apply the method to bacterial two‐component systems and comprehensively reconstruct two‐component signaling networks across all sequenced bacteria. Comparisons of our predictions with known interactions show that our method infers interaction partners genome‐wide with high accuracy. To demonstrate the general applicability of our method we show that it also accurately predicts interaction partners in a recent dataset of polyketide synthases. Analysis of the predicted genome‐wide two‐component signaling networks shows that cognates (interacting kinase/regulator pairs, which lie adjacent on the genome) and orphans (which lie isolated) form two relatively independent components of the signaling network in each genome. In addition, while most genes are predicted to have only a small number of interaction partners, we find that 10% of orphans form a separate class of ‘hub’ nodes that distribute and integrate signals to and from up to tens of different interaction partners.  相似文献   

5.
基于多个结构域联合作用导致蛋白质间相互作用的假设,提出了一种预测蛋白质间相互作用的新方法。使用支持向量机分析结构域组合对序列的氨基酸理化性质得到其序列特征值,同时采用统计分析的方法获取其频率特征值,最后通过融合上述两种特征估计该结构域组合间发生相互作用的可能性,并以此预测蛋白质间相互作用关系。该方法能够预测所有结构域组合间相互作用关系,且对于蛋白质相互作用关系有着较好的预测效果。  相似文献   

6.
Xing Qin  Shuangge Ma  Mengyun Wu 《Biometrics》2023,79(3):1761-1774
Genetic interactions play an important role in the progression of complex diseases, providing explanation of variations in disease phenotype missed by main genetic effects. Comparatively, there are fewer studies on survival time, given its challenging characteristics such as censoring. In recent biomedical research, two-level analysis of both genes and their involved pathways has received much attention and been demonstrated as more effective than single-level analysis. However, such analysis is usually limited to main effects. Pathways are not isolated, and their interactions have also been suggested to have important contributions to the prognosis of complex diseases. In this paper, we develop a novel two-level Bayesian interaction analysis approach for survival data. This approach is the first to conduct the analysis of lower-level gene–gene interactions and higher-level pathway–pathway interactions simultaneously. Significantly advancing from the existing Bayesian studies based on the Markov Chain Monte Carlo (MCMC) technique, we propose a variational inference framework based on the accelerated failure time model with effective priors to accommodate two-level selection as well as censoring. Its computational efficiency is much desirable for high-dimensional interaction analysis. We examine performance of the proposed approach using extensive simulation. The application to TCGA melanoma and lung adenocarcinoma data leads to biologically sensible findings with satisfactory prediction accuracy and selection stability.  相似文献   

7.
Sim J  Kim SY  Lee J 《Proteins》2005,59(3):627-632
Successful prediction of protein domain boundaries provides valuable information not only for the computational structure prediction of multidomain proteins but also for the experimental structure determination. Since protein sequences of multiple domains may contain much information regarding evolutionary processes such as gene-exon shuffling, this information can be detected by analyzing the position-specific scoring matrix (PSSM) generated by PSI-BLAST. We have presented a method, PPRODO (Prediction of PROtein DOmain boundaries) that predicts domain boundaries of proteins from sequence information by a neural network. The network is trained and tested using the values obtained from the PSSM generated by PSI-BLAST. A 10-fold cross-validation technique is performed to obtain the parameters of neural networks using a nonredundant set of 522 proteins containing 2 contiguous domains. PPRODO provides good and consistent results for the prediction of domain boundaries, with accuracy of about 66% using the +/-20 residue criterion. The PPRODO source code, as well as all data sets used in this work, are available from http://gene.kias.re.kr/ approximately jlee/pprodo/.  相似文献   

8.
The overall function of a multi‐domain protein is determined by the functional and structural interplay of its constituent domains. Traditional sequence alignment‐based methods commonly utilize domain‐level information and provide classification only at the level of domains. Such methods are not capable of taking into account the contributions of other domains in the proteins, and domain‐linker regions and classify multi‐domain proteins. An alignment‐free protein sequence comparison tool, CLAP (CLAssification of Proteins) was previously developed in our laboratory to especially handle multi‐domain protein sequences without a requirement of defining domain boundaries and sequential order of domains. Through this method we aim to achieve a biologically meaningful classification scheme for multi‐domain protein sequences. In this article, CLAP‐based classification has been explored on 5 datasets of multi‐domain proteins and we present detailed analysis for proteins containing (1) Tyrosine phosphatase and (2) SH3 domain. At the domain‐level CLAP‐based classification scheme resulted in a clustering similar to that obtained from an alignment‐based method. CLAP‐based clusters obtained for full‐length datasets were shown to comprise of proteins with similar functions and domain architectures. Our study demonstrates that multi‐domain proteins could be classified effectively by considering full‐length sequences without a requirement of identification of domains in the sequence.  相似文献   

9.
生殖细胞缺陷症(gcd)小鼠突变体是20个世纪90年代初发现的一种不育突变小鼠,其不育原因是由于其胚胎期原始生殖细胞的数目低于正常。FancL(也叫Pog)的缺失是引起gcd突变小鼠的原因,FancL基因缺失后可能影响了小鼠胚胎期原始生殖细胞的增殖/存活和成年期小鼠精母细胞的减数分裂。FANCL是一种含有PHD结构域的泛素E3连接酶,是Fanconi贫血复合物的组分之一。在生殖细胞中,FANCL与GGN1和GGN3相互作用,GGN1和GGN2又与一种新的蛋白质GGNBP特异作用。但GGNBP蛋白的功能还不清楚。为了研究GGNBP的功能以及揭示更多的参与该过程的蛋白质,运用Clontech公司新开发的第3套酵母双杂交系统,以GGNBP为诱饵从成年小鼠睾丸cDNA库中筛选与其相互作用的蛋白质基因,发现了一个主要在睾丸中表达的新的基因,其编码的蛋白质产物在酵母系统中与GGNBP特异作用。  相似文献   

10.
Huang Y  Cao H  Liu Z 《Proteins》2012,80(6):1610-1619
Since the proposal of three-dimensional (3D) domain swapping, many 3D domain-swapped structures have been reported. However, when compared with the vast protein structure space, it is still unclear whether 3D domain swapping is a general mechanism for protein assembly. Here, we investigated this possibility by constructing a dataset consisting of more than 500 domain-swapped structures. The domain-swapped structures were mapped into the protein structure space. We found that about 10% of protein folds and 5% of protein families contain domain-swapped structures. When comparing the domain-swapped structures in a family/superfamily, we found that proteins within a family/superfamily can swap in different ways. Interface analysis revealed that the hinge loops contributed more than half of the open interface in 70% of bona fide domain-swapped dimers, indicating that the hinge loops play an important role in stabilizing the domain-swapped conformations. Our study supports the suggestion that domain swapping is a general property of all proteins and will facilitate further understanding the mechanism of 3D domain swapping.  相似文献   

11.
In this paper, based on the approach by combining the "functional domain composition" [K.C. Chou, Y. D. Cai, J. Biol. Chem. 277 (2002) 45765] and the pseudo-amino acid composition [K.C. Chou, Proteins Struct. Funct. Genet. 43 (2001) 246; Correction Proteins Struct. Funct. Genet. 2044 (2001) 2060], the Nearest Neighbour Algorithm (NNA) was developed for predicting the protein subcellular location. Very high success rates were observed, suggesting that such a hybrid approach may become a useful high-throughput tool in the area of bioinformatics and proteomics.  相似文献   

12.
Zaki N  Berengueres J  Efimov D 《Proteins》2012,80(10):2459-2468
Detecting protein complexes from protein‐protein interaction (PPI) network is becoming a difficult challenge in computational biology. There is ample evidence that many disease mechanisms involve protein complexes, and being able to predict these complexes is important to the characterization of the relevant disease for diagnostic and treatment purposes. This article introduces a novel method for detecting protein complexes from PPI by using a protein ranking algorithm (ProRank). ProRank quantifies the importance of each protein based on the interaction structure and the evolutionarily relationships between proteins in the network. A novel way of identifying essential proteins which are known for their critical role in mediating cellular processes and constructing protein complexes is proposed and analyzed. We evaluate the performance of ProRank using two PPI networks on two reference sets of protein complexes created from Munich Information Center for Protein Sequence, containing 81 and 162 known complexes, respectively. We compare the performance of ProRank to some of the well known protein complex prediction methods (ClusterONE, CMC, CFinder, MCL, MCode and Core) in terms of precision and recall. We show that ProRank predicts more complexes correctly at a competitive level of precision and recall. The level of the accuracy achieved using ProRank in comparison to other recent methods for detecting protein complexes is a strong argument in favor of the proposed method. Proteins 2012;. © 2012 Wiley Periodicals, Inc.  相似文献   

13.
Recent Bayesian methods for the analysis of infectious disease outbreak data using stochastic epidemic models are reviewed. These methods rely on Markov chain Monte Carlo methods. Both temporal and non-temporal data are considered. The methods are illustrated with a number of examples featuring different models and datasets.  相似文献   

14.
WW domains mediate protein-protein interactions in a number of different cellular functions by recognizing proline-containing peptide sequences. We determined peptide recognition propensities for 42 WW domains using NMR spectroscopy and peptide library screens. As potential ligands, we studied both model peptides and peptides based on naturally occurring sequences, including phosphorylated residues. Thirty-two WW domains were classified into six groups according to detected ligand recognition preferences for binding the motifs PPx(Y/poY), (p/phi)P(p,g)PPpR, (p/phi)PPRgpPp, PPLPp, (p/xi)PPPPP, and (poS/poT)P (motifs according to modified Seefeld Convention 2001). In addition to these distinct binding motifs, group-specific WW domain consensus sequences were identified. For PPxY-recognizing domains, phospho-tyrosine binding was also observed. Based on the sequences of the PPx(Y/poY)-specific group, a profile hidden Markov model was calculated and used to predict PPx(Y/poY)-recognition activity for WW domains, which were not assayed. PPx(Y/poY)-binding was found to be a common property of NEDD4-like ubiquitin ligases.  相似文献   

15.
预测蛋白质间相互作用的生物信息学方法   总被引:8,自引:0,他引:8  
后基因组时代的研究模式,已从原来的序列-结构-功能转向基因表达-系统动力学-生理功能。建立蛋白质间相互作用的完全网络,即蛋白质相互作用组(interactome),将有助于从系统角度加深对细胞结构和功能的认识,并为新药靶点的发现和药物设计提供理论基础。一系列系统分析蛋白质相互作用的实验方法已经建立,近年来,出现了多种预测蛋白质相互作用的生物信息学方法,这些方法不仅是对传统实验方法的有价值的补充,而且能够扩展实验方法的预测范围;同时,在开发这些方法的过程中建立了一些重要的分子进化和分子生物学慨念。本文综述了9种生物信息学方法的原理、方法评估、存在的问题.并分析了这个领域的发展前景。  相似文献   

16.
17.
The degree to which an amino acid site is free to vary is strongly dependent on its structural and functional importance. An amino acid that plays an essential role is unlikely to change over evolutionary time. Hence, the evolutionary rate at an amino acid site is indicative of how conserved this site is and, in turn, allows evaluation of its importance in maintaining the structure/function of the protein. When using probabilistic methods for site-specific rate inference, few alternatives are possible. In this study we use simulations to compare the maximum-likelihood and Bayesian paradigms. We study the dependence of inference accuracy on such parameters as number of sequences, branch lengths, the shape of the rate distribution, and sequence length. We also study the possibility of simultaneously estimating branch lengths and site-specific rates. Our results show that a Bayesian approach is superior to maximum-likelihood under a wide range of conditions, indicating that the prior that is incorporated into the Bayesian computation significantly improves performance. We show that when branch lengths are unknown, it is better first to estimate branch lengths and then to estimate site-specific rates. This procedure was found to be superior to estimating both the branch lengths and site-specific rates simultaneously. Finally, we illustrate the difference between maximum-likelihood and Bayesian methods when analyzing site-conservation for the apoptosis regulator protein Bcl-x(L).  相似文献   

18.
蛋白质网络聚类是识别功能模块的重要手段,不仅有利于理解生物系统的组织结构,对预测蛋白质功能也具有重要的意义。针对目前蛋白质网络聚类算法缺乏有效分析软件的事实,本文设计并实现了一个新的蛋白质网络聚类算法分析平台ClusterE。该平台实现了查全率、查准率、敏感性、特异性、功能富集分析等聚类评估方法,并且集成了FAG-EC、Dpclus、Monet、IPC-MCE、IPCA等聚类算法,不仅可以对蛋白质网络聚类分析结果进行可视化,并且可以在不同聚类分析指标下对多个聚类算法进行可视化比较与分析。该平台具有良好的扩展性,其中聚类算法以及聚类评估方法都是以插件形式集成到系统中。  相似文献   

19.
Protein structure docking is the process in which the quaternary structure of a protein complex is predicted from individual tertiary structures of the protein subunits. Protein docking is typically performed in two main steps. The subunits are first docked while keeping them rigid to form the complex, which is then followed by structure refinement. Structure refinement is crucial for a practical use of computational protein docking models, as it is aimed for correcting conformations of interacting residues and atoms at the interface. Here, we benchmarked the performance of eight existing protein structure refinement methods in refinement of protein complex models. We show that the fraction of native contacts between subunits is by far the most straightforward metric to improve. However, backbone dependent metrics, based on the Root Mean Square Deviation proved more difficult to improve via refinement.  相似文献   

20.
The type III secreted protein Tir from Enterohemorrhagic Escherichia coli (EHEC O157:H7) plays a central role in adherence and pedestal formation during infection. Little is known about how Tir domains outside of the amino-terminus contribute to efficient Tir secretion and translocation. We found a 6 amino acid (519-524) carboxy-terminal region which was required for efficient Tir secretion and translocation. Interestingly, EHEC O157:H7 Tir(Delta)519-524 was efficiently secreted when expressed in the related pathogen enteropathogenic E. coli. These data suggest that this region may play a role in maintaining EHEC O157:H7 Tir in a secretion-competent conformation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号