首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 874 毫秒
1.
Conserved domains represent essential building blocks of most known proteins. Owing to their role as modular components carrying out specific functions they form a network based both on functional relations and direct physical interactions. We have previously shown that domain interaction networks provide substantially novel information with respect to networks built on full-length protein chains. In this work we present a comprehensive web resource for exploring the Domain Interaction MAp (DIMA), interactively. The tool aims at integration of multiple data sources and prediction techniques, two of which have been implemented so far: domain phylogenetic profiling and experimentally demonstrated domain contacts from known three-dimensional structures. A powerful yet simple user interface enables the user to compute, visualize, navigate and download domain networks based on specific search criteria. Availability: http://mips.gsf.de/genre/proj/dima  相似文献   

2.
Zhou Y  Wang R  Li L  Xia X  Sun Z 《Journal of molecular biology》2006,359(4):1150-1159
Identifying potential protein interactions is of great importance in understanding the topologies of cellular networks, which is much needed and valued in current systematic biological studies. The development of our computational methods to predict protein-protein interactions have been spurred on by the massive sequencing efforts of the genomic revolution. Among these methods is phylogenetic profiling, which assumes that proteins under similar evolutionary pressures with similar phylogenetic profiles might be functionally related. Here, we introduce a method for inferring functional linkages between proteins from their evolutionary scenarios. The term evolutionary scenario refers to a series of events that occurred in speciation over time, which can be reconstructed given a phylogenetic profile and a species tree. Common evolutionary pressures on two proteins can then be inferred by comparing their evolutionary scenarios, which is a direct indication of their functional linkage. This scenario method has proven to have better performance compared with the classical phylogenetic profile method, when applied to the same test set. In addition, predicted results of the two methods are found to be fairly different, suggesting the possibility of merging them in order to achieve a better performance. We analyzed the influence of the topology of the phylogenetic tree on the performance of this method, and found it to be robust to perturbations in the topology of the tree. However, if a completely random tree is incorporated, performance will decline significantly. The evolutionary scenario method was used for inferring functional linkages in 67 species, and 40,006 linkages were predicted. We examine our prediction for budding yeast and find that almost all predicted linkages are supported by further evidence.  相似文献   

3.
系统发育谱方法是目前研究较多的一种基于非同源性的生物大分子功能注释方法。针对现有算法存在的一些缺陷,从两个方面对该方法做了改进:一是构造基于权重的系统发育谱;二是采用改进的聚类算法对发育谱的相似性进行分析。从NCBI上下载100条Escherichia coli K12蛋白质作为实验数据,分别使用改进的算法和经典的层次聚类算法、K均值聚类算法对相似谱进行分析。结果显示,提出的改进算法在对相似谱聚类的精确度上明显优于后两种聚类算法。  相似文献   

4.
Predicting functions of proteins and alternatively spliced isoforms encoded in a genome is one of the important applications of bioinformatics in the post-genome era. Due to the practical limitation of experimental characterization of all proteins encoded in a genome using biochemical studies, bioinformatics methods provide powerful tools for function annotation and prediction. These methods also help minimize the growing sequence-to-function gap. Phylogenetic profiling is a bioinformatics approach to identify the influence of a trait across species and can be employed to infer the evolutionary history of proteins encoded in genomes. Here we propose an improved phylogenetic profile-based method which considers the co-evolution of the reference genome to derive the basic similarity measure, the background phylogeny of target genomes for profile generation and assigning weights to target genomes. The ordering of genomes and the runs of consecutive matches between the proteins were used to define phylogenetic relationships in the approach. We used Escherichia coli K12 genome as the reference genome and its 4195 proteins were used in the current analysis. We compared our approach with two existing methods and our initial results show that the predictions have outperformed two of the existing approaches. In addition, we have validated our method using a targeted protein-protein interaction network derived from protein-protein interaction database STRING. Our preliminary results indicates that improvement in function prediction can be attained by using coevolution-based similarity measures and the runs on to the same scale instead of computing them in different scales. Our method can be applied at the whole-genome level for annotating hypothetical proteins from prokaryotic genomes.  相似文献   

5.
6.
We have proposed a rapid phylogenetic classification at the strain level by MALDI-TOF MS using ribosomal protein matching profiling. In this study, the S10-spc-alpha operon, encoding half of the ribosomal subunit proteins and highly conserved in eubacterial genomes, was selected for construction of the ribosomal protein database as biomarkers for bacterial identification by MALDI-TOF MS analysis to establish a more reliable phylogenetic classification. Our method revealed that the 14 reliable and reproducible ribosomal subunit proteins with less than m/z 15,000, except for L14, coded in the S10-spc-alpha operon were significantly useful biomarkers for bacterial classification at species and strain levels by MALDI-TOF MS analysis of genus Pseudomonas strains. The obtained phylogenetic tree was consisted with that based on genetic sequence (gyrB). Since S10-spc-alpha operons of genus Pseudomonas strains were sequenced using specific primers designed based on nucleotide sequences of genome-sequenced strains, the ribosomal subunit proteins encoded in S10-spc-alpha operon were suitable biomarkers for construction and correction of the database. MALDI-TOF MS analysis using these 14 selected ribosomal proteins is a rapid, efficient, and versatile bacterial identification method with the validation procedure for the obtained results.  相似文献   

7.
结构域是进化上的保守序列单元,是蛋白质的结构和功能的标准组件.典型的两个蛋白质间的相互作用涉及特殊结构域间的结合,而且识别相互作用结构域对于在结构域水平上彻底理解蛋白质的功能与进化、构建蛋白质相互作用网络、分析生物学通路等十分重要.目前,依赖于对实验数据的进一步挖掘和对各种不同输入数据的计算预测,已识别出了一些相互作用/功能连锁结构域对,并由此构建了内容丰富、日益更新的结构域相互作用数据库.综述了产生结构域相互作用的8种计算预测方法.介绍了5个结构域相互作用公共数据库3DID、iPfam、InterDom、DIMA和DOMINE的有关信息和最新动态.实例概述了结构域相互作用在蛋白质相互作用计算预测、可信度评估,蛋白质结构域注释,以及在生物学通路分析中的应用.  相似文献   

8.
9.
In order to simplify and meaningfully categorize large sets of protein sequence data, it is commonplace to cluster proteins based on the similarity of those sequences. However, it quickly becomes clear that the sequence flexibility allowed a given protein varies significantly among different protein families. The degree to which sequences are conserved not only differs for each protein family, but also is affected by the phylogenetic divergence of the source organisms. Clustering techniques that use similarity thresholds for protein families do not always allow for these variations and thus cannot be confidently used for applications such as automated annotation and phylogenetic profiling. In this work, we applied a spectral bipartitioning technique to all proteins from 53 archaeal genomes. Comparisons between different taxonomic levels allowed us to study the effects of phylogenetic distances on cluster structure. Likewise, by associating functional annotations and phenotypic metadata with each protein, we could compare our protein similarity clusters with both protein function and associated phenotype. Our clusters can be analyzed graphically and interactively online.  相似文献   

10.
The BRCT (Breast Cancer Carboxyl Terminus) domain is widely distributed in proteins involved in DNA metabolism and cell cycle regulation. In most of the representative members of the BRCT family, this domain is usually comprising of about 90-100 amino acid residues and generally present as single motif or in tandem repeats. Although the members of BRCT family share little sequence similarity, structural studies have demonstrated a relatively conserved structure of two or three alpha-helices surrounding the central beta-sheets. This report illustrates an in silico analysis with the aim of understanding the sequential, structural, and phylogenetic features of BRCT domain in higher plant genome. Based on database searches 25 BRCT domain containing proteins were identified and many of them were found to be involved in multiple DNA damage repair pathways. We have further combined the homology modeling in order to address the structure-function relations of BRCT domain in connection with DNA damage repair mechanism in plants.  相似文献   

11.
Emerson RO  Thomas JH 《Journal of virology》2011,85(22):12043-12052
SCAN is a protein domain frequently found at the N termini of proteins encoded by mammalian tandem zinc finger (ZF) genes, whose structure is known to be similar to that of retroviral gag capsid domains and whose multimerization has been proposed as a model for retroviral assembly. We report that the SCAN domain is derived from the C-terminal portion of the gag capsid (CA) protein from the Gmr1-like family of Gypsy/Ty3-like retrotransposons. On the basis of sequence alignments and phylogenetic distributions, we show that the ancestral host SCAN domain (ESCAN for extended SCAN) was exapted from a full-length CA gene from a Gmr1-like retrotransposon at or near the root of the tetrapod animal branch. A truncated variant of ESCAN that corresponds to the annotated SCAN domain arose shortly thereafter and appears to be the only form extant in mammals. The Anolis lizard has a large number of tandem ZF genes with N-terminal ESCAN or SCAN domains. We predict DNA binding sites for all Anolis ESCAN-ZF and SCAN-ZF proteins and demonstrate several highly significant matches to Anolis Gmr1-like sequences, suggesting that at least some of these proteins target retroelements. SCAN is known to mediate protein dimerization, and the CA protein multimerizes to form the core retroviral and retrotransposon capsid structure. We speculate that the SCAN domain originally functioned to target host ZF proteins to retroelement capsids.  相似文献   

12.
Butyrophilins (BTN) belong to the immunoglobulin (Ig) superfamily of transmembrane proteins. These molecules are of increasing interest to immunologists, as they share a structural homology with B7 family members at the extracellular domain level. Moreover, a role of these molecules has been suggested in the negative regulation of lymphocyte activation for almost all the BTN that have been studied. In addition, the expression of some BTN family members has been reported to be associated with autoimmune diseases. Over the last few years, the number of BTN and BTN-like members has greatly increased. In this study, the butyrophilin family in mammals has been revisited, using phylogenetic analysis to identify all the family members and the phylogenetic relations among them, and to establish a standard nomenclature. Fourteen BTN groups were identified that are not all conserved between mammalian species. In addition, an overview of expression profiles and functional BTN data demonstrates that these molecules represent a new area of investigation for the design of future strategies in the modulation of the immune system.  相似文献   

13.
Homology model building of the HMG-1 box structural domain.   总被引:3,自引:1,他引:2       下载免费PDF全文
Nucleoproteins belonging to the HMG-1/2 family possess homologous domains approximately 75 amino acids in length. These domains, termed HMG-1 boxes, are highly structured, compact, and mediate the interaction between HMG-1 box-containing proteins and DNA in a variety of biological contexts. Homology model building experiments on HMG-1 box sequences 'threaded' through the 1H-NMR structure of an HMG-1 box from rat indicate that the domain does not have rigid sequence requirements for its formation. Energy calculations indicate that the structure of all HMG-1 box domains is stabilized primarily through hydrophobic interactions. We have found structural relationships in the absence of statistically significant sequence similarity, identifying several candidate proteins which could possibly assume the same three-dimensional conformation as the rat HMG-1 box motif. The threading technique provides a method by which significant structural similarities in a diverse protein family can be efficiently detected, and the 'structural alignment' derived by this method provides a rational basis through which phylogenetic relationships and the precise sites of interaction between HMG-1 box proteins and DNA can be deduced.  相似文献   

14.
Reliable methods for profiling secretory proteins are highly desirable for the identification of biomarkers of disease progression. Secreted proteins are often masked by high amounts of protein supplements in the culture medium. We have developed an efficient method for the enrichment and analysis of the secretome of different cancer cell lines, free of essential contaminants. The method is based on the optimization of cell incubation conditions in protein-free medium. Secreted proteins are concentrated and fractionated using a reversed-phase tC2 Sorbent, followed by peptide mass fingerprinting for protein identification. An average of 88 proteins were identified in each cancer cell line, of which more than 76% are known to be secreted, possess a signal peptide or a transmembrane domain. Given the importance of secreted proteins as a source for early detection and diagnosis of disease, this approach may help to discover novel candidate biomarkers with potential clinical significance.  相似文献   

15.
Most eubacteria, and all eukaryotes examined thus far, encode homologs of the DNA mismatch repair protein MutS. Although eubacteria encode only one or two MutS-like proteins, eukaryotes encode at least six distinct MutS homolog (MSH) proteins, corresponding to conserved (orthologous) gene families. This suggests evolution of individual gene family lines of descent by several duplication/specialization events. Using quantitative phylogenetic analyses (RASA, or relative apparent synapomorphy analysis), we demonstrate that comparison of complete MutS protein sequences, rather than highly conserved C-terminal domains only, maximizes information about evolutionary relationships. We identify a novel, highly conserved middle domain, as well as clearly delineate an N-terminal domain, previously implicated in mismatch recognition, that shows family-specific patterns of aromatic and charged amino acids. Our final analysis, in contrast to previous analyses of MutS-like sequences, yields a stable phylogenetic tree consistent with the known biochemical functions of MutS/MSH proteins, that now assigns all known eukaryotic MSH proteins to a monophyletic group, whose branches correspond to the respective specialized gene families. The rooted phylogenetic tree suggests their derivation from a mitochondrial MSH1-like protein, itself the descendent of the MutS of a symbiont in a primitive eukaryotic precursor.  相似文献   

16.
“Phylogenetic profiling” is based on the hypothesis that during evolution functionally or physically interacting genes are likely to be inherited or eliminated in a codependent manner. Creating presence–absence profiles of orthologous genes is now a common and powerful way of identifying functionally associated genes. In this approach, correctly determining orthology, as a means of identifying functional equivalence between two genes, is a critical and nontrivial step and largely explains why previous work in this area has mainly focused on using presence–absence profiles in prokaryotic species. Here, we demonstrate that eukaryotic genomes have a high proportion of multigene families whose phylogenetic profile distributions are poor in presence–absence information content. This feature makes them prone to orthology mis-assignment and unsuited to standard profile-based prediction methods. Using CATH structural domain assignments from the Gene3D database for 13 complete eukaryotic genomes, we have developed a novel modification of the phylogenetic profiling method that uses genome copy number of each domain superfamily to predict functional relationships. In our approach, superfamilies are subclustered at ten levels of sequence identity—from 30% to 100%—and phylogenetic profiles built at each level. All the profiles are compared using normalised Euclidean distances to identify those with correlated changes in their domain copy number. We demonstrate that two protein families will “auto-tune” with strong co-evolutionary signals when their profiles are compared at the similarity levels that capture their functional relationship. Our method finds functional relationships that are not detectable by the conventional presence–absence profile comparisons, and it does not require a priori any fixed criteria to define orthologous genes.  相似文献   

17.
18.
Summary The blue copper proteins and their relatives have been compared by sequence alignments, by comparison of three-dimensional structures, and by construction of phylogenetic trees. The group contains proteins varying in size from 100 residues to over 2,300 residues in a single chain, containing from zero to nine copper atoms, and with a broad variation in function ranging from electron carrier proteins and oxidases to the blood coagulation factors V and VIII. Difference matrices show the sequence difference to be over 90% for many pairs in the group, yet alignment scores and other evidence suggest that they all evolved from a common ancestor. We have attempted to delineate how this evolution took place and in particular to define the mechanisms by which these proteins acquired an ever-increasing complexity in structure and function. We find evidence for six such mechanisms in this group of proteins: domain enlargement, in which a single domain increases in size from about 100 residues up to 210; domain duplication, which allows for a size increase from about 170 to about 1,000 residues; segment elongation, in which a small segment undergoes multiple successive duplications that can increase the chain size 50-fold; domain recruitment, in which a domain coded elsewhere in the genome is added on to the peptide chain; subunit formation, to form multisubunit proteins; and glycosylation, which in some cases doubles the size of the protein molecule. Size increase allows for the evolution of new catalytic properties, in particular the oxidase function, and for the formation of coagulation factors with multiple interaction sites and regulatory properties. The blood coagulation system is examined as an example in which a system of interacting proteins evolved by successive duplications of larger parts of the genome. The evolution of size, functionality, and diversity is compared with the general question of increase in size and complexity in biology.  相似文献   

19.
Predictome: a database of putative functional links between proteins   总被引:11,自引:2,他引:9       下载免费PDF全文
The current deluge of genomic sequences has spawned the creation of tools capable of making sense of the data. Computational and high-throughput experimental methods for generating links between proteins have recently been emerging. These methods effectively act as hypothesis machines, allowing researchers to screen large sets of data to detect interesting patterns that can then be studied in greater detail. Although the potential use of these putative links in predicting gene function has been demonstrated, a central repository for all such links for many genomes would maximize their usefulness. Here we present Predictome, a database of predicted links between the proteins of 44 genomes based on the implementation of three computational methods—chromosomal proximity, phylogenetic profiling and domain fusion—and large-scale experimental screenings of protein–protein interaction data. The combination of data from various predictive methods in one database allows for their comparison with each other, as well as visualization of their correlation with known pathway information. As a repository for such data, Predictome is an ongoing resource for the community, providing functional relationships among proteins as new genomic data emerges. Predictome is available at http://predictome.bu.edu.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号