首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.  相似文献   

2.
Zeng LC  Han ZG  Ma WJ 《FEBS letters》2005,579(25):5443-5453
The categorization of genes by structural distinctions relevant to biological characteristics is very important for understanding of gene functions and predicting functional implications of uncharacterized genes. It was absolutely necessary to deploy an effective and efficient strategy to deal with the complexity of the large olfactomedin-like (OLF) gene family sharing sequence similarity but playing diversified roles in many important biological processes, as the simple highest-hit homology analysis gave incomprehensive results and led to inappropriate annotation for some uncharacterized OLF members. In light of evolutionary information that may facilitate the classification of the OLF family and proper association of novel OLF genes with characterized homologs, we performed phylogenetic analysis on all 116 OLF proteins currently available, including two novel members cloned by our group. The OLF family segregated into seven subfamilies and members with similar domain compositions or functional properties all fell into relevant subfamilies. Furthermore, our Northern blot analysis and previous studies revealed that the typical human OLF members in each subfamily exhibited tissue-specific expression patterns, which in turn supported the segregation of the OLF subfamilies with functional divergence. Interestingly, the phylogenetic tree topology for the OLF domains alone was almost identical with that of the full-length tree representing the unique phylogenetic feature of full-length OLF proteins and their particular domain compositions. Moreover, each of the major functional domains of OLF proteins kept the same phylogenetic feature in defining similar topology of the tree. It indicates that the OLF domain and the various domains in flanking non-OLF regions have coevolved and are likely to be functionally interdependent. Expanded by a plausible gene duplication and domain couplings scenario, the OLF family comprises seven evolutionarily and functionally distinct subfamilies, in which each member shares similar structural and functional characteristics including the composition of coevolved and interdependent domains. The phylogenetically classified and preliminarily assessed subfamily framework may greatly facilitate the studying on the OLF proteins. Furthermore, it also demonstrated a feasible and reliable strategy to categorize novel genes and predict the functional implications of uncharacterized proteins based on the comprehensive phylogenetic classification of the subfamilies and their relevance to preliminary functional characteristics.  相似文献   

3.
4.
5.
In genetic language a peculiar arrangement of biological information is provided by overlapping genes in which the same region of DNA can code for functionally unrelated messages. In this work, the informational content of overlapping genes belonging to prokaryotic and eukaryotic viruses was analyzed. Using information theory indices, we identified in the regions of overlap a first pattern, exhibiting a more uniform base composition and more severe constraints in base ordering with respect to the nonoverlapping regions. This pattern was found to be peculiar to coliphage, avian hepatitis B virus, human lentivirus, and plant luteovirus families. A second pattern, characterized by the occurrence of similar compositional constraints in both types of coding regions, was found to be limited to plant tymoviruses. At the level of codon usage, a low degree of correlation between overlapping and nonoverlapping coding regions characterized the first pattern, whereas a close link was found in tymoviruses, indicating a fine adaptation of the overlapping frame to the original codon choice of the virus. As a result of codon usage correlation analysis, deductions concerning the origin and evolution of several overlapping frames were also proposed. Comparison of amino acid composition revealed an increased frequency of amino acid residues with a high level of degeneracy (arginine, leucine, and serine) in the proteins encoded by overlapping genes; this peculiar feature of overlapping genes can be viewed as a way with which they may expand their coding ability and gain new, specialized functions. Received: 28 October 1996 / Accepted: 29 January 1997  相似文献   

6.
Fibrillar collagens are the principal structural molecules of connective tissues. The assembly of collagen fibrils is regulated by quantitatively minor fibrillar collagens, types V and XI. A unique amino-terminal propeptide domain of these collagens has been attributed this regulatory role. The structure of the amino terminal propeptide has yet to be determined. Low sequence similarity necessitated a secondary structure-based method to carry out homology modeling based upon the determined structure of LNS family members, named for a common structure in the laminin LG5 domain, the neurexin 1B domain and the sex hormone binding globulin. Distribution of amino acids within the model suggested glycosaminoglycan interaction and calcium binding. These activities were tested experimentally. Sequence analyses of existing genes for collagens indicate that 16 known collagen alpha chains may contain an LNS domain. A similar approach may prove useful for structure/function studies of similar domains in other collagens with similar domains. This will provide mechanistic details of the organization and assembly of the extracellular matrix and the underlying basis of structural integrity in connective tissues. The absolute requirement for collagen XI in skeletal growth is indicated by collagen XI deficiencies such as chondrodystrophies found in the cho/cho mouse and in humans with Stickler syndrome.  相似文献   

7.
《Ecological Complexity》2008,5(2):132-139
Understanding the processes underlying food-web structure and organization remains one of the major tasks of ecology. While first attempts were mostly based on niche theory, with body size of species imposing a hierarchical structure for consumer species, it has been recently suggested that phylogenetic constraints may be more fundamental to understand who eats whom in natural communities. Models of food-web structure built on basic evolutionary assumptions are able to adequately reproduce the topology of real food-webs. Here, we analyze different implications of phylogenetic constraints on trophic structure, and present preliminary results. Our exploration of the relationship between trophic and taxonomic similarity in food-webs shows that phylogeny and trophic structure are closely linked. Interestingly, the relationship is stronger for trophic similarity between prey (similarity measured by shared predators species, or predatory similarity) than between consumer species (similarity measured by shared prey species, or dietary similarity). When relating body mass of prey and predators, slopes of major axis regressions within taxonomic groups differ markedly from the global pattern; similar differences between taxonomic levels appear when exploring the relationship between body mass of predators and the range in body mass of their prey, and vice versa. These results are important to understand how evolutionary processes shaping body sizes can affect food-web structure.  相似文献   

8.
细胞色素分子疏水性与进化的关系   总被引:1,自引:1,他引:0  
本文在先前研究结果的基础上,通过对细胞色素分子一维结构间疏水相似性的计算,建立了相应的分子系统树,并对细胞色素分子间的进化关系进行了探讨。结果表明,从蛋白质分子的疏水相似性和非线性三维结构来研究分子间的进货关系,不仅得到了与用其它方法所得到的结果基本一致的结论,而且还在一定程度上克服了其它一些方法的局限性,取得了较佳的结果。  相似文献   

9.
The Parkinson disease gene LRRK2: evolutionary and structural insights   总被引:8,自引:0,他引:8  
Mutations in the human leucine-rich repeat kinase 2 (LRRK2) gene are associated with both familial and sporadic Parkinson disease (PD). LRRK2 belongs to a gene family known as Roco. Roco genes encode for large proteins with several protein domains. Particularly, all Roco proteins have a characteristic GTPase domain, named Roc, plus a domain of unknown function called COR. In addition, LRRK2 and several other Roco proteins also contain a protein kinase domain. In this study, I use a combination of phylogenetic and structural analyses of the COR, Roc, and kinase domains present in Roco proteins to describe the origin and evolutionary history of LRRK2. Phylogenetic analyses using these domains demonstrate that LRRK2 emerged from a duplication that occurred after the protostome-deuterostome split. The duplication was followed by the acquisition by LRRK2 proteins of a specific type of N-terminal repeat, described here for the first time. This repeat is absent in the proteins encoded by the paralogs of LRRK2, called LRRK1 or in protostome LRRK proteins. These results suggest that Drosophila or Caenorhabditis LRRK genes may not be good models to understand human LRRK2 function. Genes in the slime mold Dictyostelium discoideum with structures very similar to those found in animal LRRK genes, including the protein kinase domain, have been described. However, phylogenetic analyses suggest that this structural similarity is due to independent acquisitions of distantly related protein kinase domains. Finally, I confirm in an extensive sequence analysis that the Roc GTPase domain is related but still substantially different from small GTPases, such as Rab, Ras, or Rho. Modeling based on known kinase structures suggests that mutations in LRRK2 that cause familiar PD may alter the local 3-dimensional folding of the LRRK2 protein without affecting its overall structure.  相似文献   

10.
The recently described inhibitor of cysteine proteinases from Trypanosoma cruzi, chagasin, was found to have close homologs in several eukaryotes, bacteria and archaea, the first protein inhibitors of cysteine proteases in prokaryotes. These previously uncharacterized 110-130 residue-long proteins share a well-conserved sequence motif that corresponds to two adjacent beta-strands and the short loop connecting them. Chagasin-like proteins also have other conserved, mostly aromatic, residues, and share the same predicted secondary structure. These proteins adopt an all-beta fold with eight predicted beta-strands of the immunoglobulin type. The phylogenetic distribution of the chagasins generally correlates with the presence of papain-like cysteine proteases. Previous studies have uncovered similar trends in cysteine proteinase binding by two unrelated inhibitors, stefin and p41, that belong to the cystatin and thyroglobulin families, respectively. A hypothetical model of chagasin-cruzipain interaction suggests that chagasin may dock to the cruzipain active site in a similar manner with the conserved NPTTG motif of chagasin forming a loop that is similar to the wedge structures formed at the active sites of papain and cathepsin L by stefin and p41.  相似文献   

11.
Co-evolution and co-adaptation in protein networks   总被引:2,自引:0,他引:2  
Juan D  Pazos F  Valencia A 《FEBS letters》2008,582(8):1225-1230
Interacting or functionally related proteins have been repeatedly shown to have similar phylogenetic trees. Two main hypotheses have been proposed to explain this fact. One involves compensatory changes between the two protein families (co-adaptation). The other states that the tree similarity may be an indirect consequence of the involvement of the two proteins in similar cellular process, which in turn would be reflected by similar evolutionary pressure on the corresponding sequences. There are published data supporting both propositions, and currently the available information is compatible with both hypotheses being true, in an scenario in which both sets of forces are shaping the tree similarity at different levels.  相似文献   

12.
In this work, we report likely recurrent horizontal (lateral) gene transfer events of genes encoding pore-forming toxins of the aerolysin family between species belonging to different kingdoms of life. Clustering based on pairwise similarity and phylogenetic analysis revealed several distinct aerolysin sequence groups, each containing proteins from multiple kingdoms of life. These results strongly support at least six independent transfer events between distantly related phyla in the evolutionary history of one protein family and discount selective retention of ancestral genes as a plausible explanation for this patchy phylogenetic distribution. We discuss the possible roles of these proteins and show evidence for a convergent new function in two extant species. We hypothesize that certain gene families are more likely to be maintained following horizontal gene transfer from commensal or pathogenic organism to its host if they 1) can function alone; and 2) are immediately beneficial for the ecology of the organism, as in the case of pore-forming toxins which can be utilized in multicellular organisms for defense and predation.  相似文献   

13.
Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.  相似文献   

14.
15.
Aromatic prenyltransferases transfer prenyl moieties onto aromatic acceptor molecules, catalyzing an electrophilic substitution of the aromatic ring under formation of carbon–carbon bonds. They give rise to an astounding diversity of primary and secondary metabolites in plants, fungi and bacteria. This review describes a recently discovered family of aromatic prenyltransferases. The structure of these enyzmes shows a type of β/α fold with antiparallel β strands. Due to the α-β-β-α architecture of this fold, this group of enzymes was designated as ABBA prenyltransferases. They lack the (N/D)DxxD motif which is characteristic for many other prenyltransferases.At present, 14 genes with sequence similarity to ABBA prenyltransferases can be identified in the database. A phylogenetic analysis of these genes separates them into two clades. One of them comprises the 4-hydroxyphenylpyruvate 3-dimethylallyltransferases CloQ and NovQ involved in aminocoumarin antibiotic biosynthesis in Streptomyces strains, as well as four genes of unknown function from fungal genomes. The other clade comprises genes involved in the biosynthesis of prenylated naphthoquinones and prenylated phenazines in different streptomycetes. ABBA prenyltransferases are soluble biocatalysts which can easily be obtained as homogeneous proteins in significant amounts. Their substrates are accommodated in a surprisingly spacious central cavity which explains their promiscuity for different aromatic substrates. Therefore, the enzymes of this family represent attractive tools for the chemoenzymatic synthesis of bioactive molecules.  相似文献   

16.
Zhao N  Pang B  Shyu CR  Korkin D 《PloS one》2011,6(5):e19554
Interactions between proteins play a key role in many cellular processes. Studying protein-protein interactions that share similar interaction interfaces may shed light on their evolution and could be helpful in elucidating the mechanisms behind stability and dynamics of the protein complexes. When two complexes share structurally similar subunits, the similarity of the interaction interfaces can be found through a structural superposition of the subunits. However, an accurate detection of similarity between the protein complexes containing subunits of unrelated structure remains an open problem. Here, we present an alignment-free machine learning approach to measure interface similarity. The approach relies on the feature-based representation of protein interfaces and does not depend on the superposition of the interacting subunit pairs. Specifically, we develop an SVM classifier of similar and dissimilar interfaces and derive a feature-based interface similarity measure. Next, the similarity measure is applied to a set of 2,806×2,806 binary complex pairs to build a hierarchical classification of protein-protein interactions. Finally, we explore case studies of similar interfaces from each level of the hierarchy, considering cases when the subunits forming interactions are either homologous or structurally unrelated. The analysis has suggested that the positions of charged residues in the homologous interfaces are not necessarily conserved and may exhibit more complex conservation patterns.  相似文献   

17.
Sinorhizobium meliloti strain 1021, a nitrogen-fixing, root-nodulating bacterial microsymbiont of alfalfa, has a 3.5 Mbp circular chromosome and two megaplasmids including 1.3 Mbp pSymA carrying nonessential 'accessory' genes for nitrogen fixation (nif), nodulation and host specificity (nod). A related bacterium, psyllid-vectored 'Ca. Liberibacter asiaticus,' is an obligate phytopathogen with a reduced genome that was previously analyzed for genes orthologous to genes on the S. meliloti circular chromosome. In general, proteins encoded by pSymA genes are more similar in sequence alignment to those encoded by S. meliloti chromosomal orthologs than to orthologous proteins encoded by genes carried on the 'Ca. Liberibacter asiaticus' genome. Only two 'Ca. Liberibacter asiaticus' proteins were identified as having orthologous proteins encoded on pSymA but not also encoded on the chromosome of S. meliloti. These two orthologous gene pairs encode a Na(+)/K+ antiporter (shared with intracellular pathogens of the family Bartonellacea) and a Co++, Zn++ and Cd++ cation efflux protein that is shared with the phytopathogen Agrobacterium. Another shared protein, a redox-regulated K+ efflux pump may regulate cytoplasmic pH and homeostasis. The pSymA and 'Ca. Liberibacter asiaticus' orthologs of the latter protein are more highly similar in amino acid alignment compared with the alignment of the pSymA-encoded protein with its S. meliloti chromosomal homolog. About 182 pSymA encoded proteins have sequence similarity (≤ E-10) with 'Ca. Liberibacter asiaticus' proteins, often present as multiple orthologs of single 'Ca. Liberibacter asiaticus' proteins. These proteins are involved with amino acid uptake, cell surface structure, chaperonins, electron transport, export of bioactive molecules, cellular homeostasis, regulation of gene expression, signal transduction and synthesis of amino acids and metabolic cofactors. The presence of multiple orthologs defies mutational analysis and is consistent with the hypothesis that these proteins may be of particular importance in host/microbe interaction and their duplication likely facilitates their ongoing evolution.  相似文献   

18.
Kinch LN  Grishin NV 《Proteins》2002,48(1):75-84
Nitrogen regulatory (PII) proteins are signal transduction molecules involved in controlling nitrogen metabolism in prokaryots. PII proteins integrate the signals of intracellular nitrogen and carbon status into the control of enzymes involved in nitrogen assimilation. Using elaborate sequence similarity detection schemes, we show that five clusters of orthologs (COGs) and several small divergent protein groups belong to the PII superfamily and predict their structure to be a (betaalphabeta)(2) ferredoxin-like fold. Proteins from the newly emerged PII superfamily are present in all major phylogenetic lineages. The PII homologs are quite diverse, with below random (as low as 1%) pairwise sequence identities between some members of distant groups. Despite this sequence diversity, evidence suggests that the different subfamilies retain the PII trimeric structure important for ligand-binding site formation and maintain a conservation of conservations at residue positions important for PII function. Because most of the orthologous groups within the PII superfamily are composed entirely of hypothetical proteins, our remote homology-based structure prediction provides the only information about them. Analogous to structural genomics efforts, such prediction gives clues to the biological roles of these proteins and allows us to hypothesize about locations of functional sites on model structures or rationalize about available experimental information. For instance, conserved residues in one of the families map in close proximity to each other on PII structure, allowing for a possible metal-binding site in the proteins coded by the locus known to affect sensitivity to divalent metal ions. Presented analysis pushes the limits of sequence similarity searches and exemplifies one of the extreme cases of reliable sequence-based structure prediction. In conjunction with structural genomics efforts to shed light on protein function, our strategies make it possible to detect homology between highly diverse sequences and are aimed at understanding the most remote evolutionary connections in the protein world.  相似文献   

19.
Genome level information coupled with phylogenetic analysis of specific genes and gene families allow for a better understanding of the structure and function of their protein products. In this study, we examine the mammalian uroplakins (UPs) Ia and Ib, members of the tetraspanin superfamily, that interact with uroplakins UPII and UPIIIa/IIIb, respectively, using a phylogenetic approach of these genes from whole genome sequences. These proteins interact to form urothelial plaques that play a central role in the permeability barrier function of the apical urothelial surface of the urinary bladder. Since these plaques are found exclusively in mammalian urothelium, it is enigmatic that UP-like genomic sequences were recently found in lower vertebrates without a typical urothelium. We have cloned full-length UP-related cDNAs from frog (Xenopus laevis), chicken (Gallus gallus), and zebrafish (Danio rerio), and combined these data with sequence information from their orthologs in all the available fully sequenced and annotated animal genomes. Phylogenetic analyses of all the available uroplakin sequences, and an understanding of their distribution in several animal taxa, suggest that: (i) the UPIa/UPIb and UPII/UPIII genes evolved by gene duplication in the common ancestor of vertebrates; (ii) uroplakins can be lost in different combinations in vertebrate lineages; and (iii) there is a strong co-evolutionary relationship between UPIa and UPIb and their partners UPII and UPIIIa/IIIb, respectively. The co-evolution of the tetraspanin UPs and their associated proteins may fine-tune the structure and function of uroplakin complexes enabling them to perform diverse species- and tissue-specific functions. The structure and function of uroplakins, which are also expressed in Xenopus kidney, oocytes and fat body, are much more versatile than hitherto appreciated.  相似文献   

20.
The collagens constitute a large family of extracellular matrix components primarily responsible for maintaining the structure and biological integrity of connective tissue. These proteins exhibit considerable diversity size, sequence, tissue distribution, and molecular composition. Fourteen types of homo- and/or heterotrimeric molecules, thus far reported, are encoded by a minimum of 27 genes. Nineteen of these genes, including several that are closely linked, have been assigned to 10 separate autosomes, and one collagen gene has been mapped to the X chromosome. We have isolated a 2.1-kb human cDNA clone coding for a collagen molecule different in sequence and structure from types I-XIV collagens. This polypeptide has been designated the alpha 1 chain of type XV collagen. To determine the location of the corresponding gene, the cDNA clone was hybridized to rodent-human hybrid DNAs and to human metaphase chromosomes. The results obtained using the hybrid cell lines showed that this newly identified collagen gene, COL15A1, is present in the pter --> q34 region of chromosome 9. In situ hybridization allowed sublocalization to 9q21 --> q22, a region to which no other collagen genes had previously been assigned. Our data further demonstrate the complex arrangement of the many collagen genes in the human genome.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号