首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.  相似文献   

2.
The complete complement of C1q-domain-containing proteins in Homo sapiens   总被引:2,自引:0,他引:2  
The C-terminal domains of the A, B, C chains of C1q subcomponent of C1 complex represent a common structural motif, the C1q domain, that is found in a diverse range of proteins. We analyzed the human genome for the complete complement of this family and have identified a total of 31 independent gene sequences. The predominant organization of C1q-domain-containing (C1qDC) proteins includes a leading signal peptide, a collagen-like region of variable length, and a C-terminal C1q domain. There are 15 highly conserved residues within the C1q domain, among which 8 are invariant within the human gene set and these are predicted to cluster within the hydrophobic core of the protein. We suggest a 3-subfamily classification based on sequence homology. For some C1qDC-encoding genes, strict orthology has been retained throughout vertebrate evolution and these examples suggest a highly specific functional role for C1qDC proteins that has been under significant selective pressure. Alternatively, individual species have co-opted C1qDC proteins for roles that are highly specific to their biology, suggesting an evolutionary strategy of gene duplication and functional diversification. A more extensive analysis of the evolutionary relationship of C1qDC proteins reveals an ancient rooting, with clear members found in eubacterial species. Curiously, we have been unable to identify C1qDC-encoding genes in many eukaryotic genomcs, such as Sacchromyces cerivisae and C. elegans, suggesting that the retention or loss of this gene family throughout evolution has been sporadic.  相似文献   

3.
Alternative RNA splicing in multicellular organisms is regulated by a large group of proteins of mainly unknown origin. To predict the functions of these proteins, classification of their domains at the sequence and structural level is necessary. We have focused on four groups of splicing regulators, the heterogeneous nuclear ribonucleoprotein (hnRNP), serine?Carginine (SR), embryonic lethal, abnormal vision (ELAV)-like, and CUG-BP and ETR-like factor (CELF) proteins, that show increasing diversity among metazoa. Sequence and phylogenetic analyses were used to obtain a broader understanding of their evolutionary relationships. Surprisingly, when we characterised sequence similarities across full-length sequences and conserved domains of ten metazoan species, we found some hnRNPs were more closely related to SR, ELAV-like and CELF proteins than to other hnRNPs. Phylogenetic analyses and the distribution of the RRM domains suggest that these proteins diversified before the last common ancestor of the metazoans studied here through domain acquisition and duplication to create genes of mixed evolutionary origin. We propose that these proteins were derived independently rather than through the expansion of a single protein family. Our results highlight inconsistencies in the current classification system for these regulators, which does not adequately reflect their evolutionary relationships, and suggests that a domain-based classification scheme may have more utility.  相似文献   

4.
L-myo-Inositol 1-phosphate synthase (MIPS, EC 5.5.1.4), the key enzyme in the inositol and phosphoinositide biosynthetic pathway, is present throughout evolutionarily diverse organisms and is considered an ancient protein/gene. Analysis by multiple sequence alignment, phylogenetic tree generation and comparison of newly determined crystal structures provides new insight into the origin and evolutionary relationships among the various MIPS proteins/genes. The evolution of the MIPS protein/gene among the prokaryotes seems more diverse and complex than amongst the eukaryotes. However, conservation of a 'core catalytic structure' among the MIPS proteins implies an essential function of the enzyme in cellular metabolism throughout the biological kingdom.  相似文献   

5.
Beta-barrel proteins are the main transit points across the mitochondrial outer membrane. Mitochondrial porin, the voltage-dependent, anion-selective channel (VDAC), is responsible for the passage of small molecules between the mitochondrion and the cytosol. Through interactions with other mitochondrial and cellular proteins, it is involved in regulating organellar and cellular metabolism and likely contributes to mitochondrial structure. Tom40 is part of the translocase of the outer membrane, and acts as the channel for passage of preproteins during their import into the organelle. These proteins appear to share a common evolutionary origin and structure. In the current study, the evolutionary relationships between and within both proteins were investigated through phylogenetic analysis. The two groups have a common origin and have followed independent, complex evolutionary pathways, leading to the generation of paralogues in animals and plants. Structures of diverse representatives were modeled, revealing common themes rather than sites of high identity in both groups. Within each group, intramolecular coevolution was assessed, revealing a new set of sites potentially involved in structure-function relationships in these molecules. A weak link between Tom40 and proteins related to the mitochondrial distribution and morphology protein, Mdm10, was identified. This article is part of a Special Issue entitled: VDAC structure, function, and regulation of mitochondrial metabolism.  相似文献   

6.
The organelle paralogy hypothesis is one model for the acquisition of nonendosymbiotic organelles, generated from molecular evolutionary analyses of proteins encoding specificity in the membrane traffic system. GTPase activating proteins (GAPs) for the ADP‐ribosylation factor (Arfs) GTPases are additional regulators of the kinetics and fidelity of membrane traffic. Here we describe molecular evolutionary analyses of the Arf GAP protein family. Of the 10 subfamilies previously defined in humans, we find that 5 were likely present in the last eukaryotic common ancestor. Of the 3 most recently derived subfamilies, 1 was likely present in the ancestor of opisthokonts (animals and fungi) and apusomonads (flagellates classified as the sister lineage to opisthokonts), while 2 arose in the holozoan lineage. We also propose to have identified a novel ancient subfamily (ArfGAPC2), present in diverse eukaryotes but which is lost frequently, including in the opisthokonts. Surprisingly few ancient domains accompanying the ArfGAP domain were identified, in marked contrast to the extensively decorated human Arf GAPs. Phylogenetic analyses of the subfamilies reveal patterns of single and multiple gene duplications specific to the Holozoa, to some degree mirroring evolution of Arf GAP targets, the Arfs. Conservation, and lack thereof, of various residues in the ArfGAP structure provide contextualization of previously identified functional amino acids and their application to Arf GAP biology in general. Overall, our results yield insights into current Arf GAP biology, reveal complexity in the ancient eukaryotic ancestor and integrate the Arf GAP family into a proposed mechanism for the evolution of nonendosymbiotic organelles.  相似文献   

7.
The furanosidase superfamily contains the GH32, GH43, GH62, GH68, GH117, DUF377 (GH130), and DUF1861 families of glycoside hydrolases and their homologues. Catalytic domains of these families have five-bladed β-propeller tertiary structure. Iterative screening of the protein database supports of their relationship as well as evolutionary connections with domains from GH33 and GH93 families of glycoside hydrolases. The latter two have the structure of the six-bladed β-propeller. Among detected homologues we found 441 unclassified proteins. These proteins are combined into 39 groups based on homology: FURAN1-FURAN39. FURAN8 and FURAN36 can be considered as separate subfamilies within the GH43 and GH32 families of glycoside hydrolases, respectively. The remaining 37 groups are new families of hypothetical glycoside hydrolases.  相似文献   

8.
Sulfur-oxidizing chemoautotrophic (thioautotrophic) bacteria are now known to occur as endosymbionts in phylogenetically diverse bivalve hosts found in a wide variety of marine environments. The evolutionary origins of these symbioses, however, have remained obscure. Comparative 16S rRNA sequence analysis was used to investigate whether thioautotrophic endosymbionts are monophyletic or polyphyletic in origin and to assess whether phylogenetic relationships inferred among these symbionts reflect those inferred among their hosts. 16S rRNA gene sequences determined for endosymbionts from nine newly examined bivalve species from three families (Vesicomyidae, Lucinidae, and Solemyidae) were compared with previously published 16S rRNA sequences of thioautotrophic symbionts and free-living bacteria. Distance and parsimony methods were used to infer phylogenetic relationships among these bacteria. All newly examined symbionts fall within the gamma subdivision of the Proteobacteria, in clusters containing previously examined symbiotic thioautotrophs. The closest free-living relatives of these symbionts are bacteria of the genus Thiomicrospira. Symbionts of the bivalve superfamily Lucinacea and the family Vesicomyidae each form distinct monophyletic lineages which are strongly supported by bootstrap analysis, demonstrating that host phylogenies inferred from morphological and fossil evidence are congruent with phylogenies inferred for their respective symbionts by molecular sequence analysis. The observed congruence between host and symbiont phylogenies indicates shared evolutionary history of hosts and symbiont lineages and suggests an ancient origin for these symbioses. Correspondence to: D.L. Distel  相似文献   

9.
Many dissimilar protein sequences fold into similar structures. A central and persistent challenge facing protein structural analysis is the discrimination between homology and convergence for structurally similar domains that lack significant sequence similarity. Classic examples are the OB-fold and SH3 domains, both small, modular beta-barrel protein superfolds. The similarities among these domains have variously been attributed to common descent or to convergent evolution. Using a sequence profile-based phylogenetic technique, we analyzed all structurally characterized OB-fold, SH3, and PDZ domains with less than 40% mutual sequence identity. An all-against-all, profile-versus-profile analysis of these domains revealed many previously undetectable significant interrelationships. The matrices of scores were used to infer phylogenies based on our derivation of the relationships between sequence similarity E-values and evolutionary distances. The resulting clades of domains correlate remarkably well with biological function, as opposed to structural similarity, indicating that the functionally distinct sub-families within these superfolds are homologous. This method extends phylogenetics into the challenging "twilight zone" of sequence similarity, providing the first objective resolution of deep evolutionary relationships among distant protein families.  相似文献   

10.
The current classification of parvoviruses is based on virus host range and helper virus dependence, while little data on evolutionary relationships among viruses are available. We identified and analyzed 472 sequences of parvoviruses, among which there were (virtually) full-length genomes of all 41 viruses currently recognized as individual species within the family Parvoviridae. Our phylogenetic analysis of full-length genomes as well as open reading frames distinguished three evolutionary groups of parvoviruses from vertebrates: (i) the human helper-dependent adeno-associated virus (AAV) serotypes 1 to 6 and the autonomous avian parvoviruses; (ii) the bovine, chipmunk, and autonomous primate parvoviruses, including human viruses B19 and V9; and (iii) the parvoviruses from rodents (except for chipmunks), carnivores, and pigs. Each of these three evolutionary groups could be further subdivided, reflecting both virus-host coevolution and multiple cross-species transmissions in the evolutionary history of parvoviruses. No parvoviruses from invertebrates clustered with vertebrate parvoviruses. Our analysis provided evidence for negative selection among parvoviruses, the independent evolution of their genes, and recombination among parvoviruses from rodents. The topology of the phylogenetic tree of autonomous human and simian parvoviruses matched exactly the topology of the primate family tree, as based on the analysis of primate mitochondrial DNA. Viruses belonging to the AAV group were not evolutionarily linked to other primate parvoviruses but were linked to the parvoviruses of birds. The two lineages of human parvoviruses may have resulted from independent ancient zoonotic infections. Our results provide an argument for reclassification of Parvovirinae based on evolutionary relationships among viruses.  相似文献   

11.
The ubiquitin-dependent protein degradation pathway plays diverse roles in eukaryotes. Previous studies indicate that both F-box and Kelch motifs are common in a variety of organisms. F-box proteins are subunits of E3 ubiquitin ligase complexes called SCFs (SKP1, Cullinl, F-box protein, and Rbxl); they have an N-terminal F-box motif that binds to SKP1 (S-phase kinase associated protein), and often have C-terminal protein-protein interaction domains, which specify the protein substrates for degradation via the ubiquitin pathway. One of the most frequently found protein interaction domains in F-box proteins is the Kelch repeat domain. Although both the F-box and Kelch repeats are ancient motifs, Kelch repeats-containing F-box proteins (KFB) have only been reported for human and Arabidopsis previously. The recent sequencing of the rice genome and other plant genomes provides an opportunity to examine the possible evolution history of KFB. We carried out extensive BLAST searches to identify putative KFBs in selected organisms, and analyzed their relationships phylogenetically. We also carried out the analysis of both gene duplication and gene expression of the KFBs in rice and Arabidopsis. Our study indicates that the origin of KFBs occurs before the divergence of animals and plants, and plant KFBs underwent rapid gene duplications.  相似文献   

12.
The evolutionary relationships of proteobacteria, which comprise the largest and phenotypically most diverse division among prokaryotes, are examined based on the analyses of available molecular sequence data. Sequence alignments of different proteins have led to the identification of numerous conserved inserts and deletions (referred to as signature sequences), which either are unique characteristics of various proteobacterial species or are shared by only members from certain subdivisions of proteobacteria. These signature sequences provide molecular means to define the proteobacterial phyla and their various subdivisions and to understand their evolutionary relationships to the other groups of eubacteria as well as the eukaryotes. Based on signature sequences that are present in different proteins it is now possible to infer that the various eubacterial phyla evolved from a common ancestor in the following order: low-G+C Gram-positive-->high-G+C Gram-positive-->Deinococcus-Thermus (green nonsulfur bacteria)-->cyanobacteria-->Spirochetes-->Chlamydia-Cytophaga-Aquifex -green sulfur bacteria-->Proteobacteria-1 (epsilon and delta)-->Proteobacteria-2 (alpha)-->Proteobacteria-3 (beta)-->Proteobacteria-4 (gamma). An unexpected but important aspect of the relationship deduced here is that the main eubacterial phyla are related to each other linearly rather than in a tree-like manner, suggesting that the major evolutionary changes within Bacteria have taken place in a directional manner. The identified signatures permit placement of prokaryotes into different groups/divisions and could be used for determinative purposes. These signatures generally support the origin of mitochondria from an alpha-proteobacterium and provide evidence that the nuclear cytosolic homologs of many genes are also derived from proteobacteria.  相似文献   

13.
During the past decade, ancient gene duplications were recognized as one of the main forces in the generation of diverse gene families and the creation of new functional capabilities. New tools developed to search data banks for homologous sequences, and an increased availability of reliable three-dimensional structural information led to the recognition that proteins with diverse functions can belong to the same superfamily. Analyses of the evolution of these superfamilies promises to provide insights into early evolution but are complicated by several important evolutionary processes. Horizontal transfer of genes can lead to a vertical spread of innovations among organisms, therefore finding a certain property in some descendants of an ancestor does not guarantee that it was present in that ancestor. Complete or partial gene conversion between duplicated genes can yield phylogenetic trees with several, apparently independent gene duplications, suggesting an often surprising parallelism in the evolution of independent lineages. Additionally, the breakup of domains within a protein and the fusion of domains into multifunctional proteins makes the delineation of superfamilies a task that remains difficult to automate.  相似文献   

14.
Protein sequence and structure comparisons show that the catalytic domains of Class I aminoacyl-tRNA synthetases, a related family of nucleotidyltransferases involved primarily in coenzyme biosynthesis, nucleotide-binding domains related to the UspA protein (USPA domains), photolyases, electron transport flavoproteins, and PP-loop-containing ATPases together comprise a distinct class of alpha/beta domains designated the HUP domain after HIGH-signature proteins, UspA, and PP-ATPase. Several lines of evidence are presented to support the monophyly of the HUP domains, to the exclusion of other three-layered alpha/beta folds with the generic "Rossmann-like" topology. Cladistic analysis, with patterns of structural and sequence similarity used as discrete characters, identified three major evolutionary lineages within the HUP domain class: the PP-ATPases; the HIGH superfamily, which includes class I aaRS and related nucleotidyltransferases containing the HIGH signature in their nucleotide-binding loop; and a previously unrecognized USPA-like group, which includes USPA domains, electron transport flavoproteins, and photolyases. Examination of the patterns of phyletic distribution of distinct families within these three major lineages suggests that the Last Universal Common Ancestor of all modern life forms encoded 15-18 distinct alpha/beta ATPases and nucleotide-binding proteins of the HUP class. This points to an extensive radiation of HUP domains before the last universal common ancestor (LUCA), during which the multiple class I aminoacyl-tRNA synthetases emerged only at a late stage. Thus, substantial evolutionary diversification of protein domains occurred well before the modern version of the protein-dependent translation machinery was established, i.e., still in the RNA world.  相似文献   

15.
Viruses are the most abundant life form and infect practically all organisms. Consequently, these obligate parasites are a major cause of human suffering and economic loss. Rossmann‐like fold is the most populated fold among α/β‐folds in the Protein Data Bank and proteins containing Rossmann‐like fold constitute 22% of all known proteins 3D structures. Thus, analysis of viral proteins containing Rossmann‐like domains could provide an understanding of viral biology and evolution as well as could propose possible targets for antiviral therapy. We provide functional and evolutionary analysis of viral proteins containing a Rossmann‐like fold found in the evolutionary classification of protein domains (ECOD) database developed in our lab. We identified 81 protein families of bacterial, archeal, and eukaryotic viruses in light of their evolution‐based ECOD classification and Pfam taxonomy. We defined their functional significance using enzymatic EC number assignments as well as domain‐level family annotations.  相似文献   

16.
The PIN-domain toxin-antitoxin array in mycobacteria   总被引:3,自引:0,他引:3  
PIN-domains (homologues of the pilT N-terminal domain) are small protein domains of approximately 140 amino acids. They are found in a diverse range of organisms and recent evidence from bioinformatics, biochemistry, structural biology and microbiology suggest that the majority of the prokaryotic PIN-domain proteins are the toxic components of toxin-antitoxin (TA) operons. Several microorganisms have a large cohort of these operons. For example, the genome of Mycobacterium tuberculosis encodes 48 PIN-domain proteins, of which 38 are thought to be involved in TA interactions. This large array of PIN-domain TA operons raises questions as to their evolutionary origin and contemporary functional significance. We suggest that the evolutionary origin of genes encoding mycobacterial PIN-domain TA operons is linked to the mobile gene pool, but that TA operons can become resident within the chromosome of host cells from where they might be recruited to fulfil a variety of roles associated with retardation of cell growth and persistence in stressful environments.  相似文献   

17.
Reconstructing the evolution of the mitochondrial ribosomal proteome   总被引:4,自引:1,他引:3  
For production of proteins that are encoded by the mitochondrial genome, mitochondria rely on their own mitochondrial translation system, with the mitoribosome as its central component. Using extensive homology searches, we have reconstructed the evolutionary history of the mitoribosomal proteome that is encoded by a diverse subset of eukaryotic genomes, revealing an ancestral ribosome of alpha-proteobacterial descent that more than doubled its protein content in most eukaryotic lineages. We observe large variations in the protein content of mitoribosomes between different eukaryotes, with mammalian mitoribosomes sharing only 74 and 43% of its proteins with yeast and Leishmania mitoribosomes, respectively. We detected many previously unidentified mitochondrial ribosomal proteins (MRPs) and found that several have increased in size compared to their bacterial ancestral counterparts by addition of functional domains. Several new MRPs have originated via duplication of existing MRPs as well as by recruitment from outside of the mitoribosomal proteome. Using sensitive profile–profile homology searches, we found hitherto undetected homology between bacterial and eukaryotic ribosomal proteins, as well as between fungal and mammalian ribosomal proteins, detecting two novel human MRPs. These newly detected MRPs constitute, along with evolutionary conserved MRPs, excellent new screening targets for human patients with unresolved mitochondrial oxidative phosphorylation disorders.  相似文献   

18.
Studies of human mitochondrial (mt) DNA genomes demonstrate that the root of the human phylogenetic tree occurs in Africa. Although 2 mtDNA lineages with an African origin (haplogroups M and N) were the progenitors of all non-African haplogroups, macrohaplogroup L (including haplogroups L0-L6) is limited to sub-Saharan Africa. Several L haplogroup lineages occur most frequently in eastern Africa (e.g., L0a, L0f, L5, and L3g), but some are specific to certain ethnic groups, such as haplogroup lineages L0d and L0k that previously have been found nearly exclusively among southern African "click" speakers. Few studies have included multiple mtDNA genome samples belonging to haplogroups that occur in eastern and southern Africa but are rare or absent elsewhere. This lack of sampling in eastern Africa makes it difficult to infer relationships among mtDNA haplogroups or to examine events that occurred early in human history. We sequenced 62 complete mtDNA genomes of ethnically diverse Tanzanians, southern African Khoisan speakers, and Bakola Pygmies and compared them with a global pool of 226 mtDNA genomes. From these, we infer phylogenetic relationships amongst mtDNA haplogroups and estimate the time to most recent common ancestor (TMRCA) for haplogroup lineages. These data suggest that Tanzanians have high genetic diversity and possess ancient mtDNA haplogroups, some of which are either rare (L0d and L5) or absent (L0f) in other regions of Africa. We propose that a large and diverse human population has persisted in eastern Africa and that eastern Africa may have been an ancient source of dispersion of modern humans both within and outside of Africa.  相似文献   

19.
We present evidence of remarkable genome-wide mobility and evolutionary expansion for a class of protein domains whose borders locate close to the borders of their encoding exons. These exon-bordering domains are more numerous and widely distributed in the human genome than other domains. They also co-occur with more diverse domains to form a larger variety of domain architectures in human proteins. A systematic comparison of nine animal genomes from nematodes to mammals revealed that exon-bordering domains expanded faster than other protein domains in both abundance and distribution, as well as the diversity of co-occurring domains and the domain architectures of harboring proteins. Furthermore, exon-bordering domains exhibited a particularly strong preference for class 1-1 intron phase. Our findings suggest that exon-bordering domains were amplified and interchanged within a genome more often and/or more successfully than other domains during evolution, probably the result of extensive exon shuffling and gene duplication events. The diverse biological functions of these domains underscore the important role they play in the expansion and diversification of animal proteomes.  相似文献   

20.
Summary We have found ragweed allergen Ra3 to be related to the type 1 copper proteins; it is most closely related to stellacyanin and basic blue protein. The type 1 copper proteins form a diverse group of proteins, most of which are involved in electron transport. However, key amino acids believed to be involved in copper binding are absent from the allergen sequence; thus, the allergen is not likely to be functionally related to the type 1 copper proteins. We have grouped these proteins into one superfamily and we depict the relationships among them by an evolutionary tree. As indicated by this tree, an ancient gene duplication resulted in the divergence of plastocyanin from the line leading to basic blue protein, stellacyanin, and allergen Ra3.This paper is dedicated to the memory of Professor Margaret O. Dayhoff, whose contributions to the study of protein evolution made this investigation possible  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号