首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Escherichia coli ATP-binding cassette (ABC) proteins   总被引:8,自引:1,他引:7  
The recent completion of the Escherichia coli genome sequence ( Blattner et al ., 1997 ) has permitted an analysis of the complement of genomically encoded ATP-binding cassette (ABC) proteins. A total of 79 ABC proteins makes this the largest paralogous family of proteins in E . coli . These 79 proteins include 97 ABC domains (as some proteins include more than one ABC domain) and are components of 69 independent functional systems (as many systems involve more than one ABC domain). The ABC domains are often, but not exclusively, the energy-generating domains of multicomponent membrane-bound transporters. Thus, 57 of the 69 systems are ABC transporters, of which 44 are periplasmic-binding protein-dependent uptake systems and 13 are presumed exporters. The genes encoding these ABC transporters occupy almost 5% of the genome. Of the 12 systems that are not obviously transport related, the function of only one, the excision repair protein UvrA, is known. A phylogenetic analysis suggests that the majority of ABC proteins can be assigned to 10 subfamilies. Together with statistical and, importantly, biological evidence, this analysis provides insight into the evolution and function of the ABC proteins.  相似文献   

2.
Vertebrate evolution has been largely driven by the duplication of genes that allow for the acquisition of new functions. The ATP-binding cassette (ABC) proteins constitute a large and functionally diverse family of membrane transporters. The members of this multigene family are found in all cellular organisms, most often engaged in the translocation of a wide variety of substrates across lipid membranes. Because of the diverse function of these genes, their large size, and the large number of orthologs, ABC genes represent an excellent tool to study gene family evolution. We have identified ABC proteins from the sea squirt (Ciona intestinalis), zebrafish (Danio rerio), and chicken (Gallus gallus) and, using phylogenetic analysis, identified those genes with a one-to-one orthologous relationship to human ABC proteins. All ABC protein subfamilies found in Ciona and zebrafish correspond to the human subfamilies, with the exception of a single ABCH subfamily gene found only in zebrafish. Multiple gene duplication and deletion events were identified in different lineages, indicating an ongoing process of gene evolution. As many ABC genes are involved in human genetic diseases, and important drug transport phenotypes, the understanding of ABC gene evolution is important to the development of animal models and functional studies.  相似文献   

3.
The tumor suppressor p53 is mutated in ~50% of all human cancer cases worldwide. It is commonly assumed that the phylogenetic history of this important tumor suppressor has been thoroughly studied; however, few detailed studies of the entire extended p53 protein family have been reported, and none comprehensively and simultaneously consider functional, molecular, and phylogenetic data. Herein we examine a diverse collection of reported p53-like protein sequences, including representatives from the arthropods, nematodes, and protists, with the goal of answering several important questions. First, what evidence supports these highly divergent proteins being true homologues to the p53 family? Second, is the inferred overall family phylogeny concordant with known structures and functions? Third, does the extended p53 family possess recognizable conserved sites outside of the within-chordate, highly-conserved DNA-binding domain? Our study shows that the biochemical and functional evidence of p53 homology for nematodes, arthropods, and protists is inconsistent with their implied phylogenetic relationship within the overall family. Although these divergent sequences are always reported as functionally similar to human p53, our results confirm and extend the hypothesis that p63 is a far more appropriate protein for comparison. Within these divergent sequences, we find minimal conservation within the DNA-binding domain, and no conservation elsewhere. Taken together, our findings suggest that these sequences are not bona fide homologues of the extended p53 family and provide baseline criteria for the future identification and characterization of distant p53-family homologues.  相似文献   

4.
Functional classification of proteins from sequences alone has become a critical bottleneck in understanding the myriad of protein sequences that accumulate in our databases. The great diversity of homologous sequences hides, in many cases, a variety of functional activities that cannot be anticipated. Their identification appears critical for a fundamental understanding of the evolution of living organisms and for biotechnological applications. ProfileView is a sequence-based computational method, designed to functionally classify sets of homologous sequences. It relies on two main ideas: the use of multiple profile models whose construction explores evolutionary information in available databases, and a novel definition of a representation space in which to analyze sequences with multiple profile models combined together. ProfileView classifies protein families by enriching known functional groups with new sequences and discovering new groups and subgroups. We validate ProfileView on seven classes of widespread proteins involved in the interaction with nucleic acids, amino acids and small molecules, and in a large variety of functions and enzymatic reactions. ProfileView agrees with the large set of functional data collected for these proteins from the literature regarding the organization into functional subgroups and residues that characterize the functions. In addition, ProfileView resolves undefined functional classifications and extracts the molecular determinants underlying protein functional diversity, showing its potential to select sequences towards accurate experimental design and discovery of novel biological functions. On protein families with complex domain architecture, ProfileView functional classification reconciles domain combinations, unlike phylogenetic reconstruction. ProfileView proves to outperform the functional classification approach PANTHER, the two k-mer-based methods CUPP and eCAMI and a neural network approach based on Restricted Boltzmann Machines. It overcomes time complexity limitations of the latter.  相似文献   

5.
MOTIVATION: The completion of the Arabidopsis genome offers the first opportunity to analyze all of the membrane protein sequences of a plant. The majority of integral membrane proteins including transporters, channels, and pumps contain hydrophobic alpha-helices and can be selected based on TransMembrane Spanning (TMS) domain prediction. By clustering the predicted membrane proteins based on sequence, it is possible to sort the membrane proteins into families of known function, based on experimental evidence or homology, or unknown function. This provides a way to identify target sequences for future functional analysis. RESULTS: An automated approach was used to select potential membrane protein sequences from the set of all predicted proteins and cluster the sequences into related families. The recently completed sequence of Arabidopsis thaliana, a model plant, was analyzed. Of the 25,470 predicted protein sequences 4589 (18%) were identified as containing two or more membrane spanning domains. The membrane protein sequences clustered into 628 distinct families containing 3208 sequences. Of these, 211 families (1764 sequences) either contained proteins of known function or showed homology to proteins of known function in other species. However, 417 families (1444 sequences) contained only sequences with no known function and no homology to proteins of known function. In addition, 1381 sequences did not cluster with any family and no function could be assigned to 1337 of these.  相似文献   

6.
The minimal set of proteins necessary to maintain a vertebrate cell forms an interesting core of cellular machinery. The known proteome of human red blood cell consists of about 1400 proteins. We treated this protein complement of one of the simplest human cells as a model and asked the questions on its function and origins. The proteome was mapped onto phylogenetic profiles, i.e. vectors of species possessing homologues of human proteins. A novel clustering approach was devised, utilising similarity in the phylogenetic spread of homologues as distance measure. The clustering based on phylogenetic profiles yielded several distinct protein classes differing in phylogenetic taxonomic spread, presumed evolutionary history and functional properties. Notably, small clusters of proteins common to vertebrates or Metazoa and other multicellular eukaryotes involve biological functions specific to multicellular organisms, such as apoptosis or cell-cell signaling, respectively. Also, a eukaryote-specific cluster is identified, featuring GTP-ase signalling and ubiquitination. Another cluster, made up of proteins found in most organisms, including bacteria and archaea, involves basic molecular functions such as oxidation-reduction and glycolysis. Approximately one third of erythrocyte proteins do not fall in any of the clusters, reflecting the complexity of protein evolution in comparison to our simple model. Basically, the clustering obtained divides the proteome into old and new parts, the former originating from bacterial ancestors, the latter from inventions within multicellular eukaryotes. Thus, the model human cell proteome appears to be made up of protein sets distinct in their history and biological roles. The current work shows that phylogenetic profiles concept allows protein clustering in a way relevant both to biological function and evolutionary history.  相似文献   

7.
Schriml LM  Dean M 《Genomics》2000,64(1):24-31
ATP-binding cassette (ABC) genes encode a family of transport proteins known to be involved in a number of human genetic diseases. In this study, we characterized the ABC superfamily in Mus musculus through in silico gene identification and mapping and phylogenetic analysis of mouse and human ABC genes. By querying dbEST with amino acid sequences from the conserved ATP-binding domains, we identified and partially sequenced 18 new mouse ABC genes, bringing the total number of mouse ABC genes to 34. Twelve of the new ABC genes were mapped in the mouse genome to the X chromosome and to 10 of the 19 autosomes. Phylogenetic relationships of mouse and human ABC genes were examined with maximum parsimony and neighbor-joining analyses that demonstrated that mouse and human ABC orthologs are more closely related than are mouse paralogs. The mouse ABC genes could be grouped into the seven previously described human ABC subfamilies. Three mouse ABC genes mapped to regions implicated in cholesterol gallstone susceptibility.  相似文献   

8.
Ger MF  Rendon G  Tilson JL  Jakobsson E 《PloS one》2010,5(10):e12827
Voltage-gated and ligand-gated ion channels are used in eukaryotic organisms for the purpose of electrochemical signaling. There are prokaryotic homologues to major eukaryotic channels of these sorts, including voltage-gated sodium, potassium, and calcium channels, Ach-receptor and glutamate-receptor channels. The prokaryotic homologues have been less well characterized functionally than their eukaryotic counterparts. In this study we identify likely prokaryotic functional counterparts of eukaryotic glutamate receptor channels by comprehensive analysis of the prokaryotic sequences in the context of known functional domains present in the eukaryotic members of this family. In particular, we searched the nonredundant protein database for all proteins containing the following motif: the two sections of the extracellular glutamate binding domain flanking two transmembrane helices. We discovered 100 prokaryotic sequences containing this motif, with a wide variety of functional annotations. Two groups within this family have the same topology as eukaryotic glutamate receptor channels. Group 1 has a potassium-like selectivity filter. Group 2 is most closely related to eukaryotic glutamate receptor channels. We present analysis of the functional domain architecture for the group of 100, a putative phylogenetic tree, comparison of the protein phylogeny with the corresponding species phylogeny, consideration of the distribution of these proteins among classes of prokaryotes, and orthologous relationships between prokaryotic and human glutamate receptor channels. We introduce a construct called the Evolutionary Domain Network, which represents a putative pathway of domain rearrangements underlying the domain composition of present channels. We believe that scientists interested in ion channels in general, and ligand-gated ion channels in particular, will be interested in this work. The work should also be of interest to bioinformatics researchers who are interested in the use of functional domain-based analysis in evolutionary and functional discovery.  相似文献   

9.
The thioredoxin/glutaredoxin family consists of small heat-stable proteins that have a highly conserved CXXC active site and that participate in the regulation of many redox reactions. We have searched the human genome sequence to find putative pseudogenes (non-functional copies of protein-coding genes) for all known members of this family. This survey has resulted in the identification of seven processed pseudogenes for human Trx1 and two more for human Grx1. No evidence for the presence of processed pseudogenes has been found for the remaining members of this family. In addition, we have been unable to detect any non-processed pseudogenes derived from any member of the family in the human genome. The seven thioredoxin pseudogenes can be divided into two groups: Trx1-psi2, -psi3, -psi4, -psi5 and -psi6 arose from the functional ancestor, whereas Trx1-psi1 and -psi7 originated from Trx1-psi2 and -psi6, respectively. In all cases, the pseudogenes originated after the human/rodent radiation as shown by phylogenetic analysis. This is also the case for Grx1-psi1 and Grx1-psi2, which are placed between rodent and human sequences in the phylogenetic tree. Our study provides a molecular record of the recent evolution of these two genes in the hominid lineage.  相似文献   

10.
In order to simplify and meaningfully categorize large sets of protein sequence data, it is commonplace to cluster proteins based on the similarity of those sequences. However, it quickly becomes clear that the sequence flexibility allowed a given protein varies significantly among different protein families. The degree to which sequences are conserved not only differs for each protein family, but also is affected by the phylogenetic divergence of the source organisms. Clustering techniques that use similarity thresholds for protein families do not always allow for these variations and thus cannot be confidently used for applications such as automated annotation and phylogenetic profiling. In this work, we applied a spectral bipartitioning technique to all proteins from 53 archaeal genomes. Comparisons between different taxonomic levels allowed us to study the effects of phylogenetic distances on cluster structure. Likewise, by associating functional annotations and phenotypic metadata with each protein, we could compare our protein similarity clusters with both protein function and associated phenotype. Our clusters can be analyzed graphically and interactively online.  相似文献   

11.
The "A Disintegrin And Metalloproteinase" (ADAM) protein family and the "A Disintegrin-like And Metalloproteinase with ThromboSpondin motifs" (ADAMTS) protein family are two related families of human proteins. The similarities and differences between these two families have been investigated using phylogenetic trees and homology modeling. The phylogenetic analysis indicates that the two families are well differentiated, even when only the common metalloprotease domain is taken into account. Within the ADAM family, several proteins are lacking the binding motif for the catalytic zinc in the active site and thus presumably lack any catalytic activity. These proteins tend to cluster within the ADAM phylogenetic tree and are expressed in specific tissues, suggesting a functional differentiation. The present analysis allows us to propose the following: (i) ADAMTS proteins have a conserved role in the human organism as proteases, with some differentiation in terms of substrate specificity; (ii) ADAM proteins can act as proteases and/or mediators of intermolecular interactions; (iii) proteolytically active ADAMs tend to be more ubiquitously expressed than the inactive ones.  相似文献   

12.
The serine-rich (SR) protein family is involved in the pre-mRNA splicing process and the DNA sequences of the corresponding genes are highly conserved in the metazoan organisms. The mammalian SR proteins consist of one or two characteristic RNA binding domains (RBD), containing the signature sequences RDAEDA and SWQDLKD and a RS (arginine/serine-rich) domain. We used the amino acid and nucleotide sequences deposited in GenBank and Swiss-Prot databases to perform a phylogenetic analysis using bioinformatics tools. The results of the phylogenetic trees suggest that this family has evolved by several gene duplication events as a result of a positive selection mechanism.  相似文献   

13.
Kinesin superfamily proteins (KIFs) are key players or 'hub' proteins in the intracellular transport system, which is essential for cellular function and morphology. The KIF superfamily is also the first large protein family in mammals whose constituents have been completely identified and confirmed both in silico and in vivo. Numerous studies have revealed the structures and functions of individual family members; however, the relationships between members or a perspective of the whole superfamily structure until recently remained elusive. Here, we present a comprehensive summary based on a large, systematic phylogenetic analysis of the kinesin superfamily. All available sequences in public databases, including genomic information from all model organisms, were analyzed to yield the most complete phylogenetic kinesin tree thus far, comprising 14 families. This comprehensive classification builds on the recently proposed standardized nomenclature for kinesins and allows systematic analysis of the structural and functional relationships within the kinesin superfamily.  相似文献   

14.
The PDZ and LIM domain-containing protein family is encoded by a diverse group of genes whose phylogeny has currently not been analyzed. In mammals, ten genes are found that encode both a PDZ- and one or several LIM-domains. These genes are: ALP, RIL, Elfin (CLP36), Mystique, Enigma (LMP-1), Enigma homologue (ENH), ZASP (Cypher, Oracle), LMO7 and the two LIM domain kinases (LIMK1 and LIMK2). As conventional alignment and phylogenetic procedures of full-length sequences fell short of elucidating the evolutionary history of these genes, we started to analyze the PDZ and LIM domain sequences themselves. Using information from most sequenced eukaryotic lineages, our phylogenetic analysis is based on full-length cDNA-, EST-derived- and genomic- PDZ and LIM domain sequences of over 25 species, ranging from yeast to humans. Plant and protozoan homologs were not found. Our phylogenetic analysis identifies a number of domain duplication and rearrangement events, and shows a single convergent event during evolution of the PDZ/LIM family. Further, we describe the separation of the ALP and Enigma subfamilies in lower vertebrates and identify a novel consensus motif, which we call 'ALP-like motif' (AM). This motif is highly-conserved between ALP subfamily proteins of diverse organisms. We used here a combinatorial approach to define the relation of the PDZ and LIM domain encoding genes and to reconstruct their phylogeny. This analysis allowed us to classify the PDZ/LIM family and to suggest a meaningful model for the molecular evolution of the diverse gene architectures found in this multi-domain family.  相似文献   

15.
16.
ABC1K atypical kinases in plants: filling the organellar kinase void   总被引:1,自引:0,他引:1  
Surprisingly few protein kinases have been demonstrated in chloroplasts or mitochondria. Here, we discuss the activity of bc(1) complex kinase (ABC1K) protein family, which we suggest locate in mitochondria and plastids, thus filling the kinase void. The ABC1Ks are atypical protein kinases and their ancestral function is the regulation of quinone synthesis. ABC1Ks have proliferated from one or two members in non-photosynthetic organisms to more than 16 members in algae and higher plants. In this review, we reconstruct the evolutionary history of the ABC1K family, provide a functional domain analysis for angiosperms and a nomenclature for ABC1Ks in Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa) and maize (Zea mays). Finally, we hypothesize that targets of ABC1Ks include enzymes of prenyl-lipid metabolism as well as components of the organellar gene expression machineries.  相似文献   

17.
Physicochemcial properties of amino acids are important factors in determining protein structure and function. Most approaches make use of averaged properties over entire domains or even proteins to analyze their structure or function. This level of coarseness tends to hide the richness of the variability in the different properties across functional domains. This paper studies the conservation of physicochemical properties in a functionally similar family of proteins using a novel wavelet-based technique known as multiresolution analysis. Such an analysis can help uncover characteristics that can otherwise remain hidden. We have studied the protein kinase family of sequences and our findings are as follows: (a) a number of different properties are conserved over the functional catalytic domain irrespective of the sequence identities; (b) conservation of properties can be observed at different frequency levels and they agree well with the known structural/functional properties of the subdomains for the protein kinase family; (c) structural differences between the different kinase family members are reflected in the waveforms; and (d) functionally important mutations show distortions in the waveforms of conserved properties. The potential usefulness of the above findings in identifying functionally similar sequences in the twilight and midnight zones is demonstrated through a simple prediction model for the protein kinase family which achieved a recall of 93.7% and a precision of 96.75% in cross-validation tests.  相似文献   

18.
The recent completion of the sequencing project of the opportunistic human pathogenic yeast, Candida albicans (http://www.ncbi.nlm.nih.gov/), led us to analyze and classify its ATP-binding cassette (ABC) proteins, which constitute one of the largest superfamilies of proteins. Some of its members are multidrug transporters responsible for the commonly encountered problem of antifungal resistance. TBLASTN searches together with domain analysis identified 81 nucleotide-binding domains, which belong to 51 different putative open reading frames. Considering that each allelic pair represents a single ABC protein of the Candida genome, the total number of putative members of this superfamily is 28. Domain organization, sequence-based analysis and self-organizing map-based clustering led to the classification of Candida ABC proteins into 6 distinct subfamilies. Each subfamily from C. albicans has an equivalent in Saccharomyces cerevisiae suggesting a close evolutionary relationship between the two yeasts. Our searches also led to the identification of a new motif to each subfamily in Candida that could be used to identify sequences from the corresponding subfamily in other organisms. It is hoped that the inventory of Candida ABC transporters thus created will provide new insights into the role of ABC proteins in antifungal resistance as well as help in the functional characterization of the superfamily of these proteins.  相似文献   

19.
Haspin (haploid germ cell-specific nuclear protein kinase) is reported to be a serine/threonine kinase that may play a role in cell-cycle cessation and differentiation of haploid germ cells. In addition, Haspin mRNA can be detected in diploid cell lines and tissues. Here, Haspin-like proteins are identified in several major eukaryotic phyla-including yeasts, plants, flies, fish, and mammals-and an extended group in Caenorhabditis elegans. The Haspin-like proteins have a complete but divergent eukaryotic protein kinase domain sequence. Although clearly related to one another and to other eukaryotic protein kinases, the Haspin-related proteins lack conservation of a subset of residues that are almost invariant in known kinases and possess distinctive inserted regions. In fact, phylogenetic analysis indicates that the Haspin-like proteins form a novel eukaryotic protein kinase family distinct from those previously defined. The identification of related proteins in model organisms provides some initial insight into their functional properties and will provide new experimental avenues by which to determine the function of the Haspin proteins in mammalian cells.  相似文献   

20.
We develop a procedure called RiPE (Retrieval-induced Phylogeny Environment) that automatically performs an evolutionary analysis of a protein (sub)family, (i) by retrieving the relevant sequences via a homology search, (ii) by using the search report to construct the alignment using only homologous subsequences (taking into account their neighborhood with a low chance of homology), (iii) by realigning, and (iv) by generating phylogenetic trees based on the alignment. In a first implementation of our scheme, we start with the available proteome data of model organisms, perform a PSI-BLAST search, use MView to convert hits into a multiple alignment, and perform realignment and tree building. As a test case, we have investigated the human ABC transporters of the subfamily G, starting with the five known human ABCG transporters. Our method retrieved homologous sequences not previously analyzed, generating a tree that is more plausible and better supported than previously published trees. The RiPE 0.1 prototype is available at the RiPE website, http://ifg-izkf.uni-muenster.de/fuellen/RiPE/ripe.html.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号