首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 318 毫秒
1.
The group of 2502 transmembrane (TM) protein sequences with seven TM segments (7-tms) registered in SWISS-PROT 46.0 contains 2200 G-protein-coupled receptors (GPCRs), indicating that GPCR candidates can be detected with a reliability of 87.9% in the eukaryotic genomes merely by correctly predicting the number of TM segments as 7-tms. The predictive accuracies of TM topology-prediction methods proposed so far are not as high as expected; even the best method, HMMTOP 2.0, can only achieve a capture rate of 7-tms sequences of 77.6%. It is necessary to improve this performance as much as possible, even if by only a few percentage points, in order to identify as many novel GPCR candidate genes as possible among the increasing number of newly sequenced genomes. In this study, we propose a simple but useful prediction method for detecting as many 7-tms TM protein sequences as GPCR candidates in eukaryotic genomes as possible. This is achieved by employing a two-step prediction procedure. The first step involves collecting 7-tms sequences by the best prediction method (HMMTOP 2.0), and the second involves picking up the remaining 7-tms sequences by the second-best method (TMHMM 2.0). By this procedure, the capture rate of 7-tms TM protein sequences in SWISS-PROT can be improved considerably from 77.6% to 84.5%, and the number of GPCR candidate sequences predicted as 7-tms in the human genome (Build 35) is increased from 790 (by HMMTOP 2.0) to 903. These 790 and 903 candidate sequences include, respectively, 587 and 636 of the known human GPCRs of the 717 registered in SWISS-PROT 46.0, demonstrating that the proposed combinatorial method is effective in detecting GPCR candidate genes in eukaryotic genomes.  相似文献   

2.
Prosaposin is a multifunctional protein encoded by a single-copy gene. It contains four saposin domains (A, B, C, and D) occurring as tandem repeats connected by linker sequences. Because the saposin domains are similar to one another, it is deduced that they were created by sequential duplications of an ancestral domain. There are two types of evolutionary scenarios that may explain the creation of the four-domain gene: (1) two rounds of tandem internal gene duplication and (2) three rounds of duplications. An evolutionary and phylogenetic analysis of saposin DNA and amino acid sequences from human, mouse, rat, chicken, and zebrafish indicates that the first evolutionary scenario is the most likely. Accordingly, an ancestral saposin-unit duplication produced a two-domain gene, which, subsequently, underwent a second complete tandem duplication to give rise to the present four-domain structure of the prosaposin gene. Received: 8 February 2001 / Accepted: 29 June 2001  相似文献   

3.
4.
We now know that the evolution of multidomain proteins has frequently involved genetic duplication events. These, however, are sometimes difficult to trace because of low sequence similarity between duplicated segments. Spectrin, the major component of the membrane skeleton that provides elasticity to the cell, contains tandemly repeated sequences of 106 amino acid residues. The same repeats are also present in α-actinin, dystrophin and utrophin. Sequence alignments and phylogenetic trees of these domains allow us to interpret the evolutionary relationship between these proteins, concluding that spectrin evolved from α-actinin by an elongation process that included two duplications of a block of seven repeats. This analysis shows how a modular protein unit can be used in the evolution of large cytoskeletal structures.  相似文献   

5.
Many proteins, especially in eukaryotes, contain tandem repeats of several domains from the same family. These repeats have a variety of binding properties and are involved in protein–protein interactions as well as binding to other ligands such as DNA and RNA. The rapid expansion of protein domain repeats is assumed to have evolved through internal tandem duplications. However, the exact mechanisms behind these tandem duplications are not well-understood. Here, we have studied the evolution, function, protein structure, gene structure, and phylogenetic distribution of domain repeats. For this purpose we have assigned Pfam-A domain families to 24 proteomes with more sensitive domain assignments in the repeat regions. These assignments confirmed previous findings that eukaryotes, and in particular vertebrates, contain a much higher fraction of proteins with repeats compared with prokaryotes. The internal sequence similarity in each protein revealed that the domain repeats are often expanded through duplications of several domains at a time, while the duplication of one domain is less common. Many of the repeats appear to have been duplicated in the middle of the repeat region. This is in strong contrast to the evolution of other proteins that mainly works through additions of single domains at either terminus. Further, we found that some domain families show distinct duplication patterns, e.g., nebulin domains have mainly been expanded with a unit of seven domains at a time, while duplications of other domain families involve varying numbers of domains. Finally, no common mechanism for the expansion of all repeats could be detected. We found that the duplication patterns show no dependence on the size of the domains. Further, repeat expansion in some families can possibly be explained by shuffling of exons. However, exon shuffling could not have created all repeats.  相似文献   

6.
Summary We have implemented a routine procedure for screening protein sequences for evidence of intragenic duplications. We tested 163 protein sequences representing 116 superfamilies of unrelated proteins. Twenty superfamilies contain proteins with internal gene duplications. The intragenic duplications detected can be divided into two major types. (1) One or more duplications of all or part of a gene produce a protein with two or several detectable regions of sequence homology. Sequences from 18 superfamilies contained this type of duplication. (2) Repeated reduplication of a small DNA segment can produce a protein that is repetitive over most of its length. Three superfamilies contain such repetitive sequences. We also investigated the limits of detection of ancient duplications using sequences derived by random mutation of a model sequence consisting of ten 10-residue repeats. The original repetitive nature of the sequence was usually detected after 250 point mutations even though the ancestral segment could not be accurately reconstructed.  相似文献   

7.
We propose a new method for classifying and identifying transmembrane (TM) protein functions in proteome-scale by applying a single-linkage clustering method based on TM topology similarity, which is calculated simply from comparing the lengths of loop regions. In this study, we focused on 87 prokaryotic TM proteomes consisting of 31 proteobacteria, 22 gram-positive bacteria, 19 other bacteria, and 15 archaea. Prior to performing the clustering, we first categorized individual TM protein sequences as "known," "putative" (similar to "known" sequences), or "unknown" by using the homology search and the sequence similarity comparison against SWISS-PROT to assess the current status of the functional annotation of the TM proteomes based on sequence similarity only. More than three-quarters, that is, 75.7% of the TM protein sequences are functionally "unknown," with only 3.8% and 20.5% of them being classified as "known" and "putative," respectively. Using our clustering approach based on TM topology similarity, we succeeded in increasing the rate of TM protein sequences functionally classified and identified from 24.3% to 60.9%. Obtained clusters correspond well to functional superfamilies or families, and the functional classification and identification are successfully achieved by this approach. For example, in an obtained cluster of TM proteins with six TM segments, 109 sequences out of 119 sequences annotated as "ATP-binding cassette transporter" are properly included and 122 "unknown" sequences are also contained.  相似文献   

8.
Many α-helical membrane proteins contain internal symmetries, indicating that they might have evolved through a gene duplication and fusion event. Here, we have characterized internal duplications among membrane proteins of known structure and in three complete genomes. We found that the majority of large transmembrane (TM) proteins contain an internal duplication. The duplications found showed a large variability both in the number of TM-segments included and in their orientation. Surprisingly, an approximately equal number of antiparallel duplications and parallel duplications were found. However, of all 11 superfamilies with an internal duplication, only for one, the AcrB Multidrug Efflux Pump, the duplicated unit could be found in its nonduplicated form. An evolutionary analysis of the AcrB homologs indicates that several independent fusions have occurred, including the fusion of the SecD and SecF proteins into the 12-TM-protein SecDF in Brucella and Staphylococcus aureus. In one additional case, the Vitamin B12 transporter-like ABC transporters, the protein had undergone an additional fusion to form protein with 20 TM-helices in several bacterial genomes. Finally, homologs to all human membrane proteins were used to detect the presence of duplicated and nonduplicated proteins. This confirmed that only in rare cases can homologs with different duplication status be found, although internal symmetry is frequent among these proteins. One possible explanation is that it is frequent that duplication and fusion events happen simultaneously and that there is almost always a strong selective advantage for the fused form.  相似文献   

9.
Evolution of type II DNA methyltransferases. A gene duplication model   总被引:30,自引:0,他引:30  
On the basis of consensus sequences, which had previously been defined for two groups of closely related cytosine-specific and adenine-specific DNA methyltransferases, homologies can be detected that indicate a common origin for these proteins. Intramolecular comparisons of several of these enzymes reveal homology relationships, which suggests that gene duplication is a phylogenetic principle in the evolution of the Mtases. One or two duplications of an ancestral gene encoding a 12,000 to 16,000 Mr protein, followed by divergent evolution, may have led to very different protein structures and could explain the differences in amino acid sequences, molecular weights and biochemical properties. Intermolecular and intramolecular homologies were also recognized in type II restriction endonucleases, suggesting a very similar evolutionary pathway.  相似文献   

10.
α-Prolamins are the major seed storage proteins of species of the grass tribe Andropogonea. They are unusually rich in glutamine, proline, alanine, and leucine residues and their sequences show a series of tandem repeats presumed to be the result of multiple intragenic duplication. Two new sequences of α-prolamin clones from Coix (pBCX25.12 and pBCX25.10) are compared with similar clones from maize and Sorghum in order to investigate evolutionary relationships between the repeat motifs and to propose a schematic model for their three-dimensional structure based on hydrophobic membrane-helix propensities and helical “wheels.” A scheme is proposed for the most recent events in the evolution of the central part of the molecule (repeats 3 to 8) which involves two partial intragenic duplications and in which contemporary odd-numbered and even-numbered repeats arise from common ancestors, respectively. Each pair of repeats is proposed to form an antiparallel α-helical hairpin and that the helices of the molecule as a whole are arranged on a hexagonal net. The majority of helices show six faces of alternating hydrophobic and polar residues, which give rise to intersticial holes around each helix which alternate in chemical character. The model is consistent with proteins which contain different numbers of repeats, with oligomerization and with the dense packaging of α-prolamins within the protein body of the seed endosperm. © 1993 Wiley-Liss, Inc.  相似文献   

11.
Evidence has been presented that 5+5 TMS and 7+7 TMS inverted repeat fold transporters are members of a single superfamily named the Amino acid‐Polyamine‐organoCation (APC) superfamily. However, the evolutionary relationship between the 5+5 and the 7+7 topological types has not been established. We have identified a common fold, consisting of a spiny membrane helix/sheet, followed by a U‐like structure and a V‐like structure that is recurrent between domain duplicated units of 5+5 and 7+7 inverted repeat folds. This fold is found in the following protein structures: AdiC, ApcT, LeuT, Mhp1, BetP, CaiT, and SglT (all 5+5 TMS repeats), as well as UraA and SulP (7+7 TMS repeats). AdiC, LeuT and Mhp1 have two extra TMSs after the second duplicated domain, SglT has four extra C‐terminal TMSs, and BetP has two extra TMSs before the first duplicated domain. UraA and SulP on the other hand have two extra TMSs at the N‐terminus of each duplicated domain unit. These observations imply that multiple hairpin and domain duplication events occurred during the evolution of the APC superfamily. We suggest that the five TMS architecture was primordial and that families gained two TMSs on either side of this basic structure via dissimilar hairpin duplications either before or after intragenic duplication. Evidence for homology between TMSs 1–2 of AdiC and TMSs 1–2 and 3–4 of UraA suggests that the 7+7 topology arose via an internal duplication of the N‐terminal hairpin loop within the five TMS repeat unit followed by duplication of the 7 TMS domain. Proteins 2014; 82:336–346. © 2013 Wiley Periodicals, Inc.  相似文献   

12.
Katju V  Lynch M 《Genetics》2003,165(4):1793-1803
The significance of gene duplication in provisioning raw materials for the evolution of genomic diversity is widely recognized, but the early evolutionary dynamics of duplicate genes remain obscure. To elucidate the structural characteristics of newly arisen gene duplicates at infancy and their subsequent evolutionary properties, we analyzed gene pairs with < or =10% divergence at synonymous sites within the genome of Caenorhabditis elegans. Structural heterogeneity between duplicate copies is present very early in their evolutionary history and is maintained over longer evolutionary timescales, suggesting that duplications across gene boundaries in conjunction with shuffling events have at least as much potential to contribute to long-term evolution as do fully redundant (complete) duplicates. The median duplication span of 1.4 kb falls short of the average gene length in C. elegans (2.5 kb), suggesting that partial gene duplications are frequent. Most gene duplicates reside close to the parent copy at inception, often as tandem inverted loci, and appear to disperse in the genome as they age, as a result of reduced survivorship of duplicates located in proximity to the ancestral copy. We propose that illegitimate recombination events leading to inverted duplications play a disproportionately large role in gene duplication within this genome in comparison with other mechanisms.  相似文献   

13.
Summary Hybridization experiments indicated that the maize genome contains a family of sequences closely related to the Ds1 element originally characterized from theAdh1-Fm335 allele of maize. Examples of these Ds1-related segments were cloned and sequenced. They also had the structural properties of mobile genetic elements, i.e., similar length and internal sequence homology with Ds1, 10- or 11-bp terminal inverted repeats, and characteristic duplications of flanking genomic DNA. All sequences with 11-bp terminal inverted repeats were flanked by 8-bp duplications, but the duplication flanking one sequence with 10-bp inverted repeats was only 6 bp. Similar Ds1-related sequences were cloned fromTripsacum dactyloides. They showed no more divergence from the maize sequences than the individual maize sequences showed when compared with each other. No consensus sequence was evident for the sites at which these sequences had inserted in genomic DNA.  相似文献   

14.
A defective LDL receptor gene in a child with familial hypercholesterolemia produces a receptor precursor that is 50,000 daltons larger than normal (apparent Mr 170,000 vs. 120,000). The elongated protein resulted from a 14 kilobase duplication that encompasses exons 2 through 8. The duplication arose from an unequal crossing-over between homologous repetitive elements (Alu sequences) in intron 1 and intron 8. The mutant receptor has 18 contiguous cysteine-rich repeat sequences instead of the normal nine. Seven of these duplicated repeats are derived from the ligand-binding domain, and two repeats are part of the epidermal growth factor precursor homology region. The elongated receptor undergoes normal carbohydrate processing, its apparent molecular weight increases to 210,000, and the receptor reaches the cell surface where it binds reduced amounts of LDL but undergoes efficient internalization and recycling. The current findings support an evolutionary model in which homologous recombination between repetitive elements in introns leads to exon duplication during evolution of proteins.  相似文献   

15.
Ankyrins are membrane adaptor molecules that play important roles in coupling integral membrane proteins to the spectrin-based cytoskeleton network. Human mutations of ankyrin genes lead to severe genetic diseases such as fatal cardiac arrhythmias and hereditary spherocytosis. To elucidate the evolutionary history of ankyrins, we have identified novel ankyrin sequences in insect, fish, frog, chicken, dog, and chimpanzee genomes and explored the phylogenetic relationships of the ankyrin gene family. Our data demonstrate that duplication of ankyrin genes occurred at two different stages. The first duplication resulted from an independent evolution event specific in Arthropoda after its divergence from Chordata. Following the separation from Urochordata, expansion of ankyrins in vertebrates involved ancestral genome duplications. We did not find evidence of coordinated arrangements of gene families of ankyrin-associated membrane proteins on paralogous chromosomes. In addition, evolution of the 24 ANK-repeats strikingly correlated with the exon boundary sites of ankyrin genes, which might have occurred before its duplication in vertebrates. Such correlation is speculated to bring functional diversity and complexity. Moreover, based on the phylogenetic analysis of the ANK-repeat domain, we put forward a novel model for the putative primordial ankyrin that contains the fourth six-ANK-repeat subdomain and the spectrin-binding domain. These findings will provide guides for future studies concerning structure, function, evolutionary origins of ankyrins, and possibly other cytoskeletal proteins.  相似文献   

16.
X-ray crystallography has revealed that many integral membrane proteins consist of two domains with a similar fold but opposite (antiparallel) orientation in the membrane. The proteins are believed to have evolved by gene duplication and gene fusion events from a dual topology ancestral membrane protein, that adapted both orientations in the membrane and formed antiparallel homodimers. Here, we present a detailed analysis of the DUF606 family of bacterial membrane proteins that contains the entire collection of intermediate states of such an evolutionary pathway: single genes that would code for dual topology homodimeric proteins, paired genes coding for homologous proteins with a fixed but opposite orientation in the membrane that would form heterodimers, and fused genes that encode antiparallel two-domain fusion proteins. Two types of paired genes can be discriminated corresponding to the order in which the genes coding for the two oppositely oriented proteins occur in the operon. On the protein level, the heterodimers resulting from the two types of gene pairs are indistinguishable. In contrast, two types of fused genes corresponding to the two possible orders in which the oppositely oriented domains are present in the encoded proteins, do result in discernible types of proteins. The large number of genetic and protein states in the DUF606 family allowed for a detailed phylogenic analysis that revealed a total of nine independent duplication events in the DUF606 family, five of which resulted in paired genes, and four resulted in fused genes. Noticeably, there was no evidence for a sequential mechanism in which fusions evolve from a pair of genes. Rather, an evolutionary mechanism is proposed by which antiparallel two-domain proteins are the direct result of a gene duplication event. Combining the phylogeny of proteins and hosting microorganisms allowed for a reconstruction of the evolutionary pathway.  相似文献   

17.
The nucleotide sequences of the introns that are located between the C4 exon and the first membrane exon of mouse and rat immunoglobulin epsilon-chain genes have been determined. The rat intron sequence was found to contain four separate clusters of repetitive sequences all of which consisted of (dC-dA)n.(dG-dT)n dinucleotide repeats. A comparison between this chromosomal region in mouse and rat revealed four deletions or duplications, three of which have occurred inside or at the borders of the CA clusters. Rearrangements have occurred inside or at the borders of all four repeats after the evolutionary separation of mouse and rat. The sequence comparison reveals in addition a duplication, connected to the CA repeats, which has occurred early in evolution, before the evolutionary divergence of mouse and rat. These findings suggest that (dC-dA)n.(dG-dT)n sequences are potential targets for recombination events.  相似文献   

18.
Structure and evolution of the apolipoprotein multigene family   总被引:8,自引:0,他引:8  
We present the complementary DNA and deduced amino acid sequence of rat apolipoprotein A-II (apoA-II), and the results of a detailed statistical analysis of the nucleotide and amino acid sequences of all the apolipoprotein gene sequences published to date: namely, those of human and rat apoA-I, apoA-II and apoE, rat apoA-IV, and human apoC-I, C-II and C-III. Our results indicate that the apolipoprotein genes have very similar genomic structures, each having a total of three introns at the same locations. Using the exon/intron junctions as reference points, we have obtained an alignment of the coding regions of all the genes studied. It appears that the mature peptide regions of these genes are almost completely made up of tandem repeats of 11 codons. The part of mature peptide region encoded by exon 3 contains a common block of 33 codons, whereas the part encoded by exon 4 contains a much more variable number of internal repeats of 11 codons. These genes have apparently evolved from a primordial gene through multiple partial (internal) and complete gene duplications. On the basis of the degree of homology of the various sequences, and the pattern of the internal repeats in these genes, we propose an evolutionary tree for the apolipoprotein genes and give rough estimates of the divergence times between these genes. Our results show that apoA-II has evolved extremely rapidly and that apoA-I and apoE also have evolved at high rates but some regions are better conserved than the others. The rate of evolution of individual regions seems to be related to the stringency of their functional requirements.  相似文献   

19.
Protein domain repeats are common in proteins that are central to the organization of a cell, in particular in eukaryotes. They are known to evolve through internal tandem duplications. However, the understanding of the underlying mechanisms is incomplete. To shed light on repeat expansion mechanisms, we have studied the evolution of the muscle protein Nebulin, a protein that contains a large number of actin-binding nebulin domains.Nebulin proteins have evolved from an invertebrate precursor containing two nebulin domains. Repeat regions have expanded through duplications of single domains, as well as duplications of a super repeat (SR) consisting of seven nebulins. We show that the SR has evolved independently into large regions in at least three instances: twice in the invertebrate Branchiostoma floridae and once in vertebrates.In-depth analysis reveals several recent tandem duplications in the Nebulin gene. The events involve both single-domain and multidomain SR units or several SR units. There are single events, but frequently the same unit is duplicated multiple times. For instance, an ancestor of human and chimpanzee underwent two tandem duplications. The duplication junction coincides with an Alu transposon, thus suggesting duplication through Alu-mediated homologous recombination.Duplications in the SR region consistently involve multiples of seven domains. However, the exact unit that is duplicated varies both between species and within species. Thus, multiple tandem duplications of the same motif did not create the large Nebulin protein.Finally, analysis of segmental duplications in the human genome reveals that duplications are more common in genes containing domain repeats than in those coding for nonrepeated proteins. In fact, segmental duplications are found three to six times more often in long repeated genes than expected by chance.  相似文献   

20.
1. The pituitary hormones can be divided into 4 families; within each the members are structurally related and have probably evolved from a common ancestor by a process of gene duplication and divergence. 2. Recent structural studies have revealed much about the evolution of proteins. The roles of point mutation, gene duplication and partial gene duplication in molecular evolution have been highlighted, and the nature of the evolutionary forces involved has been extensively debated. The information available about the evolution of proteins in general provides a background for consideration of pituitary hormone evolution. 3. The structure and function of the mammalian neurohypophysial hormones (oxytocin and vasopressin) has been studied in detail. Related (structurally similar) peptides have been found in the neurohypophyses of lower vertebrates and have been Characterized in many instances. Several schemes have been proposed for the evolution of these hormones. 4. The vasopressins of the pig and its relatives show a genetic polymorphism. The roles of neurohypophysial hormones in lower vertebrates are very varied and not fully understood. 5. The ACTHs and MSHs are members of a second family of pituitary hormones. They are polypeptides of moderate size. Studies on amino-acid sequences have been carried out for ACTHs and MSHs from several mammals. α-MSH is identical in all cases studied in detail, but β-MSH and ACTH vary to some extent. There is considerable sequence homology between the hormones in this family - indicating a common phylogenetic origin and several gene duplications. 6. Dogfish MSH is the only non-mammalian hormone of the ACTH-MSH family to have been studied in detail. Two MSHs have been isolated from this species; both resemble the a-MSH of mammals in amino-acid sequence. ACTH-like and MSH-like hormones exist in many other vertebrate groups, but have not been characterized fully. 7. Structure-function relationships have been widely studied in the ACTH-MSH family, and have some interesting evolutionary implications. Polymorphism of P-MSHs is found in some mammals. 8. A third family of protein hormones includes pituitary prolactin and growth hormone, and placental lactogen. These are proteins of moderate size which have been shown to be widely distributed among the vertebrates. Species specificity can be recognized with regard to biological, immunological and structural properties. 9. Amino-acid sequences have been determined for growth hormones and prolactins from several mammals. There is sequence homology between growth hormone and prolactin. Human placental lactogen closely resembles human growth hormone. A phylogenetic tree has been constructed for this protein family. Rates of evolution within the group are rather variable. 10. The fourth family of pituitary hormones (FSH, LH, TSH and some related placental hormones) are all glycoproteins and have a subunit structure. Extensive sequence studies have been carried out on the hormones from some mammals, and show that there is considerable homology between the various subunits. The α-subunits of human TSH, LH and HCG (and probably FSH) are very similar. The β-subunits are different, but homologous. Evolution of this family clearly took place by a series of gene duplications followed by gene divergence. Schemes whereby this could have occurred have been discussed. Related hormones occur in lower vertebrates, but have not been fully characterized. Some lower vertebrates may possess only one gonado-trophin. 11. The pituitary hormones provide an interesting range of evolutionary problems, and are useful models for the study of molecular evolution. The evolutionary processes involved in their diversification have been discussed, with particular reference to the co-evolution of hormones and their receptors. Neutral mutations and gene duplications may have played a role in providing co-existing variation of hormones and receptors. 12. A speculative model for the evolution of neurohypophysial hormones is proposed, as an example of how molecular evolution may have operated in this and other hormone groups. 13. Homologies have been proposed between the various families of pituitary hormones, and between pituitary proteins and other entero-secretory proteins. The pituitary protein hormones were probably elaborated from smaller molecules by a process of partial gene duplication.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号