首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary We have implemented a routine procedure for screening protein sequences for evidence of intragenic duplications. We tested 163 protein sequences representing 116 superfamilies of unrelated proteins. Twenty superfamilies contain proteins with internal gene duplications. The intragenic duplications detected can be divided into two major types. (1) One or more duplications of all or part of a gene produce a protein with two or several detectable regions of sequence homology. Sequences from 18 superfamilies contained this type of duplication. (2) Repeated reduplication of a small DNA segment can produce a protein that is repetitive over most of its length. Three superfamilies contain such repetitive sequences. We also investigated the limits of detection of ancient duplications using sequences derived by random mutation of a model sequence consisting of ten 10-residue repeats. The original repetitive nature of the sequence was usually detected after 250 point mutations even though the ancestral segment could not be accurately reconstructed.  相似文献   

2.
Genomic DNA fragments bearing proline-rich protein (PRP) genes expressed specifically in hamster parotid glands have been isolated and characterized. Complete exonic sequences as well as intronic and a considerable portion of the flanking sequences are reported for a PRP gene, H29. H29 is interrupted by three intervening sequences, with consensus splice junctions, and it likely encodes the acidic hamster PRP Hp43a. Exceedingly high homology of the 5'-untranslated region and the sequence encoding the signal peptide is observed with other PRPs of all species studied. Significant homology was also detected among the repetitive sequences of the mature acidic PRPs from human, mouse, hamster, and rat. This conservation of the internal repeats of the PRPs suggested that proline-rich protein gene evolution involved intragenic duplication of internal repeats and gene duplication and conversion. Both hamster and mouse PRP genes (H29 and mouse proline-rich protein gene, respectively) share considerable sequence similarity in the 5'-flanking regions for about 100 base pairs upstream. The remainder of the upstream sequences were heterologous except for three oligonucleotide regions with 60-70% sequence conservation. These three regions are thought to be involved in the regulation of the tissue-specific PRP gene induction.  相似文献   

3.
Nishizawa M  Nishizawa K 《Proteins》1999,37(2):284-292
We showed previously that the use of arginine versus lysine residues in eukaryote proteins is correlated positively with local GC content of the genome within approximately 50 residues. Cumulative analyses show that the tendency for self-clustering (or repetitive use) generally is the case for all types of amino acids except for certain hydrophobic types. The degree to which each of the amino acids is used recurrently is weak for ancient proteins (or protein domains), those that are conserved through both eukaryotes and prokaryotes, but strong for modern proteins, which are unique to organisms of particular phyla. These findings support the idea that repetitiveness occurs due to a propensity of genomic DNA to cause tandem genomic duplication. A protein sequence with high repetitiveness tends to be unique in the homology search, which may indicate the weaker constraints and, hence, more arbitrary use of amino acids. Simulation analyses suggest that tandem gene duplications on a very small scale (1 or 2 codons) is an important causal factor in maintaining repetitiveness in the presence of concomittant occurrence of substitutive point mutation. For yeast proteins, approximately 1.3 duplication events per 1,000 residues on average are likely to occur, whereas 10 events of substitution mutation occur. It also is suggested that duplication enhances the probability of occurrence of some peptide motifs, such as those found in zinc fingers and segments with extreme physicochemical characteristics, and, thus, that local repetitiveness is a genomic factor influencing the evolution of eukaryote proteins.  相似文献   

4.
Carbamoylphosphate synthetase (CPS) catalyzes the first committed step in pyrimidine biosynthesis, arginine biosynthesis, or the urea cycle. Organisms may contain either one generalized or two specific CPS enzymes, and these enzymes may be heterodimeric (encoded by linked or unlinked genes), monomeric, or part of a multifunctional protein. In order to help elucidate the evolution of CPS, we have performed a comprehensive phylogenetic analysis using the 21 available complete CPS sequences, including a sequence from Sulfolobus solfataricus P2 which we report in this paper. This is the first report of a complete CPS gene sequence from an archaeon, and sequence analysis suggests that it encodes an enzyme similar to heterodimeric CPSII. We confirm that internal similarity within the synthetase domain of CPS is the result of an ancient gene duplication that preceded the divergence of the Bacteria, Archaea, and Eukarya, and use this internal duplication in phylogenetic tree construction to root the tree of life. Our analysis indicates with high confidence that this archaeal sequence is more closely related to those of Eukarya than to those of Bacteria. In addition to this ancient duplication which created the synthetase domain, our phylogenetic analysis reveals a complex history of further gene duplications, fusions, and other events which have played an integral part in the evolution of CPS.   相似文献   

5.
Summary Using computer programs that analyze the evolutionary history and probability of relationship of protein sequences, we have investigated the gene duplication events that led to the present configuration of immunoglobulin C regions, with particular attention to the origins of the homology regions (domains) of the heavy chains. We conclude that all of the sequenced heavy chains share a common ancestor consisting of four domains and that the two shorter heavy chains, alpha and gamma, have independently lost most of the second domain. These conclusions allow us to align corresponding regions of these sequences for the purpose of deriving evolutionary trees. Three independent internal gene duplications are postulated to explain the observed pattern of relationships among the four domains: first a duplication of the ancestral single domain C region, followed by independent duplications of the resulting first and last domains. In these studies there was no evidence of crossing-over and recombination between ancestral chains of different classes; however, certain types of recombinations would not be detectable from the available sequence data.  相似文献   

6.
7.
We investigated the evolution of transmembrane (TM) topology by detecting partial sequence repeats in TM protein sequences and analyzing them in detail. A total of 377 sequences that seem to have evolved by internal gene duplication events were found among 38,124 predicted TM protein sequences (except for single-spannings) from 87 prokaryotic genomes. Various types of internal duplication patterns were identified in these sequences. The majority of them are diploid-type (including quasi-diploid-type) duplication in which a primordial protein sequence was duplicated internally to become an extant TM protein with twice as many TM segments as the primordial one, and the remaining ones are partial duplications including triploid-type. The diploid-type repeats are recognized in many 8-tms, 10-tms and 12-tms TM protein sequences, suggesting the diploid-type duplication was a principle mechanism in the evolutionary development of these types of TM proteins. The "positive-inside" rule is satisfied in whole sequences of both 10-tms and 8-tms TM proteins and in both halves of 10-tms proteins while not necessarily in the second half of 8-tms proteins, providing fit examples of "internal divergent topology evolution" likely occurred after a diploid-type internal duplication event. From analyzing the partial duplication patterns, several evolutionary pathways were recognized for 6-tms TM proteins, i.e. from primordial 2-tms, 3-tms and 4-tms TM proteins to extant 6-tms proteins. Similarly, the duplication pattern analysis revealed plausible evolution scenarios that 7-tms TM proteins have arisen from 3-tms, 4-tms and 5-tms TM protein precursors via partial internal gene duplications.  相似文献   

8.
Conserved genes have found their way into the mainstream of molecular systematics. Many of these genes are members of multigene families. A difficulty with using single genes of multigene families for phylogenetic inference is that genes from one species may be paralogous to those from another taxon. We focus attention on this problem using heat shock 70 (HSP70) genes. Using polymerase chain reaction techniques with genomic DNA, we isolated and sequenced 123 distinct sequences from 12 species of sharks. Phylogenetic analysis indicated that the sequences cluster with constituitively expressed cytoplasmic heat shock-like genes. Three highly divergent gene clades were sampled. A number of similar sequences were sampled from each species within each distinct gene clade. Comparison of published species trees with an HSP70 gene tree inferred using Bayesian phylogenetic analysis revealed several cases of gene duplication and differential sorting of gene lineages within this group of sharks. Gene tree parsimony based on the objective criteria of duplication and losses showed that previously published hypotheses of species relationships and two novel hypothesis based on Bayesian phylogenetics were concordant with the history of HSP70 gene duplication and loss. By contrast, two published hypotheses based on morphological data were not significantly different from the null hypothesis of a random association between species relatedness and the HSP70 gene tree. These results suggest that gene tree parsimony using data from multigene families can be used for inferring species relationships or testing published alternative hypotheses. More importantly, the results suggest that systematic studies relying on phylogenetic inferences from HSP70 genes may by plagued by unrecognized paralogy of sampled genes. Our results underscore the distinction between gene and species trees and highlight an underappreciated source of discordance between gene trees and organismal phylogeny, i.e., unrecognized paralogy of sampled genes.  相似文献   

9.
The "ovalbumin Y" gene, one of three which constitute the ovalbumin gene family in chicken has been completely sequenced. The exact location of exons can be derived from the comparison with the ovalbumin gene sequence and from the map previously established by electron microscopy analysis. During evolution of the Y gene, selective pressure has operated to retain a sequence coding for an ovalbumin-like protein. The location of splice junctions, the length of protein coding exons and the reading phase are as in the ovalbumin gene. The overall homology between the Y and ovalbumin protein coding sequences is 72.6% (resulting in a 58% homology for the amino acid sequences). A significantly high number of base changes within coding sequences are present in clusters, which appear in several cases to be correlated with the occurrence of direct repeats. The 3' untranslated sequences of the Y and ovalbumin mRNAs have diverged much more, and the Y sequence contains a peculiar U(T) rich region. Corresponding introns of the ovalbumin and Y genes differ extensively both in sequence and in length. They share however characteristic biases in their base distribution.  相似文献   

10.
Structural analysis of oligomycin sensitivity-conferring protein (OSCP) revealed repeating sequences (residues 1-89, 105-190) suggesting an evolution of the protein by gene duplication. In addition to the reported homology with the delta-subunit of Escherichia coli F1ATPase, OSCP also shows a certain homology with the b-subunit of E. coli F0 and the ADP/ATP carrier of mitochondria.  相似文献   

11.
Structure and evolution of the Xenopus laevis albumin genes   总被引:4,自引:0,他引:4  
The 68K and 74K albumin genes of Xenopus laevis arose by duplication approximately 30 million years ago. Electron microscopic analysis showed that both genes contain 15 coding sequences. The lengths of corresponding coding sequences are almost identical and are extremely similar to those of mammalian albumin genes. A block of four coding sequences, which in mammals codes for one protein domain, is repeated three times. The corresponding introns are usually different in length and have therefore diverged as a result of insertion/deletion events. The extensive homology between these gene sequences is neither confined to nor most extensive in the coding sequences and similar amounts of homologous sequences are found in the flanking DNAs as in the gene regions. Various structures were formed in the 5'-flanking DNA by mutually exclusive pairing of different homology regions. Analysis of the two 74K albumin gene sequences isolated suggests that the X. laevis genome may contain one 68K albumin gene and two very closely related 74K albumin genes.  相似文献   

12.
13.
Prosaposin is a multifunctional protein encoded by a single-copy gene. It contains four saposin domains (A, B, C, and D) occurring as tandem repeats connected by linker sequences. Because the saposin domains are similar to one another, it is deduced that they were created by sequential duplications of an ancestral domain. There are two types of evolutionary scenarios that may explain the creation of the four-domain gene: (1) two rounds of tandem internal gene duplication and (2) three rounds of duplications. An evolutionary and phylogenetic analysis of saposin DNA and amino acid sequences from human, mouse, rat, chicken, and zebrafish indicates that the first evolutionary scenario is the most likely. Accordingly, an ancestral saposin-unit duplication produced a two-domain gene, which, subsequently, underwent a second complete tandem duplication to give rise to the present four-domain structure of the prosaposin gene. Received: 8 February 2001 / Accepted: 29 June 2001  相似文献   

14.
15.
Summary Several spontaneous Lac deletion derivatives of the β-galactosidase gene ofLactobacillus bulgaricus were analyzed for their phenotypic stability. We found that one of these mutants,lac139, carrying a deletion of 30 by within the gene, was able to revert to a Lac+ phenotype. Genetical analysis of revertants indicated that an internal region of 72 by was duplicated immediately next to the deletion site. The region involved in the duplication event is flanked by direct repeated sequences of 13 by in length. Both events, the deletion and the duplication, were mediated by the presence of such short direct repeats. Enzymatic studies of the purified proteins indicated identical kinetic parameters, but showed considerable instability of the revertant protein.  相似文献   

16.
17.
The isolation, characterization, and expression of a novel cDNA encoding a Trypanosoma cruzi polypeptide (TcAc2), homologous to various small stress proteins and glutathione S-transferases, are described. The deduced amino-acid sequence revealed two domains sharing 27% identity and an additional 27% similarity to each other suggesting that the molecule may have evolved from a single domain by a process of gene duplication and fusion. The TcAc2 cDNA was subcloned into the pGEX-2T vector for expression in E coli. In vitro translation products of epimastigote mRNA, immunoprecipitated with anti-TXepi serum, showed a major radioactive band of 52 kDa. Immunoprecipitation of [35S] methionine labelled epimastigote and trypomastigote antigens after pulse chase experiments, using anti-TcAc2 fusion protein antibodies, showed that the protein is released into the culture medium. Moreover, Western blot analysis revealed a single band of 52 kDa with epimastigote, trypomastigote and amastigote antigens. Primary structure homology searches revealed that each TcAc2 domain contained within its N-terminus significant homology to Solanum tuberosum pathogenesis-related protein PRI, soybean heat shock protein 26-A, auxin regulated clone pCNT103 from Nicotiana tabacum and Drosophila melanogaster glutathione S-transferase 27 (GST27). This finding was supported by a comparison of hydrophobicity profiles of TcAc2 and these proteins. Most of them play a central role in protection mechanisms against stress. Based on the homology between TcAc2, glutathione S-transferases (GST) and small stress proteins, it is likely that the TcAc2 gene product may play a crucial role in parasite's adaptation to its microenvironment. These molecules could be considered as members of the GST superfamily, where the T cruzi protein may take a particular place because of its internal gene duplication.  相似文献   

18.
19.
Evolution of type II DNA methyltransferases. A gene duplication model   总被引:30,自引:0,他引:30  
On the basis of consensus sequences, which had previously been defined for two groups of closely related cytosine-specific and adenine-specific DNA methyltransferases, homologies can be detected that indicate a common origin for these proteins. Intramolecular comparisons of several of these enzymes reveal homology relationships, which suggests that gene duplication is a phylogenetic principle in the evolution of the Mtases. One or two duplications of an ancestral gene encoding a 12,000 to 16,000 Mr protein, followed by divergent evolution, may have led to very different protein structures and could explain the differences in amino acid sequences, molecular weights and biochemical properties. Intermolecular and intramolecular homologies were also recognized in type II restriction endonucleases, suggesting a very similar evolutionary pathway.  相似文献   

20.
Mouse M and P lysozymes are the products of separate genes, are specifically expressed in separate tissues, and are adapted to different functions. The lysozyme genes have assumed these markedly different characteristics following their generation by gene duplication 30-50 million years ago. The discovery of the lysozyme P gene only 5 kb upstream from the M gene in tandem repeat has enabled an investigation of the molecular basis of their duplication and subsequent divergence. The duplication is shown to have involved recombination between two B2 repeat sequences flanking the original gene. The resulting downstream copy has retained the myeloid specificity of expression along with just 1.7 kb of upstream sequences, while the upstream copy is inactive in macrophages and has become expressed instead in the small intestine. Although multiple gene conversion events have served to maintain a generally high homology between the genes, certain regions have been found to be specific for either one of the gene pair: two repetitive sequences peculiar to the P region may serve to protect the coding regions from gene conversion, while sequences unique to the M gene may be more directly involved in differential regulation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号