首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Retropseudogenes for human chromosomal protein HMG-17   总被引:5,自引:0,他引:5  
The human genome contains multiple copies of sequences homologous to the cDNA coding for non-histone chromosomal protein HMG-17. To study the mechanism of generation and dispersion of the HMG-17 multigene family a human genomic library was screened and 70 clones isolated and studied by Southern transfer and restriction site analysis. The results suggest that most of the clones contain unique sequences. Sequence analysis of two genomic clones indicates that they contain elements typical of processed retropseudogenes. Even though both sequences contained open reading frames the sequences lacked introns, were flanked by short, direct repeats and lacked elements associated with functional genes. The sequences of the two pseudogenes were 85% homologous to each other and each was 90% homologous to the human cDNA. Based on the sequence difference in the open reading frame between the pseudogenes and the cDNA it can be estimated that the sequences arose approximately ten million years ago from a common precursor. The present paper, which is the first study on genes coding for this nucleosomal binding protein, indicates that the HMG-17 multigene family is the largest known human retropseudogene family.  相似文献   

2.
The enzyme alpha 1,3-galactosyltransferase (alpha1,3-GT), which catalyzes synthesis of terminal alpha-galactosyl epitopes (Gal alpha1,3Gal beta1-4GlcNAc-R), is produced in non-primate mammals, prosimians and new-world monkeys, but not in old-world monkeys, apes and humans. We cloned and sequenced a cDNA that contains the coding sequence of the feline alpha1,3-GT gene. Flow cytometric analysis demonstrated that the alpha-galactosyl epitope was expressed on the surface of a human cell line transduced with an expression vector containing this cDNA, and this alpha-galactosyl epitope expression subsided by alpha-galactosidase treatment. The open reading frame of the feline alpha1,3-GT cDNA is 1,113 base pairs in length and encodes 371 amino acids. The nucleotide sequence and its deduced amino acid sequence of the feline alpha1,3-GT gene are 88-90% and 85-87%, respectively, similar to the reported sequences of the bovine, porcine, marmoset and cebus monkey alpha1,3-GT genes, while they are 88% and 82-83%, respectively, similar to those of the orangutan and human alpha1,3-GT pseudogenes, and 81% and 77%, respectively, similar to the murine alpha1,3-GT gene. Thus, the alpha1,3-GT genes and pseudogenes of mammals are highly similar. Ratios of non-synonymous nucleotide changes among the primate pseudogenes as well as the primate genes are still higher than the ratios of non-primates, suggesting that the primate alpha1,3-GT genes tend to be divergent.  相似文献   

3.
To study reductive evolutionary processes in bacterial genomes, we examine sequences in the Rickettsia genomes which are unconstrained by selection and evolve as pseudogenes, one of which is the metK gene, which codes for AdoMet synthetase. Here, we sequenced the metK gene and three surrounding genes in eight different species of the genus Rickettsia. The metK gene was found to contain a high incidence of deletions in six lineages, while the three genes in its surroundings were functionally conserved in all eight lineages. A more drastic example of gene degradation was identified in the metK downstream region, which contained an open reading frame in Rickettsia felis. Remnants of this open reading frame could be reconstructed in five additional species by eliminating sites of frameshift mutations and termination codons. A detailed examination of the two reconstructed genes revealed that deletions strongly predominate over insertions and that there is a strong transition bias for point mutations which is coupled to an excess of GC-to-AT substitutions. Since the molecular evolution of these inactive genes should reflect the rates and patterns of neutral mutations, our results strongly suggest that there is a high spontaneous rate of deletions as well as a strong mutation bias toward AT pairs in the Rickettsia genomes. This may explain the low genomic G + C content (29%), the small genome size (1.1 Mb), and the high noncoding content (24%), as well as the presence of several pseudogenes in the Rickettsia prowazekii genome.  相似文献   

4.
Unannotated mammalian genome databases (dog, cow, opossum) were searched for candidate connexin genes, using sequences from annotated genomes (man, mouse). All 18 'multi-species' connexin genes, i.e., orthologs of connexin26 , 29/31.3 (duplicated in opossum), 30, 30.2/31.9, 30.3, 31, 31.1, 32, 36, 37, 39/40.1, 40, 43, 45, 44/46, 47, 50, and 57/62 , were found in dog, cow and opossum. Connexin25 and 58 have been considered specific for man, but evident orthologs of connexin25 were found in dog, cow and opossum, and orthologs of connexin58 were found in dog and cow. Moreover, a connexin43 -like sequence (approx. 80% identical to connexin43 ) was found in man, chimpanzee, dog and cow. In the three former species, the sequences were located on the X chromosome. In man, chimpanzee and cow, there were stop codons in all reading frames; these sequences are therefore judged as pseudogenes, called here Cx43pX . In the dog, the sequence contained an open reading frame for a protein of 35.7 kDa (connexin35.7). We suggest that these sequences are orthologs of connexin33 , previously considered as a rodent-specific connexin gene. Thus, connexin25 , 33 and 58 are not species-specific genes. However, the opossum may possess a candidate, connexin39.2 , without obvious orthologs in other mammals. Furthermore, pseudogenes of primate connexin31.3 and opossum connexin35 (one of the two orthologs of primate connexin31.3) were detected. These results suggest that the structure of the mammalian connexin gene family should be revised, especially with regard to the so-called 'species-specific' connexins .  相似文献   

5.
6.
cDNA clones representing the entire genome of human rhinovirus 2 have been obtained and used to determine the complete nucleotide sequence. The genome consists of 7102 nucleotides and possesses a long open reading frame of 6450 nucleotides; this reading frame is initiated 611 nucleotides from the 5' end and stops 42 nucleotides from the polyA tract. The N-terminal sequences of three of the viral capsid proteins have been elucidated, thus defining the positions of three cleavage sites on the polyprotein. The extensive amino acid sequence homology with poliovirus and human rhinovirus 14 enabled the other cleavage sites to be predicted. Cleavages in the 3' half of the molecule appear to take place predominantly at Gln-Gly pairs, whereas those in the 5' half (including the capsid proteins) are more heterogeneous.  相似文献   

7.
We have isolated five genomic DNA clones which contain nucleotide sequences hybridizable to a cDNA for human ubiquinone-binding protein in Complex III (QP). Nucleotide sequence analysis revealed that two of them contained different types of pseudogenes suggesting molecular evolution of the gene, and that the other three clones contained the overlapping fragments from the same QP gene. The gene spans 4.5 to 5 kb in length. The sequences of exons in the gene were determined and found to be identical to the corresponding parts of the human QP cDNA. The exon-intron boundaries follow the GT/AG rule. Two CAAT boxes were found in the promoter region. It is concluded from these results that the isolated human QP gene is functional. Genomic Southern blot analysis showed that the gene is present in a single copy in the human genome.  相似文献   

8.
9.
The 78 101 base pair long sequence of a cluster of 22-kDa alpha zein genes in the maize inbred BSSS53 was determined. Each zein gene is contained within a repeat unit that varies in length. If such a repeat, or amplicon, is aligned along the entire sequence, a 10.5-fold sequence amplification is delineated. Because of insertions and deletions in intergenic regions, many of the zein genes are spaced over different distances. Only three out of 10 zein-related sequences have an intact open reading frame, indicating an unusual large number of genes unable to contribute to the accumulation of normal-size 22-kDa zein proteins. It is proposed that the seven remaining zein-related sequences be considered gene reserves because of their potential to be restored by gene conversion. Intergenic insertions in the cluster range from 1098 to 14 896 base pairs. Although they are composed of transposable element sequences, they also contain additional open reading frames, two of them showing homology to rice cDNA sequences. The average amplicon is 4423 base pairs long, with the sequence surrounding each zein gene more than 90 % conserved. Coincidently, the size of the amplicon is equivalent to the average gene density (one gene within 4640 bp) in the Arabidopsis thaliana genome, one of the smallest in plants. Successive steps of amplification and insertion of DNA might explain to a certain degree how genome size variation has been generated in plants.  相似文献   

10.
The rat ribosomal protein L35a gene comprises a multigene family which contains 15-20 members as shown by the Southern blot analysis using L35a cDNA as a probe. We isolated 15 independent clones which contained distinct genes from a rat genomic library. Analysis of the restriction sites showed that all of them lacked the intervening sequences. Thermal stability of the hybrid molecules between these genes and the cDNA indicated that the similarity of the genes to the cDNA sequence varied. The nucleotide sequences of three genes gRL35a-A, gRL35a-B and gRL35a-G were determined. They shared some characteristics; namely: they lacked the intervening sequences, they contained (A)-rich tracts, and they were flanked by direct repeats. Two genes, gRL35a-A and gRL35a-B, contained a sequence completely identical to that of the cDNA. The nucleotide sequence of the 5' flanking region of gRL35a-B showed a significant homology with that of the same region of mouse ribosomal protein L32-related unmutated processed genes. Although this region of gRL35a-B contained the sequences homologous to the TATA box and the CCAAT box, gRL35a-B was not transcribed in an in vitro assay system. Thus, the L35a gene family comprises mostly processed pseudogenes. Further, Southern blot analysis in various animals indicated that the multigene construction of this ribosomal protein gene was a feature of mammalian genes. The origin and the evolutionary aspect of processed pseudogenes are discussed.  相似文献   

11.
12.
As a first step towards understanding the molecular mechanisms through which the expression of the gene (OAT) encoding ornithine aminotransferase (OAT) is regulated in a tissue-specific manner, we have used a near full length OAT cDNA to isolate related sequences from a rat genomic DNA library. Twenty-one unique clones representing five contigs and spanning approximately 140 kb of genomic DNA were isolated and characterized. From these clones we have identified a single expressed OAT gene and three processed pseudogenes. The comparison of the EcoRI, BamHI, and HindIII fragments contained within these genomic clones with those detected in total genomic DNA by the cDNA probe suggests that essentially all of the OAT-related sequences in the rat genome have been isolated. Thus, the tissue-specific regulation of OAT gene expression appears to be effected through a single expressed gene. Data are presented which suggest that the OAT-1, OAT-2, and OAT-3 pseudogenes arose approximately 28.5, 7.3, and 25.1 Myr ago, respectively. Mutation rates are presented for each codon position of the expressed rat and human OAT genes. The region of the rat genome flanking the boundary of the OAT-3 pseudogene is of additional interest as it shares considerable identity to sequences contained within expressed genes and flanking other processed pseudogenes.  相似文献   

13.
14.
C Magoulas  D A Hickey 《Génome》1992,35(1):133-139
Several cDNA and genomic clones were isolated from Drosophila melanogaster gene libraries by hybridization with a region of a mammalian gene that contains a simple repetitive sequence of six GCN repeats. One of the cDNA clones, E6, was completely sequenced and it was shown that it contains a region of 16 GCN repeats; these repeats encode a polyalanine stretch within a long open reading frame. The sequencing of three different genomic clones (A, B, and D) revealed that all the isolated Drosophila clones are similar to one another in a short region containing variable numbers of the GCN repeat. The genomic clone B was found to be the genomic counterpart of the cDNA clone E6. The other genomic clones, A and D, also hybridize with Drosophila cDNA clones at high stringency. These results indicate that the short GCN repetitive sequences, which we have named ala, are found within transcribed regions of the Drosophila genome. These Drosophila genes containing the ala repeat do not show significant sequence similarity to any presently known gene; we have named these novel genes ala-A, ala-B, and ala-D. The cDNA clone from gene ala-B was named ala-E6.  相似文献   

15.
J Y Tso  X H Sun  T H Kao  K S Reece    R Wu 《Nucleic acids research》1985,13(7):2485-2502
Full length cDNAs encoding the glycolytic enzyme glyceraldehyde-3-phosphate dehydrogenase (GAPDH) from rat and man have been isolated and sequenced. Many GAPDH gene-related sequences have been found in both genomes based on genomic blot hybridization analysis. Only one functional gene product is known. Results from genomic library screenings suggest that there are 300-400 copies of these sequences in the rat genome and approximately 100 in the human genome. Some of these related sequences have been shown to be processed pseudogenes. We have isolated several rat cDNA clones corresponding to these pseudogenes indicating that some pseudogenes are transcribed. Rat and human cDNAs are 89% homologous in the coding region, and 76% homologous in the first 100 base pairs of the 3'-noncoding region. Comparison of these two cDNA sequences with those of the chicken, Drosophila and yeast genes allows the analysis of the evolution of the GAPDH genes in detail.  相似文献   

16.
To identify additional members of the murine N-formyl-Met-Leu-Phe peptide receptor family (fMLF-R), a mouse macrophage cDNA library was screened using the open reading frame of murine N-formyl peptide receptor. Four individual hybridizing cDNA clones were maintained through tertiary screening. One cDNA clone was a truncated, polyadenylated version of the previously described murine-fMLF-R. The other three cDNA clones varied in length, but contained identical open reading frame sequences. One clone, 8C10, was selected for further study and shared 70% sequence identity with murine-fMLF-R and 89% sequence identity with murine lipoxin A4 receptor cDNA. When placed into the pcDNA-3 expression vector and cotransfected with Galpha16 cDNA into COS-1 cells, 8C10 cDNA induced the production of inositol-1,4,5-triphosphate when concentrations of 1-1600 nM lipoxin A4 (LXA4) were tested as ligands. Northern blot analysis of murine organs indicated that the 8C10 message is present in lung, spleen, and adipose tissue. Moreover, mice treated with LPS demonstrated increased expression of 8C10 message in spleen and adipose tissue, while showing a slight reduction in lung. We have also characterized the 8C10 structural gene from a 129Sv/J genomic library and have determined its size to be >6.1 kb in length and comprised of two exons separated by a 4.8-kb intron. Collectively, these data indicate that this homologue receptor is closely related to the murine LXA4 receptor and functionally responds to LXA4 as a ligand.  相似文献   

17.
Sequences in the human genome with homology to the murine mammary tumor virus (MMTV) pol gene were isolated from a human phage library. Ten clones with extensive pol homology were shown to define five separate loci. These loci share common sequences immediately adjacent to the pol-like segments and, in addition, contain a related repeat element which bounds this region. This organization is suggestive of a proviral structure. We estimate that the human genome contains 30 to 40 copies of these pol-related sequences. The pol region of one of the cloned segments (HM16) and the complete MMTV pol gene were sequenced and compared. The nucleotide homology between these pol sequences is 52% and is concentrated in the terminal regions. The MMTV pol gene contains a single long open reading frame encoding 899 amino acids and is demarcated from the partially overlapping putative gag gene by termination codons and a shift in translational reading frame. The pol sequence of HM16 is multiply terminated but does contain open reading frames which encode 370, 105, and 112 amino acid residues in separate reading frames. We deduced a composite pol protein sequence for HM16 by aligning it to the MMTV pol gene and then compared these sequences with other retroviral pol protein sequences. Conserved sequences occur in both the amino and carboxyl regions which lie within the polymerase and endonuclease domains of pol, respectively.  相似文献   

18.
A "gene-island" sequencing strategy has been developed that expedites the targeted acquisition of orthologous gene sequences from related species for comparative genome analysis. A 152-kb bacterial artificial chromosome (BAC) clone from sorghum (Sorghum bicolor) encoding phytochrome A (PHYA) was fully sequenced, revealing 16 open reading frames with a gene density similar to many regions of the rice (Oryza sativa) genome. The sequences of genes in the orthologous region of the maize (Zea mays) and rice genomes were obtained using the gene-island sequencing method. BAC clones containing the orthologous maize and rice PHYA genes were identified, sheared, subcloned, and probed with the sorghum PHYA-containing BAC DNA. Sequence analysis revealed that approximately 75% of the cross-hybridizing subclones contained sequences orthologous to those within the sorghum PHYA BAC and less than 25% contained repetitive and/or BAC vector DNA sequences. The complete sequence of four genes, including up to 1 kb of their promoter regions, was identified in the maize PHYA BAC. Nine orthologous gene sequences were identified in the rice PHYA BAC. Sequence comparison of the orthologous sorghum and maize genes aided in the identification of exons and conserved regulatory sequences flanking each open reading frame. Within genomic regions where micro-colinearity of genes is absolutely conserved, gene-island sequencing is a particularly useful tool for comparative analysis of genomes between related species.  相似文献   

19.
This report provides the complete nucleotide sequences of the full-length cDNA encoding squalene synthase (SQS) and its genomic DNA sequence from a triterpene-producing fungus, Ganoderma lucidum. The cDNA of the squalene synthase (SQS) (GenBank Accession Number: DQ494674) was found to contain an open reading frame (ORF) of 1,404 bp encoding a 468-amino-acid polypeptide, whereas the SQS genomic DNA sequence (GenBank Accession Number: DQ494675) consisted of 1,984 bp and contained four exons and three introns. Only one gene copy was present in the G lucidum genome. The deduced amino acid sequence of Ganoderma lucidum squalene synthase (Gl-SQS) exhibited a high homology with other fungal squalene synthase genes and contained six conserved domains. A phylogenetic analysis revealed that G. lucidum SQS belonged to the fungi SQS group, and was more closely related to the SQS of U. maydis than to those of other fungi. A gene expression analysis showed that the expression level was relatively low in mycelia incubated for 12 days, increased after 14 to 20 days of incubation, and reached a relatively high level in the mushroom primordia. Functional complementation of Gl-SQS in a SQS-deficient strain of Saccharomyces cerevisiae confirmed that the cloned cDNA encoded a squalene synthase.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号