首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
田天  袁缓  陈斌 《昆虫学报》1950,63(8):1016-1027
【目的】明确肉食亚目(Adephaga)水生类群线粒体基因组的基本特征,并基于线粒体基因组序列分析肉食亚目水生类群的系统发育关系。【方法】基于Illumina HiSeq X Ten测序技术测定了圆鞘隐盾豉甲Dineutus mellyi和齿缘龙虱Eretes sticticus的线粒体全基因组序列,对其进行了基因注释,并对其tRNA基因二级结构进行了预测分析。加上已公布的鞘翅目(Coleoptera)肉食亚目水生类群17个种的线粒体基因组序列,对该类群共19个种线粒体的蛋白质编码基因(protein-coding genes, PCGs)开展了比较基因组学分析,包括AT含量、密码子偏好性、选择压力等。基于13个PCGs的氨基酸序列和核苷酸序列,利用最大似然法(ML)和贝叶斯法(BI)分别构建鞘翅目肉食亚目水生类群的系统发育关系,并通过FcLM分析进一步评估伪龙虱科(Noteridae)和瀑甲科(Meruidae)的系统发育位置。【结果】圆鞘隐盾豉甲和齿缘龙虱的线粒体基因组全长分别为16 123 bp(GenBank登录号: MN781126)和16 196 bp(GenBank登录号: MN781132),都包含13个PCGs、22个tRNA基因、2个rRNA基因和1个D-loop区(控制区)。19个肉食亚目水生类群线粒体基因组PCGs的碱基组成都呈现A+T偏好性,在密码子使用上也都偏向于使用富含A+T的密码子;在进化过程中13个PCGs的进化模式相同,都受到纯化选择。基于线粒体基因组13个PCGs的氨基酸序列的肉食亚目水生类群的系统发育关系为(豉甲科Gyrinidae+(沼梭甲科Haliplidae+((壁甲科Aspidytidae+(两栖甲科Amphizoidae+龙虱科Dytiscidae))+(水甲科Hygrobiidae+(瀑甲科Meruidae+伪龙虱科Noteridae)))))。【结论】研究结果表明,豉甲科是肉食亚目水生类群的基部类群,接下来是沼梭甲科和龙虱总科;伪龙虱科和瀑甲科互为姐妹群,并一起作为龙虱总科内部的一个分支;两栖甲科与龙虱科具有更近的亲缘关系。  相似文献   

2.
Okayasu T  Sorimachi K 《Amino acids》2009,36(2):261-271
We recently classified 23 bacteria into two types based on their complete genomes; “S-type” as represented by Staphylococcus aureus and “E-type” as represented by Escherichia coli. Classification was characterized by concentrations of Arg, Ala or Lys in the amino acid composition calculated from the complete genome. Based on these previous classifications, not only prokaryotic but also eukaryotic genome structures were investigated by amino acid compositions and nucleotide contents. Organisms consisting of 112 bacteria, 15 archaea and 18 eukaryotes were classified into two major groups by cluster analysis using GC contents at the three codon positions calculated from complete genomes. The 145 organisms were classified into “AT-type” and “GC-type” represented by high A or T (low G or C) and high G or C (low A or T) contents, respectively, at every third codon position. Reciprocal changes between G or C and A or T contents at the third codon position occurred almost synchronously in every codon among the organisms. Correlations between amino acid concentrations (Ala, Ile and Lys) and the nucleotide contents at the codon position were obtained in both “AT-type” and “GC-type” organisms, but with different regression coefficients. In certain correlations of amino acid concentrations with GC contents, eukaryotes, archaea and bacteria showed different behaviors; thus these kingdoms evolved differently. All organisms are basically classifiable into two groups having characteristic codon patterns; organisms with low GC and high AT contents at the third codon position and their derivatives, and organisms with an inverse relationship.  相似文献   

3.
We analyzed the nucleotide contents of several completely sequenced genomes, and we show that nucleotide bias can have a dramatic effect on the amino acid composition of the encoded proteins. By surveying the genes in 21 completely sequenced eubacterial and archaeal genomes, along with the entire Saccharomyces cerevisiae genome and two Plasmodium falciparum chromosomes, we show that biased DNA encodes biased proteins on a genomewide scale. The predicted bias affects virtually all genes within the genome, and it could be clearly seen even when we limited the analysis to sets of homologous gene sequences. Parallel patterns of compositional bias were found within the archaea and the eubacteria. We also found a positive correlation between the degree of amino acid bias and the magnitude of protein sequence divergence. We conclude that mutational bias can have a major effect on the molecular evolution of proteins. These results could have important implications for the interpretation of protein-based molecular phylogenies and for the inference of functional protein adaptation from comparative sequence data.  相似文献   

4.
Methods to infer the ancestral conditions of life are commonly based on geological and paleontological analyses. Recently, several studies used genome sequences to gain information about past ecological conditions taking advantage of the property that the G+C and amino acid contents of bacterial and archaeal ribosomal DNA genes and proteins, respectively, are strongly influenced by the environmental temperature. The adaptation to optimal growth temperature (OGT) since the Last Universal Common Ancestor (LUCA) over the universal tree of life was examined, and it was concluded that LUCA was likely to have been a mesophilic organism and that a parallel adaptation to high temperature occurred independently along the two lineages leading to the ancestors of Bacteria on one side and of Archaea and Eukarya on the other side. Here, we focus on Archaea to gain a precise view of the adaptation to OGT over time in this domain. It has been often proposed on the basis of indirect evidence that the last archaeal common ancestor was a hyperthermophilic organism. Moreover, many results showed the influence of environmental temperature on the evolutionary dynamics of archaeal genomes: Thermophilic organisms generally display lower evolutionary rates than mesophiles. However, to our knowledge, no study tried to explain the differences of evolutionary rates for the entire archaeal domain and to investigate the evolution of substitution rates over time. A comprehensive archaeal phylogeny and a non homogeneous model of the molecular evolutionary process allowed us to estimate ancestral base and amino acid compositions and OGTs at each internal node of the archaeal phylogenetic tree. The last archaeal common ancestor is predicted to have been hyperthermophilic and adaptations to cooler environments can be observed for extant mesophilic species. Furthermore, mesophilic species present both long branches and high variation of nucleotide and amino acid compositions since the last archaeal common ancestor. The increase of substitution rates observed in mesophilic lineages along all their branches can be interpreted as an ongoing adaptation to colder temperatures and to new metabolisms. We conclude that environmental temperature is a major factor that governs evolutionary rates in Archaea.  相似文献   

5.
Order Chiroptera is a unique group of mammals whose members have attained self-powered flight as their main mode of locomotion. Much speculation persists regarding bat evolution; however, lack of sufficient molecular data hampers evolutionary and conservation studies. Of ~ 1200 species, complete mitochondrial genome sequences are available for only eleven. Additional sequences should be generated if we are to resolve many questions concerning these fascinating mammals. Herein, we describe the complete mitochondrial genomes of three bats: Corynorhinus rafinesquii, Lasiurus borealis and Artibeus lituratus. We also compare the currently available mitochondrial genomes and analyze codon usage in Chiroptera. C. rafinesquii, L. borealis and A. lituratus mitochondrial genomes are 16438 bp, 17048 bp and 16709 bp, respectively. Genome organization and gene arrangements are similar to other bats. Phylogenetic analyses using complete mitochondrial genome sequences support previously established phylogenetic relationships and suggest utility in future studies focusing on the evolutionary aspects of these species. Comprehensive analyses of available bat mitochondrial genomes reveal distinct nucleotide patterns and synonymous codon preferences corresponding to different chiropteran families. These patterns suggest that mutational and selection forces are acting to different extents within Chiroptera and shape their mitochondrial genomes.  相似文献   

6.
The purpose of this research was to search for evolutionarily conserved fungal sequences to test the hypothesis that fungi have a set of core genes that are not found in other organisms, as these genes may indicate what makes fungi different from other organisms. By comparing 6355 predicted or known yeast (Saccharomyces cerevisiae) genes to the genomes of 13 other fungi using Standalone TBLASTN at an e-value <1E-5, a list of 3340 yeast genes was obtained with homologs present in at least 12 of 14 fungal genomes. By comparing these common fungal genes to complete genomes of animals (Fugu rubripes, Caenorhabditis elegans), plants (Arabidopsis thaliana, Oryza sativa), and bacteria (Agrobacterium tumefaciens, Xylella fastidiosa), a list of common fungal genes with homologs in these plants, animals, and bacteria was produced (938 genes), as well as a list of exclusively fungal genes without homologs in these other genomes (60 genes). To ensure that the 60 genes were exclusively fungal, these were compared using TBLASTN to the major sequence databases at GenBank: NR (nonredundant), EST (expressed sequence tags), GSS (genome survey sequences), and HTGS (unfinished high-throughput genome sequences). This resulted in 17 yeast genes with homologs in other fungal genomes, but without known homologs in other organisms. These 17 core, fungal genes were not found to differ from other yeast genes in GC content or codon usage patterns. More intensive study is required of these 17 genes and other common fungal genes to discover unique features of fungi compared to other organisms.Reviewing Editor: Prof. David Gottman  相似文献   

7.
Mouse, chicken and Xenopus laevis homologues to rig (rat insulinoma gene) cDNA were isolated and their nucleotide sequences were determined. Each homologue encoded a 145-amino acid protein; the amino acid sequence remained invariant in the murine and avian genes, and there were only 6 amino acid substitutions in the salientian gene. The evolutionary rate calculated for rig mRNA was sufficiently low to be viewed as evidence that rig is vital to vertebrate species. Southern blot analysis indicated that haploid sets of the mammalian genomes contain several copies of rig or rig-related sequences, whereas there appeared to be only one copy in the amphibian and bird genomes. The possibility that rig belongs to the class of housekeeping genes is discussed.  相似文献   

8.
Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans.  相似文献   

9.

Background  

It has been suggested previously that genome and proteome sequences show characteristics typical of natural-language texts such as "signature-style" word usage indicative of authors or topics, and that the algorithms originally developed for natural language processing may therefore be applied to genome sequences to draw biologically relevant conclusions. Following this approach of 'biological language modeling', statistical n-gram analysis has been applied for comparative analysis of whole proteome sequences of 44 organisms. It has been shown that a few particular amino acid n-grams are found in abundance in one organism but occurring very rarely in other organisms, thereby serving as genome signatures. At that time proteomes of only 44 organisms were available, thereby limiting the generalization of this hypothesis. Today nearly 1,000 genome sequences and corresponding translated sequences are available, making it feasible to test the existence of biological language models over the evolutionary tree.  相似文献   

10.
Goto N  Kurokawa K  Yasunaga T 《Gene》2007,401(1-2):172-180
To date, the complete genome sequences of more than 250 organisms have been determined. This information can now be used to determine whether there exist any invariant sequences that are conserved among all organisms, from bacteria to plants, animals, and humans. The existence of invariant sequences would strongly suggest that these sequences have been inherited unchanged from the last common ancestor of all life, and that they have essential functions. We have developed a new software program to identify invariant sequences conserved among the currently sequenced genomes and applied this analysis to the complete genome sequences of 266 organisms. We have identified 3 invariant DNA sequences longer than or equal to 11 bp and 6 invariant amino acid sequences longer than or equal to 6 aa. The longest invariant DNA sequence, AAGTCGTACAAGGT (15 bp), was found in the 16S/18S rRNA gene. Two 8 aa sequences, GHVDHGKT in IF2 and EF-Tu and DTPGHVDF in EF-G, were the longest invariant amino acid sequences detected. These sequences could be essential elements from the genome of the last common ancestor and may have remained unchanged throughout evolution.  相似文献   

11.
In the present study, the complete mitochondrial DNA (mtDNA) sequences of the pig nodule worm Oesophagostomum quadrispinulatum were determined for the first time, and the mt genome of Oesophagostomum dentatum from China was also sequenced for comparative analysis of their gene contents and genome organizations. The mtDNA sequences of O. dentatum China isolate and O. quadrispinulatum were 13,752 and 13,681 bp in size, respectively. Each of the two mt genomes comprises 36 genes, including 12 protein-coding genes, two ribosomal RNA and 22 transfer RNA genes, but lacks the ATP synthetase subunit 8 gene. All genes are transcribed in the same direction and have a nucleotide composition high in A and T. The contents of A+T are 75.79% and 77.52% for the mt genomes of O. dentatum and O. quadrispinulatum, respectively. Phylogenetic analyses using concatenated amino acid sequences of the 12 protein-coding genes, with three different computational algorithms (maximum likelihood, maximum parsimony and Bayesian inference), all revealed that O. dentatum and O. quadrispinulatum represent distinct but closely-related species. These data provide novel and useful markers for studying the systematics, population genetics and molecular diagnosis of the two pig nodule worms.  相似文献   

12.
Genomic trees have been constructed based on the presence and absence of families of protein-encoding genes observed in 27 complete genomes, including genomes of 15 free-living organisms. This method does not rely on the identification of suspected orthologs in each genome, nor the specific alignment used to compare gene sequences because the protein-encoding gene families are formed by grouping any protein with a pairwise similarity score greater than a preset value. Because of this all inclusive grouping, this method is resilient to some effects of lateral gene transfer because transfers of genes are masked when the recipient genome already has a homolog (not necessarily an ortholog) of the incoming gene. Of 71 genes suspected to have been laterally transferred to the genome of Aeropyrum pernix, only approximately 7 to 15 represent genes where a lateral gene transfer appears to have generated homoplasy in our character dataset. The genomic tree of the 15 free-living taxa includes six different bacterial orders, six different archaeal orders, and two different eukaryotic kingdoms. The results are remarkably similar to results obtained by analysis of rRNA. Inclusion of the other 12 genomes resulted in a tree only broadly similar to that suggested by rRNA with at least some of the differences due to artifacts caused by the small genome size of many of these species. Very small genomes, such as those of the two Mycoplasma genomes included, fall to the base of the Bacterial domain, a result expected due to the substantial gene loss inherent to these lineages. Finally, artificial ``partial genomes' were generated by randomly selecting ORFs from the complete genomes in order to test our ability to recover the tree generated by the whole genome sequences when only partial data are available. The results indicated that partial genomic data, when sampled randomly, could robustly recover the tree generated by the whole genome sequences. Received: 30 May 2001 / Accepted: 10 October 2001  相似文献   

13.
Mitochondrial (mt) genome sequences provide useful markers for investigating population genetic structures, systematics and phylogenetics of organisms. Although Taenia multiceps, T. hydatigena, and T. taeniaeformis are common taeniid tapeworms of ruminants, pigs, dogs, or cats, causing significant economic losses, no published study on their mt genomes is available. The complete mt genomes of T. multiceps, T. hydatigena, and T. taeniaeformis were amplified in two overlapping fragments and then sequenced. The sizes of the entire mt genome were 13700 bp for T. multiceps, 13489 bp for T. hydatigena, and 13647 bp for T. taeniaeformis. Each of the three genomes contains 36 genes, consisting of 12 genes for proteins, 2 genes for rRNA, and 22 genes for tRNA, which are the same as the mt genomes of all other cestode species studied to date. All genes are transcribed in the same direction and have a nucleotide composition high in A and T. The contents of A+T of the complete genomes are 71.3% for T. multiceps, 70.8% for T. hydatigena, and 73.0% for T. taeniaeformis. The AT bias had a significant effect on both the codon usage pattern and amino acid composition of proteins. T. multiceps and T. hydatigena had two noncoding regions, but T. taeniaeformis had only one. Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes revealed that T. multiceps, T. hydatigena, and T. taeniaeformis were more closely related to the other members of the Taenia genus, consistent with results of previous morphological and molecular studies. The present study determined the complete mt genome sequences for three Taenia species of animal and human health significance, providing useful markers for studying the systematics, population genetics, and molecular epidemiology of these cestode parasites of animals and humans.  相似文献   

14.
The complete nucleotide sequences of the genomes of the type 2 ( P712 , Ch, 2ab ) and type 3 (Leon 12a1b ) poliovirus vaccine strains were determined. Comparison of the sequences with the previously established genome sequence of type 1 (LS-c, 2ab ) poliovirus vaccine strain revealed that 71% of the nucleotides in the genome RNAs were common, that the 5' and 3' termini of the genomes were highly homologous, and that more than 80% of the nucleotide differences in the coding region occurred in the third letter position of in-phase codons, resulting in a low frequency of amino acid difference. These results strongly suggested that the serotypes of poliovirus derived from a common prototype. A comparison of the amino acid sequences predicted from the genome sequences showed highest variation in the capsid protein region, whereas non-structural proteins are highly conserved. Initiation of polyprotein synthesis occurs in all three strains more than 740 nucleotides downstream from the 5' end. An analysis of the non-coding region suggests that small peptides that could potentially originate from this region are conserved. The amino acid sequences immediately surrounding the cleavage signals, however, show a higher than average degree of variation. The analysis of the amino acid sequences of the capsid protein VP1 of all serotypes has led to the prediction of potential antigenic sites on the virion involved in neutralization.  相似文献   

15.
16.
Mycobacteriophages are viruses that infect mycobacterial hosts. Expansion of a collection of sequenced phage genomes to a total of 60—all infecting a common bacterial host—provides further insight into their diversity and evolution. Of the 60 phage genomes, 55 can be grouped into nine clusters according to their nucleotide sequence similarities, 5 of which can be further divided into subclusters; 5 genomes do not cluster with other phages. The sequence diversity between genomes within a cluster varies greatly; for example, the 6 genomes in Cluster D share more than 97.5% average nucleotide similarity with one another. In contrast, similarity between the 2 genomes in Cluster I is barely detectable by diagonal plot analysis. In total, 6858 predicted open-reading frames have been grouped into 1523 phamilies (phams) of related sequences, 46% of which possess only a single member. Only 18.8% of the phams have sequence similarity to non-mycobacteriophage database entries, and fewer than 10% of all phams can be assigned functions based on database searching or synteny. Genome clustering facilitates the identification of genes that are in greatest genetic flux and are more likely to have been exchanged horizontally in relatively recent evolutionary time. Although mycobacteriophage genes exhibit a smaller average size than genes of their host (205 residues compared with 315), phage genes in higher flux average only 100 amino acids, suggesting that the primary units of genetic exchange correspond to single protein domains.  相似文献   

17.
Tekaia F  Yeramian E  Dujon B 《Gene》2002,297(1-2):51-60
Can we infer the lifestyle of an organism from the characteristic properties of its genome? More precisely, what are the relations between easily quantifiable properties from genomic sequences, such as amino-acid compositions, and more subtle characteristics concerning for example lifestyles or evolutionary trends? Here, we seek a global picture for such properties, based on a large number (56) of complete genomes, including significant numbers of representatives from the three domains of life. We consider the amino acid compositions of the predicted proteomes, and we use correspondence analysis, as a multivariate method to extract the relevant information from the large-scale data. From these analyses we derive a series of conclusions, concerning lifestyles, as well as physico-chemical and evolutionary trends: (1) correspondence analysis of the amino acid compositions permits discrimination between the three known lifestyles (mesophily/thermophily/hyperthermophily). (2) For various organisms, amino-acid composition properties are essentially driven by GC content, and to a significantly lesser extent by growth temperatures associated with lifestyles. Roughly speaking, the respective contributions of these two components are 57 and 20%. It is notable that these proportions are essentially unchanged with respect to a previous analysis (Nature 393 (1998) 537), which involved only 15 genomes, available at the time. (3) In terms of amino acid compositional biases, two specific 'signatures' for thermophily (in a broad sense, including hyperthermophily) can be detected. First, thermophilic species display a relative abundance in glutamic acid (Glu), concomitantly with the depletion in glutamine. Second, in thermophilic species, the relative abundance in Glu (negative charge) is significantly correlated (Pearson correlation coefficient r=0.83 with P<0.0001), with the increase in the lumped 'pool' lysine+arginine (positive charges). This correlation (absent in mesophiles) could be interpreted on a physico-chemical basis, relevant to the thermostability of proteins. (4) Statistically significant differences are observed between the average lengths of the genes in the surveyed species, which follow their distribution between the three domains of life. Also a significant difference is observed between the average lengths of thermophilic (283.0+/-5.8) versus mesophilic (340+/-9.4) genes. It is thus possible that the 'general' shortening of the primary sequences in thermophilic proteins plays a role in thermostability. (5) Considering various combinations of conservation properties (genes conserved exclusively in eukaryotes, in archaea, in bacteria, in combinations of two domains, etc.) correspondence analysis reveals a trend towards thermophilic-hyperthermophilic profiles for the most conserved subset of genes (ancient genes). (6) When limited to the subset of species-specific genes, correspondence analysis leads to a different picture for the clustering of genomes following amino-acid compositions: for example, the 'core' specific part of a genome can bear lifestyle signatures different from those of the complete genome.Various results are discussed both on methodological and biological grounds. The evolutionary perspectives opened by our analyses are noted.  相似文献   

18.
Thirty-nine human parainfluenza type 1 (HPIV-1) genomes were sequenced from samples collected in Milwaukee, Wisconsin from 1997–2010. Following sequencing, phylogenetic analyses of these sequences plus any publicly available HPIV-1 sequences (from GenBank) were performed. Phylogenetic analysis of the whole genomes, as well as individual genes, revealed that the current HPIV-1 viruses group into three different clades. Previous evolutionary studies of HPIV-1 in Milwaukee revealed that there were two genotypes of HPIV-1 co-circulating in 1991 (previously described as HPIV-1 genotypes C and D). The current study reveals that there are still two different HPIV-1 viruses co-circulating in Milwaukee; however, both groups of HPIV-1 viruses are derived from genotype C indicating that genotype D may no longer be in circulation in Milwaukee. Analyses of genetic diversity indicate that while most of the genome is under purifying selection some regions of the genome are more tolerant of mutation. In the 40 HPIV-1 genomes sequenced in this study, the nucleotide sequence of the L gene is the most conserved while the sequence of the P gene is the most variable. Over the entire protein coding region of the genome, 81 variable amino acid residues were observed and as with nucleotide diversity, the P protein seemed to be the most tolerant of mutation (and contains the greatest proportion of non-synonymous to synonymous substitutions) while the M protein appears to be the least tolerant of amino acid substitution.  相似文献   

19.
Breton S  Burger G  Stewart DT  Blier PU 《Genetics》2006,172(2):1107-1119
Marine mussels of the genus Mytilus have an unusual mode of mitochondrial DNA (mtDNA) transmission termed doubly uniparental inheritance (DUI). Female mussels are homoplasmic for the F mitotype, which is inherited maternally, while males are usually heteroplasmic, carrying a mixture of the maternal F mitotype and the paternally inherited M genome. Two classes of M genomes have been observed: "standard" M genomes and "recently masculinized" M genomes. The latter are more similar to F genomes at the sequence level but are transmitted paternally like standard M genomes. In this study we report the complete sequences of two standard male M. edulis and one recently masculinized male M. trossulus mitochondrial genome. A comparative analysis, including the previously sequenced M. edulis F and M. galloprovincialis F and M mtDNAs, reveals that these genomes are identical in gene order, but highly divergent in nucleotide and amino acid sequence. The large amount (>20%) of nucleotide substitutions that fall in coding regions implies that there are several amino acid replacements between the F and M genomes, which likely have an impact on the structural and functional properties of the mitochondrial proteome. Correlation of the divergence rate of different protein-coding genes indicates that mtDNA-encoded proteins of the M genome are still under selective constraints, although less highly than genes of the F genome. The mosaic F/M control region of the masculinized F genome provides evidence for lineage-specific sequences that may be responsible for the different mode of transmission genetics. This analysis shows the value of comparative genomics to better understand the mechanisms of maintenance and segregation of mtDNA sequence variants in mytilid mussels.  相似文献   

20.
By analyses of short DNA sequences, we have deduced the overall arrangement of genes in the (A + T)-rich coding sequences of herpesvirus saimiri (HVS) relative to the arrangements of homologous genes in the (G + C)-rich coding sequences of the Epstein-Barr virus (EBV) genome and the (A + T)-rich sequences of the varicella-zoster virus (VZV) genome. Fragments of HVS DNA from 13 separate sites within the 111 kilobase pairs of the light DNA coding sequences of the genome were subcloned into M13 vectors, and sequences of up to 350 bases were determined from each of these sites. Amino acid sequences predicted for fragments of open reading frames defined by these sequences were compared with a library of the protein sequences of major open reading frames predicted from the complete DNA sequences of VZV and EBV. Of the 13 short amino acid sequences obtained from HVS, only 3 were recognizably homologous to proteins encoded by VZV, but all 13 HVS sequences were unambiguously homologous to gene products encoded by EBV. The HVS reading frames identified by this method included homologs of the major capsid polypeptides, glycoprotein H, the major nonstructural DNA-binding protein, thymidine kinase, and the homolog of the regulatory gene product of the BMLF1 reading frame of EBV. Locally as well as globally, the order and relative orientation of these genes resembled that of their homologs on the EBV genome. Despite the major differences in their nucleotide compositions and in the nature and arrangements of reiterated DNA sequences, the genomes of the lymphotropic herpesviruses HVS and EBV encode closely related proteins, and they share a common organization of these coding sequences which differs from that of the neurotropic herpesviruses, VZV and herpes simplex virus.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号