首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 20 毫秒
1.
2.
The nucleotide composition of genomes undergoes dramatic variations among all three kingdoms of life. GC content, an important characteristic for a genome, is related to many important functions, and therefore GC content and its distribution are routinely reported for sequenced genomes. Traditionally, GC content distribution is assessed by computing GC contents in windows that slide along the genome. Disadvantages of this routinely used window-based method include low resolution and low sensitivity. Additionally, different window sizes result in different GC content distribution patterns within the same genome. We proposed a windowless method, the GC profile, for displaying GC content variations across the genome. Compared to the window-based method, the GC profile has the following advantages: 1) higher sensitivity, because of variation-amplifying procedures; 2) higher resolution, because boundaries between domains can be determined at one single base pair; 3) uniqueness, because the GC profile is unique for a given genome and 4) the capacity to show both global and regional GC content distributions. These characteristics are useful in identifying horizontally-transferred genomic islands and homogenous GC-content domains. Here, we review the applications of the GC profile in identifying genomic islands and genome segmentation points, and in serving as a platform to integrate with other algorithms for genome analysis. A web server generating GC profiles and implementing relevant genome segmentation algorithms is available at: www.zcurve.net.  相似文献   

3.
4.
The elemental composition of proteins influences the quantities of different elements required by organisms. Here, we considered variation in the sulphur content of whole proteomes among 19 Archaea, 122 Eubacteria and 10 eukaryotes whose genomes have been fully sequenced. We found that different species vary greatly in the sulphur content of their proteins, and that average sulphur content of proteomes and genome base composition are related. Forces contributing to variation in proteomic sulphur content appear to operate quite uniformly across the proteins of different species. In particular, the sulphur content of orthologous proteins was frequently correlated with mean proteomic sulphur contents. Among prokaryotes, proteomic sulphur content tended to be greater in anaerobes, relative to non-anaerobes. Thermophiles tended to have lower proteomic sulphur content than non-thermophiles, consistent with the thermolability of cysteine and methionine residues. This work suggests that persistent environmental growth conditions can influence the evolution of elemental composition of whole proteomes in a manner that may have important implications for the amount of sulphur used by living organisms to build proteins. It extends previous studies that demonstrated links between transient changes in environmental conditions and the elemental composition of subsets of proteins expressed under these conditions.  相似文献   

5.
Codon usage and base composition in sequences from the A + T-rich genome ofRickettsia prowazekii, a member of the alpha Proteobacteria, have been investigated. Synonymous codon usage patterns are roughly similar among genes, even though the data set includes genes expected to be expressed at very different levels, indicating that translational selection has been ineffective in this species. However, multivariate statistical analysis differentiates genes according to their G + C contents at the first two codon positions. To study this variation, we have compared the amino acid composition patterns of 21R. prowazekii proteins with that of a homologous set of proteins fromEscherichia coli. The analysis shows that individual genes have been affected by biased mutation rates to very different extents: genes encoding proteins highly conserved among other species being the least affected. Overall, protein coding and intergenic spacer regions have G + C content values of 32.5% and 21.4%, respectively. Extrapolation from these values suggests thatR. prowazekii has around 800 genes and that 60–70% of the genome may be coding. Correspondence to: S.G.E. Andersson  相似文献   

6.
7.
Since base composition of translational stop codons (TAG, TAA, and TGA) is biased toward a low G+C content, a differential density for these termination signals is expected in random DNA sequences of different base compositions. The expected length of reading frames (DNA segments of sense codons flanked by in-phase stop codons) in random sequences is thus a function of GC content. The analysis of DNA sequences from several genome databases stratified according to GC content reveals that the longest coding sequences—exons in vertebrates and genes in prokaryotes—are GC-rich, while the shortest ones are GC-poor. Exon lengthening in GC-rich vertebrate regions does not result, however, in longer vertebrate proteins, perhaps because of the lower number of exons in the genes located in these regions. The effects on coding-sequence lengths constitute a new evolutionary meaning for compositional variations in DNA GC content. Correspondence to: J. L. Oliver  相似文献   

8.
Lactobacillus plantarum is a versatile and flexible species that is encountered in a variety of niches and can utilize a broad range of fermentable carbon sources. To assess if this versatility is linked to a variable gene pool, microarrays containing a subset of small genomic fragments of L. plantarum strain WCFS1 were used to perform stringent genotyping of 20 strains of L. plantarum from various sources. The gene categories with the most genes conserved in all strains were those involved in biosynthesis or degradation of structural compounds like proteins, lipids, and DNA. Conversely, genes involved in sugar transport and catabolism were highly variable between strains. Moreover, besides the obvious regions of variance, like prophages, other regions varied between the strains, including regions encoding plantaricin biosynthesis, nonribosomal peptide biosynthesis, and exopolysaccharide biosynthesis. In many cases, these variable regions colocalized with regions of unusual base composition. Two large regions of flexibility were identified between 2.70 and 2.85 and 3.10 and 3.29 Mb of the WCFS1 chromosome, the latter being close to the origin of replication. The majority of genes encoded in these variable regions are involved in sugar metabolism. This functional overrepresentation and the unusual base composition of these regions led to the hypothesis that they represented lifestyle adaptation regions in L. plantarum. The present study consolidates this hypothesis by showing that there is a high degree of gene content variation among L. plantarum strains in genes located in these regions of the WCFS1 genome. Interestingly, based on our genotyping data L. plantarum strains clustered into two clearly distinguishable groups, which coincided with an earlier proposed subdivision of this species based on conventional methods.  相似文献   

9.
A mathematical model is presented that describes the concentration of an amino acid in total cell protein as a function of its concentration in individual cell proteins or in sets of cell proteins. The resulting equation makes it possible to calculate how the makeup of cell proteins must change to obtain a specified alteration in the content of an amino acid in the total cell protein. It is recognized that protein species or sets of proteins that are distinguished by being richer or poorer in a key amino acid than the overall protein must undergo considerable variations in content. The necessary extent of these shifts suggests that the amino acid composition of total cell protein is not likely to be affected significantly by variations in the cultivation conditions.  相似文献   

10.
Nuclear DNA content, chromatin structure, and DNA composition were investigated in four Agave species: two diploid, Agave tequilana Weber and Agave angustifolia Haworth var. marginata Hort., and two pentaploid, Agave fourcroydes Lemaire and Agave sisalana Perrine. It was determined that the genome size of pentaploid species is nearly 2.5 times that of diploid ones. Cytophotometric analyses of chromatin structure were performed following Feulgen or DAPI staining to determine optical density profiles of interphase nuclei. Pentaploid species showed higher frequencies of condensed chromatin (heterochromatin) than diploid species. On the other hand, a lower frequency of A-T rich (DAPI stained) heterochromatin was found in pentaploid species than in diploid ones, indicating that heterochromatin in pentaploid species is made up of sequences with base compositions different from those of diploid species. Since thermal denaturation profiles of extracted DNA showed minor variations in the base composition of the genomes of the four species, it is supposed that, in pentaploid species, the large heterochromatin content is not due to an overrepresentation of G-C repetitive sequences but rather to the condensation of nonrepetitive sequences, such as, for example, redundant gene copies switched off in the polyploid complement. It is suggested that speciation in the genus Agave occurs through point mutations and minor DNA rearrangements, as is also indicated by the relative stability of the karyotype of this genus. Key words : Agave, DNA cytophotometry, DNA melting profiles, chromatin structure, genome size.  相似文献   

11.
The base compositional correlations that hold among various coding and noncoding regions of the canine genome have been analysed. The distribution pattern of genes, on the basis of GC(3) composition, shows a wide range similar to that observed in human. However the occurrence of maximum number of genes was observed in the range of 65-75% of GC(3) composition. The correlation between the coding DNA sequences of canine with the different noncoding regions (introns and flanking regions) is found to be significant and in many cases the degree of correlation show similarity to human genome. We found that these correlations are not limited to the GC content alone, but is holding at the level of the frequency of individual bases as well. The present study suggests that canines ideally belong to the predicted 'general mammalian pattern' of genome composition along with human beings.  相似文献   

12.
A novel method to calculate the G+C content of genomic DNA sequences.   总被引:2,自引:0,他引:2  
The base composition of a DNA fragment or genome is usually measured by the proportion of A+T or G+C in the sequence. The G+C content along genomic sequences is usually calculated using an overlapping or non-overlapping sliding window method. The result and accuracy of such an approach depends on the size of the window and the moving distance adopted. In this paper, a novel windowless technique to calculate the G+C content of genomic sequences is proposed. By this method, the G+C content can be calculated at different "resolution". In an extreme case, the G+C content may be computed at a specific point, rather than in a window of finite size. This is particularly useful to analyze the fine variation of base composition along genomic sequences. As the first example, the variation of G+C content along each of 16 yeast chromosomes is analyzed. The G+C-rich regions with length larger than 5 kb sequences are detected and listed in details. It is found that each chromosome consists of several G+C-rich and G+C-poor regions alternatively, i.e., a mosaic structure. Another example is to analyze the G+C content for each of the two chromosomes of the Vibrio cholerae genome. Based on the variations of the G+C content in each chromosome, it is shown that some fragments in the Vibrio cholerae genome may have been transferred from other species. Especially, the position and size of the large integron island on the smaller chromosome was precisely predicted. This method would be a useful tool for analyzing genomic sequences.  相似文献   

13.
14.
Prior studies on subfractions of mouse and Kangaroo rat DNA have suggested that variations in base concentration within a given genome may not be great enough to account for Q-banding. To examine this with another species, calf DNA was subfractionated by CsCl ultracentrifugation into GC-rich satellites and the main band DNA was further fractionated into AT-rich, intermediate and GC-rich portions. The effect of varying concentrations of these DNAs on quinacrine and Hoechst 33258 fluorescence was examined. Although with both compounds there was less fluorescence in the presence of the GC-rich satellites than main band fractions, these results per se did not answer the question of whether the variation in base composition alone was adequate to account for chromosome banding. To answer this the fluorescence observed in the presence of DNA of a given base composition was related to the fluorescence observed in the presence of DNA of 40% GC content (F/F40). This allowed the derivation of a term B which indicated the relative change in fluorescence per 1% change in base composition of DNA. To determine the percent change in fluorescence observed in Q-banding, the photoelectric recordings of Caspersson et al. (1971) were used. From these data we conclude: 1. Quinacrine is twice as sensitive to changes in base composition as Hoechst 33258. 2. Variation in the base content of DNA along the chromosome is sufficient to account for most Q-banding, except possibly for some of the extremes of quinacrine fluorescence. This was further examined with daunomycin. Even though daunomycin gives good fluorescent banding, DNAs varying in base composition from 100 to 40% GC content all resulted in the same relative fluorescence of 0.03. However, in the presence of poly (dA-dT) the relative fluorescence was 0.85, indicating a great sensitivity to very AT-rich DNA. This suggests that with daunomycin and possibly other fluorochromes, stretches of very AT-rich DNA may be more important in fluorescent banding than simple variation in mean base composition.  相似文献   

15.
Comparison of the human and mouse genomes has revealed that significant variations in evolutionary rates exist among genomic regions and that a large part of this variation is interchromosomal. We confirm in this work, using a large collection of introns, that human chromosome 19 is the one that shows the highest divergence with respect to mouse. To search for other differences among chromosomes, we examine the distribution of gene functions in human and mouse chromosomes using the Gene Ontology definitions. We found by correspondence analysis that among the strongest clusterings of gene functions in human chromosomes is a group of genes coding for DNA binding proteins in chromosome 19. Interestingly, chromosome 19 also has a very high GC content, a feature that has been proposed to promote an opening of the chromatin, thereby facilitating binding of proteins to the DNA helix. In the mouse genome, however, a similar aggregation of genes coding for DNA binding proteins and high GC content cannot be found. This suggests that the distribution of genes coding for DNA binding proteins and the variations of the chromatin accessibility to these proteins are different in the human and mouse genomes. It is likely that the overall high synonymous and intron rates in chromosome 19 are a by-product of the high GC content of this chromosome.Department of Physiology and Molecular Biodiversity, Institut de Biologia Molecular de Barcelona, CSIC, Jordi Girona 18, 08034 Barcelona, Spain  相似文献   

16.
The codon usage in the Vibrio cholerae genome is analyzed in this paper. Although there are much more genes on the chromosome 1 than on chromosome 2, the codon usage patterns of genes on the two chromosomes are quite similar, indicating that the two chromosomes may have coexisted in the same cell for a very long history. Unlike the base frequency pattern observed in other genomes, the G+C content at the third codon position of the V. cholerae genome varies in a rather small interval. The most notable feature of codon usage of V. cholerae genome is that there is a fraction of genes show significant bias in base choice at the second codon position. The 2,006 known genes can be classified into two clusters according to the base frequencies at this position. The smaller cluster contains 227 genes, most of which code for proteins involved in transport and binding functions. The encoding products of these genes have significant bias in amino acids composition as compared with other genes. The codon usage patterns for the 1,836 function unknown ORFs are also analyzed, which is useful to study their functions.  相似文献   

17.
The rates and patterns of molecular evolution in many eukaryotic organisms have been shown to be influenced by the compartmentalization of their genomes into fractions of distinct base composition and mutational properties. We have examined the Drosophila genome to explore relationships between the nucleotide content of large chromosomal segments and the base composition and rate of evolution of genes within those segments. Direct determination of the G + C contents of yeast artificial chromosome clones containing inserts of Drosophila melanogaster DNA ranging from 140-340 kb revealed significant heterogeneity in base composition. The G + C content of the large segments studied ranged from 36.9% G + C for a clone containing the hunchback locus in polytene region 85, to 50.9% G + C for a clone that includes the rosy region in polytene region 87. Unlike other organisms, however, there was no significant correlation between the base composition of large chromosomal regions and the base composition at fourfold degenerate nucleotide sites of genes encompassed within those regions. Despite the situation seen in mammals, there was also no significant association between base composition and rate of nucleotide substitution. These results suggest that nucleotide sequence evolution in Drosophila differs from that of many vertebrates and does not reflect distinct mutational biases, as a function of base composition, in different genomic regions. Significant negative correlations between codon-usage bias and rates of synonymous site divergence, however, provide strong support for an argument that selection among alternative codons may be a major contributor to variability in evolutionary rates within Drosophila genomes.  相似文献   

18.
Evolution of genome size and DNA base composition in reptiles   总被引:2,自引:2,他引:0  
E. Olmo 《Genetica》1981,57(1):39-50
The evolution of genome size and base composition of DNA from various reptiles has been studied. DNA amount was measured cytophotometrically and GC concentration estimated by thermal denaturation. The Reptilia appear to be a fairly homogeneous group with respect to DNA quantity, although chelonians stand out because of their higher inter- and intrafamilial variability and DNA content. Quantitative DNA variations do not show a single evolutionary trend, but rather seem to have followed different patterns within each group.The differences in genome size between related species seem to be mainly the result of duplication or loss of DNA sequences characterized by a similar mean denaturation temperature. This agrees with observations of other authors that quantitative variations in reptiles are mainly due to differences in the amount of repetitive DNA.Several hypotheses on the significance of quantitative DNA variations in reptiles are discussed.  相似文献   

19.
Psyllids, like aphids, feed on plant phloem sap and are obligately associated with prokaryotic endosymbionts acquired through vertical transmission from an ancestral infection. We have sequenced 37 kb of DNA of the genome of Carsonella ruddii, the endosymbiont of psyllids, and found that it has a number of unusual properties revealing a more extreme case of degeneration than was previously reported from studies of eubacterial genomes, including that of the aphid endosymbiont Buchnera aphidicola. Among the unusual properties are an exceptionally low guanine-plus-cytosine content (19.9%), almost complete absence of intergenic spaces, operon fusion, and lack of the usual promoter sequences upstream of 16S rDNA. These features suggest the synthesis of long mRNAs and translational coupling. The most extreme instances of base compositional bias occur in the genes encoding proteins that have less highly conserved amino acid sequences; the guanine-plus-cytosine content of some protein-coding sequences is as low as 10%. The shift in base composition has a large effect on proteins: in polypeptides of C. ruddii, half of the residues consist of five amino acids with codons low in guanine plus cytosine. Furthermore, the proteins of C. ruddii are reduced in size, with an average of about 9% fewer amino acids than in homologous proteins of related bacteria. These observations suggest that the C. ruddii genome is not subject to constraints that limit the evolution of other known eubacteria.  相似文献   

20.
张乃心  张玉娟  余果  陈斌 《昆虫学报》2013,56(4):398-407
研究双翅目昆虫线粒体基因组的结构特点, 并设计其测序的通用引物, 为今后双翅目昆虫线粒体基因组的研究提供参考和依据。利用比较基因组学和生物信息学方法, 分析了已经完全测序的26个双翅目昆虫线粒体基因组的结构特点、 碱基组成和保守区, 并据此设计了双翅目昆虫基因组测序的通用引物。结果表明: 双翅目昆虫线粒体基因组长14 503~19 517 bp, 其结构保守, 含有37个编码基因, 包括13个蛋白质编码基因, 22个tRNA编码基因和2个rRNA编码基因, 此外还包含一段长度差异很大的非编码区(AT富含区)。基因组内基因排列次序稳定, 除个别基因外, 其余都与黑腹果蝇Drosophila melanogaster基因排列次序一致。基因组的碱基组成不均衡, AT含量在72.59%~85.15%之间, 碱基使用存在偏向性, 偏好使用AC碱基。全基因组的核苷酸和氨基酸序列保守, 共鉴定了11个保守区。在保守区内共设计了26对双翅目线粒体基因组测序通用引物, 扩增的目标片段都在1 200 bp以内。将该套通用引物用于葱蝇Delia antiqua线粒体全基因组测序, 结果证明其高效、 合用。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号