首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper analyses the compositional correlations that hold in the chicken genome. Significant linear correlations were found among the regions studied—coding sequences (and their first, second, and third codon positions), flanking regions (5′ and 3′), and introns—as is the case in the human genome. We found that these compositional correlations are not limited to global GC levels but even extend to individual bases. Furthermore, an analysis of 1037 coding sequences has confirmed a correlation among GC3, GC2, and GC1. The implications of these results are discussed. Received: 9 December 1998 / Accepted: 18 April 1999  相似文献   

2.
Several groups have addressed the issue of the influence of GC on expression levels in mammalian genes. In general, GC-rich genes appeared to be more expressed than GC-poor ones. Recently, expression levels of GC3-rich and GC3-poor versions of genes (GC3 is the third codon position GC), inserted in vector plasmids, were compared in order to eliminate differences associated with their genomic context. Transfection experiments showed that GC3-rich genes were expressed more efficiently than their GC3-poor counterparts, indicating that GC3 dramatically and intrinsically boosts expression efficiency. Here we show that, while the protocols used eliminated the original genomic context, they replaced it with the plasmid contexts whose compositional properties affected the results.  相似文献   

3.
Synonymous codon choices vary considerably among Schistosoma mansoni genes. Principal components analysis detects a single major trend among genes, which highly correlates with GC content in third codon positions and exons, but does not discriminate among putatively highly and lowly expressed genes. The effective number of codons used in each gene, and its distribution when plotted against GC3, suggests that codon usage is shaped mainly by mutational biases. The GC content of exons, GC3, 5′, 3′, and flanking (5′+ 3′+ introns) regions are all correlated among them, suggesting that variations in GC content may exist among different regions of the S. mansoni genome. We propose that this genome structure might be among the most important factors shaping codon usage in this species, although the action of selection on certain sequences cannot be excluded. Received: 10 March 1997 / Accepted: 27 June 1997  相似文献   

4.
Jabbari K  Bernardi G 《Gene》2000,247(1-2):287-292
In the present work we show that in the Drosophila genome (which covers a 37-51% GC range at a DNA size of approx.50kb) a linear correlation holds between GC (or GC(3)50kb) genomic sequences embedding them. This correlation allows us to position the two compositional distributions of (a) coding sequences, and (b) of long DNA segments relative to each other and to calculate gene concentration across the compositional range of the Drosophila genome. Using this approach, we show that gene concentration increases with increasing GC of the regions embedding the genes, reaching a 7-fold higher level in the GC-richest regions compared with the GC-poorest regions. The gene distribution of the Drosophila genome is, therefore, similar to (although less striking than) that of the human genome, whereas it is very different from those of the Arabidopsis genome, which has about the same size as the Drosophila genome.  相似文献   

5.
A new method to determine entropic profiles in DNA sequences is presented. It is based on the chaos-game representation (CGR) of gene structure, a technique which produces a fractal-like picture of DNA sequences. First, the CGR image was divided into squares 4-m in size (m being the desired resolution), and the point density counted. Second, appropriate intervals were adjusted, and then a histogram of densities was prepared. Third, Shannon's formula was applied to the probability-distribution histogram, thus obtaining a new entropic estimate for DNA sequences, the histogram entropy , a measurement that goes with the level of constraints on the DNA sequence. Lastly, the entropic profile for the sequence was drawn, by considering the entropies at each resolution level, thus providing a way to summarize the complexity of large genomic regions or even entire genomes at different resolution levels. The application of the method to DNA sequences reveals that entropic profiles obtained in this way, as opposed to previously published ones, clearly discriminate between random and natural DNA sequences. Entropic profiles also show a different degree of variability within and between genomes. The results of these analyses are discussed in relation both to the genome compartmentalization in vertebrates and to the differential action of compositional and/or functional constraints on DNA sequences.  相似文献   

6.
The Poland–Fixman–Freire formalism was adapted for modeling of calorimetric DNA melting profiles, and applied to plasmid pBR 322 and long random sequences. We studied the influence of the difference (HGC?HAT) between the helix‐coil transition enthalpies of AT and GC base pairs on the calorimetric melting profile and on normalized calorimetric melting profile. A strong alteration of DNA calorimetrical profile with HGC?HAT was demonstrated. In contrast, there is a relatively slight change in the normalized profiles and in corresponding ordinary (optical) normalized differential melting curves (DMCs). For fixed HGC?HAT, the average relative deviation (S) between DMC and normalized calorimetric profile, and the difference between their melting temperatures (Tcal?Tm) are weakly dependent on peculiarities of the multipeak fine structure of DMCs. At the same time, both the deviation S and difference (Tcal?Tm) enlarge with the temperature melting range of the helix‐coil transition. It is shown that the local deviation between DMC and normalized calorimetric profile increases in regions of narrow peaks distant from the melting temperature.  相似文献   

7.
人类蛋白编码基因局部GC水平相关性分析   总被引:2,自引:0,他引:2  
陈祥贵  胡军  杨潇 《遗传》2008,30(9):1169-1174
GC含量是基因组DNA序列碱基组成的重要特征, 蕴涵基因结构、功能和进化信息。文中通过从公共数据库提取7 992个非冗余的人类蛋白质编码基因DNA序列, 分析了基因序列不同区域的局部GC含量和相关性。结果表明: 基因局部GC含量呈现不均一性, 5′非翻译区GC水平最高, 为62.56%; 而3′非翻译区GC水平最低, 为43.97%。3′侧翼序列的GC含量能较好地代表基因所在区域DNA长片段的GC水平。虽然开放阅读框的GC含量比内含子、3′非翻译区和3′侧翼序列的GC含量高, 但4个区域的GC含量之间均存在较高的相关性。密码子第三位置的平均GC含量(GC3)为58.09%, 显著高于密码子第一位置和第二位置的GC含量, 且与开放阅读框的GC水平高度相关, 相关系数高达0.91。GC3与内含子、3′非翻译区、3′侧翼序列的GC水平相关性也较高, GC3对3′侧翼序列的GC含量的直线回归斜率为1.25。因此, GC3可作为基因所在区域GC水平变化的敏感性指标。而密码子第一位置和第二位置以及5′侧翼序列和5′非翻译区GC水平与基因其他区域的GC水平的相关性较弱。该研究结果提示: 基因蛋白编码区密码子第三位置、内含子、3′非翻译区和3′侧翼序列的碱基可能经历了相近的进化过程, 而蛋白编码区密码子第一位置和第二位置、5′侧翼序列和5′非翻译区由于功能的需要而经历了不同的突变和选择。  相似文献   

8.
该研究以2株野生沙枣(Elaeagnus angustifolia Linn.)嫩枝经温室水培后的嫩叶为材料,采用CTAB法分别提取总DNA,并利用第二代测序技术进行总DNA从头测序,组装后得到2株沙枣叶绿体基因组全序列,并详细分析了其蛋白质编码基因密码子使用的偏好性及其原因,为沙枣叶绿体基因工程和分子系统进化等研究奠定基础。结果显示:(1)组装得到沙枣叶绿体基因组序列全长150 546 bp,由长度为81 113 bp的长单拷贝(LSC)区域和25 494 bp的短单拷贝(SSC)区域,以及1对分隔开它们的长18 445 bp的反向重复序列(IRS)组成;注释共得到132个基因,包括86个蛋白编码基因、38个tRNA基因和8个rRNA基因。(2)沙枣叶绿体基因组蛋白编码基因密码子的第三位碱基GC含量(GC_3)为28.47%,明显低于整个叶绿体基因组GC含量(37%),也低于第一位(GC_1)和第二位(GC_2)碱基的GC含量,说明密码子对AT碱基结尾有偏好性;其中, UCU、CCU、UGU、GCU、CUU、GAU、UCA和UAA为最优密码子。(3)同义密码子相对使用频率(RSCU)分析发现,影响密码子使用模式的因素并不单一,密码子的偏好性受到突变、选择及其他因素的共同影响,并且自然选择表达引起的序列差异比突变对密码子偏好性的影响要显著;中性绘图分析、有效密码子数(ENC-plot)分析和奇偶偏好性(PR2-plot)分析表明,沙枣叶绿体基因组使用密码子的偏性受选择的影响更大。(4)通过最大似然法、最大简约法和贝叶斯方法对胡颓子科6个物种和1个枣的叶绿体基因序列构建系统发育树,与它们使用密码子偏性聚类的结果一致,表明叶绿体基因组使用密码子偏性与物种的亲缘关系相关。  相似文献   

9.
为确定瑶药紫九牛叶绿体基因组密码子的使用模式及其成因,该研究以紫九牛叶绿体基因组50条蛋白质编码序列为研究对象,利用Codon W 1.4.2和在线软件CUSP和Chips分析其密码子偏好性。结果表明:(1)RSCU>1的密码子有29个,其中有28个以A/U结尾,说明叶绿体基因组的同义密码子中偏好以A/U结尾。(2)紫九牛叶绿体基因组密码子的GC含量GC1(47.38%)>GC2(39.81%)>GC3(29.60%),ENC值大于45的有40个,说明紫九牛叶绿体基因组存在较弱的偏性。(3)中性绘图分析和ENC-plot分析说明了紫九牛叶绿体基因组密码子的偏好性既受到选择的作用,又受到突变因素的影响。(4)通过构建的高低基因表达库最终确定了15个最优密码子,分别为UUG、AUU、GUU、GUA、UCU、 CCU、ACU、ACA、GCU、CAA、AAC、GAA、UGU、CGU和GGU。该研究为紫九牛叶绿体基因组的确定以及遗传多样性分析提供了依据。  相似文献   

10.
Sequencing and annotation of a contiguous stretch of genomic DNA (112.3 kb) from the oomycete plant pathogen Phytophthora infestans revealed the order, spacing and genomic context of four members of the elicitin (inf) gene family. Analysis of the GC content at the third codon position (GC3) of six genes encoded in the region, and a set of randomly selected coding regions as well as random genomic regions, showed that a high GC3 value is a general feature of Phytophthora genes that can be exploited to optimize gene prediction programs for Phytophthora species. At least one-third of the annotated 112.3-kb P. infestans sequence consisted of transposons or transposon-like elements. The most prominent were four Tc3/gypsy and Tc1/copia type retrotransposons and three DNA transposons that belong to the Tc1/mariner, Pogo and PiggyBac groups, respectively. Comparative analysis of other available genomic sequences suggests that transposable elements are highly heterogeneous and ubiquitous in the P. infestans genome.Electronic Supplementary Material Supplementary material is available for this article at  相似文献   

11.
Genes that have experienced accelerated evolutionary rates on the human lineage during recent evolution are candidates for involvement in human-specific adaptations. To determine the forces that cause increased evolutionary rates in certain genes, we analyzed alignments of 10,238 human genes to their orthologues in chimpanzee and macaque. Using a likelihood ratio test, we identified protein-coding sequences with an accelerated rate of base substitutions along the human lineage. Exons evolving at a fast rate in humans have a significant tendency to contain clusters of AT-to-GC (weak-to-strong) biased substitutions. This pattern is also observed in noncoding sequence flanking rapidly evolving exons. Accelerated exons occur in regions with elevated male recombination rates and exhibit an excess of nonsynonymous substitutions relative to the genomic average. We next analyzed genes with significantly elevated ratios of nonsynonymous to synonymous rates of base substitution (dN/dS) along the human lineage, and those with an excess of amino acid replacement substitutions relative to human polymorphism. These genes also show evidence of clusters of weak-to-strong biased substitutions. These findings indicate that a recombination-associated process, such as biased gene conversion (BGC), is driving fixation of GC alleles in the human genome. This process can lead to accelerated evolution in coding sequences and excess amino acid replacement substitutions, thereby generating significant results for tests of positive selection.  相似文献   

12.
The nucleosome formation potential of introns, intergenic spacers and exons of human genes is shown here to negatively correlate with among-tissues breadth of gene expression. The nucleosome formation potential is also found to negatively correlate with the GC content of genomic sequences; the slope of regression line is steeper in exons compared with noncoding DNA (introns and intergenic spacers). The correlation with GC content is independent of sequence length; in turn, the nucleosome formation potential of introns and intergenic spacers positively (albeit weakly) correlates with sequence length independently of GC content. These findings help explain the functional significance of the isochores (regions differing in GC content) in the human genome as a result of optimization of genomic structure for epigenetic complexity and support the notion that noncoding DNA is important for orderly chromatin condensation and chromatin-mediated suppression of tissue-specific genes.  相似文献   

13.
樟树叶绿体基因组密码子偏好性分析   总被引:3,自引:0,他引:3  
秦政  郑永杰  桂丽静  谢谷艾  伍艳芳 《广西植物》2018,38(10):1346-1355
为分析樟树(Cinnamomum camphora)叶绿体基因组密码子偏好性使用模式,该研究利用CodonW、EMBOSS、R语言等软件和程序,对53条樟树叶绿体基因组密码子使用模式及偏好性进行了系统分析。结果表明:樟树叶绿体基因的有效密码子数(ENC)在36.82~59.30之间,表明密码子的偏好性较弱。相对同义密码子使用度(RSCU)分析发现RSCU>1的密码子有32个,其中28个以A、U结尾,表明第3位密码子偏好使用A和U碱基。中性绘图分析发现GC3与GC12的相关性不显著,回归曲线斜率为0.049,说明密码子偏好性主要受到自然选择的影响。ENC-plot分析发现大部分基因落在曲线的下方,同样表明选择是影响密码子偏好性的主要因素。该研究发现共有9个密码子(UUU、CUU、UCA、ACA、UAU、AAU、GAU、UGA、GGA)被鉴定为樟树叶绿体基因组的最优密码子。  相似文献   

14.
Genomic clones encoding the S 2- and S 6-RNases of Nicotiana alata Link and Otto, which are the allelic stylar products of the self-incompatibility (S) locus, were isolated and sequenced. Analysis of genomic DNA by pulsed-field gel electrophoresis and Southern blotting indicates the presence of only a single S-RNase gene in the N. alata genome. The sequences of the open-reading frames in the genomic and corresponding cDNA clones were identical. The organization of the genes was similar to that of other S-RNase genes from solanaceous plants. No sequence similarity was found between the DNA flanking the S 2- and S 6-RNase genes, despite extensive similarities between the coding regions. The DNA flanking the S 6-RNase gene contained sequences that were moderately abundant in the genome. These repeat sequences are also present in other members of the Nicotianae.  相似文献   

15.
Synonymous codon usage of 53 protein coding genes in chloroplast genome of Coffea arabica was analyzed for the first time to find out the possible factors contributing codon bias. All preferred synonymous codons were found to use A/T ending codons as chloroplast genomes are rich in AT. No difference in preference for preferred codons was observed in any of the two strands, viz., leading and lagging strands. Complex correlations between total base compositions (A, T, G, C, GC) and silent base contents (A3, T3, G3, C3, GC3) revealed that compositional constraints played crucial role in shaping the codon usage pattern of C. arabica chloroplast genome. ENC Vs GC3 plot grouped majority of the analyzed genes on or just below the left side of the expected GC3 curve indicating the influence of base compositional constraints in regulating codon usage. But some of the genes lie distantly below the continuous curve confirmed the influence of some other factors on the codon usage across those genes. Influence of compositional constraints was further confirmed by correspondence analysis as axis 1 and 3 had significant correlations with silent base contents. Correlation of ENC with axis 1, 4 and CAI with 1, 2 prognosticated the minor influence of selection in nature but exact separation of highly and lowly expressed genes could not be seen. From the present study, we concluded that mutational pressure combined with weak selection influenced the pattern of synonymous codon usage across the genes in the chloroplast genomes of C. arabica.  相似文献   

16.
Comparative genomics has revealed that variations in bacterial and archaeal genome DNA sequences cannot be explained by only neutral mutations. Virus resistance and plasmid distribution systems have resulted in changes in bacterial and archaeal genome sequences during evolution. The restriction-modification system, a virus resistance system, leads to avoidance of palindromic DNA sequences in genomes. Clustered, regularly interspaced, short palindromic repeats (CRISPRs) found in genomes represent yet another virus resistance system. Comparative genomics has shown that bacteria and archaea have failed to gain any DNA with GC content higher than the GC content of their chromosomes. Thus, horizontally transferred DNA regions have lower GC content than the host chromosomal DNA does. Some nucleoid-associated proteins bind DNA regions with low GC content and inhibit the expression of genes contained in those regions. This form of gene repression is another type of virus resistance system. On the other hand, bacteria and archaea have used plasmids to gain additional genes. Virus resistance systems influence plasmid distribution. Interestingly, the restriction-modification system and nucleoid-associated protein genes have been distributed via plasmids. Thus, GC content and genomic signatures do not reflect bacterial and archaeal evolutionary relationships.  相似文献   

17.

Background  

Microarray-CGH experiments are used to detect and map chromosomal imbalances, by hybridizing targets of genomic DNA from a test and a reference sample to sequences immobilized on a slide. These probes are genomic DNA sequences (BACs) that are mapped on the genome. The signal has a spatial coherence that can be handled by specific statistical tools. Segmentation methods seem to be a natural framework for this purpose. A CGH profile can be viewed as a succession of segments that represent homogeneous regions in the genome whose BACs share the same relative copy number on average. We model a CGH profile by a random Gaussian process whose distribution parameters are affected by abrupt changes at unknown coordinates. Two major problems arise : to determine which parameters are affected by the abrupt changes (the mean and the variance, or the mean only), and the selection of the number of segments in the profile.  相似文献   

18.
Ilyina  T. S.  Romanova  Yu. M. 《Molecular Biology》2002,36(2):171-179
Data on the structural organization and evolutionary role of specific bacterial DNA regions known as genomic islands are reviewed. Emphasis is placed on the most extensively studied genomic islands, pathogenicity islands (PAIs), which are present in the chromosome of Gram-negative and Gram-positive pathogenic bacteria and absent from related nonpathogenic strains. PAIs are long DNA regions that harbor virulence genes and often differ in GC content from the remainder of the bacterial genome. Many PAI occur in the tRNA gene loci, which provide a convenient target for foreign gene insertion. Some PAI are highly homologous to each other and contain sequences similar to ISs, phage att sites, and plasmid ori sites, along with functional or defective integrase and transposase genes, suggesting horizontal transfer of PAI among bacteria.  相似文献   

19.
Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9–10mers residing within coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H.influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions. These results imply that the high frequency of DUS in genome maintenance genes is conserved among phylogenetically divergent species and thus are of significant biological importance. Increased DUS density is expected to enhance DNA uptake and the over-representation of DUS in genome maintenance genes might reflect facilitated recovery of genome preserving functions. For example, transient and beneficial increase in genome instability can be allowed during pathogenesis simply through loss of antimutator genes, since these DUS-containing sequences will be preferentially recovered. Furthermore, uptake of such genes could provide a mechanism for facilitated recovery from DNA damage after genotoxic stress.  相似文献   

20.

Background

Spirodela polyrhiza is a species of the order Alismatales, which represent the basal lineage of monocots with more ancestral features than the Poales. Its complete sequence of the mitochondrial (mt) genome could provide clues for the understanding of the evolution of mt genomes in plant.

Methods

Spirodela polyrhiza mt genome was sequenced from total genomic DNA without physical separation of chloroplast and nuclear DNA using the SOLiD platform. Using a genome copy number sensitive assembly algorithm, the mt genome was successfully assembled. Gap closure and accuracy was determined with PCR products sequenced with the dideoxy method.

Conclusions

This is the most compact monocot mitochondrial genome with 228,493 bp. A total of 57 genes encode 35 known proteins, 3 ribosomal RNAs, and 19 tRNAs that recognize 15 amino acids. There are about 600 RNA editing sites predicted and three lineage specific protein-coding-gene losses. The mitochondrial genes, pseudogenes, and other hypothetical genes (ORFs) cover 71,783 bp (31.0%) of the genome. Imported plastid DNA accounts for an additional 9,295 bp (4.1%) of the mitochondrial DNA. Absence of transposable element sequences suggests that very few nuclear sequences have migrated into Spirodela mtDNA. Phylogenetic analysis of conserved protein-coding genes suggests that Spirodela shares the common ancestor with other monocots, but there is no obvious synteny between Spirodela and rice mtDNAs. After eliminating genes, introns, ORFs, and plastid-derived DNA, nearly four-fifths of the Spirodela mitochondrial genome is of unknown origin and function. Although it contains a similar chloroplast DNA content and range of RNA editing as other monocots, it is void of nuclear insertions, active gene loss, and comprises large regions of sequences of unknown origin in non-coding regions. Moreover, the lack of synteny with known mitochondrial genomic sequences shed new light on the early evolution of monocot mitochondrial genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号