首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract

The codon usage in the Vibrio cholerae genome is analyzed in this paper. Although there are much more genes on the chromosome 1 than on chromosome 2, the codon usage patterns of genes on the two chromosomes are quite similar, indicating that the two chromosomes may have coexisted in the same cell for a very long history. Unlike the base frequency pattern observed in other genomes, the G+C content at the third codon position of the V. cholerae genome varies in a rather small interval. The most notable feature of codon usage of V. cholerae genome is that there is a fraction of genes show significant bias in base choice at the second codon position. The 2006 known genes can be classified into two clusters according to the base frequencies at this position. The smaller cluster contains 227 genes, most of which code for proteins involved in transport and binding functions. The encoding products of these genes have significant bias in amino acids composition as compared with other genes. The codon usage patterns for the 1836 function unknown ORFs are also analyzed, which is useful to study their functions.  相似文献   

2.
Codon usage in Aspergillus nidulans.   总被引:17,自引:0,他引:17  
Summary Synonymous codon usage in genes from the ascomycete (filamentous) fungus Aspergillus nidulans has been investigated. A total of 45 gene sequences has been analysed. Multivariate statistical analysis has been used to identify a single major trend among genes. At one end of this trend are lowly expressed genes, whereas at the other extreme lie genes known or expected to be highly expressed. The major trend is from nearly random codon usage (in the lowly expressed genes) to codon usage that is highly biased towards a set of 19–20 optimal codons. The G+C content of the A. nidulans genome is close to 50%, indicating little overall mutational bias, and so the codon usage of lowly expressed genes is as expected in the absence of selection pressure at silent sites. Most of the optimal codons are C- or G-ending, making highly expressed genes more G+C-rich at silent sites.  相似文献   

3.
The aim of this study was to analyze patterns of nucleotidic composition and codon usage in the pea aphid genome (Acyrthosiphon pisum). A collection of 60,000 expressed sequence tags (ESTs) in the pea aphid has been used to automatically reconstruct 5809 coding sequences (CDSs), based on similarity with known proteins and on coding style recognition. Reconstructions were manually checked for ribosomal proteins, leading to tentatively reconstruct the nea-complete set of this category. Pea aphid coding sequences showed a shift toward AT (especially at the third codon position) compared to drosophila homologues. Genes with a putative high level of expression (ribosomal and other genes with high EST support) remained more GC3-rich and had a distinct codon usage from bulk sequences: they exhibited a preference for C-ending codons and CGT (for arginine), which thus appeared optimal for translation. However, the discrimination was not as strong as in drosophila, suggesting a reduced degree of translational selection. The space of variation in codon usage for A. pisum appeared to be larger than in drosophila, with a substantial fraction of genes that remained GC3-rich. Some of those (in particular some structural proteins) also showed high levels of codon bias and a very strong preference for C-ending codons, which could be explained either by strong translational selection or by other mechanisms. Finally, genomic traces were analyzed to build 206 fragments containing a full CDS, which allowed studying the correlations between GC contents of coding and those of noncoding (flanking and introns) sequences.  相似文献   

4.
Synonymous codon usage varies both between organisms and among genes within a genome, and arises due to differences in G + C content, replication strand skew, or gene expression levels. Correspondence analysis (CA) is widely used to identify major sources of variation in synonymous codon usage among genes and provides a way to identify horizontally transferred or highly expressed genes. Four methods of CA have been developed based on three kinds of input data: absolute codon frequency, relative codon frequency, and relative synonymous codon usage (RSCU) as well as within-group CA (WCA). Although different CA methods have been used in the past, no comprehensive comparative study has been performed to evaluate their effectiveness. Here, the four CA methods were evaluated by applying them to 241 bacterial genome sequences. The results indicate that WCA is more effective than the other three methods in generating axes that reflect variations in synonymous codon usage. Furthermore, WCA reveals sources that were previously unnoticed in some genomes; e.g. synonymous codon usage related to replication strand skew was detected in Rickettsia prowazekii. Though CA based on RSCU is widely used, our evaluation indicates that this method does not perform as well as WCA.Key words: correspondence analysis, synonymous codon usage, horizontal gene transfer, strand-specific mutational bias, translational selection  相似文献   

5.
Burkholderia pseudomallei is a recognized biothreat agent and the causative agent of melioidosis. Codon usage biases of all protein-coding genes (length greater than or equal to 300 bp) from the complete genome of B. pseudomallei K96243 have been analyzed. As B. pseudomallei is a GC-rich organism (68.5%), overall codon usage data analysis indicates that indeed codons ending in G and/or C are predominant in this organism. But multivariate statistical analysis indicates that there is a single major trend in the codon usage variation among the genes in this organism, which has a strong positively correlation with the expressivities of the genes. The majority of the lowly expressed genes are scattered towards the negative end of the major axis whereas the highly expressed genes are clustered towards the positive end. At the same time, from the results that there were two significant correlations between axis 1 coordinates and the GC, GC3s content at silent sites of each sequence, and clearly significant negatively correlations between the ‘Effective Number of Codons’ values and GC, GC3s content, we inferred that codon usage bias was affected by gene nucleotide composition also. In addition, some other factors such as the lengths of the genes as well as the hydrophobicity of genes also influence the codon usage variation among the genes in this organism in a minor way. At the same time, notably, 21 codons have been defined as ‘optimal codons’ of the B. pseudomallei. In summary, our work have provided a basic understanding of the mechanisms for codon usage bias and some more useful information for improving the expression of target genes in vivo and in vitro. Sheng Zhao and Qin Zhang contributed equally to this work.  相似文献   

6.
Plant chloroplast genes have a codon use that reflects the genome compositional bias of a high A+T content with the single exception of the highly translatedpsbA gene which codes for the photosystem II D1 protein. The codon usage of plantpsbA corresponds more closely to the limited tRNA population of the chloroplast and is very similar to the codon use observed in the chloroplast genes of the green algaChlamydomonas reinhardtii. This pattern of codon use may be an adaptation for increased translation efficiency. A correspondence between codon use of plantpsbA andChlamydomonas chloroplast genes and the tRNAs coded by the chloroplast genome, however, is not observed in all synonymous codon groups. It is shown here that the degree of correspondence between codon use and tRNA population in different synonymous groups is correlated with the second codon position composition. Synonymous groups with an A or T at the second codon position have a high representation of codons for which a complementary tRNA is coded by the chloroplast genome. Those with a G or C at the second position have an increased representation of codons that bind a chloroplast tRNA by wobble. It is proposed that the difference between synonymous groups in terms of codon adaptation to the tRNA population in plantpsbA andChlamydomonas chloroplast genes may be the result of differences in second position composition.  相似文献   

7.
Genetic analysis of Rickettsia prowazekii has been hindered by the lack of selectable markers and efficient mechanisms for generating rickettsial gene knockouts. We have addressed these problems by adapting a gene that codes for rifampin resistance for expression in R. prowazekii and by incorporating this selection into a transposon mutagenesis system suitable for generating rickettsial gene knockouts. The arr-2 gene codes for an enzyme that ADP-ribosylates rifampin, thereby destroying its antibacterial activity. Based on the published sequence, this gene was synthesized by PCR with overlapping primers that contained rickettsial codon usage base changes. This R. prowazekii-adapted arr-2 gene (Rparr-2) was placed downstream of the strong rickettsial rpsL promoter (rpsLP), and the entire construct was inserted into the Epicentre EZ::TN transposome system. A purified transposon containing rpsLP-Rparr-2 was combined with transposase, and the resulting DNA-protein complex (transposome) was electroporated into competent rickettsiae. Following selection with rifampin, rickettsiae with transposon insertions in the genome were identified by PCR and Southern blotting and the insertion sites were determined by rescue cloning and inverse PCR. Multiple insertions into widely spaced areas of the R. prowazekii genome were identified. Three insertions were identified within gene coding sequences. Transposomes provide a mechanism for generating random insertional mutations in R. prowazekii, thereby identifying nonessential rickettsial genes.  相似文献   

8.
樟树叶绿体基因组密码子偏好性分析   总被引:3,自引:0,他引:3  
秦政  郑永杰  桂丽静  谢谷艾  伍艳芳 《广西植物》2018,38(10):1346-1355
为分析樟树(Cinnamomum camphora)叶绿体基因组密码子偏好性使用模式,该研究利用CodonW、EMBOSS、R语言等软件和程序,对53条樟树叶绿体基因组密码子使用模式及偏好性进行了系统分析。结果表明:樟树叶绿体基因的有效密码子数(ENC)在36.82~59.30之间,表明密码子的偏好性较弱。相对同义密码子使用度(RSCU)分析发现RSCU>1的密码子有32个,其中28个以A、U结尾,表明第3位密码子偏好使用A和U碱基。中性绘图分析发现GC3与GC12的相关性不显著,回归曲线斜率为0.049,说明密码子偏好性主要受到自然选择的影响。ENC-plot分析发现大部分基因落在曲线的下方,同样表明选择是影响密码子偏好性的主要因素。该研究发现共有9个密码子(UUU、CUU、UCA、ACA、UAU、AAU、GAU、UGA、GGA)被鉴定为樟树叶绿体基因组的最优密码子。  相似文献   

9.
To understand the synonymous codon usage pattern in mitochondrial genome of Antheraea assamensis, we analyzed the 13 mitochondrial protein‐coding genes of this species using a bioinformatic approach as no work was reported yet. The nucleotide composition analysis suggested that the percentages of A, T, G,and C were 33.73, 46.39, 9.7 and 10.17, respectively and the overall GC content was 19.86, that is, lower than 50% and the genes were AT rich. The mean effective number of codons of mitochondrial protein‐coding genes was 36.30 and it indicated low codon usage bias (CUB). Relative synonymous codon usage analysis suggested overrepresented and underrepresented codons in each gene and the pattern of codon usage was different among genes. Neutrality plot analysis revealed a narrow range of distribution for GC content at the third codon position and some points were diagonally distributed, suggesting both mutation pressure and natural selection influenced the CUB.  相似文献   

10.
王艳  赵懿琛  赵德刚 《广西植物》2021,41(2):274-282
为了解杜仲基因密码子使用模式,该文以杜仲基因组密码子为研究对象,运用CodonW软件对杜仲的320个蛋白编码基因进行同义密码子相对使用频率(RSCU)分析、ENC-GC3s关联分析编码基因的密码子ENC值、PR2-plot偏倚分析编码基因的密码子碱基使用频率,并运用CUSP软件与Codon Usage Database软件对杜仲基因密码子的GC含量、使用频率与代表性物种烟草、拟南芥、大肠杆菌和酿酒酵母的密码子GC含量和使用频率进行比较。结果表明:杜仲基因密码子的RSCU>1的密码子有30个,其中18个以G/C结尾、12个以A/U结尾,说明杜仲基因密码子偏好以G/C结尾,且偏好性较强;有效密码子数(ENC)范围为30~60,该范围内的密码子距离标准曲线较远,其ENC值小,偏好性较强;PR2-plot偏倚分析碱基使用频率显示,G>C、U>A;杜仲与代表性物种的GC含量分析显示,杜仲的GC12、GC3以及平均GC含量均高于代表性物种;杜仲与代表性物种的密码子使用频率分析显示,杜仲与烟草、酿酒酵母的密码子偏好较为接近,杜仲与拟南芥、大肠杆菌的密码子偏好差距较大。杜仲是我国特有的珍贵中药材,对其进行密码子使用模式分析,并研究其密码子偏好规律,为杜仲植物基因工程中外源基因的改良及表达提供了理论基础。  相似文献   

11.
The codon usage in the Vibrio cholerae genome is analyzed in this paper. Although there are much more genes on the chromosome 1 than on chromosome 2, the codon usage patterns of genes on the two chromosomes are quite similar, indicating that the two chromosomes may have coexisted in the same cell for a very long history. Unlike the base frequency pattern observed in other genomes, the G+C content at the third codon position of the V. cholerae genome varies in a rather small interval. The most notable feature of codon usage of V. cholerae genome is that there is a fraction of genes show significant bias in base choice at the second codon position. The 2,006 known genes can be classified into two clusters according to the base frequencies at this position. The smaller cluster contains 227 genes, most of which code for proteins involved in transport and binding functions. The encoding products of these genes have significant bias in amino acids composition as compared with other genes. The codon usage patterns for the 1,836 function unknown ORFs are also analyzed, which is useful to study their functions.  相似文献   

12.
Codon usage patterns in 16 chromosomes coincided with each other in Saccharomyces cerevisiae, and the same result was obtained from Encephalitozoon cuniculi consisting of 11 chromosomes, although each chromosome function differs. In addition, preferential codon usage in the regenerated coding systems for Leu and Lys differed between Saccharomyces cerevisiae and Encephalitozoon cuniculi. These results cannot be explained by Darwins natural selection theory or by the neutral theory proposed against Darwins. Furthermore, the codon usage patterns were examined in both prokaryotes and eukaryotes. The use of G or C at the third codon position was much lower than T or A in Ureaplasma urealyticum, whereas inversely the use of G or C at the third codon position was much higher than T or A in Mycobacterium tuberculosis. Additionally, Candida albicans and Plasmodium falciparum also showed a very low usage of G or C at the third codon position. It is a difficult leap to speculate that the inverse codon usage change occurred over the genome during biological evolution. Thus, the present results strongly suggest that organisms were derived from different origins, indicating that the origin of life was plural, based on genomic structures.  相似文献   

13.
Summary The vestigial plastid genome of Epifagus virginiana (beechdrops), a nonphotosynthetic parasitic flowering plant, is functional but lacks six ribosomal protein and 13 tRNA genes found in the chloroplast DNAs of photosynthetic flowering plants. Import of nuclear gene products is hypothesized to compensate for many of these losses. Codon usage and amino acid usage patterns in Epifagus plastic genes have not been affected by the tRNA gene losses, though a small shift in the base composition of the whole genome (toward A + T -richness) is apparent. The ribosomal protein and tRNA genes that remain have had a high rate of molecular evolution, perhaps due to relaxation of constraints on the translational apparatus. Despite the compactness and extensive gene loss, one translational gene (infA, encoding initiation factor 1) that is a pseudogene in tobacco has been maintained intact in Epifagus.Offprint requests to: J.D. Palmer  相似文献   

14.
该研究以2株野生沙枣(Elaeagnus angustifolia Linn.)嫩枝经温室水培后的嫩叶为材料,采用CTAB法分别提取总DNA,并利用第二代测序技术进行总DNA从头测序,组装后得到2株沙枣叶绿体基因组全序列,并详细分析了其蛋白质编码基因密码子使用的偏好性及其原因,为沙枣叶绿体基因工程和分子系统进化等研究奠定基础。结果显示:(1)组装得到沙枣叶绿体基因组序列全长150 546 bp,由长度为81 113 bp的长单拷贝(LSC)区域和25 494 bp的短单拷贝(SSC)区域,以及1对分隔开它们的长18 445 bp的反向重复序列(IRS)组成;注释共得到132个基因,包括86个蛋白编码基因、38个tRNA基因和8个rRNA基因。(2)沙枣叶绿体基因组蛋白编码基因密码子的第三位碱基GC含量(GC_3)为28.47%,明显低于整个叶绿体基因组GC含量(37%),也低于第一位(GC_1)和第二位(GC_2)碱基的GC含量,说明密码子对AT碱基结尾有偏好性;其中, UCU、CCU、UGU、GCU、CUU、GAU、UCA和UAA为最优密码子。(3)同义密码子相对使用频率(RSCU)分析发现,影响密码子使用模式的因素并不单一,密码子的偏好性受到突变、选择及其他因素的共同影响,并且自然选择表达引起的序列差异比突变对密码子偏好性的影响要显著;中性绘图分析、有效密码子数(ENC-plot)分析和奇偶偏好性(PR2-plot)分析表明,沙枣叶绿体基因组使用密码子的偏性受选择的影响更大。(4)通过最大似然法、最大简约法和贝叶斯方法对胡颓子科6个物种和1个枣的叶绿体基因序列构建系统发育树,与它们使用密码子偏性聚类的结果一致,表明叶绿体基因组使用密码子偏性与物种的亲缘关系相关。  相似文献   

15.
Rao Y  Wu G  Wang Z  Chai X  Nie Q  Zhang X 《DNA research》2011,18(6):499-512
Synonymous codons are used with different frequencies both among species and among genes within the same genome and are controlled by neutral processes (such as mutation and drift) as well as by selection. Up to now, a systematic examination of the codon usage for the chicken genome has not been performed. Here, we carried out a whole genome analysis of the chicken genome by the use of the relative synonymous codon usage (RSCU) method and identified 11 putative optimal codons, all of them ending with uracil (U), which is significantly departing from the pattern observed in other eukaryotes. Optimal codons in the chicken genome are most likely the ones corresponding to highly expressed transfer RNA (tRNAs) or tRNA gene copy numbers in the cell. Codon bias, measured as the frequency of optimal codons (Fop), is negatively correlated with the G + C content, recombination rate, but positively correlated with gene expression, protein length, gene length and intron length. The positive correlation between codon bias and protein, gene and intron length is quite different from other multi-cellular organism, as this trend has been only found in unicellular organisms. Our data displayed that regional G + C content explains a large proportion of the variance of codon bias in chicken. Stepwise selection model analyses indicate that G + C content of coding sequence is the most important factor for codon bias. It appears that variation in the G + C content of CDSs accounts for over 60% of the variation of codon bias. This study suggests that both mutation bias and selection contribute to codon bias. However, mutation bias is the driving force of the codon usage in the Gallus gallus genome. Our data also provide evidence that the negative correlation between codon bias and recombination rates in G. gallus is determined mostly by recombination-dependent mutational patterns.  相似文献   

16.
为了分析美丽梧桐、云南梧桐叶绿体基因组密码子的使用偏性,该研究通过筛选美丽梧桐、云南梧桐叶绿体基因组中各52条蛋白编码序列,并利用CodonW、CUSP和SPSS软件对其密码子使用模式及偏性进行了分析。结果表明:(1)美丽梧桐、云南梧桐的GC含量分别为38.12%、38.05%,表明叶绿体基因组内富含A/T碱基。(2)有效密码子数(ENC)范围为36.91~56.46、36.55~58.04,表明多数密码子偏性较弱。(3)相对同义密码子(RSCU)分析显示,RSCU1的密码子各有29个,其中28个以A、U结尾。(4)中性绘图显示,GC_3与GC_(12)的相关性不显著,回归曲线斜率分别为0.195和0.304,说明密码子偏好性主要受到自然选择的影响。(5) ENC-plot分析中大部分基因分布于曲线的周围和下方,ENC比值多分布于-0.04~0.10之间,表明突变会影响密码子偏性的形成。此外,17、18个密码子分别被鉴定为美丽梧桐、云南梧桐的最优密码子。以上结果说明美丽梧桐、云南梧桐叶绿体基因组的密码子使用偏性可能受选择和突变共同作用,且使用模式较为相似,但具有一定的差异,可能与适应环境的进化机制有关。  相似文献   

17.
Hua J  Li M  Dong P  Xie Q  Bu W 《Molecular biology reports》2009,36(7):1757-1765
The first complete mitochondrial genome of dobsonfly Protohermes concolorus Yang et Yang, 1988 (Megaloptera: Corydalidae) was sequenced in this study. The genome was a circular molecule of 15,851 bp containing the typical 37 genes that arranged in the same order as that of the putative ancestor of hexapods. Sequences overlaps were observed between several neighbor genes, which made the genome relatively compact. The tRNA-Ser (GCT) could not be folded into typical secondary structure because its DHU arm was replaced with a simple loop. Six of the 13 protein genes were terminated with a single T adjacent to a downstream tRNA gene in the same strand. The variation of GC content caused the different nucleotide substitution patterns of the protein genes. The genome was AT-biased with a total A + T content of 75.83% which was also demonstrated by the codon usage. The control region was the most AT-rich region with a sub-region of even higher A + T content. Protein genes of two strands presented opposite CG-skew trends which was also reflected by the codon usage. For most of the amino acids, the protein coding sequences did not prefer to use the cognate codons of corresponding tRNAs and the codon usage of the protein genes was not random. The variation of nucleotide substitution patterns of protein genes was significantly correlated with the GC content. The phylogenetic analyses based on all the 13 protein genes showed that Megaloptera was the sister group of other holometabolous insects except Coleoptera.  相似文献   

18.
Patterns in codon usage were examined for the coding regions of the 23 known lepidopteran hemolymph proteins. Coding triplets are GC rich at the third position and a significant linear relationship between GC content of silent and nonsilent (replacement) sites was demonstrated. Intron GC content was significantly lower than in coding regions and no relationship between intron GC content and the same at silent and nonsilent sites was found. Though hemolymph proteins are all produced by the same tissue—fat body—significantly less bias was observed when all moth sequences were pooled than when sequences of the two major species were analyzed separately, as predicted by the genome hypothesis. In cases where no statistically significant bias was observed, polar or acidic basic amino acids were almost exclusively involved. Calculation of codon adaptation indices (CAI) was of limited value in quantifying the degree of codon bias and probably reflects the complexity of multicellular-organism life cycles and the changing patterns of gene expression over different developmental stages. Correspondence to: D.R. Frohlich  相似文献   

19.
Genes with atypical G+C content and pattern of codon usage in a certain genome are possibly of exotic origin, and this idea has been applied to identify horizontal events. In this way, it was postulated that a total of 755 genes in the E. coli genome are relics of horizontal events after the divergence of E. coli from the Salmonella lineage 100 million years ago (Lawrence and Ochman, 1998). In this paper we propose a new way to study sequence composition more thoroughly. We found that although the 755 genes differ in composition from other genes in the E. coli genome, the difference is minor. If we accepted that these genes are horizontally transferred, then (1) it would be more likely that they were transferred from genomes evolutionarily closely related to E. coli; but (2) the dating method used by Lawrence and Ochman (1997, 1998) largely underestimated the average age of introduced sequences in the E. coli genome, in particular, most of the 755 genes should be introduced into E. coli before, instead of after, the divergence of E. coli from the Salmonella lineage. Our study reveals that atypical G+C content and pattern of codon usage are not reliable indicators of horizontal gene transfer events. Received: 27 September 2000 / Accepted: 9 April 2001  相似文献   

20.
Phytophthora is a genus entirely comprised of destructive plant pathogens. It belongs to the Stramenopila, a unique branch of eukaryotes, phylogenetically distinct from plants, animals, or fungi. Phytophthora genes show a strong preference for usage of codons ending with G or C (high GC3). The presence of high GC3 in genes can be utilized to differentiate coding regions from noncoding regions in the genome. We found that both selective pressure and mutation bias drive codon bias in Phytophthora. Indicative for selection pressure is the higher GC3 value of highly expressed genes in different Phytophthora species. Lineage specific GC increase of noncoding regions is reminiscent of whole-genome mutation bias, whereas the elevated Phytophthora GC3 is primarily a result of translation efficiency-driven selection. Heterogeneous retrotransposons exist in Phytophthora genomes and many of them vary in their GC content. Interestingly, the most widespread groups of retroelements in Phytophthora show high GC3 and a codon bias that is similar to host genes. Apparently, selection pressure has been exerted on the retroelement’s codon usage, and such mimicry of host codon bias might be beneficial for the propagation of retrotransposons. Reviewing Editor: Dr. Yves van de Peer  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号