首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 34 毫秒
1.
Gasior SL  Preston G  Hedges DJ  Gilbert N  Moran JV  Deininger PL 《Gene》2007,390(1-2):190-198
The human Long Interspersed Element-1 (LINE-1) and the Short Interspersed Element (SINE) Alu comprise 28% of the human genome. They share the same L1-encoded endonuclease for insertion, which recognizes an A+T-rich sequence. Under a simple model of insertion distribution, this nucleotide preference would lead to the prediction that the populations of both elements would be biased towards A+T-rich regions. Genomic L1 elements do show an A+T-rich bias. In contrast, Alu is biased towards G+C-rich regions when compared to the genome average. Several analyses have demonstrated that relatively recent insertions of both elements show less G+C content bias relative to older elements. We have analyzed the repetitive element and G+C composition of more than 100 pre-insertion loci derived from de novo L1 insertions in cultured human cancer cells, which should represent an evolutionarily unbiased set of insertions. An A+T-rich bias is observed in the 50 bp flanking the endonuclease target site, consistent with the known target site for the L1 endonuclease. The L1, Alu, and G+C content of 20 kb of the de novo pre-insertion loci shows a different set of biases than that observed for fixed L1s in the human genome. In contrast to the insertion sites of genomic L1s, the de novo L1 pre-insertion loci are relatively L1-poor, Alu-rich and G+C neutral. Finally, a statistically significant cluster of de novo L1 insertions was localized in the vicinity of the c-myc gene. These results suggest that the initial insertion preference of L1, while A+T-rich in the initial vicinity of the break site, can be influenced by the broader content of the flanking genomic region and have implications for understanding the dynamics of L1 and Alu distributions in the human genome.  相似文献   

2.
In special coordinates (codon position-specific nucleotide frequencies), bacterial genomes form two straight lines in 9-dimensional space: one line for eubacterial genomes, another for archaeal genomes. All the 348 distinct bacterial genomes available in Genbank in April 2007, belong to these lines with high accuracy. The main challenge now is to explain the observed high accuracy. The new phenomenon of complementary symmetry for codon position-specific nucleotide frequencies is observed. The results of analysis of several codon usage models are presented. We demonstrate that the mean-field approximation, which is also known as context-free, or complete independence model, or Segre variety, can serve as a reasonable approximation to the real codon usage. The first two principal components of codon usage correlate strongly with genomic G+C content and the optimal growth temperature, respectively. The variation of codon usage along the third component is related to the curvature of the mean-field approximation. First three eigenvalues in codon usage PCA explain 59.1%, 7.8% and 4.7% of variation. The eubacterial and archaeal genomes codon usage is clearly distributed along two third order curves with genomic G+C content as a parameter.  相似文献   

3.
The effects of experimental burial and erosion on the seagrass Zostera noltii were assessed through in situ manipulation of the sediment level (− 2 cm, 0 cm, + 2 cm, + 4 cm, + 8 cm and + 16 cm). Shoot density, leaf and sheath length, internode length, C and N content and carbohydrates of leaves and rhizomes were examined 1, 2, 4 and 8 weeks after disturbance. Both burial and erosion resulted in the decrease of shoot density for all the sediment levels. The threshold for total shoot loss was between 4 cm and 8 cm of burial, particularly during the 2nd week. A laboratory experiment confirmed that shoots did not survive more than 2 weeks under complete burial. There was no evidence of induced flowering by burial or erosion. As well, no clear evidence was found of sediment level effects on leaf and sheath length. Longer rhizome internodes were observed as a response to both burial and erosion, suggesting a plant attempt to relocate the leaf-producing meristems closer to sediment surface or in search of new sediment avoiding the eroded area. The C content of leaves and rhizomes, as well as the non-structural carbohydrates (mainly the starch in rhizomes), decreased significantly along the experimental period, indicating the internal mobilization of carbon to meet the plant demands as a consequence of light deprivation. The significant decrease of N content in leaves, and its simultaneous increase in rhizomes, suggests the internal translocation of nitrogen from leaves to rhizomes. About 50% of the N lost by the leaves was recovered by the rhizomes. Our results indicated that Z. noltii has a high sensitivity to burial and erosion disturbance, which should be considered in the management of coastal activities.  相似文献   

4.
Sueoka N 《Gene》2002,300(1-2):141-154
The intra-strand Parity Rule 2 of DNA (PR2) states that A=T and G=C within each strands. Useful corollaries of PR2 are G/(G+C)=A/(A+T)=0.5, G/(G+A)=C/(C+T)=G+C, G/(G+T)=C/(C+A)=G+C. Here. A, T, G, and C represent relative contents of the four nucleotide residues in a specific strand of DNA, so that A+T+G+C=1. Thus, deviations from the PR2 is a sign of strand-specific (or asymmetric) mutation and/or selection pressures. The present study delineates the symmetric and asymmetric effects of mutations on the intra-genomic heterogeneity of the G+C content in the human genome. The results of this study on the human genome are: (1) When both two- and four-codon amino acids were combined, only slight departures from the PR2 were observed in the total ranges of G+C content of the third-codon position. Thus, the G+C heterogeneity is likely to be caused by symmetric mutagenesis between the two strands. (2) The above result makes the deamination of cytosine due to double-strand breathing of DNA [Mol. Biol. Evol. 17 (2000) 1371] and/or incorporation of the oxidized guanine (8-oxo-guanine) opposite adenine during DNA replication (dGTP-oxidation hypothesis) as the most likely candidates for the major cause of the diversities of the G+C content. (3) Patterns of amino acid-specific PR2-biases detected by plotting PR2 corollaries against the G+C content of third codon position revealed that eight four-codon amino acids can be divided into three types by the second codon letter: (a) C2-type (Ala, Pro, Ser4, and Thr), (b) G2-type (Arg4 and Gly), and (c) T2-type (Leu4 and Val). (4) Most of the asymmetric plot patterns of the above three classes in PR2 biases can be explained by C2→T2 deamination of C2pG3 of C2-type to T2pG3 (T2-type) in both human and chicken. This explains the existence of some preferred codons in human and chicken. However, these biases (asymmetric) hardly contribute to the overall G+C content diversity of the third codon position.  相似文献   

5.
Pérez-Brocal V  Latorre A  Gil R  Moya A 《Gene》2005,345(1):73-80
Preliminary analysis of two selected genomic regions of Buchnera aphidicola BCc, the primary endosymbiont of the cedar aphid Cinara cedri, has revealed a number of interesting features when compared with the corresponding homologous regions of the three B. aphidicola genomes previously sequenced, that are associated with different aphid species. Both regions exhibit a significant reduction in length and gene number in B. aphidicola BCc, as it could be expected since it possess the smallest bacterial genome. However, the observed genome reduction is not even in both regions, as it appears to be dependent on the nature of their gene content. The region fpr-trxA, that contains mainly metabolic genes, has lost almost half of its genes (45.6%) and has reduced 52.9% its length. The reductive process in the region rrl-aroK, that contains mainly ribosomal protein genes, is less dramatic, since it has lost 9.3% of genes and has reduced 15.5% of its length. Length reduction is mainly due to the loss of protein-coding genes, not to the shortening of ORFs or intergenic regions. In both regions, G+C content is about 4% lower in BCc than in the other B. aphidicola strains. However, when only conserved genes and intergenic regions of the four B. aphidicola strains are compared, the G+C reduction is higher in the fpr-trxA region.  相似文献   

6.
The mean (G + C) composition (51.0%) and standard deviation (+/- 3.8%) of published DNA sequences accounting for 10% of the E. coli genome is in excellent agreement with the principal overall distribution determined by high resolution melting. While differences in base and neighbor characteristics are small and uniform throughout all regions of the genome, it is found that the (G + C) content of sequences varies in segmented fashion within boundaries corresponding to coding (53% G + C) and noncoding (46% G + C) regions; with variances in the latter being six-fold greater than in coding regions. The variance in different regions shows a strong negative dependence on (G + C) content of the region, reflecting the condition that A-T and G-C base pairs are preferred neighbors of A-T and C-G pairs, respectively; with the bias increasing with decreasing (G + C) content. Neighbor analysis indicates the most extreme positive biases occur in AA, TT, GC and CG throughout all regions, but particularly in noncoding regions. Extraordinary numbers of oligomeric strings of (A)n, etc., are the further consequence of this bias. These and other characteristics point to the existence of inherent biases in neighbor frequencies levied during replication or repair, and which reflect, in turn, neighbor influences during mutation. The bias in codon usage noted by Grantham and others is seen here as due, in part, to the adaptation of coding sequences to this microenvironment through selection among synonymous codons so as to preserve inherent neighbor biases.  相似文献   

7.
Past analyses of the genome of the yeast Saccharomyces cerevisiae have revealed substantial regional variation in G+C content. Important questions remain, though, as to the origin, nature, significance, and generality of this variation. We conducted an extensive analysis of the yeast genome to try to answer these questions. Our results indicate that open reading frames (ORFs) with similar G+C contents at silent codon positions are significantly clustered on chromosomes. This clustering can be explained by very short range correlations of silent-site G+C contents at neighboring ORFs. ORFs of high silent-site G+C content are disproportionately concentrated on shorter chromosomes, which causes a negative relationship between chromosome length and G+C content. Contrary to previous reports, there is no correlation between gene density and silent-site G+C content in yeast. Chromosome III is atypical in many regards, and possible reasons for this are discussed.  相似文献   

8.
Heliothis virescens ascovirus 3a (HvAV-3a), a member of the family Ascoviridae, has the highest diversity among ascovirus species that have been reported in Australia, Indonesia, China, and the United States. To understand the diversity and origin of this important ascovirus, the complete genome of the HvAV Indonesia strain (HvAV-3g), isolated from Spodoptera exigua, was determined to be 199,721 bp, with a G+C content of 45.9%. Therefore, HvAV-3g has the largest genome among the reported ascovirus genomes to date. There are 194 predicted open reading frames (ORFs) encoding proteins of 50 or more amino acid residues. In comparison to HvAV-3e reported from Australia, HvAV-3g has all the ORFs in HvAV-3e with 6 additional ORFs unique to HvAV-3g, including 1 peptidase C26 gene with the highest identity to Drosophila spp. and 2 gas vesicle protein U (GvpU) genes with identities to Bacillus megaterium. The five unique homologous regions (hrs) and 25 baculovirus repeat ORFs (bro) of HvAV-3g are highly variable.  相似文献   

9.
N Ohta  N Sato    T Kuroiwa 《Nucleic acids research》1998,26(22):5190-5198
The complete nucleotide sequence of the mitochondrial genome of a very primitive unicellular red alga, Cyanidioschyzon merolae , has been determined. The mitochondrial genome of C.merolae contains 34 genes for proteins including unidentified open reading frames (ORFs) (three subunits of cytochrome c oxidase, apocytochrome b protein, three subunits of F1F0-ATPase, seven subunits of NADH ubiquinone oxidoreductase, three subunits of succinate dehydrogenase, four proteins implicated in c-type cytochrome biogenesis, 11 ribosomal subunits and two unidentified open reading frames), three genes for rRNAs and 25 genes for tRNAs. The G+C content of this mitochondrial genome is 27.2%. The genes are encoded on both strands. The genome size is comparatively small for a plant mitochondrial genome (32 211 bp). The mitochondrial genome resembles those of plants in its gene content because it contains several ribosomal protein genes and ORFs shared by other plant mitochondrial genomes. In contrast, it resembles those of animals in the genome organization, because it has very short intergenic regions and no introns. The gene set in this mitochondrial genome is a subset of that of Reclinomonas americana , an amoeboid protozoan. The results suggest that plant mitochondria originate from the same ancestor as other mitochondria and that most genes were lost from the mitochondrial genome at a fairly early stage of the evolution of the plants.  相似文献   

10.
Sequence organization of the mitochondrial genome of yeast--a review   总被引:3,自引:0,他引:3  
M de Zamaroczy  G Bernardi 《Gene》1985,37(1-3):1-17
We have compiled the available primary structural data for the mitochondrial genome of Saccharomyces cerevisiae and have estimated the size of the remaining gaps, which represent 12-13% of the genome. The lengths of sequenced regions and of gaps lead to a new assessment of genome sizes; these range (in round figures) from 85 000 bp for the long genomes, to 78 000 bp for the short genomes, to 74 000 bp for the supershort genome of Saccharomyces carlsbergensis. These values are 8-11% higher than those previously estimated from restriction fragments. Interstrain differences concern not only facultative intervening sequences (introns) and mini-inserts, but also insertions/deletions in intergenic sequences. The primary structure appears to be extremely conserved in genes and ori sequences, and highly conserved in intergenic sequences. Since coding sequences represent at most 33-35% of the genome, at least two thirds of the genome are formed by noncoding and yet highly conserved sequences. The G + C level of genes or exon is 25%, and that of intronic open reading frames (ORFs) 22%; increasingly lower values are shown by intronic closed reading frames (CRFs), 20%, ori sequences, 19%, intergenic ORFs, 17.5% and intergenic sequences, 15%.  相似文献   

11.
A new method of isolating host-independent Bdellovibrio bacteriovorus has been developed. Filtered suspensions of host-dependent cells are dropped in small volumes onto 0.2 μm membranes laid on rich media agar. Significant growth is observed within 1–2 days; these cells were confirmed to be B. bacteriovorus using microscopic observations and PCR.  相似文献   

12.
Molar content of guanine plus cytosine (G + C) and optimal growth temperature (OGT) are main factors characterizing the frequency distribution of amino acids in prokaryotes. Previous work, using multivariate exploratory methods, has emphasized ascertainment of biological factors underlying variability between genomes, but the strength of each identified factor on amino acid content has not been quantified. We combine the flexibility of the phylogenetic mixed model (PMM) with the power of Bayesian inference via Markov Chain Monte Carlo (MCMC) methods, to obtain a novel evolutionary picture of amino acid usage in prokaryotic genomes. We implement a Bayesian PMM which incorporates the feature that evolutionary history makes observed data interdependent. As in previous studies with PMM, we present a variance partition; however, attention is also given to the posterior distribution of "systematic effects" that may shed light about the relative importance of and relationships between evolutionary forces acting at the genomic level. In particular, we analyzed influences of G + C, OGT, and respiratory metabolism. Estimates of G + C effects were significant for amino acids coded by G + C or molar content of adenine plus thymine (A + T) in first and second bases. OGT had an important effect on 12 amino acids, probably reflecting complex patterns of protein modifications, to cope with varying environments. The effect of respiratory metabolism was less clear, probably due to the already reported association of G + C with aerobic metabolism. A "heritability" parameter was always high and significant, reinforcing the importance of accommodating phylogenetic relationships in these analyses. "Heritable" component correlations displayed a pattern that tended to cluster "pure" G + C (A + T) in first and second codon positions, suggesting an inherited departure from linear regression on G + C.  相似文献   

13.
The complete genome sequences of Choristoneura occidentalis and C. rosaceana nucleopolyhedroviruses (ChocNPV and ChroNPV, respectively) (Baculoviridae: Alphabaculovirus) were determined and compared with each other and with those of other baculoviruses, including the genome of the closely related C. fumiferana NPV (CfMNPV). The ChocNPV genome was 128,446 bp in length (1147 bp smaller than that of CfMNPV), had a G+C content of 50.1%, and contained 148 open reading frames (ORFs). In comparison, the ChroNPV genome was 129,052 bp in length, had a G+C content of 48.6% and contained 149 ORFs. ChocNPV and ChroNPV shared 144 ORFs in common, and had a 77% sequence identity with each other and 96.5% and 77.8% sequence identity, respectively, with CfMNPV. Five homologous regions (hrs), with sequence similarities to those of CfMNPV, were identified in ChocNPV, whereas the ChroNPV genome contained three hrs featuring up to 14 repeats. Both genomes encoded three inhibitors of apoptosis (IAP-1, IAP-2, and IAP-3), as reported for CfMNPV, and the ChocNPV IAP-3 gene represented the most divergent functional region of this genome relative to CfMNPV. Two ORFs were unique to ChocNPV, and four were unique to ChroNPV. ChroNPV ORF chronpv38 is a eukaryotic initiation factor 5 (eIF-5) homolog that has also been identified in the C. occidentalis granulovirus (ChocGV) and is believed to be the product of horizontal gene transfer from the host. Based on levels of sequence identity and phylogenetic analysis, both ChocNPV and ChroNPV fall within group I alphabaculoviruses, where ChocNPV appears to be more closely related to CfMNPV than does ChroNPV. Our analyses suggest that it may be appropriate to consider ChocNPV and CfMNPV as variants of the same virus species.  相似文献   

14.
15.
Comparison of heteroduplexes (HD) between DNAs of different transposable phages of Pseudomonas aeruginosa belonging to two previously described subgroups (D3112 and B3) revealed two types of structure (composition) of the bacteriophages, designated "type A" and "type B". The properties of genome structure of type A (phages of D3112 subgroup) are as follows: high level of conservation (up to 70% of genomes of different phages are represented as blocks of homologous DNA sequences); substitutions in genomes revealed as nonhomology regions in HD are, as a rule, small and located in certain sites; the distribution of the nonhomologous regions in HD of these phages is highly reproducible in independent experiments. Bacteriophages of subgroup B3 have genomes of type B: only a small part (approx. 30%) of genomes retain homology general for all of the phages; the nonhomologous regions are distributed in a large number of sites in HD; the sizes of nonhomologous regions are substantially larger than for the phages of subgroup D3112; distribution of the regions in HD is highly variable, which is characteristic of DNAs with partial homology. There is no difference between genomes of types A and B in G + C content (approx. 61-63%). Viable recombinants can be formed in crosses between phages of different genome types not only in regions with earlier revealed large DNA/DNA homology (right ends of genomes), but also in central portions of the genomes. Nevertheless, functional incompatibility of some regions of phage genomes of types A and B was demonstrated.  相似文献   

16.
Analysis of the mitochondrial DNA of a liverwort Marchantia polymorpha by electron microscopy and restriction endonuclease mapping indicated that the liverwort mitochondrial genome was a single circular molecule of about 184,400 base-pairs. We have determined the complete sequence of the liverwort mitochondrial DNA and detected 94 possible genes in the sequence of 186,608 base-pairs. These included genes for three species of ribosomal RNA, 29 genes for 27 species of transfer RNA and 30 open reading frames (ORFs) for functionally known proteins (16 ribosomal proteins, 3 subunits of H(+)-ATPase, 3 subunits of cytochrome c oxidase, apocytochrome b protein and 7 subunits of NADH ubiquinone oxidoreductase). Three ORFs showed similarity to ORFs of unknown function in the mitochondrial genomes of other organisms. Furthermore, 29 ORFs were predicted as possible genes by using the index of G + C content in first, second and third letters of codons (42.0 +/- 10.9%, 37.0 +/- 13.2% and 26.4 +/- 9.4%, respectively) obtained from the codon usages of identified liverwort genes. To date, 32 introns belonging to either group I or group II intron have been found in the coding regions of 17 genes including ribosomal RNA genes (rrn18 and rrn26), a transfer RNA gene (trnS) and a pseudogene (psi nad7). RNA editing was apparently lacking in liverwort mitochondria since the nucleotide sequences of the liverwort mitochondrial DNA were well-conserved at the DNA level.  相似文献   

17.
18.
Li W  Zou H  Tao M 《Antonie van Leeuwenhoek》2007,92(4):417-427
The mechanism of translation initiation is responsible for shaping the mRNA sequences downstream of the start codon. However, this region has not been systematically analyzed in prokaryotes. We used sequence logos and statistic methods to analyze the patterns of overrepresented sequences in this region for 125 species of bacteria and 23 species of archaea. The specific positions are compared to the first 33 amino acids in the proteins. At the 2nd amino acid position, Lys, Ser or Thr is highly overrepresented for 68% to 84% of the genomes examined and Ala is highly overrepresented for 57% of the genomes. Overrepresentation of Lys2 is negatively correlated with the G + C content and overrepresentation of Ser2 or Thr2 is positively correlated with the G + C content of genomes. Ile at the 4th to the 8th positions were found to be overrepresented for 91% of the genomes analyzed and this seemed to be conserved for both bacteria and archaea. Organisms growing at high temperatures have relatively low extent of nucleotides bias at 5′ termini of open reading frames (ORFs). The extent of overrepresenting A and underrepresenting G at ORF 5′ termini is reduced in thermophiles and hyperthermophiles for both archaea and bacteria. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

19.
Ren Zhang M.D. 《Amino acids》1997,12(2):167-177
Summary Based on the genetic codes and a simple theorem for the geometrical property of the regular tetrahedron, each amino acid is mapped onto a unique point in a 3-dimensional tetrahedral space. The distribution of the 20 mapping points for 20 amino acids is studied in detail. It is found that the mapping points for the hydrophobic and hydrophilic amino acids are distributed at distinct regions in the 3-dimensional space. A plane separating the two kinds of points satisfactorily based on the Fisher's algorithm has been calculated. It is shown that the codons coding for the hydrophobic amino acids are constituted dominantly by the bases of keto group, i.e., G and T. While the codons coding for the hydrophilic amino acids are constituted dominantly by the bases of amino group, i.e., A and C. The biological implication of the mapping points and the separating plane has been discussed in some details.  相似文献   

20.
Bacteriophage B3 is a transposable phage of Pseudomonas aeruginosa. In this report, we present the complete DNA sequence and annotation of the B3 genome. DNA sequence analysis revealed that the B3 genome is 38,439 bp long with a G+C content of 63.3%. The genome contains 59 proposed open reading frames (ORFs) organized into at least three operons. Of these ORFs, the predicted proteins from 41 ORFs (68%) display significant similarity to other phage or bacterial proteins. Many of the predicted B3 proteins are homologous to those encoded by the early genes and head genes of Mu and Mu-like prophages found in sequenced bacterial genomes. Only two of the predicted B3 tail proteins are homologous to other well-characterized phage tail proteins; however, several Mu-like prophages and transposable phage D3112 encode approximately 10 highly similar proteins in their predicted tail gene regions. Comparison of the B3 genomic organization with that of Mu revealed evidence of multiple genetic rearrangements, the most notable being the inversion of the proposed B3 immunity/early gene region, the loss of Mu-like tail genes, and an extreme leftward shift of the B3 DNA modification gene cluster. These differences illustrate and support the widely held view that tailed phages are genetic mosaics arising by the exchange of functional modules within a diverse genetic pool.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号