首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The correlation between genomic G+C content and optimal growth temperature in prokaryotes has gained renewed interest after Musto et al. [H. Musto, H. Naya, A. Zavala, H. Romero, F. Alvarex-Valin, G. Bernardi, Correlations between genomic GC levels and optimal growth temperatures in prokaryotes, FEBS Lett. 573 (2004) 73-77], reported that positive correlations exist in 15 families studied. We have reanalyzed their data and found that when genome size and data quality were adjusted for, there was no significant evidence of relationship between optimal temperature and GC content for two of the families that had previously shown strongly significant correlations. Using updated temperature optima for Halobacteriaceae species we found the correlation is insignificant in this family. For the family Enterobacteriaceae when genome size and optimal temperature are included in a multiple linear regression, only genome size is significant as a predictor of GC content. We showed that more profound statistical methods than simple two factor correlation analysis should be used for analyzing complex intrinsic and extrinsic factors that affect genomic GC content. We further found that a positive correlation between temperature and genomic GC is only evident in free-living species of low optimal growth temperatures.  相似文献   

2.

Background

GC content varies greatly between different genomic regions in many eukaryotes. In order to determine whether this organization named isochore organization influences gene expression patterns, the relationship between GC content and gene expression has been investigated in man and mouse. However, to date, this question is still a matter for debate. Among the avian species, chicken (Gallus gallus) is the best studied representative with a complete genome sequence. The distinctive features and organization of its sequence make it a good model to explore important issues in genome structure and evolution.

Methods

Only nuclear genes with complete information on protein-coding sequence with no evidence of multiple-splicing forms were included in this study. Chicken protein coding sequences, complete mRNA sequences (or full length cDNA sequences), and 5 untranslated region sequences (5 UTR) were downloaded from Ensembl and chicken expression data originated from a previous work. Three indices i.e. expression level, expression breadth and maximum expression level were used to measure the expression pattern of a given gene. CpG islands were identified using hgTables of the UCSC Genome Browser. Correlation analysis between variables was performed by SAS Proprietary Software Release 8.1.

Results

In chicken, the GC content of 5 UTR is significantly and positively correlated with expression level, expression breadth, and maximum expression level, whereas that of coding sequences and introns and at the third coding position are negatively correlated with expression level and expression breadth, and not correlated with maximum expression level. These significant trends are independent of recombination rate, chromosome size and gene density. Furthermore, multiple linear regression analysis indicated that GC content in genes could explain approximately 10% of the variation in gene expression.

Conclusions

GC content is significantly associated with gene expression pattern and could be one of the important regulation factors in the chicken genome.  相似文献   

3.

Background

Introns comprise a large fraction of eukaryotic genomes, yet little is known about their functional significance. Regulatory elements have been mapped to some introns, though these are believed to account for only a small fraction of genome wide intronic DNA. No consistent patterns have emerged from studies that have investigated general levels of evolutionary constraint in introns.

Results

We examine the relationship between intron length and levels of evolutionary constraint by analyzing inter-specific divergence at 225 intron fragments in Drosophila melanogaster and Drosophila simulans, sampled from a broad distribution of intron lengths. We document a strongly negative correlation between intron length and divergence. Interestingly, we also find that divergence in introns is negatively correlated with GC content. This relationship does not account for the correlation between intron length and divergence, however, and may simply reflect local variation in mutational rates or biases.

Conclusion

Short introns make up only a small fraction of total intronic DNA in the genome. Our finding that long introns evolve more slowly than average implies that, while the majority of introns in the Drosophila genome may experience little or no selective constraint, most intronic DNA in the genome is likely to be evolving under considerable constraint. Our results suggest that functional elements may be ubiquitous within longer introns and that these introns may have a more general role in regulating gene expression than previously appreciated. Our finding that GC content and divergence are negatively correlated in introns has important implications for the interpretation of the correlation between divergence and levels of codon bias observed in Drosophila.  相似文献   

4.
We report the draft genome of the strain Lactobacillus gigeriorum CRBIP 24.85T, isolated from a chicken crop. The total length of the 60 scaffolds is about 1.9 Mb, with a GC content of 38% and 2,062 protein-coding sequences (CDS).  相似文献   

5.

Background

DNA word frequencies, normalized for genomic AT content, are remarkably stable within prokaryotic genomes and are therefore said to reflect a “genomic signature.” The genomic signatures can be used to phylogenetically classify organisms from arbitrary sampled DNA. Genomic signatures can also be used to search for horizontally transferred DNA or DNA regions subjected to special selection forces. Thus, the stability of the genomic signature can be used as a measure of genomic homogeneity. The factors associated with the stability of the genomic signatures are not known, and this motivated us to investigate further. We analyzed the intra-genomic variance of genomic signatures based on AT content normalization (0th order Markov model) as well as genomic signatures normalized by smaller DNA words (1st and 2nd order Markov models) for 636 sequenced prokaryotic genomes. Regression models were fitted, with intra-genomic signature variance as the response variable, to a set of factors representing genomic properties such as genomic AT content, genome size, habitat, phylum, oxygen requirement, optimal growth temperature and oligonucleotide usage variance (OUV, a measure of oligonucleotide usage bias), measured as the variance between genomic tetranucleotide frequencies and Markov chain approximated tetranucleotide frequencies, as predictors.

Principal Findings

Regression analysis revealed that OUV was the most important factor (p<0.001) determining intra-genomic homogeneity as measured using genomic signatures. This means that the less random the oligonucleotide usage is in the sense of higher OUV, the more homogeneous the genome is in terms of the genomic signature. The other factors influencing variance in the genomic signature (p<0.001) were genomic AT content, phylum and oxygen requirement.

Conclusions

Genomic homogeneity in prokaryotes is intimately linked to genomic GC content, oligonucleotide usage bias (OUV) and aerobiosis, while oligonucleotide usage bias (OUV) is associated with genomic GC content, aerobiosis and habitat.  相似文献   

6.
We show the negative link between genome size and metabolic intensity in tetrapods, using the heart index (relative heart mass) as a unified indicator of metabolic intensity in poikilothermal and homeothermal animals. We found two separate regression lines of heart index on genome size for reptiles-birds and amphibians-mammals (the slope of regression is steeper in reptiles-birds). We also show a negative correlation between GC content and nucleosome formation potential in vertebrate DNA, and, consistent with this relationship, a positive correlation between genome GC content and nuclear size (independent of genome size). It is known that there are two separate regression lines of genome GC content on genome size for reptiles-birds and amphibians-mammals: reptiles-birds have the relatively higher GC content (for their genome sizes) compared to amphibians-mammals. Our results suggest uniting all these data into one concept. The slope of negative regression between GC content and nucleosome formation potential is steeper in exons than in non-coding DNA (where nucleosome formation potential is generally higher), which indicates a special role of non-coding DNA for orderly chromatin organization. The chromatin condensation and nuclear size are supposed to be key parameters that accommodate the effects of both genome size and GC content and connect them with metabolic intensity. Our data suggest that the reptilian-birds clade evolved special relationships among these parameters, whereas mammals preserved the amphibian-like relationships. Surprisingly, mammals, although acquiring a more complex general organization, seem to retain certain genome-related properties that are similar to amphibians. At the same time, the slope of regression between nucleosome formation potential and GC content is steeper in poikilothermal than in homeothermal genomes, which suggests that mammals and birds acquired certain common features of genomic organization.  相似文献   

7.
The origin of Hordelymus genome has been debated for years, and no consensus conclusion was reached. In this study, we sequenced and analyzed the RPB2 (RNA polymerase subunit II) gene from Hordelymus europaeus (L.) Harz, and its potential diploid ancestor species those were suggested in previous studies. The focus of this study was to examine the phylogenetic relationship of Hordelymus genomes with its potential donor Hordeum, Psathyrostachys, and Taeniatherum species. Two distinguishable copies of sequences were obtained from H. europaeus. The obvious difference between the two copies of sequences is a 24 bp indel (insertion/deletion). Phylogenetic analysis showed a strong affinity between Hordeum genome and Hordelymus with 85% bootstrap support. These results suggested that one genome in tetraploid H. europaeus closely related to the genome in Hordeum species. Another genome in H. europaeus is sister to the genomes in Triticeae species examined here, which corresponds well with the recently published EF-G data. No obvious relationship was found between Hordelymus and either Ta genome donor, Taeniatherum caput-medusae or Ns genome donor, Psathyrostachys juncea. Our data does not support the presence of Ta and Ns genome in H. europaeus, and further confirms that H. europaeus is allopolyploid.  相似文献   

8.
9.
《Experimental mycology》1992,16(4):302-307
The base composition and complexity of genomic DNA fromPuccinia sorghi have been estimated by thermal denaturation, analytical ultracentrifugation, and reassociation kinetics. The buoyant density of genomic DNA in CsCl was found to be 1.7021 g/ml, which corresponds to a GC content of 43%. From thermal denaturation curves the GC content was estimated to be 41%. The haploid genome size ofP. sorghi was estimated to be 4.7 × 107 bp, half of which represented a moderately repetitive fraction. The size of theP. sorghi genome is similar to that of other basidiomycete fungi; however, the amount of repetitive DNA is greater than that reported for most other fungi.  相似文献   

10.
Helicobacter pylori infection is strongly associated with gastric cancer. In the present study, the relationship between interleukin-1B (IL-1B) polymorphism, H. pylori infection, and prevalence of gastric cancer (GC) in patients of North India was evaluated using genomic DNA directly extracted from biopsy tissues for performing PCR-RFLP. A total of 136 GC cases and 110 healthy controls were included for studying polymorphisms in the genotypes of IL-1B−511, −31, +3954 and IL-1RN both in the presence and absence of H. pylori active infection. Results showed that the frequency of IL-1RN 2/2 was significantly higher in GC cases (21.32%) than the controls (9.09%) with an odds ratio (OR) of 4.391 (95% CI 1.093-10.131). The risk of GC was also found higher in other genotypes of IL-1B namely, −511 TT (χ2 = 18.975, p < 0.001), −31CC (χ2 = 21.219, p < 0.001), +3954 CT (χ2 = 21.082, p < 0.001) and IL-1RN 1/2 (χ2 = 30.543, p < 0.001) with active infection of H. pylori. Our findings indicate that the IL-1B and IL-1RN polymorphisms are associated with the development of GC and H. pylori infection markedly increases the risk of GC in North Indian population. Additionally, IL-1B−511 C/C and IL-RN 2/2 polymorphisms seem to be involved in the development of GC in H. pylori uninfected patients.  相似文献   

11.
Coding sequence (CDS) architecture affects gene expression levels in organisms. Codon optimization can increase the gene expression level. Therefore, understanding codon usage patterns has important implications for research on genetic engineering and exogenous gene expression. To date, the codon usage patterns of many model plants have been analyzed. However, the relationship between CDS architecture and gene expression in Arachis duranensis remains poorly understood. According to the results of genome sequencing, A. duranensis has many resistant genes that can be used to improve the cultivated peanut. In this study, bioinformatic approaches were used to estimate A. duranensis CDS architectures, including frequency of the optimal codon (Fop), polypeptide length and GC contents at the first (GC1), second (GC2) and third (GC3) codon positions. In addition, Arachis RNA-seq datasets were downloaded from PeanutBase. The relationships between gene expression and CDS architecture were assessed both under normal growth as well as nematode and drought stress conditions. A total of 26 codons with high frequency were identified, which preferentially ended with A or T in A. duranensis CDSs under the above-mentioned three conditions. A similar CDS architecture was found in differentially expressed genes (DEGs) under nematode and drought stresses. The GC1 content differed between DEGs and non-differentially expressed genes (NDEGs) under both drought and nematode stresses. The expression levels of DEGs were affected by different CDS architectures compared with NDEGs under drought stress. In addition, no correlation was found between differential gene expression and CDS architecture neither under nematode nor under drought stress. These results aid the understanding of gene expression in A. duranensis.  相似文献   

12.
13.
Polymerase chain reaction (PCR) based on single primers of arbitrary nucleotide sequence provides a powerful marker system for genome analysis because each primer amplifies multiple products, and cloning, sequencing, and hybridization are not required. We have evaluated this typing system for the mouse by identifying optimal PCR conditions; characterizing effects of GC content, primer length, and multiplexed primers; demonstrating considerable variation among a panel of inbred strains; and establishing linkage for several products. Mg2+, primer, template, and annealing conditions were identified that optimized the number and resolution of amplified products. Primers with 40% GC content failed to amplify products readily, primers with 50% GC content resulted in reasonable amplification, and primers with 60% GC content gave the largest number of well-resolved products. Longer primers did not necessarily amplify more products than shorter primers of the same proportional GC content. Multiplexed primers yielded more products than either primer alone and usually revealed novel variants. A strain survey showed that most strains could be readily distinguished with a modest number of primers. Finally, linkage for seven products was established on five chromosomes. These characteristics establish single primer PCR as a powerful method for mouse genome analysis.  相似文献   

14.
In order to explore the relationship between unacetylated arginine-rich histones and condensed chromatin structure, the extent of histone acetylation was examined in cultured cell lines derived from three species of deer mice. These species differ considerably in their genomic content of heterochromatin but contain essentially the same euchromatin content. Cells of Peromyscus eremicus, containing 34–36% more constitutive heterochromatin than Peromyscus boylii or Peromyscus crinitus cells were found to contain 28–35% more unacetylated histone H4, 22–29% more unacetylated histone H3, and 18–22% more unacetylated histone H2B. This relationship between unacetylated histones and heterochromatin content was further explored by inducing hyperacetylation of P. eremicus and P. boylii histones through treatment of cells with 15 mM sodium butyrate for 24 h. It was found that the percentages of unacetylated histones H3 and H4 remaining after butyrate treatment were proportional to the amount of constitutive heterochromatin in the genome. These data support the concept that a small core of histones in constitutive heterochromatin is inaccessible to acetylation. It was also found that the acetylated state of isolated histones was sensitive to the method of histone extraction. Thus concern must be given to preparative procedures when studying histone acetylation in order to minimize these acetate losses.  相似文献   

15.
This current study presents, for the first time, the complete chloroplast genome of two Cleomaceae species: Dipterygium glaucum and Cleome chrysantha in order to evaluate the evolutionary relationship. The cp genome is 158,576 bp in length with 35.74% GC content in D. glaucum and 158,111 bp with 35.96% GC in C. chrysantha. Inverted repeats IR 26,209 bp, 26,251 bp each, LSC of 87,738 bp, 87,184 bp and SSC of 18,420 bp, 18,425 bp respectively. There are 136 genes in the genome, which includes 80 protein coding genes, 31 tRNA genes and four rRNA genes were observed in both chloroplast genomes. 117 genes are unique while the remaining 19 genes are duplicated in IR regions. The analysis of repeats shows that the cp genome includes all types of repeats with more frequent occurrences of palindromic; Also, this analysis indicates that the total number of simple sequence repeats (SSR) were 323 in D. glaucum, and 313 in C. chrysantha, of which the majority of the SSRs in these plastid genomes were mononucleotide repeats A/T which are located in the intergenic spacer. Moreover, the comparative analysis of the four cp sequences revealed four hotspot genes (atpF, rpoC2, rps19, and ycf1), these variable regions could be used as molecular makers for the species authentication as well as resources for inferring phylogenetic relationships of the species. All the relationships in the phylogenetic tree are with high support, this indicate that the complete chloroplast genome is a useful data for inferring phylogenetic relationship within the Cleomaceae and other families. The simple sequence repeats identified will be useful for identification, genetic diversity, and other evolutionary studies of the species. This study reported the first cp genome of the genus Dipterygium and Cleome. The finding of this study will be beneficial for biological disciplines such as evolutionary and genetic diversity studies of the species within the core Cleomaceae.  相似文献   

16.
Understanding species evolution and improvement requires information of their genome origin and differentiation. Among the species in the family Gramineae, genome identities of Agropyron-Elytrigia-Leymus group are still ambiguous. In order to delineate the genome relationship, nucleotide sequence analysis in the rDNA ITS regions was carried out among the species in the genera Elytrigia, Agropyron, Psathyrostachys, Leymus, and Psacopyrum containing E, St, P, Ns, and Xm genomes. The ITS-1 and ITS-2 showed a narrow range of variation in length except for the presence of a pentanucleotide, TGGGG, in/del in some haplotypes, whereas higher numbers of nucleotide substitutions were observed in most genera. There were 187 variable sites in the ITS-1, 5.8S, and ITS-2 regions, in which a few genome specific mutations were observed. While the level of variation was similar between ITS-1 and ITS-2, the rate of transition mutation versus transversion mutations was different among the ITS-1, 5.8S, and ITS-2 segments. GC contents of the ITS regions ranged between 55–65% between genomes and the haplotypes of P and H genomes were slightly higher than others. In phylogenetic analysis, the ITS haplotypes were classified into two groups; one containing H, Ns, NsXm genomes, and another containing P, St, and E genomes, which are congruous to the genome affinities from other studies. Among the four genomes in Pascopyrum smithii (2n=8x=56, StStNsNsHHXmXm), the haplotypes of H and St genomes were identified with the reference diploid species, but the haplotypes having Ns and Xm genomes were not found in the present analysis.  相似文献   

17.
Assessment of the environmental factors that control species richness (S) is a central issue in ecology. In this study, aquatic macrophyte S was estimated in 235 sampling sites distributed in 8 arms of a large (1350 km2) subtropical reservoir (Itaipu Reservoir, Brazil). Morphometric variables (area, shoreline development and length of shoreline, all measured for each arm; n= 8) and environmental variables measured at each sampling site (extinction coefficient of light (k), electrical conductivity, fetch, distance from the main reservoir body; n = 235) were used to predict aquatic macrophyte S at two spatial scales. At arm scale, linear regression analysis indicated that length of shoreline was a better predictor of S than area. At sampling site scale, multiple regression analysis indicated that S was significantly predicted by electrical conductivity, fetch and distance from the main body. However, other relationships with predictive interest was demonstrated by using non-traditional regression approaches. This analysis started by the visual inspection of scatter plots. The bivariate relationship between S and fetch, for example, showed an envelope or a `left triangle' pattern. The relationship between the number of submerged species and k showed an asymmetrical left triangle pattern. Using randomization procedures, it was demonstrated that these patterns were not generated by chance alone. Beta diversity (estimated within the arms) was significantly and positively correlated with spatial environmental variability. Overall, these results indicate that the prediction of aquatic macrophytes assemblage variables in large waterbodies, specially S, is more complex than previous studies have suggested.  相似文献   

18.
Salvia miltiorrhiza is an important medicinal plant with great economic and medicinal value. The complete chloroplast (cp) genome sequence of Salvia miltiorrhiza, the first sequenced member of the Lamiaceae family, is reported here. The genome is 151,328 bp in length and exhibits a typical quadripartite structure of the large (LSC, 82,695 bp) and small (SSC, 17,555 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,539 bp). It contains 114 unique genes, including 80 protein-coding genes, 30 tRNAs and four rRNAs. The genome structure, gene order, GC content and codon usage are similar to the typical angiosperm cp genomes. Four forward, three inverted and seven tandem repeats were detected in the Salvia miltiorrhiza cp genome. Simple sequence repeat (SSR) analysis among the 30 asterid cp genomes revealed that most SSRs are AT-rich, which contribute to the overall AT richness of these cp genomes. Additionally, fewer SSRs are distributed in the protein-coding sequences compared to the non-coding regions, indicating an uneven distribution of SSRs within the cp genomes. Entire cp genome comparison of Salvia miltiorrhiza and three other Lamiales cp genomes showed a high degree of sequence similarity and a relatively high divergence of intergenic spacers. Sequence divergence analysis discovered the ten most divergent and ten most conserved genes as well as their length variation, which will be helpful for phylogenetic studies in asterids. Our analysis also supports that both regional and functional constraints affect gene sequence evolution. Further, phylogenetic analysis demonstrated a sister relationship between Salvia miltiorrhiza and Sesamum indicum. The complete cp genome sequence of Salvia miltiorrhiza reported in this paper will facilitate population, phylogenetic and cp genetic engineering studies of this medicinal plant.  相似文献   

19.
Underivatized estrone (ES), equilin (EQ), equilenin (EQN) and their corresponding 17α-diols 17α-estradiol (ESD), 17α-dihydroequilin (DHEQ) and 17α-dihydroequilenin (DHEQN) were separated by TLC, RP-HPLC and capillary GC. Their dipole moments (μ) and Randić's connectivity indices (1χ) were determined as parameters of importance for the separation. The number of H atoms was taken as an additive structural parameter of importance for the quantitative structure-chromatographic retention relationship study (QSRR). Principal component analysis (PCA) was applied in order to find similarities and dissimilarities between 9 TLC and 10 RP-HPLC systems. PCA indicated that proton donor-proton acceptor interactions play the most important role for the TLC and RP-HPLC separation. The two-dimensional non-linear map of PC variables showed that the keto-estrogens (ES, EQ and EQN) and the corresponding diols (ESD, DHEQ and DHEQN) form two separate clusters. The relationship between GC retention of equine estrogens characterized by Kováts indices (KI), their 1χ and μ was expressed by the equation KI/100 = a/1χ+b/μ2+c. The biological activity of the estrogens was related to log 1/μ2.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号