首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The global, rather than local, variation in G+C content along the nuclear DNA sequences of various organisms was studied using GenBank sequence data. When long DNA sequences of the genomes of Escherichia coli and Saccharomyces cerevisiae were examined, the levels of their G+C content (G+C%) were found to be within a narrow range around that of the whole genome. The G+C% levels for sequences of vertebrate genomes, however, were found to cover a wide range, showing that their genome is a mosaic of sequences with different G+C% levels, in each of which the sequence is fairly homogeneous in its G+C% for a very long distance. Through surveying a human genetic map and GenBank DNA sequences, the global variations in G+C% along the human genome DNA were found to be correlated with chromosome band structures.  相似文献   

2.
Correlation was positive between the G + C content at the codon third position in genes of vertebrates and the G + C content of the genome portion surrounding each gene. Exons of genes with a high G + C% at the codon 3rd position are surrounded by G + C-rich introns and G + C-rich flanking sequences, and those with a low G + C% at the position by A + T-rich introns and flanking sequences. Analysis of G + C content distribution along DNA sequences using a DNA Sequence Data Bank supported the view that the vertebrate genome is a mosaic of regions with clear differences in their G + C content. The biological significance of the variation in G + C content throughout the vertebrate genome is discussed in connection with chromosomal banding.  相似文献   

3.
Shewanella putrefaciens has been considered the main spoilage bacteria of low-temperature stored marine seafood. However, psychrotropic Shewanella have been reclassified during recent years, and the purpose of the present study was to determine whether any of the new Shewanella species are important in fish spoilage. More than 500 H2S-producing strains were isolated from iced stored marine fish (cod, plaice, and flounder) caught in the Baltic Sea during winter or summer time. All strains were identified as Shewanella species by phenotypic tests. Different Shewanella species were present on newly caught fish. During the warm summer months the mesophilic human pathogenic S. algae dominated the H2S-producing bacterial population. After iced storage, a shift in the Shewanella species was found, and most of the H2S-producing strains were identified as S. baltica. The 16S rRNA gene sequence analysis confirmed the identification of these two major groups. Several isolates could only be identified to the genus Shewanella level and were separated into two subgroups with low (44%) and high (47%) G+C mol%. The low G+C% group was isolated during winter months, whereas the high G+C% group was isolated on fish caught during summer and only during the first few days of iced storage. Phenotypically, these strains were different from the type strains of S. putrefaciens, S. oneidensis, S. colwelliana, and S. affinis, but the high G+C% group clustered close to S. colwelliana by 16S rRNA gene sequence comparison. The low G+C% group may constitute a new species. S. baltica, and the low G+C% group of Shewanella spp. strains grew well in cod juice at 0 degrees C, but three high G+C Shewanella spp. were unable to grow at 0 degrees C. In conclusion, the spoilage reactions of iced Danish marine fish remain unchanged (i.e., trimethylamine-N-oxide reduction and H2S production); however, the main H2S-producing organism was identified as S. baltica.  相似文献   

4.
5.
Pseudomonas sp. 14-3 is an Antarctic bacterium that shows high stress resistance in association with high polyhydroxybutyrate (PHB) production. In this paper genes involved in PHB biosynthesis (phaRBAC) were found within a genomic island named pha-GI. Numerous mobile elements or proteins associated with them, such as an integrase, insertion sequences, a bacterial group II intron, a complete Type I protein secretion system and IncP plasmid-related proteins were detected among the 28 ORFs identified in this large genetic element (32.3kb). The G+C distribution was not homogeneous, likely reflecting a mosaic structure that contains regions from diverse origins. pha-GI has strong similarities with genomic islands found in diverse Proteobacteria, including Burkholderiales species and Azotobacter vinelandii. The G+C content, phylogeny inference and codon usage analysis showed that the phaBAC cluster itself has a complex mosaic structure and indicated that the phaB and phaC genes were acquired by horizontal transfer, probably derived from Burkholderiales. These results describe for the first time a pha cluster located within a genomic island, and suggest that horizontal transfer of pha genes is a mechanism of adaptability to stress conditions such as those found in the extreme Antarctic environment.  相似文献   

6.
The sequences of the human genome compiled in DNA databases are now about 10 megabase pairs (Mb), and thus the size of the sequences is several times the average size of chromosome bands at high resolution. By surveying this large quantity of data, it may be possible to clarify the global characteristics of the human genome, that is, correlation of gene sequence data (kb-level) to cytogenetic data (Mb-level). By extensively searching the GenBank database, we calculated codon usages in about 2000 human sequences. The highest G + C percentage at the third codon position was 97%, and that of about 250 sequences was 80% or more. The lowest G + C% was 27%, and that in about 150 sequences was 40% or less. A major portion of the GC-rich genes was found to be on special subsets of R-bands (T-bands and/or terminal R-bands). AT-rich genes, however, were mainly on G-bands or non-T-type internal R-bands. Average G + C% at the third position for individual chromosomes differed among chromosomes, and were related to T-band density, quinacrine dullness, and mitotic chiasmata density in the respective chromosomes.  相似文献   

7.
An isochore map of the human genome based on the Z curve method   总被引:4,自引:0,他引:4  
Zhang CT  Zhang R 《Gene》2003,317(1-2):127-135
The distribution of the G+C content in the human genome has been studied by using a windowless technique derived from the Z curve method. The most important findings presented in this paper are twofold. First, abrupt variations of the G+C content along human chromosome sequences are the main variation patterns of G+C content. It is found that at some sites, the G+C content undergoes abrupt changes from a G+C-rich region to a G+C-poor region alternatively and vice versa. Second, it is shown that long domains with relatively homogeneous G+C content along each chromosome do exist. These domains are thought to be isochores, which usually have sharp boundaries. Consequently, 56 isochores longer than 3 Mb have been identified in chromosomes 1-22, X and Y. Boundaries, size and G+C content of each isochore identified are listed in detail. As an example to demonstrate the power of the method, the boundary between the Classes III and II isochores of the MHC sequence has been determined and found to be at 2,477,936, which is in good agreement with the experimental evidence. A homogeneity index is introduced to measure the homogeneity of G+C content in isochores. We emphasize that the homogeneity of G+C content is relative. The isochores in which the G+C content keeps absolutely constant do not exist. Isochore structures appear to be a basic organization of the human genome. Due to the relevance to many important biological functions, the clarification of isochore structures will provide much insight into the understanding of the human genome.  相似文献   

8.
9.
10.
A novel method to calculate the G+C content of genomic DNA sequences.   总被引:2,自引:0,他引:2  
The base composition of a DNA fragment or genome is usually measured by the proportion of A+T or G+C in the sequence. The G+C content along genomic sequences is usually calculated using an overlapping or non-overlapping sliding window method. The result and accuracy of such an approach depends on the size of the window and the moving distance adopted. In this paper, a novel windowless technique to calculate the G+C content of genomic sequences is proposed. By this method, the G+C content can be calculated at different "resolution". In an extreme case, the G+C content may be computed at a specific point, rather than in a window of finite size. This is particularly useful to analyze the fine variation of base composition along genomic sequences. As the first example, the variation of G+C content along each of 16 yeast chromosomes is analyzed. The G+C-rich regions with length larger than 5 kb sequences are detected and listed in details. It is found that each chromosome consists of several G+C-rich and G+C-poor regions alternatively, i.e., a mosaic structure. Another example is to analyze the G+C content for each of the two chromosomes of the Vibrio cholerae genome. Based on the variations of the G+C content in each chromosome, it is shown that some fragments in the Vibrio cholerae genome may have been transferred from other species. Especially, the position and size of the large integron island on the smaller chromosome was precisely predicted. This method would be a useful tool for analyzing genomic sequences.  相似文献   

11.
Characteristics of human and mouse orthologous gene sequences which have large G+C content variations were investigated in this study. The orthologous gene pairs were classified into two groups according to the deviation between human and mouse G+C content at the third codon position (GC3) and were subsequently analyzed. In one group, mouse genes had higher GC3 than the corresponding human genes and in another group, human genes had higher GC3 than mouse. Furthermore, the orthologous pairs were separated based on the deviation between human or mouse GC3 and the G+C content at the third codon position of identical codons (IC3), to examine the effect of increased or decreased G+C content in human or mouse sequences. The nucleotide substitution patterns between human and mouse sequences in the two groups were remarkably distinct, and consistent with the state of G+C-rich or G+C-poor sequences. The effect of increase or decrease of G+C content in human or mouse sequences was not clear in the nucleotide substitution patterns. The chromosomal locations of human and mouse orthologous gene pairs were different between the two groups. The genes located on an identical syntenic segment showed the trend of having similar G+C content. Moreover, the same gene order of some genes on different chromosomes of both species demonstrated the gene rearrangements between human and mouse. Our study indicated that the chromosomal locations and rearrangements are associated with the GC3 variation between human and mouse sequences.Key Words: Human mouse orthologs, G+C content variation, nucleotide substitution, gene location, gene rearrangement.  相似文献   

12.
13.
14.
The distribution of interspersed repetitive DNA sequences in the human genome   总被引:25,自引:0,他引:25  
The distribution of interspersed repetitive DNA sequences in the human genome has been investigated, using a combination of biochemical, cytological, computational, and recombinant DNA approaches. "Low-resolution" biochemical experiments indicate that the general distribution of repetitive sequences in human DNA can be adequately described by models that assume a random spacing, with an average distance of 3 kb. A detailed "high-resolution" map of the repetitive sequence organization along 400 kb of cloned human DNA, including 150 kb of DNA fragments isolated for this study, is consistent with this general distribution pattern. However, a higher frequency of spacing distances greater than 9.5 kb was observed in this genomic DNA sample. While the overall repetitive sequence distribution is best described by models that assume a random distribution, an analysis of the distribution of Alu repetitive sequences appearing in the GenBank sequence database indicates that there are local domains with varying Alu placement densities. In situ hybridization to human metaphase chromosomes indicates that local density domains for Alu placement can be observed cytologically. Centric heterochromatin regions, in particular, are at least 50-fold underrepresented in Alu sequences. The observed distribution for repetitive sequences in human DNA is the expected result for sequences that transpose throughout the genome, with local regions of "preference" or "exclusion" for integration.  相似文献   

15.
Watanabe Y  Tenzen T  Nagasaka Y  Inoko H  Ikemura T 《Gene》2000,252(1-2):163-172
The human genome is composed of long-range G+C% mosaic structures, which are thought to be related to chromosome bands. Replication timing during S phase is associated with chromosomal band zones; thus, band boundaries are thought to correspond to regions where replication timing switches. The proximal limit of the human X-inactivation center (XIC) has been localized cytologically to the junction zone between Xq13.1 and Xq13.2. Using PCR-based quantification of the newly replicated DNA from cell-cycle fractionated THP-1 cells, the replication timing in and around the XIC was determined at the genome sequence level. We found two regions where replication timing changes from the early to late period during S phase. One is located near a large inverted duplication proximal to the XIC, and the other is near the XIST locus. We propose that the 1Mb late-replicated zone (from the large inverted duplication to XIST) corresponds to a G-band Xq13.2. Several common characteristics were observed in the XIST region and the MHC class II-III junction which was previously defined as a band boundary. These characteristics included differential high-density clustering of Alu and LINE repeats, and the presence of polypurine/polypyrimidine tracts, MER41A, MER57 and MER58B.  相似文献   

16.
The N-myc amplification of human neuroblastomas was characterized by the amplified DNA cloned from the cell line MC-NB-1 using the phenol emulsion reassociation technique (PERT). A number of PERT clones exhibiting amplification in this cell line were tested for amplification in other neuroblastoma cell lines. In almost all cell lines examined, only a few clones were co-amplified with N-myc and most of the others were exclusively amplified in a subset of the cell lines. The total aggregate size of the Hind III fragment identified by the PERT clones was approximately 350 kb. Most of the PERT clones were mapped to human chromosome (chr) 2p23-2pter, where the N-myc gene is located. Four types of amplicons, the 100, 420, 480 and 520 kb fragments, shown to be Not I fragments, were identified by hexagonal field gel electrophoresis. Three fragments are ordered in a head-to-tail array, and the remaining fragment is either ordered in a tail-to-head array or something else. Despite the extremely unusual construction of the amplified sequences in this cell line as compared with others, there was a low degree of sequence heterogeneity among the amplicons within this cell line. These observations lead to the idea that the complex rearrangements that give rise to the heterogeneous organization of the amplified sequences among the different cell lines precede the amplification of these sequences.  相似文献   

17.

Background

Previous studies have shown that microRNA precursors (pre-miRNAs) have considerably more stable secondary structures than other native RNAs (tRNA, rRNA, and mRNA) and artificial RNA sequences. However, pre-miRNAs with ultra stable secondary structures have not been investigated. It is not known if there is a tendency in pre-miRNA sequences towards or against ultra stable structures? Furthermore, the relationship between the structural thermodynamic stability of pre-miRNA and their evolution remains unclear.

Results

We investigated the correlation between pre-miRNA sequence conservation and structural stability as measured by adjusted minimum folding free energies in pre-miRNAs isolated from human, mouse, and chicken. The analysis revealed that conserved and non-conserved pre-miRNA sequences had structures with similar average stabilities. However, the relatively ultra stable and unstable pre-miRNAs were more likely to be non-conserved than pre-miRNAs with moderate stability. Non-conserved pre-miRNAs had more G+C than A+U nucleotides, while conserved pre-miRNAs contained more A+U nucleotides. Notably, the U content of conserved pre-miRNAs was especially higher than that of non-conserved pre-miRNAs. Further investigations showed that conserved and non-conserved pre-miRNAs exhibited different structural element features, even though they had comparable levels of stability.

Conclusions

We proposed that there is a correlation between structural thermodynamic stability and sequence conservation for pre-miRNAs from human, mouse, and chicken genomes. Our analyses suggested that pre-miRNAs with relatively ultra stable or unstable structures were less favoured by natural selection than those with moderately stable structures. Comparison of nucleotide compositions between non-conserved and conserved pre-miRNAs indicated the importance of U nucleotides in the pre-miRNA evolutionary process. Several characteristic structural elements were also detected in conserved pre-miRNAs.
  相似文献   

18.
Characterization of the human p53 gene.   总被引:54,自引:5,他引:49       下载免费PDF全文
Cosmid and lambda clones containing the human p53 gene were isolated and characterized in detail. The gene is 20 kilobases (kb) long and has 11 exons, the first and second exons being separated by an intron of 10 kb. Restriction fragments upstream of sequences known to be within the first identified exon were tested for promoter activity by cloning them in front of the chloramphenicol acetyltransferase gene and transfecting the resulting constructs into HeLa cells. A 0.35-kb DNA fragment was identified that had promoter activity. Results of primer extension experiments indicated that the mRNA cap site falls within this fragment, as expected. Analysis of the sequence upstream of the presumptive cap site indicated that the human p53 promoter may be of an unusual type.  相似文献   

19.
Butyrate producers constitute an important bacterial group in the human large intestine. Butyryl-CoA is formed from two molecules of acetyl-CoA in a process resembling beta-oxidation in reverse. Three different arrangements of the six genes coding for this pathway have been found in low mol% G+C-content gram-positive human colonic bacteria using DNA sequencing and degenerate PCR. Gene arrangements were strongly conserved within phylogenetic groups defined by 16S rRNA gene sequence relationships. In the case of one of the genes, encoding beta-hydroxybutyryl-CoA dehydrogenase, however, sequence relationships were strongly suggestive of horizontal gene transfer between lineages. The newly identified gene for butyryl-CoA CoA-transferase, which performs the final step in butyrate formation in most known human colonic bacteria, was not closely linked to these central pathway genes.  相似文献   

20.
Normal human plasma alpha 2HS-glycoprotein has earlier been shown to be comprised of two polypeptide chains. Recently, the amino acid and carbohydrate sequences of the short chain were elucidated (Gejyo, F., Chang, J.-L., Bürgi, W., Schmid, K., Offner, G. D., Troxler, R.F., van Halbeck, H., Dorland, L., Gerwig, G. J., and Vliegenthart, J.F.G. (1983) J. Biol. Chem. 258, 4966-4971). In the present study, the amino acid sequence of the long chain of this protein, designated A-chain, was determined and found to consist of 282 amino acid residues. Twenty-four amino acid doublets were found; the most abundant of these are Pro-Pro and Ala-Ala which each occur five times. Of particular interest is the presence of three Gly-X-Pro and one Gly-Pro-X sequences that are characteristic of the repeating sequences of collagens. Chou-Fasman evaluation of the secondary structure suggested that the A-chain contains 29% alpha-helix, 24% beta-pleated sheet, and 26% reverse turns and, thus, approximately 80% of the polypeptide chain may display ordered structure. Four glycosylation sites were identified. The two N-glycosidic oligosaccharides were found in the center region (residues 138 and 158), whereas the two O-glycosidic heterosaccharides, both linked to threonine (residues 238 and 252), occur within the carboxyl-terminal region. The N-glycans are linked to Asn residues in beta-turns, while the O-glycans are located in short random segments. Comparison of the sequence of the amino- and carboxyl-terminal 30 residues with protein sequences in a data bank demonstrated that the A-chain is not significantly related to any known proteins. However, the proline-rich carboxyl-terminal region of the A-chain displays some sequence similarity to collagens and the collagen-like domains of complement subcomponent C1q.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号