首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Informativeness of human (dC-dA)n.(dG-dT)n polymorphisms   总被引:133,自引:0,他引:133  
J L Weber 《Genomics》1990,7(4):524-530
Abundant human interspersed repetitive DNA sequences of the form (dC-dA)n.(dG-dT)n have been shown to exhibit length polymorphisms. Examination of over 100 human (dC-dA)n.(dG-dT)n sequences revealed that the sequences differed from each other both in numbers of repeats and in repeat sequence type. Using a set of precise classification rules, the sequences were divided into three categories: perfect repeat sequences without interruptions in the runs of CA or GT dinucleotides (64% of total), imperfect repeat sequences with one or more interruptions in the run of repeats (25%), and compound repeat sequences with adjacent tandem simple repeats of a different sequence (11%). Informativeness of (dC-dA)n.(dG-dT)n markers in the perfect sequence category was found to increase with increasing average numbers of repeats. PIC values ranged from 0 at about 10 or fewer repeats to above 0.8 for sequences with about 24 or more repeats. (dC-dA)n.(dG-dT)n polymorphisms in the imperfect sequence category showed lower informativeness than expected on the basis of the total numbers of repeats. The longest run of uninterrupted CA or GT repeats was found to be the best predictor of informativeness of (dC-dA)n.(dG-dT)n polymorphisms regardless of the repeat sequence category.  相似文献   

2.
Some bacterial genomes are known to have low CpG dinucleotide frequencies. While their causes are not clearly understood, the frequency of CpG is suppressed significantly in the genome of Mycoplasma genitalium, but not in that of Mycoplasma pneumoniae. We compared orthologous gene pairs of the two closely related species to analyze CpG substitution patterns between these two genomes. We also divided genome sequences into three regions: protein-coding, noncoding, and RNA-coding, and obtained the CpG frequencies for each region for each organism. It was found that the observed/expected ratio of CpG dinucleotides is low in both the protein-coding and noncoding regions; while that ratio is in the normal range in the RNA-coding region. Our results indicate that CpG suppression of the Mycoplasma genome is not caused by (1) biased usage amino acid; (2) biased usage of synonymous codon; or (3) methylation effects by the CpG methyltransferase in the genomes of their hosts. Instead, we consider it likely that a certain global pressure, such as genome-wide pressure for the advantages of DNA stability or replication, has the effect of decreasing CpG over the entire genome, which, in turn, resulted in the biased codon usage.  相似文献   

3.
4.
We isolated clones and determined the sequence of portions of mouse and human cellular DNA which cross-hybridize strongly with the IR3 repetitive region of Epstein-Barr virus. The sequences were found to be tandem arrays of a simple sequence based on the triplet GGA, very similar to the IR3 repeat. The cellular repeats have distinct differences from the viral repeat region, however, and their sequences do not appear capable of being translated into a purely glycine-plus-alanine protein domain like the portion of the Epstein-Barr nuclear antigen coded by IR3. Although the relationship between IR3 and the cellular repeats is left unclear, the cellular repeats have many interesting features. The tandem arrays are about 1 to several kilobases long, much shorter than satellite tandem repeats and larger than other interspersed, tandem repeats. Each of the repeats is a distinct variation, perhaps diverged from a common sequence, (GGA)n. This family is present in the genomes of all species tested and appears to be a ubiquitous feature of all higher eucaryotic genomes.  相似文献   

5.
Mechanisms and applications of immune stimulatory CpG oligodeoxynucleotides   总被引:11,自引:0,他引:11  
Immune stimulation has been widely recognized as an undesirable side effect of certain antisense oligodeoxynucleotides (ODN) which can interfere with their therapeutic application. It is now clear that these dose-dependent immune stimulatory effects primarily result from the presence of an unmethylated CpG dinucleotide in particular base contexts ('CpG motif). The sequence-specific immune activation is not just an experimental artifact, but is actually a highly evolved immune defense mechanism whose actual 'goal' is the detection of microbial nucleic acids. In contrast to vertebrate DNA, in which CpG dinucleotides are 'suppressed' and are highly methylated, microbial genomes do not generally feature CpG suppression or methylation [1]. Immune effector cells such as B cells, macrophages, dendritic cells, and natural killer cells appear to have evolved pattern recognition receptors (PRR) that by binding the microbe-restricted structure of CpG motifs, trigger protective immune responses. Although the specific immune activation appears to have a variety of potential therapeutic applications, it is generally undesirable in antisense ODN. Immune stimulation may be avoided in antisense oligos by the selection of CpG-free target sequences, by the use of ODN backbones that do not support immune stimulation, or by selective modifications of the cytosine in any CpG dinucleotides.  相似文献   

6.
Genome organization of herpesvirus aotus type 2.   总被引:2,自引:1,他引:1       下载免费PDF全文
Herpesvirus aotus type 2, a virus commonly found in owl monkeys without overt disease, has a similar genome structure to the oncogenic herpesviruses of nonhuman primates (herpesvirus saimiri, herpesvirus ateles). Virion DNA of herpesvirus aotus type 2 (M-DNA) has an unique 110-kilobase-pair region of low G + C content (40.2%, L-DNA), inserted between stretches of repetitive H-DNA (68.7% G + C, about 41 kilobase pairs per molecule) that are variable in length. A minority of virions contain defective genomes that consist of repetitive H-DNA only. The H-DNA is composed of various types of repeat units that are related in sequence with each other. The two dominant types of repeats (2.3 and 2.7 kilobase pairs) were cloned and compared by restriction enzyme cleavages and partial nucleotide sequencing. They are homologous in at least 1.3 kilobase pairs. The two forms of repeat units are randomly arranged and oriented in tandem. Reassociation kinetics did not allow detection of sequence homologies between H- and L-DNA of herpesvirus aotus type 2 and the respective sequences of oncogenic primate herpesviruses.  相似文献   

7.
We consider the substitution model T92+CpG of DNA sequence evolution which takes into account the hypermutability of CpG dinucleotides, an effect that can be especially observed in vertebrate genomes. We provide an exact method to simulate the evolution of finite DNA sequences under this model and numerical procedures to infer evolutionary times in two cases: between an ancestral and a present sequence and between two homologous sequences. We show on simulated data that our new numerical method yields very accurate estimations of divergence times. In a context of strong CpG hypermutability, it clearly outperforms the classical estimation procedure that is solely based on the model T92 without CpG influence. Supplementary Material is available at www.liebertonline.com/cmb .  相似文献   

8.
CpG islands in vertebrate genomes   总被引:120,自引:0,他引:120  
  相似文献   

9.
Simple sequence repeats (SSRs) or microsatellites constitute a countable portion of genomes. However, the significance of SSRs in organelle genomes has not been completely understood. The availability of organelle genome sequences allows us to understand the organization of SSRs in their genic and intergenic regions. In the current study we surveyed the patterns of SSRs in mitochondrial genomes of different taxa of plants. A total of 16 mitochondrial genomes, from algae to angiosperms, have been considered to analyze the pattern of simple sequence repeats present in them. Based on study, the mononucleotide repeats of A/T were found to be more prevalent in mitochondrial genomes over other repeat types. The dinucleotides repeats, TA/AT, were the second most numerous, whereas tri-, tetra-, and pentanucleotide repeats were in less number and present in intronic or intergenic portions only. Mononucleotide repeats prevailed in protein-coding exonic portions of all organisms. These results indicates that microsatellite pattern in mitochondrial genomes is different from nuclear genomes and also focuses on organization and diversity at SSR locuses in mitochondrial genomes. This is the novel report of microsatellite polymorphism in plant mitochondrion on whole genome level.  相似文献   

10.
Repetitive sequences are a major constituent of many eukaryote genomes and play roles in gene regulation, chromosome inheritance, nuclear architecture, and genome stability. The identification of repetitive elements has traditionally relied on in-depth, manual curation and computational determination of close relatives based on DNA identity. However, the rapid divergence of repetitive sequence has made identification of repeats by DNA identity difficult even in closely related species. Hence, the presence of unidentified repeats in genome sequences affects the quality of gene annotations and annotation-dependent analyses (e.g. microarray analyses). We have developed an enhanced repeat identification pipeline using two approaches. First, the de novo repeat finding program PILER-DF was used to identify interspersed repetitive elements in several recently finished Dipteran genomes. Repeats were classified, when possible, according to their similarity to known elements described in Repbase and GenBank, and also screened against annotated genes as one means of eliminating false positives. Second, we used a new program called RepeatRunner, which integrates results from both RepeatMasker nucleotide searches and protein searches using BLASTX. Using RepeatRunner with PILER-DF predictions, we masked repeats in thirteen Dipteran genomes and conclude that combining PILER-DF and RepeatRunner greatly enhances repeat identification in both well-characterized and un-annotated genomes.  相似文献   

11.
Microsatellites are widely distributed in plant genomes and comprise unstable regions that undergo mutational changes at rates much greater than that observed for non-repetitive sequences. They demonstrate intrinsic genetic instability, manifested as frequent length changes due to insertions or deletions of repeat units. Detailed analysis of 1600 clones containing genomic sequences of Vicia bithynica revealed the presence of microsatellite repeats in its genome. Based on the screening of a partial DNA library of plasmids, 13 clones harbouring (GA/TC)n tracts of various lengths of repeated motif were identified for further analysis of their internal sequence organization. Sequence analyses revealed the precise length, number of repeats, interruptions within tracts, as well as sequence composition flanking the repeat motifs. Representative plasmids containing different lengths of (GA/TC)n embedded in their original flanking sequence were used to investigate the genetic stability of the repeats. In the study presented herein, we employed a well characterised and tractable bacterial genetic system. Recultivations of Escherichia coli harbouring plasmids containing (GA/TC)n inserts demonstrated that the genetic instability of (GA/TC)n microsatellites depends highly on their length (number of repeats). These observations are in agreement with similar studies performed on repetitive sequences from humans and other organisms.  相似文献   

12.
Two novel repetitive sequence families were isolated from Turritis glabra (2n = 2x = 12). These two repeat families are similar to those of centromeric repeats in Arabidopsis thaliana, are co-localized on one chromosome pair, and differ by about 20% from each other. Phylogenetic analysis revealed that the two repeat families of T. glabra are more similar to each other than to the centromeric repeat families of other Arabidopsis and related species. The relationships of satellite sequences reflected the species phylogeny, indicating that the replacement of satellite sequences has occurred in each species lineage independently, and shared variants could not have existed for a long time between species.  相似文献   

13.
Parvoviruses are rapidly evolving viruses that infect a wide range of hosts, including vertebrates and invertebrates. Extensive methylation of the parvovirus genome has been recently demonstrated. A global pattern of methylation of CpG dinucleotides is seen in vertebrate genomes, compared to “fractional” methylation patterns in invertebrate genomes. It remains unknown if the loss of CpG dinucleotides occurs in all viruses of a given DNA virus family that infect host species spanning across vertebrates and invertebrates. We investigated the link between the extent of CpG dinucleotide depletion among autonomous parvoviruses and the evolutionary lineage of the infected host. We demonstrate major differences in the relative abundance of CpG dinucleotides among autonomous parvoviruses which share similar genome organization and common ancestry, depending on the infected host species. Parvoviruses infecting vertebrate hosts had significantly lower relative abundance of CpG dinucleotides than parvoviruses infecting invertebrate hosts. The strong correlation of CpG dinucleotide depletion with the gain in TpG/CpA dinucleotides and the loss of TpA dinucleotides among parvoviruses suggests a major role for CpG methylation in the evolution of parvoviruses. Our data present evidence that links the relative abundance of CpG dinucleotides in parvoviruses to the methylation capabilities of the infected host. In sum, our findings support a novel perspective of host-driven evolution among autonomous parvoviruses.  相似文献   

14.
Han L  Su B  Li WH  Zhao Z 《Genome biology》2008,9(5):R79

Background  

CpG islands, which are clusters of CpG dinucleotides in GC-rich regions, are considered gene markers and represent an important feature of mammalian genomes. Previous studies of CpG islands have largely been on specific loci or within one genome. To date, there seems to be no comparative analysis of CpG islands and their density at the DNA sequence level among mammalian genomes and of their correlations with other genome features.  相似文献   

15.
Simmen MW 《Genomics》2008,92(1):33-40
In mammalian genomes CpGs occur at one-fifth their expected frequency. This is accepted as resulting from cytosine methylation and deamination of 5-methylcytosine leading to TpG and CpA dinucleotides. The corollary that a CpG deficit should correlate with TpG excess has not hitherto been systematically tested at a genomic level. I analyzed genome sequences (human, chimpanzee, mouse, pufferfish, zebrafish, sea squirt, fruitfly, mosquito, and nematode) to do this and generally to assess the hypothesis that CpG deficit, TpG excess, and other data are accountable in terms of 5-methylcytosine mutation. In all methylated genomes local CpG deficit decreases with higher G + C content. Local TpG surplus, while positively associated with G + C level in mammalian genomes but negatively associated with G + C in nonmammalian methylated genomes, is always explicable in terms of the CpG trend under the methylation model. Covariance of dinucleotide abundances with G + C demonstrates that correlation analyses should control for G + C. Doing this reveals a strong negative correlation between local CpG and TpG abundances in methylated genomes, in accord with the methylation hypothesis. CpG deficit also correlates with CpT excess in mammals, which may reflect enhanced cytosine mutation in the context 5'-YCG-3'. Analyses with repeat-masked sequences show that the results are not attributable to repetitive elements.  相似文献   

16.
Repeat elements are important components of eukaryotic genomes. One limitation in our understanding of repeat elements is that most analyses rely on reference genomes that are incomplete and often contain missing data in highly repetitive regions that are difficult to assemble. To overcome this problem we develop a new method, REPdenovo, which assembles repeat sequences directly from raw shotgun sequencing data. REPdenovo can construct various types of repeats that are highly repetitive and have low sequence divergence within copies. We show that REPdenovo is substantially better than existing methods both in terms of the number and the completeness of the repeat sequences that it recovers. The key advantage of REPdenovo is that it can reconstruct long repeats from sequence reads. We apply the method to human data and discover a number of potentially new repeats sequences that have been missed by previous repeat annotations. Many of these sequences are incorporated into various parasite genomes, possibly because the filtering process for host DNA involved in the sequencing of the parasite genomes failed to exclude the host derived repeat sequences. REPdenovo is a new powerful computational tool for annotating genomes and for addressing questions regarding the evolution of repeat families. The software tool, REPdenovo, is available for download at https://github.com/Reedwarbler/REPdenovo.  相似文献   

17.
Abundant human interspersed repetitive DNA sequences of the form (dC-dA)n · (dG-dT)n have been shown to exhibit length polymorphisms. Examination of over 100 human (dC-dA)n · (dG-dT)n sequences revealed that the sequences differed from each other both in numbers of repeats and in repeat sequence type. Using a set of precise classification rules, the sequences were divided into three categories: perfect repeat sequences without interruptions in the runs of CA or GT dinucleotides (64% of total), imperfect repeat sequences with one or more interruptions in the run of repeats (25%), and compound repeat sequences with adjacent tandem simple repeats of a different sequence (11%). Informativeness of (dC-dA)n · (dG-dT)n markers in the perfect sequence category was found to increase with increasing average numbers of repeats. PIC values ranged from 0 at about 10 or fewer repeats to above 0.8 for sequences with about 24 or more repeats. (dC-dA)n · (dG-dT)n polymorphisms in the imperfect sequence category showed lower informativeness than expected on the basis of the total numbers of repeats. The longest run of uninterrupted CA or GT repeats was found to be the best predictor of informativeness of (dC-dA)n · (dG-dT)n polymorphisms regardless of the repeat sequence category.  相似文献   

18.
Zhao Z  Zhang F 《Gene》2006,366(2):316-324
We analyzed n-mers (n=3-8) in the local environment of 8,249,446 human SNPs and compared their distribution with that in the genome reference sequences. The results revealed that the short sequences, which contained at least one CpG dinucleotide, occurred more frequently in the local SNP sequences than in the genome sequences. To exclude the hypermutability effect of the methylated CpG dinucleotides on the sequence context of SNPs, we examined the distribution patterns for each of the six categories of substitution. We observed the similar pattern (i.e., CpG-containing n-mers vs. non-CpG-containing n-mers) in SNP categories A/G, C/T and C/G but the opposite pattern in category A/T. We next identified 34,928 putative CpG islands in the human genome and located 133,591 SNPs within these islands. In the CpG islands, CpG SNPs were 3.92-fold less prevalent relative to the presence of CpG dinucleotides. Conversely, in the human genome, the frequency of CpG dinucleotides at the polymorphic sites was 6.09 times that in the genome reference sequences. These results support the previous views of mutational suppression at the CpG sites in the CpG islands and hypermutability of the methylated CpG dinucleotides that are prevalent in the non-CpG island sequences in the human genome. Our study represents a comprehensive investigation of the sequence context of SNPs in the human genome and in human CpG islands.  相似文献   

19.
We screened plant genome sequences, primarily from rice and Arabidopsis thaliana, for CpG islands, and identified DNA segments rich in CpG dinucleotides within these sequences. These CpG-rich clusters appeared in the analysed sequences as discrete peaks and occurred at the frequencies of one per 4.7 kb in rice and one per 4.0 kb in A. thaliana. In rice and A. thaliana, most of the CpG-rich clusters were associated with genes, which suggests that these clusters are useful landmarks in genome sequences for identifying genes in plants with small genomes. In contrast, in plants with larger genomes, only a few of the clusters were associated with genes. These plant CpG-rich clusters satisfied the criteria used for identifying human CpG islands, which suggests that these CpG clusters may be regarded as plant CpG islands. The position of each island relative to the 5'-end of its associated gene varied considerably. Genes in the analysed sequences were grouped into five classes according to the position of the CpG islands within their associated genes. A large proportion of the genes belonged to one of two classes, in which a CpG island occurred near the 5'-end of the gene or covered the whole gene region. The position of a plant CpG island within its associated gene appeared to be related to the extent of tissue-specific expression of the gene; the CpG islands of most of the widely expressed rice genes occurred near the 5'-end of the genes.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号